Expert assistant for the Nextflow homer_peakcalling pipeline. Helps with pipeline execution, module development, HOMER parameter tuning, and ChIP-seq/ChEC-seq analysis workflows.
An expert assistant for working with the homer_peakcalling_from_bam Nextflow pipeline, a ChIP-seq/ChEC-seq analysis tool built with the nf-core template.
This skill provides specialized guidance for:
**Repository**: cmatkhan/homer_peakcalling_from_bam
**Type**: Independent Nextflow pipeline (nf-core template v3.3.2)
**Purpose**: ChIP-seq/ChEC-seq peak calling from BAM files using HOMER
**Nextflow requirement**: >=24.10.5
1. **Entry point**: `main.nf` - Contains `CMATKHAN_HOMER_PEAKCALLING_FROM_BAM` workflow
2. **Primary workflow**: `workflows/homer_peakcalling_from_bam.nf`
- BAM filtering with SAMTOOLS_VIEW
- Optional blacklist removal with BEDTOOLS_INTERSECT
- HOMER_PEAKCALLING subworkflow
- MultiQC report generation
3. **HOMER subworkflow**: `subworkflows/local/homer_peakcalling/main.nf`
- Tag directory creation
- Peak calling with/without control
- Peak format conversion
- Optional annotation and merging
- `maketagdirectory` - Creates tag directories
- `findpeaks` - Peak calling (multiple styles)
- `pos2bed` - Peak format conversion
- `makeucscfile` - bedGraph generation
- `mergepeaks` - Multi-sample peak merging
- `annotatepeaks` - Genomic annotation
When users need to execute the pipeline:
```bash
nextflow run cmatkhan/homer_peakcalling_from_bam \
-profile docker \
--input samplesheet.csv \
--outdir results
nextflow run cmatkhan/homer_peakcalling_from_bam \
-profile docker \
--input samplesheet.csv \
--outdir results \
-resume
nextflow run cmatkhan/homer_peakcalling_from_bam \
-profile chipexo,docker \
--input samplesheet.csv \
--outdir results
nextflow run cmatkhan/homer_peakcalling_from_bam \
-profile chec,docker \
--input samplesheet.csv \
--outdir results
nextflow run cmatkhan/homer_peakcalling_from_bam \
-profile test,docker \
--outdir test_results
```
**Input samplesheet format** (CSV):
```csv
sample,bam
SAMPLE1,/path/to/sample1.bam
SAMPLE2,/path/to/sample2.bam
```
**Required**:
**Optional**:
```
outdir/
├── homer/tagdir/ # Tag directories per sample
├── findpeaks/ # Peak files (.txt and .bed)
├── mergepeaks/ # Merged peaks (if enabled)
├── annotatepeaks/ # Annotated peaks (if enabled)
├── multiqc/ # QC report
└── pipeline_info/ # Execution metadata
```
HOMER parameters are controlled via profile configs. Edit `conf/chipexo.config`, `conf/chec.conf`, or `conf/modules.config`:
```groovy
withName: '.*:HOMER_FINDPEAKS.*' {
ext.args = [
'-L 2', # Fold enrichment over local
'-P 0.001', # Poisson p-value
'-minDist 50', # Minimum distance between peaks
'-fdr 0.05' # False discovery rate
].join(' ')
}
withName: '.*:HOMER_MAKETAGDIRECTORY.*' {
ext.args = [
'-fragLength 1', # Fragment length
'-single' # Single-end mode
].join(' ')
}
```
**For nf-core modules**:
```bash
nf-core modules install <module_name>
```
**For local HOMER modules**:
- `main.nf` - Module definition
- `meta.yml` - Metadata
- `environment.yml` - Conda environment (optional)
**Channel pattern**: All modules use `[ meta, file ]` tuples where `meta` is a map with sample info
```bash
nf-test test tests/default.nf.test
nf-test test subworkflows/local/homer_peakcalling/tests/main.nf.test
pre-commit run --all-files
```
**chipexo profile** (`conf/chipexo.config`):
**chec profile** (`conf/chec.conf`):
**base profile** (`conf/base.config`):
**Known limitations**:
**For resume issues**:
**For parameter issues**:
```bash
nextflow run cmatkhan/homer_peakcalling_from_bam \
-profile docker \
--input samples.csv \
--fasta genome.fa \
--gtf genes.gtf \
--control_bam input.bam \
--outdir chip_results
```
```bash
nextflow run cmatkhan/homer_peakcalling_from_bam \
-profile chipexo,docker \
--input chipexo_samples.csv \
--fasta genome.fa \
--gtf genes.gtf \
--merge_peaks true \
--annotate_individual true \
--outdir chipexo_results
```
```bash
nextflow run cmatkhan/homer_peakcalling_from_bam \
-profile chec,docker \
--input chec_samples.csv \
--fasta genome.fa \
--blacklist_bed blacklist.bed \
--make_bedgraph true \
--outdir chec_results
```
1. **Nextflow version**: Requires >=24.10.5
2. **Input format**: CSV samplesheet with sample,bam columns
3. **Reference genome**: FASTA file required
4. **GTF requirement**: Needed only for annotation features
5. **Control BAM**: Converted to special `[[id: 'realControl'], file]` format internally
6. **Empty inputs**: Use `[]` or `Channel.empty()` for optional inputs
7. **DSL2 conventions**: All channels typed as `[ meta, file ]` tuples
8. **Testing framework**: Uses nf-test (not native Nextflow tests)
1. Always use appropriate profile for experiment type (chipexo, chec, or default)
2. Test parameter changes with test profile before full dataset
3. Use `-resume` for failed runs to save compute time
4. Keep module-specific arguments in `conf/modules.config`
5. Follow nf-core module patterns when adding functionality
6. Run pre-commit checks before committing changes
7. Document new parameters in schema and documentation
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/homer-peak-calling-pipeline-assistant/raw