Usage
General Usage
To run longcallR
for SNP calling and read phasing, use the following command:
- -b, --bam-path: Input BAM file (must be sorted and indexed)
- -f, --ref-path: Reference FASTA file
- -p, --preset: Preset for sequencing platform
- -t, --threads: Number of threads
- -o, --output: Output file prefix
Available Presets
Preset | Description |
---|---|
hifi-masseq |
PacBio Mas-Seq data (strand bias filtering disabled) |
hifi-isoseq |
PacBio Iso-Seq data (strand bias filtering enabled) |
ont-cdna |
ONT cDNA data (strand bias filtering enabled) |
ont-drna |
ONT direct RNA data (strand bias filtering disabled) |
Advanced Usage
To perform SNP calling and phasing in a specific genomic region, use the following command:
longcallR
-b <input.sorted.bam>
-f <reference.fa>
-p <preset>
-t <threads>
-r <chr:start-end>
-o <output_prefix>
To perform SNP calling and phasing based on user-provided candidate SNPs (e.g., known genomic SNPs), use the following command:
longcallR
-b <input.sorted.bam>
-f <reference.fa>
-v <input.vcf>
-p <preset>
-t <threads>
-o <output_prefix>
To customize thresholds for read coverage, read length, mapping quality, base quality, allele fraction, allele fraction for low fraction candidates (more challenge sites), strand bias filtering, use:
longcallR
-b <input.sorted.bam>
-f <reference.fa>
-p <preset>
--min-depth <min_depth>
--max-depth <max_depth>
--min-read-length <min_length>
--min-mapq <min_mapq>
--min-baseq <min_baseq>
--min-allele-freq <min_allele_freq>
--low-allele-frac-cutoff <low_allele_frac_cutoff>
--strand-bias <strand_bias>
-t <threads>
-o <output_prefix>
Full Arguments
- -b, --bam-path: Input BAM file (must be sorted and indexed)
- -f, --ref-path: Reference FASTA file
- -p, --preset: Preset for sequencing platform
- -t, --threads: Number of threads
- -o, --output: Output file prefix
- -a, --annotation: Annotation file, GFF3 or GTF format
- -r, --region: Region to be processed. Format: chr:start-end, left-closed, right-open
- -x, --contigs: Conitgs to be processed. Example: -x chr1 chr2 chr3
- -v , --input-vcf: Input vcf file as Candidate SNPs
- --min-allele-freq: Minimum allele frequency for high allele fraction candidate SNPs [Default: 0.20]
- --min-allele-freq-include-intron: Minimum allele frequency for high allele fraction candidate SNPs include intron [Default: 0.0]
- --low-allele-frac-cutoff: Minimum allele frequency for low allele fraction candidate SNPs [Default: 0.05]
- --low-allele-cnt-cutoff: Minimum allele count for low allele fraction candidate SNPs [Default: 10]
- --min-read-length: Minimum length for reads [Default: 500]
- --min-mapq: Minimum mapping quality for reads [Default: 20]
- --min-baseq: Minimim base quality for allele calling [Default: 10]
- --divergence: Max sequence divergence for valid reads [Default: 0.05]
- --min-depth: Minimum depth for a candidate SNP [Default: 10]
- --max-depth: Maximum depth for a candidate SNP [Default: 50000]
- --strand-bias: Whether to use strand bias to filter SNPs [Default: false] [possible values: true, false]
- --min-qual: Minimum QUAL for candidate SNPs [Default: 2]
- --distance-to-read-end: Ignore bases within distance to read end [Default: 20]
- --polya-tail-length: PolyA tail length threshold for filtering [Default: 5]
- --dense-win-size: Window size used to identify dense regions of candidate SNPs [Default: 500]
- --min-dense-cnt: Minimum number of candidate SNPs within the dense window to consider the region as dense [Default: 5]
- --min-linkers: Minimum number of related candidate heterozygous SNPs required to perform phasing in a region [Default: 1]
- --max-enum-snps: Maximum number of SNPs for enumerate haplotypes [Default: 10]
- --min-phase-score: Minimum phase score to filter candidate SNPs [Default: 8.0]
- --min-read-assignment-diff: Minimum absolute difference between haplotype assignment probabilities required for a read to be confidently assigned [Default: 0.15]
- --truncation: When set, apply truncation of high coverage regions
- --truncation-coverage: Read number threshold for region truncation [Default: 200000]
- --downsample: When set, allow downsampling of high coverage regions
- --downsample-depth: Downsample depth [Default: 10000]
- --exon-only: When set, only call SNPs in exons
- --no-bam-output: When set, do not output phased bam file
- --get-blocks: When set, show all regions to be processed.