.. This file was automatically generated by docs/create.py. CLI *** This page describes the command line interface (CLI) for PyPGx. For getting help on the CLI: .. code-block:: text $ pypgx -h usage: pypgx [-h] [-v] COMMAND ... positional arguments: COMMAND call-genotypes Call genotypes for target gene. call-phenotypes Call phenotypes for target gene. combine-results Combine various results for target gene. compare-genotypes Calculate concordance between two genotype results. compute-control-statistics Compute summary statistics for control gene from BAM files. compute-copy-number Compute copy number from read depth for target gene. compute-target-depth Compute read depth for target gene from BAM files. create-consolidated-vcf Create a consolidated VCF file. create-input-vcf Call SNVs/indels from BAM files for all target genes. create-regions-bed Create a BED file which contains all regions used by PyPGx. estimate-phase-beagle Estimate haplotype phase of observed variants with the Beagle program. filter-samples Filter Archive file for specified samples. import-read-depth Import read depth data for target gene. import-variants Import SNV/indel data for target gene. plot-bam-copy-number Plot copy number profile from CovFrame[CopyNumber]. plot-bam-read-depth Plot read depth profile with BAM data. plot-cn-af Plot both copy number profile and allele fraction profile in one figure. plot-vcf-allele-fraction Plot allele fraction profile with VCF data. plot-vcf-read-depth Plot read depth profile with VCF data. predict-alleles Predict candidate star alleles based on observed variants. predict-cnv Predict CNV from copy number data for target gene. prepare-depth-of-coverage Prepare a depth of coverage file for all target genes with SV from BAM files. print-data Print the main data of specified archive. print-metadata Print the metadata of specified archive. run-chip-pipeline Run genotyping pipeline for chip data. run-long-read-pipeline Run genotyping pipeline for long-read sequencing data. run-ngs-pipeline Run genotyping pipeline for NGS data. slice-bam Slice BAM file for all genes used by PyPGx. test-cnv-caller Test CNV caller for target gene. train-cnv-caller Train CNV caller for target gene. options: -h, --help Show this help message and exit. -v, --version Show the version number and exit. For getting help on a specific command (e.g. call-genotypes): .. code-block:: text $ pypgx call-genotypes -h call-genotypes ============== .. code-block:: text $ pypgx call-genotypes -h usage: pypgx call-genotypes [-h] [--alleles PATH] [--cnv-calls PATH] genotypes Call genotypes for target gene. Positional arguments: genotypes Output archive file with the semantic type SampleTable[Genotypes]. Optional arguments: -h, --help Show this help message and exit. --alleles PATH Input archive file with the semantic type SampleTable[Alleles]. --cnv-calls PATH Input archive file with the semantic type SampleTable[CNVCalls]. call-phenotypes =============== .. code-block:: text $ pypgx call-phenotypes -h usage: pypgx call-phenotypes [-h] genotypes phenotypes Call phenotypes for target gene. Positional arguments: genotypes Input archive file with the semantic type SampleTable[Genotypes]. phenotypes Output archive file with the semantic type SampleTable[Phenotypes]. Optional arguments: -h, --help Show this help message and exit. combine-results =============== .. code-block:: text $ pypgx combine-results -h usage: pypgx combine-results [-h] [--genotypes PATH] [--phenotypes PATH] [--alleles PATH] [--cnv-calls PATH] results Combine various results for target gene. Positional arguments: results Output archive file with the semantic type SampleTable[Results]. Optional arguments: -h, --help Show this help message and exit. --genotypes PATH Input archive file with the semantic type SampleTable[Genotypes]. --phenotypes PATH Input archive file with the semantic type SampleTable[Phenotypes]. --alleles PATH Input archive file with the semantic type SampleTable[Alleles]. --cnv-calls PATH Input archive file with the semantic type SampleTable[CNVCalls]. compare-genotypes ================= .. code-block:: text $ pypgx compare-genotypes -h usage: pypgx compare-genotypes [-h] [--verbose] first second Calculate concordance between two genotype results. Only samples that appear in both genotype results will be used to calculate concordance for genotype calls as well as CNV calls. Positional arguments: first First archive file with the semantic type SampleTable[Results]. second Second archive file with the semantic type SampleTable[Results]. Optional arguments: -h, --help Show this help message and exit. --verbose Whether to print the verbose version of output, including discordant calls. compute-control-statistics ========================== .. code-block:: text $ pypgx compute-control-statistics -h usage: pypgx compute-control-statistics [-h] [--assembly TEXT] [--bed PATH] gene control-statistics bams [bams ...] Compute summary statistics for control gene from BAM files. Note that for the arguments gene and --bed, the 'chr' prefix in contig names (e.g. 'chr1' vs. '1') will be automatically added or removed as necessary to match the input BAM's contig names. Positional arguments: gene Control gene (recommended choices: 'EGFR', 'RYR1', 'VDR'). Alternatively, you can provide a custom region (format: chrom:start-end). control-statistics Output archive file with the semantic type SampleTable[Statistics]. bams One or more input BAM files. Alternatively, you can provide a text file (.txt, .tsv, .csv, or .list) containing one BAM file per line. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --bed PATH By default, the input data is assumed to be WGS. If it's targeted sequencing, you must provide a BED file to indicate probed regions. [Example] For the VDR gene from WGS data: $ pypgx compute-control-statistics \ VDR \ control-statistics.zip \ 1.bam 2.bam [Example] For a custom region from targeted sequencing data: $ pypgx compute-control-statistics \ chr1:100-200 \ control-statistics.zip \ bam.list \ --bed probes.bed compute-copy-number =================== .. code-block:: text $ pypgx compute-copy-number -h usage: pypgx compute-copy-number [-h] [--samples-without-sv TEXT [TEXT ...]] read-depth control-statistics copy-number Compute copy number from read depth for target gene. The command will convert read depth to copy number by performing intra-sample normalization using summary statistics from the control gene. During copy number analysis, if the input data is targeted sequencing, the command will apply inter-sample normalization using summary statistics across all samples. For best results, it is recommended to specify known samples without SV using --samples-without-sv. Positional arguments: read-depth Input archive file with the semantic type CovFrame[ReadDepth]. control-statistics Input archive file with the semantic type SampleTable[Statistics]. copy-number Output archive file with the semantic type CovFrame[CopyNumber]. Optional arguments: -h, --help Show this help message and exit. --samples-without-sv TEXT [TEXT ...] List of known samples with no SV. compute-target-depth ==================== .. code-block:: text $ pypgx compute-target-depth -h usage: pypgx compute-target-depth [-h] [--assembly TEXT] [--bed PATH] gene read-depth bams [bams ...] Compute read depth for target gene from BAM files. Positional arguments: gene Target gene. read-depth Output archive file with the semantic type CovFrame[ReadDepth]. bams One or more input BAM files. Alternatively, you can provide a text file (.txt, .tsv, .csv, or .list) containing one BAM file per line. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --bed PATH By default, the input data is assumed to be WGS. If it is targeted sequencing, you must provide a BED file to indicate probed regions. [Example] For the CYP2D6 gene from WGS data: $ pypgx compute-target-depth \ CYP2D6 \ read-depth.zip \ 1.bam 2.bam [Example] For the CYP2D6 gene from targeted sequencing data: $ pypgx compute-target-depth \ CYP2D6 \ read-depth.zip \ bam.list \ --bed probes.bed create-consolidated-vcf ======================= .. code-block:: text $ pypgx create-consolidated-vcf -h usage: pypgx create-consolidated-vcf [-h] imported-variants phased-variants consolidated-variants Create a consolidated VCF file. Positional arguments: imported-variants Input archive file with the semantic type VcfFrame[Imported]. phased-variants Input archive file with the semantic type VcfFrame[Phased]. consolidated-variants Output archive file with the semantic type VcfFrame[Consolidated]. Optional arguments: -h, --help Show this help message and exit. create-input-vcf ================ .. code-block:: text $ pypgx create-input-vcf -h usage: pypgx create-input-vcf [-h] [--assembly TEXT] [--genes TEXT [TEXT ...]] [--exclude] [--dir-path PATH] [--max-depth INT] vcf fasta bams [bams ...] Call SNVs/indels from BAM files for all target genes. To save computing resources, this method will call variants only for target genes whose at least one star allele is defined by SNVs/indels. Therefore, variants will not be called for target genes that have star alleles defined only by structural variation (e.g. UGT2B17). Positional arguments: vcf Output VCF file. It must have .vcf.gz as suffix. fasta Reference FASTA file. bams One or more input BAM files. Alternatively, you can provide a text file (.txt, .tsv, .csv, or .list) containing one BAM file per line. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --genes TEXT [TEXT ...] List of genes to include. --exclude Exclude specified genes. Ignored when --genes is not used. --dir-path PATH By default, intermediate files (likelihoods.bcf, calls.bcf, and calls.normalized.bcf) will be stored in a temporary directory, which is automatically deleted after creating final VCF. If you provide a directory path, intermediate files will be stored there. --max-depth INT At a position, read maximally this number of reads per input file (default: 250). If your input data is from WGS (e.g. 30X), you don't need to change this option. However, if it's from targeted sequencing with ultra-deep coverage (e.g. 500X), then you need to increase the maximum depth. create-regions-bed ================== .. code-block:: text $ pypgx create-regions-bed -h usage: pypgx create-regions-bed [-h] [--assembly TEXT] [--add-chr-prefix] [--merge] [--target-genes] [--sv-genes] [--var-genes] [--genes TEXT [TEXT ...]] [--exclude] Create a BED file which contains all regions used by PyPGx. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --add-chr-prefix Whether to add the 'chr' string in contig names. --merge Whether to merge overlapping intervals (gene names will be removed too). --target-genes Whether to only return target genes, excluding control genes and paralogs. --sv-genes Whether to only return target genes whose at least one star allele is defined by structural variation --var-genes Whether to only return target genes whose at least one star allele is defined by SNVs/indels. --genes TEXT [TEXT ...] List of genes to include. --exclude Exclude specified genes. Ignored when --genes is not used. estimate-phase-beagle ===================== .. code-block:: text $ pypgx estimate-phase-beagle -h usage: pypgx estimate-phase-beagle [-h] [--panel PATH] [--impute] imported-variants phased-variants Estimate haplotype phase of observed variants with the Beagle program. Positional arguments: imported-variants Input archive file with the semantic type VcfFrame[Imported]. The 'chr' prefix in contig names (e.g. 'chr1' vs. '1') will be automatically added or removed as necessary to match the reference VCF's contig names. phased-variants Output archive file with the semantic type VcfFrame[Phased]. Optional arguments: -h, --help Show this help message and exit. --panel PATH VCF file (compressed or uncompressed) corresponding to a reference haplotype panel. By default, the 1KGP panel in the pypgx-bundle directory will be used. --impute Perform imputation of missing genotypes. filter-samples ============== .. code-block:: text $ pypgx filter-samples -h usage: pypgx filter-samples [-h] [--exclude] input output samples [samples ...] Filter Archive file for specified samples. Positional arguments: input Input archive file. output Output archive file. samples Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. Optional arguments: -h, --help Show this help message and exit. --exclude Exclude specified samples. import-read-depth ================= .. code-block:: text $ pypgx import-read-depth -h usage: pypgx import-read-depth [-h] [--samples TEXT [TEXT ...]] [--exclude] gene depth-of-coverage read-depth Import read depth data for target gene. Positional arguments: gene Target gene. depth-of-coverage Input archive file with the semantic type CovFrame[DepthOfCoverage]. read-depth Output archive file with the semantic type CovFrame[ReadDepth]. Optional arguments: -h, --help Show this help message and exit. --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --exclude Exclude specified samples. import-variants =============== .. code-block:: text $ pypgx import-variants -h usage: pypgx import-variants [-h] [--assembly TEXT] [--platform TEXT] [--samples TEXT [TEXT ...]] [--exclude] gene vcf imported-variants Import SNV/indel data for target gene. The command will slice the input VCF for the target gene to create an archive file with the semantic type VcfFrame[Imported] or VcfFrame[Consolidated]. Positional arguments: gene Target gene. vcf Input VCF file must be already BGZF compressed (.gz) and indexed (.tbi) to allow random access. imported-variants Output archive file with the semantic type VcfFrame[Imported] or VcfFrame[Consolidated]. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --platform TEXT Genotyping platform used (default: 'WGS') (choices: 'WGS', 'Targeted', 'Chip', 'LongRead'). When the platform is 'WGS', 'Targeted', or 'Chip', the command will assess whether every genotype call in the sliced VCF is haplotype phased (e.g. '0|1'). If the sliced VCF is fully phased, the command will return VcfFrame[Consolidated] or otherwise VcfFrame[Imported]. When the platform is 'LongRead', the command will return VcfFrame[Consolidated] after applying the phase-extension algorithm to estimate haplotype phase of any variants that could not be resolved by read-backed phasing. --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --exclude Exclude specified samples. plot-bam-copy-number ==================== .. code-block:: text $ pypgx plot-bam-copy-number -h usage: pypgx plot-bam-copy-number [-h] [--fitted] [--path PATH] [--samples TEXT [TEXT ...]] [--ymin FLOAT] [--ymax FLOAT] [--fontsize FLOAT] copy-number Plot copy number profile from CovFrame[CopyNumber]. Positional arguments: copy-number Input archive file with the semantic type CovFrame[CopyNumber]. Optional arguments: -h, --help Show this help message and exit. --fitted Show the fitted line as well. --path PATH Create plots in this directory (default: current directory). --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --ymin FLOAT Y-axis bottom (default: -0.3). --ymax FLOAT Y-axis top (default: 6.3). --fontsize FLOAT Text fontsize (default: 25). plot-bam-read-depth =================== .. code-block:: text $ pypgx plot-bam-read-depth -h usage: pypgx plot-bam-read-depth [-h] [--path PATH] [--samples TEXT [TEXT ...]] [--ymin FLOAT] [--ymax FLOAT] [--fontsize FLOAT] read-depth Plot read depth profile with BAM data. Positional arguments: read-depth Input archive file with the semantic type CovFrame[ReadDepth]. Optional arguments: -h, --help Show this help message and exit. --path PATH Create plots in this directory (default: current directory). --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --ymin FLOAT Y-axis bottom. --ymax FLOAT Y-axis top. --fontsize FLOAT Text fontsize (default: 25). plot-cn-af ========== .. code-block:: text $ pypgx plot-cn-af -h usage: pypgx plot-cn-af [-h] [--path PATH] [--samples TEXT [TEXT ...]] [--ymin FLOAT] [--ymax FLOAT] [--fontsize FLOAT] copy-number imported-variants Plot both copy number profile and allele fraction profile in one figure. Positional arguments: copy-number Input archive file with the semantic type CovFrame[CopyNumber]. imported-variants Input archive file with the semantic type VcfFrame[Imported]. Optional arguments: -h, --help Show this help message and exit. --path PATH Create plots in this directory (default: current directory). --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --ymin FLOAT Y-axis bottom (default: -0.3). --ymax FLOAT Y-axis top (default: 6.3). --fontsize FLOAT Text fontsize (default: 25). plot-vcf-allele-fraction ======================== .. code-block:: text $ pypgx plot-vcf-allele-fraction -h usage: pypgx plot-vcf-allele-fraction [-h] [--path PATH] [--samples TEXT [TEXT ...]] [--fontsize FLOAT] imported-variants Plot allele fraction profile from VcfFrame[Imported]. Positional arguments: imported-variants Input archive file with the semantic type VcfFrame[Imported]. Optional arguments: -h, --help Show this help message and exit. --path PATH Create plots in this directory (default: current directory). --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --fontsize FLOAT Text fontsize (default: 25). plot-vcf-read-depth =================== .. code-block:: text $ pypgx plot-vcf-read-depth -h usage: pypgx plot-vcf-read-depth [-h] [--assembly TEXT] [--path PATH] [--samples TEXT [TEXT ...]] [--ymin FLOAT] [--ymax FLOAT] gene vcf Plot read depth profile with VCF data. Positional arguments: gene Target gene. vcf Input VCF file. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --path PATH Create plots in this directory (default: current directory). --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --ymin FLOAT Y-axis bottom. --ymax FLOAT Y-axis top. predict-alleles =============== .. code-block:: text $ pypgx predict-alleles -h usage: pypgx predict-alleles [-h] consolidated-variants alleles Predict candidate star alleles based on observed variants. Positional arguments: consolidated-variants Input archive file with the semantic type VcfFrame[Consolidated]. alleles Output archive file with the semantic type SampleTable[Alleles]. Optional arguments: -h, --help Show this help message and exit. predict-cnv =========== .. code-block:: text $ pypgx predict-cnv -h usage: pypgx predict-cnv [-h] [--cnv-caller PATH] copy-number cnv-calls Predict CNV from copy number data for target gene. Genomic positions that are missing copy number because, for example, the input data is targeted sequencing will be imputed with forward filling. Positional arguments: copy-number Input archive file with the semantic type CovFrame[CopyNumber]. cnv-calls Output archive file with the semantic type SampleTable[CNVCalls]. Optional arguments: -h, --help Show this help message and exit. --cnv-caller PATH Archive file with the semantic type Model[CNV]. By default, a pre-trained CNV caller in the pypgx-bundle directory will be used. prepare-depth-of-coverage ========================= .. code-block:: text $ pypgx prepare-depth-of-coverage -h usage: pypgx prepare-depth-of-coverage [-h] [--assembly TEXT] [--bed PATH] [--genes TEXT [TEXT ...]] [--exclude] depth-of-coverage bams [bams ...] Prepare a depth of coverage file for all target genes with SV from BAM files. To save computing resources, this method will count read depth only for target genes whose at least one star allele is defined by structural variation. Therefore, read depth will not be computed for target genes that have star alleles defined only by SNVs/indels (e.g. CYP3A5). Positional arguments: depth-of-coverage Output archive file with the semantic type CovFrame[DepthOfCoverage]. bams One or more input BAM files. Alternatively, you can provide a text file (.txt, .tsv, .csv, or .list) containing one BAM file per line. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --bed PATH By default, the input data is assumed to be WGS. If it's targeted sequencing, you must provide a BED file to indicate probed regions. Note that the 'chr' prefix in contig names (e.g. 'chr1' vs. '1') will be automatically added or removed as necessary to match the input BAM's contig names. --genes TEXT [TEXT ...] List of genes to include. --exclude Exclude specified genes. Ignored when --genes is not used. [Example] From WGS data: $ pypgx prepare-depth-of-coverage \ depth-of-coverage.zip \ 1.bam 2.bam [Example] From targeted sequencing data: $ pypgx prepare-depth-of-coverage \ depth-of-coverage.zip \ bam.list \ --bed probes.bed print-data ========== .. code-block:: text $ pypgx print-data -h usage: pypgx print-data [-h] input Print the main data of specified archive. Positional arguments: input Input archive file. Optional arguments: -h, --help Show this help message and exit. print-metadata ============== .. code-block:: text $ pypgx print-metadata -h usage: pypgx print-metadata [-h] input Print the metadata of specified archive. Positional arguments: input Input archive file. Optional arguments: -h, --help Show this help message and exit. run-chip-pipeline ================= .. code-block:: text $ pypgx run-chip-pipeline -h usage: pypgx run-chip-pipeline [-h] [--assembly TEXT] [--panel PATH] [--impute] [--force] [--samples TEXT [TEXT ...]] [--exclude] gene output variants Run genotyping pipeline for chip data. Positional arguments: gene Target gene. output Output directory. variants Input VCF file must be already BGZF compressed (.gz) and indexed (.tbi) to allow random access. Statistical haplotype phasing will be skipped if input VCF is already fully phased. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --panel PATH VCF file corresponding to a reference haplotype panel (compressed or uncompressed). By default, the 1KGP panel in the pypgx-bundle directory will be used. --impute Perform imputation of missing genotypes. --force Overwrite output directory if it already exists. --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --exclude Exclude specified samples. [Example] To genotype the CYP3A5 gene from chip data: $ pypgx run-chip-pipeline \ CYP3A5 \ CYP3A5-pipeline \ variants.vcf.gz run-long-read-pipeline ====================== .. code-block:: text $ pypgx run-long-read-pipeline -h usage: pypgx run-long-read-pipeline [-h] [--assembly TEXT] [--force] [--samples TEXT [TEXT ...]] [--exclude] gene output variants Run genotyping pipeline for long-read sequencing data. Positional arguments: gene Target gene. output Output directory. variants Input VCF file must be already BGZF compressed (.gz) and indexed (.tbi) to allow random access. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --force Overwrite output directory if it already exists. --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --exclude Exclude specified samples. [Example] To genotype the CYP3A5 gene from long-read sequencing data: $ pypgx run-long-read-pipeline \ CYP3A5 \ CYP3A5-pipeline \ variants.vcf.gz run-ngs-pipeline ================ .. code-block:: text $ pypgx run-ngs-pipeline -h usage: pypgx run-ngs-pipeline [-h] [--variants PATH] [--depth-of-coverage PATH] [--control-statistics PATH] [--platform TEXT] [--assembly TEXT] [--panel PATH] [--force] [--samples TEXT [TEXT ...]] [--exclude] [--samples-without-sv TEXT [TEXT ...]] [--do-not-plot-copy-number] [--do-not-plot-allele-fraction] [--cnv-caller PATH] gene output Run genotyping pipeline for NGS data. During copy number analysis, if the input data is targeted sequencing, the command will apply inter-sample normalization using summary statistics across all samples. For best results, it is recommended to specify known samples without SV using --samples-without-sv. Positional arguments: gene Target gene. output Output directory. Optional arguments: -h, --help Show this help message and exit. --variants PATH Input VCF file must be already BGZF compressed (.gz) and indexed (.tbi) to allow random access. Statistical haplotype phasing will be skipped if input VCF is already fully phased. --depth-of-coverage PATH Archive file with the semantic type CovFrame[DepthOfCoverage]. --control-statistics PATH Archive file with the semantic type SampleTable[Statistics]. --platform TEXT Genotyping platform (default: 'WGS') (choices: 'WGS', 'Targeted') --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --panel PATH VCF file corresponding to a reference haplotype panel (compressed or uncompressed). By default, the 1KGP panel in the pypgx-bundle directory will be used. --force Overwrite output directory if it already exists. --samples TEXT [TEXT ...] Specify which samples should be included for analysis by providing a text file (.txt, .tsv, .csv, or .list) containing one sample per line. Alternatively, you can provide a list of samples. --exclude Exclude specified samples. --samples-without-sv TEXT [TEXT ...] List of known samples without SV. --do-not-plot-copy-number Do not plot copy number profile. --do-not-plot-allele-fraction Do not plot allele fraction profile. --cnv-caller PATH Archive file with the semantic type Model[CNV]. By default, a pre-trained CNV caller in the pypgx-bundle directory will be used. [Example] To genotype the CYP3A5 gene, which does not have SV, from WGS data: $ pypgx run-ngs-pipeline \ CYP3A5 \ CYP3A5-pipeline \ --variants variants.vcf.gz [Example] To genotype the CYP2D6 gene, which does have SV, from WGS data: $ pypgx run-ngs-pipeline \ CYP2D6 \ CYP2D6-pipeline \ --variants variants.vcf.gz \ --depth-of-coverage depth-of-coverage.zip \ --control-statistics control-statistics-VDR.zip [Example] To genotype the CYP2D6 gene from targeted sequencing data: $ pypgx run-ngs-pipeline \ CYP2D6 \ CYP2D6-pipeline \ --variants variants.vcf.gz \ --depth-of-coverage depth-of-coverage.zip \ --control-statistics control-statistics-VDR.zip \ --platform Targeted slice-bam ========= .. code-block:: text $ pypgx slice-bam -h usage: pypgx slice-bam [-h] [--assembly TEXT] [--genes TEXT [TEXT ...]] [--exclude] input output Slice BAM file for all genes used by PyPGx. Positional arguments: input Input BAM file. It must be already indexed to allow random access. output Output BAM file. Optional arguments: -h, --help Show this help message and exit. --assembly TEXT Reference genome assembly (default: 'GRCh37') (choices: 'GRCh37', 'GRCh38'). --genes TEXT [TEXT ...] List of genes to include. --exclude Exclude specified genes. Ignored when --genes is not used. test-cnv-caller =============== .. code-block:: text $ pypgx test-cnv-caller -h usage: pypgx test-cnv-caller [-h] [--confusion-matrix PATH] [--comparison-table PATH] cnv-caller copy-number cnv-calls Test CNV caller for target gene. Positional arguments: cnv-caller Input archive file with the semantic type Model[CNV]. copy-number Input archive file with the semantic type CovFrame[CopyNumber]. cnv-calls Input archive file with the semantic type SampleTable[CNVCalls]. Optional arguments: -h, --help Show this help message and exit. --confusion-matrix PATH Write the confusion matrix as a CSV file where rows indicate actual class and columns indicate prediction class. --comparison-table PATH Write a CSV file comparing actual vs. predicted CNV calls for each sample. train-cnv-caller ================ .. code-block:: text $ pypgx train-cnv-caller -h usage: pypgx train-cnv-caller [-h] [--confusion-matrix PATH] [--comparison-table PATH] copy-number cnv-calls cnv-caller Train CNV caller for target gene. This command will return a SVM-based multiclass classifier that makes CNV calls using the one-vs-rest strategy. Positional arguments: copy-number Input archive file with the semantic type CovFrame[CopyNumber]. cnv-calls Input archive file with the semantic type SampleTable[CNVCalls]. cnv-caller Output archive file with the semantic type Model[CNV]. Optional arguments: -h, --help Show this help message and exit. --confusion-matrix PATH Write the confusion matrix as a CSV file where rows indicate actual class and columns indicate prediction class. --comparison-table PATH Write a CSV file comparing actual vs. predicted CNV calls for each sample.