``auto_process`` commands ========================= .. note:: This documentation has been auto-generated from the command help ``auto_process.py`` implements the following commands: .. contents:: :local: .. _commands_info: info **** :: usage: auto_process.py info [-h] [--version] [--debug] [ANALYSIS_DIR] Print information about the analysis associated with ANALYSIS_DIR. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --debug Turn on debugging output .. _commands_setup: setup ***** :: usage: auto_process.py setup [-h] [--version] -r RUN_NUMBER [-s SAMPLE_SHEET] [-n ANALYSIS_NUMBER] [-f FILE] [--fastq-dir UNALIGNED_DIR] [--analysis-dir ANALYSIS_DIR] [--debug] RUN_DIR Set up automatic processing of Illumina sequencing data from RUN_DIR. positional arguments: RUN_DIR directory with the output from an Illumina sequencer optional arguments: -h, --help show this help message and exit --version show program's version number and exit -r RUN_NUMBER, --run-number RUN_NUMBER Set facility run number (required) -s SAMPLE_SHEET, --samplesheet SAMPLE_SHEET, --sample-sheet SAMPLE_SHEET Copy sample sheet file from name and location SAMPLE_SHEET (default is to look for SampleSheet.csv inside DIR). SAMPLE_SHEET can be a local or remote file, or a URL -n ANALYSIS_NUMBER, --analysis-number ANALYSIS_NUMBER Set analysis number (e.g. if reprocessing a run); will be appended to analysis directory name if '--analysis- dir' not supplied -f FILE, --file FILE Additional file(s) to copy into new analysis directory. FILE can be a local or remote file, or a URL --fastq-dir UNALIGNED_DIR Import fastq.gz files from FASTQ_DIR (which should be a subdirectory of DIR with the same structure as that the 'Unaligned' or 'bcl2fastq2' output directory produced by CASAVA/bcl2fastq) --analysis-dir ANALYSIS_DIR Make new directory called ANALYSIS_DIR (otherwise default is '_analysis[]') --debug Turn on debugging output .. _commands_make_fastqs: make_fastqs *********** :: usage: auto_process.py make_fastqs [-h] [--version] [--no-save] [--debug] [--id NAME] [--force-copy] [--protocol {standard,mirna,10x_chromium_sc,10x_atac,10x_multiome,10x_multiome_atac,10x_multiome_gex,10x_visium,10x_visium_v1,10x_visium_hd,10x_visium_hd_3prime,parse_evercode,biorad_ddseq}] [--sample-sheet SAMPLE_SHEET] [--lanes LANES[:OPTIONS]] [--output-dir OUT_DIR] [--platform PLATFORM] [--use-bases-mask BASES_MASK] [--bcl-converter CONVERTER] [--no-lane-splitting] [--use-lane-splitting] [--find-adapters-with-sliding-window] [--create-empty-fastqs] [--no-create-empty-fastqs] [--create-fastq-for-index-reads] [--ignore-missing-bcls] [--nprocessors NPROCESSORS] [--runner RUNNER] [--adapter ADAPTER_SEQUENCE] [--adapter-read2 ADAPTER_SEQUENCE_READ2] [--minimum-trimmed-read-length MINIMUM_TRIMMED_READ_LENGTH] [--mask-short-adapter-reads MASK_SHORT_ADAPTER_READS] [--no-adapter-trimming] [--r1-length R1_LENGTH] [--r2-length R2_LENGTH] [--r3-length R3_LENGTH] [--10x_jobmode CELLRANGER_JOBMODE] [--10x_localcores CELLRANGER_LOCALCORES] [--10x_localmem CELLRANGER_LOCALMEM] [--10x_maxjobs CELLRANGER_MAXJOBS] [--10x_mempercore CELLRANGER_MEMPERCORE] [--10x_jobinterval CELLRANGER_JOBINTERVAL] [--ignore-dual-index] [--rc-i2-override RC_I2_OVERRIDE] [--stats-file STATS_FILE] [--per-lane-stats-file PER_LANE_STATS_FILE] [--no-stats] [--barcode-analysis-dir BARCODE_ANALYSIS_DIR] [--no-barcode-analysis] [--enable-conda {yes,no}] [--conda-env-dir CONDA_ENV_DIR] [-j NJOBS] [-c NCORES] [-b NBATCHES] [--verbose] [--work-dir WORKING_DIR] [--require-bcl2fastq-version BCL2FASTQ_VERSION] [ANALYSIS_DIR] Generate fastq files from raw bcl files produced by Illumina sequencer. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --no-save Don't save parameter changes to the auto_process.info file --debug Turn on debugging output --id NAME identifier for output files Primary data management: --force-copy force primary data to be copied (by default only data on a remote system will be copied; data on a local system will be symlinked) General Fastq generation: --protocol {standard,mirna,10x_chromium_sc,10x_atac,10x_multiome,10x_multiome_atac,10x_multiome_gex,10x_visium,10x_visium_v1,10x_visium_hd,10x_visium_hd_3prime,parse_evercode,biorad_ddseq} specify Fastq generation protocol depending on the data being processed (default: 'standard') --sample-sheet SAMPLE_SHEET use an alternative sample sheet to the default 'custom_SampleSheet.csv' created on setup. --lanes LANES[:OPTIONS] define a set of lanes to group for processing. LANES can be a single lane (e.g. '1'), a list ('1,2,3,7'), a range ('1-3'), or a combination ('1-3,7'). Specified lanes are processed together in a group, using OPTIONS (if supplied). OPTIONS takes the form '[PROTOCOL:][KEY=VALUE:[KEY=VALUE]...] (for example --lanes=1-4:standard:trim_adapters=no) --output-dir OUT_DIR set the directory for the output Fastqs (default: 'bcl2fastq') --platform PLATFORM explicitly specify the sequencing platform. Only use this if the platform cannot be identified from the instrument name --use-bases-mask BASES_MASK explicitly set the bases-mask string to indicate how each cycle should be used in the BCL to Fastq conversion (overrides default). Set to 'auto' to determine automatically Bcl conversion options: --bcl-converter CONVERTER explicitly set BCL conversion software to use for non-10xGenomics runs (either 'bcl2fastq' or 'bcl- convert'; can also include a version specifier e.g. 'bcl2fastq>=2.0'). Default: bcl2fastq>=2.20 (may be overridden by platform-specific settings) --no-lane-splitting don't split the output FASTQ files by lane. Default: off (may be overridden by platform-specific settings); turn off using --use-lane-splitting --use-lane-splitting split the output FASTQ files by lane. Default: on (but may be overridden by platform-specific settings); turn off using --no-lane-splitting --find-adapters-with-sliding-window use sliding window algorithm to identify adapters for trimming --create-empty-fastqs create empty files as placeholders for missing FASTQs from demultiplexing step. Default: off (but may be overridden by platform-specific settings); turn off using --no-create-empty-fastqs. NB Fastq generation must have finished without for this option to be applied --no-create-empty-fastqs don't create empty files as placeholders for missing FASTQs from demultiplexing step. Default: on (but may be overridden by platform-specific settings); turn off using --create-empty-fastqs. --create-fastq-for-index-reads also create FASTQs for index reads --ignore-missing-bcls ignores missing or corrupt BCL files and assumes 'N'/'#' for missing calls (only applies if using bcl2fastq as the BCL conversion software) --nprocessors NPROCESSORS explicitly specify number of processors/cores to use (default taken from job runner) --runner RUNNER explicitly specify runner definition (e.g. 'GEJobRunner(-j y)') Adapter trimming and masking: --adapter ADAPTER_SEQUENCE sequence of adapter to be trimmed. Specify multiple adapters by separating them with plus sign (+). Only used for read 1 if --adapter-read2 is also specified (default: use adapter sequence from sample sheet) --adapter-read2 ADAPTER_SEQUENCE_READ2 sequence of adapter to be trimmed in read 2. Specify multiple adapters by separating them with plus sign (+) (default: use adapter sequence from sample sheet) --minimum-trimmed-read-length MINIMUM_TRIMMED_READ_LENGTH Minimum read length after adapter trimming. bcl2fastq trims the adapter from the read down to this value; if there is more adapter match below this length then those bases are masked not trimmed (i.e. replaced by N rather than removed) (default: 35) --mask-short-adapter-reads MASK_SHORT_ADAPTER_READS minimum length of unmasked bases that a read can be after adapter trimming; reads with fewer ACGT bases will be completely masked with Ns (default: 22) --no-adapter-trimming turn off adapter trimming even if adapter sequences are supplied Read truncation options: --r1-length R1_LENGTH truncate R1 reads to R1_LENGTH (ignored if --use- bases-mask is explicitly set) --r2-length R2_LENGTH truncate R2 reads to R2_LENGTH (ignored if --use- bases-mask is explicitly set, or if there is no R2 read) --r3-length R3_LENGTH truncate R3 reads to R3_LENGTH (ignored if --use- bases-mask is explicitly set, or if there is no R3 read) 10x Genomics data options (Cellranger*/Spaceranger): --10x_jobmode CELLRANGER_JOBMODE job mode to run cellranger in (default: 'local') --10x_localcores CELLRANGER_LOCALCORES maximum cores cellranger can request at onetime for jobmode 'local' (ignored for other jobmodes) (default: 1) --10x_localmem CELLRANGER_LOCALMEM maximum total memory cellranger can request at one time for jobmode 'local' (ignored for other jobmodes) (in Gbs; default: 5) --10x_maxjobs CELLRANGER_MAXJOBS maxiumum number of concurrent jobs to run NB only used if jobmode is not 'local' (default: 24) --10x_mempercore CELLRANGER_MEMPERCORE memory assumed per core (in Gbs; default: 5); NB only used if jobmode is not 'local' --10x_jobinterval CELLRANGER_JOBINTERVAL how often jobs are submitted (in ms; default: 100); only used if jobmode is not 'local' --ignore-dual-index on a dual-indexed flowcell where the second index was not used for the 10x sample, ignore it 10x Genomics Spaceranger options: --rc-i2-override RC_I2_OVERRIDE (Spaceranger only) explicitly indicate whether bases in I2 read were emitted as reverse complement by the sequencing workflow: set to 'true' for the Reverse Complement Workflow (Workflow B)/ NovaSeq Reagent Kit v1.5 or greater, 'false' for the Forward Strand Workflow (Workflow A) / older NovaSeq Reagent Kits. If unset then workflow will be determined automatically (recommended) Statistics generation: --stats-file STATS_FILE specify output file for fastq statistics --per-lane-stats-file PER_LANE_STATS_FILE specify output file for per-lane statistics --no-stats don't generate statistics file; use 'update_fastq_stats' command to (re)generate statistics Barcode analysis: --barcode-analysis-dir BARCODE_ANALYSIS_DIR specify subdirectory where barcode analysis will be performed and outputs will be written --no-barcode-analysis don't perform barcode analysis; use 'analyse_barcodes' command to run barcode analysis separately Conda dependency resolution: --enable-conda {yes,no} use conda to resolve task dependencies; can be 'yes' or 'no' (default: yes) --conda-env-dir CONDA_ENV_DIR specify directory for conda enviroments (default: temporary directory) Job control options: -j NJOBS, --maxjobs NJOBS maxiumum number of jobs to run concurrently (default: 12) -c NCORES, --maxcores NCORES maximum number of cores available for running jobs (default: no limit) -b NBATCHES, --maxbatches NBATCHES enable dynamic batching of pipeline jobs with maximum number of batches set to NBATCHES (default: no batching) Advanced/debugging options: --verbose run pipeline in 'verbose' mode --work-dir WORKING_DIR specify the working directory for the pipeline operations Deprecated options: --require-bcl2fastq-version BCL2FASTQ_VERSION deprecated: explicitly specify version of bcl2fastq software to use (e.g. '=1.8.4' or '>=2.0') (use --bcl- converter instead) .. _commands_analyse_barcodes: analyse_barcodes **************** :: usage: auto_process.py analyse_barcodes [-h] [--version] [--unaligned-dir UNALIGNED_DIR] [--lanes LANES] [--mismatches MISMATCHES] [--cutoff CUTOFF] [--sample-sheet SAMPLE_SHEET] [--id NAME] [--barcode-analysis-dir BARCODE_ANALYSIS_DIR] [--force] [--runner RUNNER] [--debug] [ANALYSIS_DIR] Analyse barcode sequences for Fastq files in specified lanes in ANALYSIS_DIR, and report the most common barcodes found across all reads from each lane. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --unaligned-dir UNALIGNED_DIR explicitly set the (sub)directory with bcl-to-fastq outputs --lanes LANES specify which lanes to analyse barcodes for (default is to do analysis for all lanes). --mismatches MISMATCHES maximum number of mismatches to use when grouping similar barcodes (default is to determine automatically from the bases mask) --cutoff CUTOFF exclude barcodes with a smaller fraction of associated reads than CUTOFF, e.g. '0.01' excludes barcodes with < 1% of reads (default is 0.01%) --sample-sheet SAMPLE_SHEET use an alternative sample sheet to the default 'custom_SampleSheet.csv' created on setup. --id NAME specify an identifier to be written into the default output barcode analysis directory name (e.g. 'barcode_analysis_NAME') and report title --barcode-analysis-dir BARCODE_ANALYSIS_DIR specify subdirectory where barcode analysis will be performed and outputs will be written --force discard and regenerate counts (by default existing counts will be used) --runner RUNNER explicitly specify runner definition (e.g. 'GEJobRunner(-j y)') --debug Turn on debugging output .. _commands_setup_analysis_dirs: setup_analysis_dirs ******************* :: usage: auto_process.py setup_analysis_dirs [-h] [--version] [--ignore-missing-metadata] [--unaligned-dir UNALIGNED_DIR] [--undetermined UNDETERMINED] [--short-fastq-names] [--link-to-fastqs] [--id NAME] [--debug] [ANALYSIS_DIR] Create analysis subdirectories for projects defined in projects.info file in ANALYSIS_DIR. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --ignore-missing-metadata force creation of project directories even if metadata is not set (default is to fail if metadata is missing) --unaligned-dir UNALIGNED_DIR explicitly specify the subdirectory with output Fastqs --undetermined UNDETERMINED explicitly specify name for project directory with 'undetermined' fastqs --short-fastq-names shorten fastq file names when copying or linking from project directory (default is to keep long names from bcl2fastq) --link-to-fastqs create symbolic links to original fastqs from project directory (default is to make hard links) --id NAME identifier to append to project names --debug Turn on debugging output .. _commands_run_qc: run_qc ****** :: usage: auto_process.py run_qc [-h] [--version] [--projects PROJECT_PATTERN] [--qc_dir QC_DIR] [--fastq_dir FASTQ_DIR] [--protocol PROJECTNAME=QCPROTOCOL] [--organism PROJECTNAME=ORGANISM] [--fastq_subset SUBSET] [-t NTHREADS] [--cellranger CELLRANGER_EXE] [--10x_chemistry {ARC-v1,SC3Pv1,SC3Pv2,SC3Pv3,SC3Pv3HT,SC3Pv3LT,SC3Pv4,SC5P-PE,SC5P-PE-v3,SC5P-R2,SC5P-R2-v3,auto,fiveprime,threeprime}] [--10x_force_cells N_CELLS] [--10x_extra_projects PROJECT_DIRS] [--10x_transcriptome ORGANISM=REFERENCE] [--10x_premrna_reference ORGANISM=REFERENCE] [--report HTML_FILE] [--enable-conda {yes,no}] [--conda-env-dir CONDA_ENV_DIR] [-c NCORES] [-j NJOBS] [-b NBATCHES] [--verbose] [--work-dir WORKING_DIR] [--runner RUNNER] [--debug] [ANALYSIS_DIR] Run QC procedures for sequencing projects in ANALYSIS_DIR. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --projects PROJECT_PATTERN simple wildcard-based pattern specifying a subset of projects and samples to run the QC on. PROJECT_PATTERN should be of the form 'pname[/sname]', where 'pname' specifies a project (or set of projects) and 'sname' optionally specifies a sample (or set of samples). --qc_dir QC_DIR explicitly specify QC output directory (nb if supplied then the same QC_DIR will be used for each project. Non-absolute paths are assumed to be relative to the project directory). Default: 'qc' --fastq_dir FASTQ_DIR explicitly specify subdirectory of DIR with Fastq files to run the QC on. QC options: --protocol PROJECTNAME=QCPROTOCOL specify QC protocol for project PROJECTNAME (overrides the automatic protocol determination for that project) --organism PROJECTNAME=ORGANISM specify organism for QC run for project PROJECTNAME (overrides the organism set for that project) --fastq_subset SUBSET specify size of subset of total reads to use for fastq_screen, BAM file generation etc (default 100000, set to 0 to use all reads) -t NTHREADS, --threads NTHREADS number of threads to use for QC script (default: taken from job runner) Cellranger/10xGenomics options: --cellranger CELLRANGER_EXE explicitly specify path to Cellranger executable to use for single library analysis (NB will be used for all projects) --10x_chemistry {ARC-v1,SC3Pv1,SC3Pv2,SC3Pv3,SC3Pv3HT,SC3Pv3LT,SC3Pv4,SC5P-PE,SC5P-PE-v3,SC5P-R2,SC5P-R2-v3,auto,fiveprime,threeprime} assay configuration for 10xGenomics scRNA-seq; if set to 'auto' (the default) then cellranger will attempt to determine this automatically --10x_force_cells N_CELLS force number of cells for 10xGenomics scRNA-seq and scATAC-seq, overriding automatic cell detection algorithms (default is to use built-in cell detection) --10x_extra_projects PROJECT_DIRS specify additional projects to include samples from in single library analyses, as comma-separated list --10x_transcriptome ORGANISM=REFERENCE specify cellranger transcriptome reference datasets to associate with organisms (overrides references defined in config file) --10x_premrna_reference ORGANISM=REFERENCE specify cellranger pre-mRNA reference datasets to associate with organisms (overrides references defined in config file) Output and reporting: --report HTML_FILE file name for output HTML QC report (default: _report.html) Conda dependency resolution: --enable-conda {yes,no} use conda to resolve task dependencies; can be 'yes' or 'no' (default: no) --conda-env-dir CONDA_ENV_DIR specify directory for conda enviroments (default: temporary directory) Job control options: -c NCORES, --maxcores NCORES maximum number of cores available for running jobs (default: no limit) -j NJOBS, --maxjobs NJOBS maxiumum number of jobs to run concurrently (default: 12) -b NBATCHES, --maxbatches NBATCHES enable dynamic batching of pipeline jobs with maximum number of batches set to NBATCHES (default: no batching) Advanced/debugging options: --verbose run pipeline in 'verbose' mode --work-dir WORKING_DIR specify the working directory for the pipeline operations --runner RUNNER explicitly specify runner definition (e.g. 'GEJobRunner(-j y)') --debug Turn on debugging output .. _commands_publish_qc: publish_qc ********** :: usage: auto_process.py publish_qc [-h] [--version] [--qc_dir QC_DIR] [--use-hierarchy {yes,no}] [--url BASE_URL] [--projects PROJECT_PATTERN] [--ignore-missing-qc] [--exclude-zip-files {yes,no}] [--regenerate-reports] [--force] [--suppress-warnings] [--legacy] [--runner RUNNER] [--debug] [ANALYSIS_DIR] Copy QC reports from ANALYSIS_DIR to local or remote directory (e.g. web server). By default existing QC reports will be copied without further checking; if no report is found then QC results will be verified and a report generated first. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit Destination options: --qc_dir QC_DIR specify target directory to copy QC reports to. QC_DIR can be a local directory, or a remote location in the form '[[user@]host:]directory'. Overrides the default settings. --use-hierarchy {yes,no} use YEAR/PLATFORM hierarchy under QC_DIR; can be 'yes' or 'no' (default: no) --url BASE_URL specify the 'base' URL for accessing the published reports. Overrides the default settings Projects and data options: --projects PROJECT_PATTERN simple wildcard-based pattern specifying a subset of projects and samples to publish the QC for. PROJECT_PATTERN can specify a single project, or a set of projects. --ignore-missing-qc skip projects where QC results are missing or can't be verified, or where reports can't be generated. --exclude-zip-files {yes,no} exclude ZIP archives from publication; can be 'yes' or 'no' (default: no) QC reporting options: --regenerate-reports attempt to regenerate existing QC reports --force force generation of QC reports for all projects even if verification has failed --suppress-warnings don't include warning messages in (re)generated QC reports or top level index even if there are missing metrics in individual QC reports (NB won't be applied for pre-existing reports; combine with --regenerate- reports and --force to update all reports) --legacy legacy mode: include links to MultiQC and 'cellranger count' reports in the top-level index page Advanced/debugging options: --runner RUNNER explicitly specify runner definition (e.g. 'GEJobRunner(-j y)') --debug Turn on debugging output .. _commands_archive: archive ******* :: usage: auto_process.py archive [-h] [--version] [--archive_dir ARCHIVE_DIR] [--platform PLATFORM] [--year YEAR] [--group GROUP] [--chmod PERMISSIONS] [--logging_file LOGGING_FILE] [--final] [--force] [--runner RUNNER] [--dry-run] [--debug] [ANALYSIS_DIR] Copy sequencing analysis data directory ANALYSIS_DIR to 'archive' destination. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --archive_dir ARCHIVE_DIR specify top-level archive directory to copy data under. ARCHIVE_DIR can be a local directory, or a remote location in the form '[[user@]host:]directory'. Overrides the default settings. --platform PLATFORM specify the platform e.g. 'hiseq', 'miseq' etc (overrides automatically determined platform, if any). Use 'other' for cases where the platform is unknown. --year YEAR specify the year e.g. '2014' (default is the current year) --group GROUP specify the name of group for the archived files (default: None) --chmod PERMISSIONS specify permissions for the archived files. PERMISSIONS should be a string recognised by the 'chmod' command (e.g. 'o-rwX') (default: None) --logging_file LOGGING_FILE log run details to LOGGING_FILE on final archive (default: None) --final copy data to final archive location (default is to copy to staging area) --force attempt to complete archiving operations ignoring any errors (e.g. key metadata items not set, unable to set group etc) --runner RUNNER explicitly specify runner definition (e.g. 'GEJobRunner(-j y)') --dry-run Dry run i.e. report what would be done but don't perform any actions --debug Turn on debugging output .. _commands_report: report ****** :: usage: auto_process.py report [-h] [--version] [--logging | --summary | --projects] [--fields FIELDS] [--template TEMPLATE] [--file OUT_FILE] [--debug] [ANALYSIS_DIR] Report information on analysis in ANALYSIS_DIR. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --logging print short report suitable for logging file --summary print full report suitable for bioinformaticians --projects print tab-delimited line (one per project) suitable for injection into a spreadsheet --fields FIELDS fields to report --template TEMPLATE name of template with fields to report (templates should be defined in the config file) --file OUT_FILE write report to OUT_FILE; destination can be a local file, or a remote file specified as [[USER@]HOST:]PATH (default is to write to stdout) --debug Turn on debugging output .. _commands_samplesheet: samplesheet *********** :: usage: auto_process.py samplesheet [-h] [--version] [--use SAMPLE_SHEET | --set-project [LANES:][COL=PATTERN:]NEW_PROJECT | --set-sample-id [LANES:][COL=PATTERN:]NEW_ID | --set-sample-name NEW_NAME | -i SAMPLE_SHEET | -e | -p | -s] [--debug] [ANALYSIS_DIR] Query and manipulate sample sheets positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --use SAMPLE_SHEET update the default sample sheet file to SAMPLE_SHEET (must be a file on the local file system) --set-project [LANES:][COL=PATTERN:]NEW_PROJECT update the sample project field. Optional LANES specifies one or more lanes (e.g. '1', '1,2,3', '1-3', '1,3-5') to update; optional COL=PATTERN specifies a glob-style pattern to match to an arbitrary column (e.g. 'Sample_Name=ITS*'); NEW_PROJECT is the new project name --set-sample-id [LANES:][COL=PATTERN:]NEW_ID update the sample ID field.Optional LANES specifies one or more lanes (e.g. '1', '1,2,3', '1-3', '1,3-5') to update; optional COL=PATTERN specifies a glob-style pattern to match to an arbitrary column (e.g. 'Sample_Name=ITS*'); NEW_ID can be either 'SAMPLE_NAME' or an arbitrary string --set-sample-name NEW_NAME update the sample name field.Optional LANES specifies one or more lanes (e.g. '1', '1,2,3', '1-3', '1,3-5') to update; optional COL=PATTERN specifies a glob-style pattern to match to an arbitrary column (e.g. 'Sample_Name=ITS*'); NEW_NAME can be either 'SAMPLE_ID' or an arbitrary string -i SAMPLE_SHEET, --import SAMPLE_SHEET replace existing sample sheet file with version copied from the specified location; SAMPLE_SHEET can be a local or remote file, or a URL -e, --edit bring up sample sheet file in an editor to make changes manually -p, --predict show predicted outputs from sample sheet -s, --summarise summarise predicted outputs from sample sheet Advanced options: --debug Turn on debugging output .. _commands_update: update ****** :: usage: auto_process.py update [-h] [--version] [--paths] [--project-metadata] [--project-dirs] [--qc-reports] [--debug] [ANALYSIS_DIR] Update paths and metadata across ANALYSIS_DIR and its projects and QC outputs when directory has been moved or copied, or project metadata has been updated. positional arguments: ANALYSIS_DIR existing auto_process analysis directory to update (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --paths update paths stored in the metadata and parameter files for ANALYSIS_DIR --project-metadata propagate modified metadata in 'projects.info' to project directories in ANALYSIS_DIR --project-dirs synchronise project entries in 'projects.info' with project directories within ANALYSIS_DIR --qc-reports regenerate QC reports for projects where metadata has been updated --debug Turn on debugging output .. _commands_merge_fastq_dirs: merge_fastq_dirs **************** :: usage: auto_process.py merge_fastq_dirs [-h] [--version] [--primary-unaligned-dir UNALIGNED_DIR] [--output-dir OUTPUT_DIR] [--dry-run] [--debug] [ANALYSIS_DIR] Automatically merge fastq directories from multiple bcl-to-fastq runs within ANALYSIS_DIR. Use this command if 'make_fastqs' step was run multiple times to process subsets of lanes. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --primary-unaligned-dir UNALIGNED_DIR merge fastqs from additional bcl-to-fastq directories into UNALIGNED_DIR. Original data will be moved out of the way first. Defaults to 'bcl2fastq'. --output-dir OUTPUT_DIR merge fastqs into OUTPUT_DIR (relative to ANALYSIS_DIR). Defaults to UNALIGNED_DIR. --dry-run Dry run i.e. report what would be done but don't perform any actions --debug Turn on debugging output .. _commands_update_fastq_stats: update_fastq_stats ****************** :: usage: auto_process.py update_fastq_stats [-h] [--version] [--unaligned-dir UNALIGNED_DIR] [--sample-sheet SAMPLE_SHEET] [--id NAME] [--stats-file STATS_FILE] [--per-lane-stats-file PER_LANE_STATS_FILE] [-a] [--force] [--nprocessors NPROCESSORS] [--runner RUNNER] [--debug] [ANALYSIS_DIR] (Re)generate statistics for fastq files produced from 'make_fastqs'. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --unaligned-dir UNALIGNED_DIR explicitly set the (sub)directory with bcl-to-fastq outputs --sample-sheet SAMPLE_SHEET explicitly specify the sample sheet to use (defaults to the sample sheet stored in the analysis directory parameters) --id NAME specify an identifier to be written into the output statistics file name (e.g. 'statistics.NAME.info') --stats-file STATS_FILE specify output file for fastq statistics --per-lane-stats-file PER_LANE_STATS_FILE specify output file for per-lane statistics -a, --add add new data from UNALIGNED_DIR to existing statistics --force force statistics to be regenerated even if existing statistics files are newer than fastqs --nprocessors NPROCESSORS explicitly specify number of processors/cores to use (default taken from job runner) --runner RUNNER explicitly specify runner definition (e.g. 'GEJobRunner(-j y)') --debug Turn on debugging output .. _commands_import_project: import_project ************** :: usage: auto_process.py import_project [-h] [--version] [--debug] [--comment COMMENT] [ANALYSIS_DIR] PROJECT_DIR Copy a project directory PROJECT_DIR from another analysis directory into ANALYSIS_DIR, update metadata appropriately, and regenerate QC reports. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) PROJECT_DIR path to project directory to import optional arguments: -h, --help show this help message and exit --version show program's version number and exit --debug Turn on debugging output --comment COMMENT specify comment text to be appended to the stored comments associated with the project .. _commands_config: config ****** :: usage: auto_process.py config [-h] [--version] [--debug] [--init | --set KEY_VALUE | --add NEW_SECTION] [--raw] [--show] Query and change global configuration. Run without options arguments to displays configuration settings. optional arguments: -h, --help show this help message and exit --version show program's version number and exit --debug Turn on debugging output Creation and edit options: --init Create a new default configuration file based on the sample template. --set KEY_VALUE Set the value of a parameter. KEY_VALUE should be of the form '=' ( should be of the form 'SECTION[:SUBSECTION].NAME'). Multiple --set options can be specified. --add NEW_SECTION Add a new section called NEW_SECTION to the config (to add e.g. a new platform, use 'platform:NAME'). Multiple --add options can be specified. Display options: --raw Show the 'raw' configuration (i.e. only parameters and values explicitly defined in the config before defaults are loaded) Deprecated/defunct options: --show Show the values of parameters and settings (does nothing; use 'config' with no options to display settings) .. _commands_params: params ****** :: usage: auto_process.py params [-h] [--version] [--set KEY_VALUE] [--debug] [ANALYSIS_DIR] Query and change processing parameters and settings for ANALYSIS_DIR. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --set KEY_VALUE Set the value of a parameter. KEY_VALUE should be of the form '='. Multiple --set options can be specified. --debug Turn on debugging output .. _commands_metadata: metadata ******** :: usage: auto_process.py metadata [-h] [--version] [--set [PROJECT:]PARAM=VALUE] [--update] [--debug] [ANALYSIS_DIR] Query and change metadata associated with ANALYSIS_DIR. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --set [PROJECT:]PARAM=VALUE Set metadata item PARAM to VALUE for ANALYSIS_DIR, or if the PROJECT identifier is also specified, then for item PARAM in that project. Multiple --set options can be specified. --update Automatically update metadata items where possible (e.g. for older analyses which have old or missing metadata files) --debug Turn on debugging output .. _commands_readme: readme ****** :: usage: auto_process.py readme [-h] [--version] [--init] [-V] [-e] [-m MESSAGE] [--debug] [ANALYSIS_DIR] Add or amend a README file in the analysis directory DIR. positional arguments: ANALYSIS_DIR auto_process analysis directory (optional: defaults to the current directory) optional arguments: -h, --help show this help message and exit --version show program's version number and exit --init create a new README file -V, --view display the contents of the README file -e, --edit bring up README file in an editor to make changes -m MESSAGE, --message MESSAGE append MESSAGE text to the README file --debug Turn on debugging output .. _commands_clone: clone ***** :: usage: auto_process.py clone [-h] [--version] [--copy-fastqs] [--exclude-projects] [--debug] [ANALYSIS_DIR] CLONE_DIR Make a copy of an existing directory DIR in a new directory CLONE_DIR. positional arguments: ANALYSIS_DIR existing auto_process analysis directory to clone (optional: defaults to the current directory) CLONE_DIR path to cloned directory optional arguments: -h, --help show this help message and exit --version show program's version number and exit --copy-fastqs Copy fastq.gz files from DIR into CLONE_DIR (default is to make a link to the bcl-to-fastq directory) --exclude-projects Exclude (i.e. don't copy) project directories from DIR --debug Turn on debugging output