``auto_process`` commands
=========================

.. note::

   This documentation has been auto-generated from the
   command help

``auto_process.py`` implements the following commands:

.. contents:: :local:

.. _commands_info:

info
****

::

    usage: auto_process.py info [-h] [--version] [--debug] [ANALYSIS_DIR]
    
    Print information about the analysis associated with ANALYSIS_DIR.
    
    positional arguments:
      ANALYSIS_DIR  auto_process analysis directory (optional: defaults to the
                    current directory)
    
    optional arguments:
      -h, --help    show this help message and exit
      --version     show program's version number and exit
      --debug       Turn on debugging output
    
.. _commands_setup:

setup
*****

::

    usage: auto_process.py setup [-h] [--version] -r RUN_NUMBER [-s SAMPLE_SHEET]
                                 [-n ANALYSIS_NUMBER] [-f FILE]
                                 [--fastq-dir UNALIGNED_DIR]
                                 [--analysis-dir ANALYSIS_DIR] [--debug]
                                 RUN_DIR
    
    Set up automatic processing of Illumina sequencing data from RUN_DIR.
    
    positional arguments:
      RUN_DIR               directory with the output from an Illumina sequencer
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      -r RUN_NUMBER, --run-number RUN_NUMBER
                            Set facility run number (required)
      -s SAMPLE_SHEET, --samplesheet SAMPLE_SHEET, --sample-sheet SAMPLE_SHEET
                            Copy sample sheet file from name and location
                            SAMPLE_SHEET (default is to look for SampleSheet.csv
                            inside DIR). SAMPLE_SHEET can be a local or remote
                            file, or a URL
      -n ANALYSIS_NUMBER, --analysis-number ANALYSIS_NUMBER
                            Set analysis number (e.g. if reprocessing a run); will
                            be appended to analysis directory name if '--analysis-
                            dir' not supplied
      -f FILE, --file FILE  Additional file(s) to copy into new analysis
                            directory. FILE can be a local or remote file, or a
                            URL
      --fastq-dir UNALIGNED_DIR
                            Import fastq.gz files from FASTQ_DIR (which should be
                            a subdirectory of DIR with the same structure as that
                            the 'Unaligned' or 'bcl2fastq2' output directory
                            produced by CASAVA/bcl2fastq)
      --analysis-dir ANALYSIS_DIR
                            Make new directory called ANALYSIS_DIR (otherwise
                            default is '<RUN_DIR>_analysis[<ANALYSIS_NUMBER>]')
      --debug               Turn on debugging output
    
.. _commands_make_fastqs:

make_fastqs
***********

::

    usage: auto_process.py make_fastqs [-h] [--version] [--no-save] [--debug]
                                       [--id NAME] [--force-copy]
                                       [--protocol {standard,mirna,10x_chromium_sc,10x_atac,10x_multiome,10x_multiome_atac,10x_multiome_gex,10x_visium,10x_visium_v1,10x_visium_hd,10x_visium_hd_3prime,parse_evercode,biorad_ddseq}]
                                       [--sample-sheet SAMPLE_SHEET]
                                       [--lanes LANES[:OPTIONS]]
                                       [--output-dir OUT_DIR]
                                       [--platform PLATFORM]
                                       [--use-bases-mask BASES_MASK]
                                       [--bcl-converter CONVERTER]
                                       [--no-lane-splitting]
                                       [--use-lane-splitting]
                                       [--find-adapters-with-sliding-window]
                                       [--create-empty-fastqs]
                                       [--no-create-empty-fastqs]
                                       [--create-fastq-for-index-reads]
                                       [--ignore-missing-bcls]
                                       [--nprocessors NPROCESSORS]
                                       [--runner RUNNER]
                                       [--adapter ADAPTER_SEQUENCE]
                                       [--adapter-read2 ADAPTER_SEQUENCE_READ2]
                                       [--minimum-trimmed-read-length MINIMUM_TRIMMED_READ_LENGTH]
                                       [--mask-short-adapter-reads MASK_SHORT_ADAPTER_READS]
                                       [--no-adapter-trimming]
                                       [--r1-length R1_LENGTH]
                                       [--r2-length R2_LENGTH]
                                       [--r3-length R3_LENGTH]
                                       [--10x_jobmode CELLRANGER_JOBMODE]
                                       [--10x_localcores CELLRANGER_LOCALCORES]
                                       [--10x_localmem CELLRANGER_LOCALMEM]
                                       [--10x_maxjobs CELLRANGER_MAXJOBS]
                                       [--10x_mempercore CELLRANGER_MEMPERCORE]
                                       [--10x_jobinterval CELLRANGER_JOBINTERVAL]
                                       [--ignore-dual-index]
                                       [--rc-i2-override RC_I2_OVERRIDE]
                                       [--stats-file STATS_FILE]
                                       [--per-lane-stats-file PER_LANE_STATS_FILE]
                                       [--no-stats]
                                       [--barcode-analysis-dir BARCODE_ANALYSIS_DIR]
                                       [--no-barcode-analysis]
                                       [--enable-conda {yes,no}]
                                       [--conda-env-dir CONDA_ENV_DIR] [-j NJOBS]
                                       [-c NCORES] [-b NBATCHES] [--verbose]
                                       [--work-dir WORKING_DIR]
                                       [--require-bcl2fastq-version BCL2FASTQ_VERSION]
                                       [ANALYSIS_DIR]
    
    Generate fastq files from raw bcl files produced by Illumina sequencer.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --no-save             Don't save parameter changes to the auto_process.info
                            file
      --debug               Turn on debugging output
      --id NAME             identifier for output files
    
    Primary data management:
      --force-copy          force primary data to be copied (by default only data
                            on a remote system will be copied; data on a local
                            system will be symlinked)
    
    General Fastq generation:
      --protocol {standard,mirna,10x_chromium_sc,10x_atac,10x_multiome,10x_multiome_atac,10x_multiome_gex,10x_visium,10x_visium_v1,10x_visium_hd,10x_visium_hd_3prime,parse_evercode,biorad_ddseq}
                            specify Fastq generation protocol depending on the
                            data being processed (default: 'standard')
      --sample-sheet SAMPLE_SHEET
                            use an alternative sample sheet to the default
                            'custom_SampleSheet.csv' created on setup.
      --lanes LANES[:OPTIONS]
                            define a set of lanes to group for processing. LANES
                            can be a single lane (e.g. '1'), a list ('1,2,3,7'), a
                            range ('1-3'), or a combination ('1-3,7'). Specified
                            lanes are processed together in a group, using OPTIONS
                            (if supplied). OPTIONS takes the form
                            '[PROTOCOL:][KEY=VALUE:[KEY=VALUE]...] (for example
                            --lanes=1-4:standard:trim_adapters=no)
      --output-dir OUT_DIR  set the directory for the output Fastqs (default:
                            'bcl2fastq')
      --platform PLATFORM   explicitly specify the sequencing platform. Only use
                            this if the platform cannot be identified from the
                            instrument name
      --use-bases-mask BASES_MASK
                            explicitly set the bases-mask string to indicate how
                            each cycle should be used in the BCL to Fastq
                            conversion (overrides default). Set to 'auto' to
                            determine automatically
    
    Bcl conversion options:
      --bcl-converter CONVERTER
                            explicitly set BCL conversion software to use for
                            non-10xGenomics runs (either 'bcl2fastq' or 'bcl-
                            convert'; can also include a version specifier e.g.
                            'bcl2fastq>=2.0'). Default: bcl2fastq>=2.20 (may be
                            overridden by platform-specific settings)
      --no-lane-splitting   don't split the output FASTQ files by lane. Default:
                            off (may be overridden by platform-specific settings);
                            turn off using --use-lane-splitting
      --use-lane-splitting  split the output FASTQ files by lane. Default: on (but
                            may be overridden by platform-specific settings); turn
                            off using --no-lane-splitting
      --find-adapters-with-sliding-window
                            use sliding window algorithm to identify adapters for
                            trimming
      --create-empty-fastqs
                            create empty files as placeholders for missing FASTQs
                            from demultiplexing step. Default: off (but may be
                            overridden by platform-specific settings); turn off
                            using --no-create-empty-fastqs. NB Fastq generation
                            must have finished without for this option to be
                            applied
      --no-create-empty-fastqs
                            don't create empty files as placeholders for missing
                            FASTQs from demultiplexing step. Default: on (but may
                            be overridden by platform-specific settings); turn off
                            using --create-empty-fastqs.
      --create-fastq-for-index-reads
                            also create FASTQs for index reads
      --ignore-missing-bcls
                            ignores missing or corrupt BCL files and assumes
                            'N'/'#' for missing calls (only applies if using
                            bcl2fastq as the BCL conversion software)
      --nprocessors NPROCESSORS
                            explicitly specify number of processors/cores to use
                            (default taken from job runner)
      --runner RUNNER       explicitly specify runner definition (e.g.
                            'GEJobRunner(-j y)')
    
    Adapter trimming and masking:
      --adapter ADAPTER_SEQUENCE
                            sequence of adapter to be trimmed. Specify multiple
                            adapters by separating them with plus sign (+). Only
                            used for read 1 if --adapter-read2 is also specified
                            (default: use adapter sequence from sample sheet)
      --adapter-read2 ADAPTER_SEQUENCE_READ2
                            sequence of adapter to be trimmed in read 2. Specify
                            multiple adapters by separating them with plus sign
                            (+) (default: use adapter sequence from sample sheet)
      --minimum-trimmed-read-length MINIMUM_TRIMMED_READ_LENGTH
                            Minimum read length after adapter trimming. bcl2fastq
                            trims the adapter from the read down to this value; if
                            there is more adapter match below this length then
                            those bases are masked not trimmed (i.e. replaced by N
                            rather than removed) (default: 35)
      --mask-short-adapter-reads MASK_SHORT_ADAPTER_READS
                            minimum length of unmasked bases that a read can be
                            after adapter trimming; reads with fewer ACGT bases
                            will be completely masked with Ns (default: 22)
      --no-adapter-trimming
                            turn off adapter trimming even if adapter sequences
                            are supplied
    
    Read truncation options:
      --r1-length R1_LENGTH
                            truncate R1 reads to R1_LENGTH (ignored if --use-
                            bases-mask is explicitly set)
      --r2-length R2_LENGTH
                            truncate R2 reads to R2_LENGTH (ignored if --use-
                            bases-mask is explicitly set, or if there is no R2
                            read)
      --r3-length R3_LENGTH
                            truncate R3 reads to R3_LENGTH (ignored if --use-
                            bases-mask is explicitly set, or if there is no R3
                            read)
    
    10x Genomics data options (Cellranger*/Spaceranger):
      --10x_jobmode CELLRANGER_JOBMODE
                            job mode to run cellranger in (default: 'local')
      --10x_localcores CELLRANGER_LOCALCORES
                            maximum cores cellranger can request at onetime for
                            jobmode 'local' (ignored for other jobmodes) (default:
                            1)
      --10x_localmem CELLRANGER_LOCALMEM
                            maximum total memory cellranger can request at one
                            time for jobmode 'local' (ignored for other jobmodes)
                            (in Gbs; default: 5)
      --10x_maxjobs CELLRANGER_MAXJOBS
                            maxiumum number of concurrent jobs to run NB only used
                            if jobmode is not 'local' (default: 24)
      --10x_mempercore CELLRANGER_MEMPERCORE
                            memory assumed per core (in Gbs; default: 5); NB only
                            used if jobmode is not 'local'
      --10x_jobinterval CELLRANGER_JOBINTERVAL
                            how often jobs are submitted (in ms; default: 100);
                            only used if jobmode is not 'local'
      --ignore-dual-index   on a dual-indexed flowcell where the second index was
                            not used for the 10x sample, ignore it
    
    10x Genomics Spaceranger options:
      --rc-i2-override RC_I2_OVERRIDE
                            (Spaceranger only) explicitly indicate whether bases
                            in I2 read were emitted as reverse complement by the
                            sequencing workflow: set to 'true' for the Reverse
                            Complement Workflow (Workflow B)/ NovaSeq Reagent Kit
                            v1.5 or greater, 'false' for the Forward Strand
                            Workflow (Workflow A) / older NovaSeq Reagent Kits. If
                            unset then workflow will be determined automatically
                            (recommended)
    
    Statistics generation:
      --stats-file STATS_FILE
                            specify output file for fastq statistics
      --per-lane-stats-file PER_LANE_STATS_FILE
                            specify output file for per-lane statistics
      --no-stats            don't generate statistics file; use
                            'update_fastq_stats' command to (re)generate
                            statistics
    
    Barcode analysis:
      --barcode-analysis-dir BARCODE_ANALYSIS_DIR
                            specify subdirectory where barcode analysis will be
                            performed and outputs will be written
      --no-barcode-analysis
                            don't perform barcode analysis; use 'analyse_barcodes'
                            command to run barcode analysis separately
    
    Conda dependency resolution:
      --enable-conda {yes,no}
                            use conda to resolve task dependencies; can be 'yes'
                            or 'no' (default: yes)
      --conda-env-dir CONDA_ENV_DIR
                            specify directory for conda enviroments (default:
                            temporary directory)
    
    Job control options:
      -j NJOBS, --maxjobs NJOBS
                            maxiumum number of jobs to run concurrently (default:
                            12)
      -c NCORES, --maxcores NCORES
                            maximum number of cores available for running jobs
                            (default: no limit)
      -b NBATCHES, --maxbatches NBATCHES
                            enable dynamic batching of pipeline jobs with maximum
                            number of batches set to NBATCHES (default: no
                            batching)
    
    Advanced/debugging options:
      --verbose             run pipeline in 'verbose' mode
      --work-dir WORKING_DIR
                            specify the working directory for the pipeline
                            operations
    
    Deprecated options:
      --require-bcl2fastq-version BCL2FASTQ_VERSION
                            deprecated: explicitly specify version of bcl2fastq
                            software to use (e.g. '=1.8.4' or '>=2.0') (use --bcl-
                            converter instead)
    
.. _commands_analyse_barcodes:

analyse_barcodes
****************

::

    usage: auto_process.py analyse_barcodes [-h] [--version]
                                            [--unaligned-dir UNALIGNED_DIR]
                                            [--lanes LANES]
                                            [--mismatches MISMATCHES]
                                            [--cutoff CUTOFF]
                                            [--sample-sheet SAMPLE_SHEET]
                                            [--id NAME]
                                            [--barcode-analysis-dir BARCODE_ANALYSIS_DIR]
                                            [--force] [--runner RUNNER] [--debug]
                                            [ANALYSIS_DIR]
    
    Analyse barcode sequences for Fastq files in specified lanes in ANALYSIS_DIR,
    and report the most common barcodes found across all reads from each lane.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --unaligned-dir UNALIGNED_DIR
                            explicitly set the (sub)directory with bcl-to-fastq
                            outputs
      --lanes LANES         specify which lanes to analyse barcodes for (default
                            is to do analysis for all lanes).
      --mismatches MISMATCHES
                            maximum number of mismatches to use when grouping
                            similar barcodes (default is to determine
                            automatically from the bases mask)
      --cutoff CUTOFF       exclude barcodes with a smaller fraction of associated
                            reads than CUTOFF, e.g. '0.01' excludes barcodes with
                            < 1% of reads (default is 0.01%)
      --sample-sheet SAMPLE_SHEET
                            use an alternative sample sheet to the default
                            'custom_SampleSheet.csv' created on setup.
      --id NAME             specify an identifier to be written into the default
                            output barcode analysis directory name (e.g.
                            'barcode_analysis_NAME') and report title
      --barcode-analysis-dir BARCODE_ANALYSIS_DIR
                            specify subdirectory where barcode analysis will be
                            performed and outputs will be written
      --force               discard and regenerate counts (by default existing
                            counts will be used)
      --runner RUNNER       explicitly specify runner definition (e.g.
                            'GEJobRunner(-j y)')
      --debug               Turn on debugging output
    
.. _commands_setup_analysis_dirs:

setup_analysis_dirs
*******************

::

    usage: auto_process.py setup_analysis_dirs [-h] [--version]
                                               [--ignore-missing-metadata]
                                               [--unaligned-dir UNALIGNED_DIR]
                                               [--undetermined UNDETERMINED]
                                               [--short-fastq-names]
                                               [--link-to-fastqs] [--id NAME]
                                               [--debug]
                                               [ANALYSIS_DIR]
    
    Create analysis subdirectories for projects defined in projects.info file in
    ANALYSIS_DIR.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --ignore-missing-metadata
                            force creation of project directories even if metadata
                            is not set (default is to fail if metadata is missing)
      --unaligned-dir UNALIGNED_DIR
                            explicitly specify the subdirectory with output Fastqs
      --undetermined UNDETERMINED
                            explicitly specify name for project directory with
                            'undetermined' fastqs
      --short-fastq-names   shorten fastq file names when copying or linking from
                            project directory (default is to keep long names from
                            bcl2fastq)
      --link-to-fastqs      create symbolic links to original fastqs from project
                            directory (default is to make hard links)
      --id NAME             identifier to append to project names
      --debug               Turn on debugging output
    
.. _commands_run_qc:

run_qc
******

::

    usage: auto_process.py run_qc [-h] [--version] [--projects PROJECT_PATTERN]
                                  [--qc_dir QC_DIR] [--fastq_dir FASTQ_DIR]
                                  [--protocol PROJECTNAME=QCPROTOCOL]
                                  [--organism PROJECTNAME=ORGANISM]
                                  [--fastq_subset SUBSET] [-t NTHREADS]
                                  [--cellranger CELLRANGER_EXE]
                                  [--10x_chemistry {ARC-v1,SC3Pv1,SC3Pv2,SC3Pv3,SC3Pv3HT,SC3Pv3LT,SC3Pv4,SC5P-PE,SC5P-PE-v3,SC5P-R2,SC5P-R2-v3,auto,fiveprime,threeprime}]
                                  [--10x_force_cells N_CELLS]
                                  [--10x_extra_projects PROJECT_DIRS]
                                  [--10x_transcriptome ORGANISM=REFERENCE]
                                  [--10x_premrna_reference ORGANISM=REFERENCE]
                                  [--report HTML_FILE] [--enable-conda {yes,no}]
                                  [--conda-env-dir CONDA_ENV_DIR] [-c NCORES]
                                  [-j NJOBS] [-b NBATCHES] [--verbose]
                                  [--work-dir WORKING_DIR] [--runner RUNNER]
                                  [--debug]
                                  [ANALYSIS_DIR]
    
    Run QC procedures for sequencing projects in ANALYSIS_DIR.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --projects PROJECT_PATTERN
                            simple wildcard-based pattern specifying a subset of
                            projects and samples to run the QC on. PROJECT_PATTERN
                            should be of the form 'pname[/sname]', where 'pname'
                            specifies a project (or set of projects) and 'sname'
                            optionally specifies a sample (or set of samples).
      --qc_dir QC_DIR       explicitly specify QC output directory (nb if supplied
                            then the same QC_DIR will be used for each project.
                            Non-absolute paths are assumed to be relative to the
                            project directory). Default: 'qc'
      --fastq_dir FASTQ_DIR
                            explicitly specify subdirectory of DIR with Fastq
                            files to run the QC on.
    
    QC options:
      --protocol PROJECTNAME=QCPROTOCOL
                            specify QC protocol for project PROJECTNAME (overrides
                            the automatic protocol determination for that project)
      --organism PROJECTNAME=ORGANISM
                            specify organism for QC run for project PROJECTNAME
                            (overrides the organism set for that project)
      --fastq_subset SUBSET
                            specify size of subset of total reads to use for
                            fastq_screen, BAM file generation etc (default 100000,
                            set to 0 to use all reads)
      -t NTHREADS, --threads NTHREADS
                            number of threads to use for QC script (default: taken
                            from job runner)
    
    Cellranger/10xGenomics options:
      --cellranger CELLRANGER_EXE
                            explicitly specify path to Cellranger executable to
                            use for single library analysis (NB will be used for
                            all projects)
      --10x_chemistry {ARC-v1,SC3Pv1,SC3Pv2,SC3Pv3,SC3Pv3HT,SC3Pv3LT,SC3Pv4,SC5P-PE,SC5P-PE-v3,SC5P-R2,SC5P-R2-v3,auto,fiveprime,threeprime}
                            assay configuration for 10xGenomics scRNA-seq; if set
                            to 'auto' (the default) then cellranger will attempt
                            to determine this automatically
      --10x_force_cells N_CELLS
                            force number of cells for 10xGenomics scRNA-seq and
                            scATAC-seq, overriding automatic cell detection
                            algorithms (default is to use built-in cell detection)
      --10x_extra_projects PROJECT_DIRS
                            specify additional projects to include samples from in
                            single library analyses, as comma-separated list
      --10x_transcriptome ORGANISM=REFERENCE
                            specify cellranger transcriptome reference datasets to
                            associate with organisms (overrides references defined
                            in config file)
      --10x_premrna_reference ORGANISM=REFERENCE
                            specify cellranger pre-mRNA reference datasets to
                            associate with organisms (overrides references defined
                            in config file)
    
    Output and reporting:
      --report HTML_FILE    file name for output HTML QC report (default:
                            <QC_DIR>_report.html)
    
    Conda dependency resolution:
      --enable-conda {yes,no}
                            use conda to resolve task dependencies; can be 'yes'
                            or 'no' (default: no)
      --conda-env-dir CONDA_ENV_DIR
                            specify directory for conda enviroments (default:
                            temporary directory)
    
    Job control options:
      -c NCORES, --maxcores NCORES
                            maximum number of cores available for running jobs
                            (default: no limit)
      -j NJOBS, --maxjobs NJOBS
                            maxiumum number of jobs to run concurrently (default:
                            12)
      -b NBATCHES, --maxbatches NBATCHES
                            enable dynamic batching of pipeline jobs with maximum
                            number of batches set to NBATCHES (default: no
                            batching)
    
    Advanced/debugging options:
      --verbose             run pipeline in 'verbose' mode
      --work-dir WORKING_DIR
                            specify the working directory for the pipeline
                            operations
      --runner RUNNER       explicitly specify runner definition (e.g.
                            'GEJobRunner(-j y)')
      --debug               Turn on debugging output
    
.. _commands_publish_qc:

publish_qc
**********

::

    usage: auto_process.py publish_qc [-h] [--version] [--qc_dir QC_DIR]
                                      [--use-hierarchy {yes,no}] [--url BASE_URL]
                                      [--projects PROJECT_PATTERN]
                                      [--ignore-missing-qc]
                                      [--exclude-zip-files {yes,no}]
                                      [--regenerate-reports] [--force]
                                      [--suppress-warnings] [--legacy]
                                      [--runner RUNNER] [--debug]
                                      [ANALYSIS_DIR]
    
    Copy QC reports from ANALYSIS_DIR to local or remote directory (e.g. web
    server). By default existing QC reports will be copied without further
    checking; if no report is found then QC results will be verified and a report
    generated first.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
    
    Destination options:
      --qc_dir QC_DIR       specify target directory to copy QC reports to. QC_DIR
                            can be a local directory, or a remote location in the
                            form '[[user@]host:]directory'. Overrides the default
                            settings.
      --use-hierarchy {yes,no}
                            use YEAR/PLATFORM hierarchy under QC_DIR; can be 'yes'
                            or 'no' (default: no)
      --url BASE_URL        specify the 'base' URL for accessing the published
                            reports. Overrides the default settings
    
    Projects and data options:
      --projects PROJECT_PATTERN
                            simple wildcard-based pattern specifying a subset of
                            projects and samples to publish the QC for.
                            PROJECT_PATTERN can specify a single project, or a set
                            of projects.
      --ignore-missing-qc   skip projects where QC results are missing or can't be
                            verified, or where reports can't be generated.
      --exclude-zip-files {yes,no}
                            exclude ZIP archives from publication; can be 'yes' or
                            'no' (default: no)
    
    QC reporting options:
      --regenerate-reports  attempt to regenerate existing QC reports
      --force               force generation of QC reports for all projects even
                            if verification has failed
      --suppress-warnings   don't include warning messages in (re)generated QC
                            reports or top level index even if there are missing
                            metrics in individual QC reports (NB won't be applied
                            for pre-existing reports; combine with --regenerate-
                            reports and --force to update all reports)
      --legacy              legacy mode: include links to MultiQC and 'cellranger
                            count' reports in the top-level index page
    
    Advanced/debugging options:
      --runner RUNNER       explicitly specify runner definition (e.g.
                            'GEJobRunner(-j y)')
      --debug               Turn on debugging output
    
.. _commands_archive:

archive
*******

::

    usage: auto_process.py archive [-h] [--version] [--archive_dir ARCHIVE_DIR]
                                   [--platform PLATFORM] [--year YEAR]
                                   [--group GROUP] [--chmod PERMISSIONS]
                                   [--logging_file LOGGING_FILE] [--final]
                                   [--force] [--runner RUNNER] [--dry-run]
                                   [--debug]
                                   [ANALYSIS_DIR]
    
    Copy sequencing analysis data directory ANALYSIS_DIR to 'archive' destination.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --archive_dir ARCHIVE_DIR
                            specify top-level archive directory to copy data
                            under. ARCHIVE_DIR can be a local directory, or a
                            remote location in the form '[[user@]host:]directory'.
                            Overrides the default settings.
      --platform PLATFORM   specify the platform e.g. 'hiseq', 'miseq' etc
                            (overrides automatically determined platform, if any).
                            Use 'other' for cases where the platform is unknown.
      --year YEAR           specify the year e.g. '2014' (default is the current
                            year)
      --group GROUP         specify the name of group for the archived files
                            (default: None)
      --chmod PERMISSIONS   specify permissions for the archived files.
                            PERMISSIONS should be a string recognised by the
                            'chmod' command (e.g. 'o-rwX') (default: None)
      --logging_file LOGGING_FILE
                            log run details to LOGGING_FILE on final archive
                            (default: None)
      --final               copy data to final archive location (default is to
                            copy to staging area)
      --force               attempt to complete archiving operations ignoring any
                            errors (e.g. key metadata items not set, unable to set
                            group etc)
      --runner RUNNER       explicitly specify runner definition (e.g.
                            'GEJobRunner(-j y)')
      --dry-run             Dry run i.e. report what would be done but don't
                            perform any actions
      --debug               Turn on debugging output
    
.. _commands_report:

report
******

::

    usage: auto_process.py report [-h] [--version]
                                  [--logging | --summary | --projects]
                                  [--fields FIELDS] [--template TEMPLATE]
                                  [--file OUT_FILE] [--debug]
                                  [ANALYSIS_DIR]
    
    Report information on analysis in ANALYSIS_DIR.
    
    positional arguments:
      ANALYSIS_DIR         auto_process analysis directory (optional: defaults to
                           the current directory)
    
    optional arguments:
      -h, --help           show this help message and exit
      --version            show program's version number and exit
      --logging            print short report suitable for logging file
      --summary            print full report suitable for bioinformaticians
      --projects           print tab-delimited line (one per project) suitable for
                           injection into a spreadsheet
      --fields FIELDS      fields to report
      --template TEMPLATE  name of template with fields to report (templates
                           should be defined in the config file)
      --file OUT_FILE      write report to OUT_FILE; destination can be a local
                           file, or a remote file specified as [[USER@]HOST:]PATH
                           (default is to write to stdout)
      --debug              Turn on debugging output
    
.. _commands_samplesheet:

samplesheet
***********

::

    usage: auto_process.py samplesheet [-h] [--version]
                                       [--use SAMPLE_SHEET | --set-project [LANES:][COL=PATTERN:]NEW_PROJECT | --set-sample-id [LANES:][COL=PATTERN:]NEW_ID | --set-sample-name NEW_NAME | -i SAMPLE_SHEET | -e | -p | -s]
                                       [--debug]
                                       [ANALYSIS_DIR]
    
    Query and manipulate sample sheets
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --use SAMPLE_SHEET    update the default sample sheet file to SAMPLE_SHEET
                            (must be a file on the local file system)
      --set-project [LANES:][COL=PATTERN:]NEW_PROJECT
                            update the sample project field. Optional LANES
                            specifies one or more lanes (e.g. '1', '1,2,3', '1-3',
                            '1,3-5') to update; optional COL=PATTERN specifies a
                            glob-style pattern to match to an arbitrary column
                            (e.g. 'Sample_Name=ITS*'); NEW_PROJECT is the new
                            project name
      --set-sample-id [LANES:][COL=PATTERN:]NEW_ID
                            update the sample ID field.Optional LANES specifies
                            one or more lanes (e.g. '1', '1,2,3', '1-3', '1,3-5')
                            to update; optional COL=PATTERN specifies a glob-style
                            pattern to match to an arbitrary column (e.g.
                            'Sample_Name=ITS*'); NEW_ID can be either
                            'SAMPLE_NAME' or an arbitrary string
      --set-sample-name NEW_NAME
                            update the sample name field.Optional LANES specifies
                            one or more lanes (e.g. '1', '1,2,3', '1-3', '1,3-5')
                            to update; optional COL=PATTERN specifies a glob-style
                            pattern to match to an arbitrary column (e.g.
                            'Sample_Name=ITS*'); NEW_NAME can be either
                            'SAMPLE_ID' or an arbitrary string
      -i SAMPLE_SHEET, --import SAMPLE_SHEET
                            replace existing sample sheet file with version copied
                            from the specified location; SAMPLE_SHEET can be a
                            local or remote file, or a URL
      -e, --edit            bring up sample sheet file in an editor to make
                            changes manually
      -p, --predict         show predicted outputs from sample sheet
      -s, --summarise       summarise predicted outputs from sample sheet
    
    Advanced options:
      --debug               Turn on debugging output
    
.. _commands_update:

update
******

::

    usage: auto_process.py update [-h] [--version] [--paths] [--project-metadata]
                                  [--project-dirs] [--qc-reports] [--debug]
                                  [ANALYSIS_DIR]
    
    Update paths and metadata across ANALYSIS_DIR and its projects and QC outputs
    when directory has been moved or copied, or project metadata has been updated.
    
    positional arguments:
      ANALYSIS_DIR        existing auto_process analysis directory to update
                          (optional: defaults to the current directory)
    
    optional arguments:
      -h, --help          show this help message and exit
      --version           show program's version number and exit
      --paths             update paths stored in the metadata and parameter files
                          for ANALYSIS_DIR
      --project-metadata  propagate modified metadata in 'projects.info' to
                          project directories in ANALYSIS_DIR
      --project-dirs      synchronise project entries in 'projects.info' with
                          project directories within ANALYSIS_DIR
      --qc-reports        regenerate QC reports for projects where metadata has
                          been updated
      --debug             Turn on debugging output
    
.. _commands_merge_fastq_dirs:

merge_fastq_dirs
****************

::

    usage: auto_process.py merge_fastq_dirs [-h] [--version]
                                            [--primary-unaligned-dir UNALIGNED_DIR]
                                            [--output-dir OUTPUT_DIR] [--dry-run]
                                            [--debug]
                                            [ANALYSIS_DIR]
    
    Automatically merge fastq directories from multiple bcl-to-fastq runs within
    ANALYSIS_DIR. Use this command if 'make_fastqs' step was run multiple times to
    process subsets of lanes.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --primary-unaligned-dir UNALIGNED_DIR
                            merge fastqs from additional bcl-to-fastq directories
                            into UNALIGNED_DIR. Original data will be moved out of
                            the way first. Defaults to 'bcl2fastq'.
      --output-dir OUTPUT_DIR
                            merge fastqs into OUTPUT_DIR (relative to
                            ANALYSIS_DIR). Defaults to UNALIGNED_DIR.
      --dry-run             Dry run i.e. report what would be done but don't
                            perform any actions
      --debug               Turn on debugging output
    
.. _commands_update_fastq_stats:

update_fastq_stats
******************

::

    usage: auto_process.py update_fastq_stats [-h] [--version]
                                              [--unaligned-dir UNALIGNED_DIR]
                                              [--sample-sheet SAMPLE_SHEET]
                                              [--id NAME]
                                              [--stats-file STATS_FILE]
                                              [--per-lane-stats-file PER_LANE_STATS_FILE]
                                              [-a] [--force]
                                              [--nprocessors NPROCESSORS]
                                              [--runner RUNNER] [--debug]
                                              [ANALYSIS_DIR]
    
    (Re)generate statistics for fastq files produced from 'make_fastqs'.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --unaligned-dir UNALIGNED_DIR
                            explicitly set the (sub)directory with bcl-to-fastq
                            outputs
      --sample-sheet SAMPLE_SHEET
                            explicitly specify the sample sheet to use (defaults
                            to the sample sheet stored in the analysis directory
                            parameters)
      --id NAME             specify an identifier to be written into the output
                            statistics file name (e.g. 'statistics.NAME.info')
      --stats-file STATS_FILE
                            specify output file for fastq statistics
      --per-lane-stats-file PER_LANE_STATS_FILE
                            specify output file for per-lane statistics
      -a, --add             add new data from UNALIGNED_DIR to existing statistics
      --force               force statistics to be regenerated even if existing
                            statistics files are newer than fastqs
      --nprocessors NPROCESSORS
                            explicitly specify number of processors/cores to use
                            (default taken from job runner)
      --runner RUNNER       explicitly specify runner definition (e.g.
                            'GEJobRunner(-j y)')
      --debug               Turn on debugging output
    
.. _commands_import_project:

import_project
**************

::

    usage: auto_process.py import_project [-h] [--version] [--debug]
                                          [--comment COMMENT]
                                          [ANALYSIS_DIR] PROJECT_DIR
    
    Copy a project directory PROJECT_DIR from another analysis directory into
    ANALYSIS_DIR, update metadata appropriately, and regenerate QC reports.
    
    positional arguments:
      ANALYSIS_DIR       auto_process analysis directory (optional: defaults to
                         the current directory)
      PROJECT_DIR        path to project directory to import
    
    optional arguments:
      -h, --help         show this help message and exit
      --version          show program's version number and exit
      --debug            Turn on debugging output
      --comment COMMENT  specify comment text to be appended to the stored
                         comments associated with the project
    
.. _commands_config:

config
******

::

    usage: auto_process.py config [-h] [--version] [--debug]
                                  [--init | --set KEY_VALUE | --add NEW_SECTION]
                                  [--raw] [--show]
    
    Query and change global configuration. Run without options arguments to
    displays configuration settings.
    
    optional arguments:
      -h, --help         show this help message and exit
      --version          show program's version number and exit
      --debug            Turn on debugging output
    
    Creation and edit options:
      --init             Create a new default configuration file based on the
                         sample template.
      --set KEY_VALUE    Set the value of a parameter. KEY_VALUE should be of the
                         form '<param>=<value>' (<param> should be of the form
                         'SECTION[:SUBSECTION].NAME'). Multiple --set options can
                         be specified.
      --add NEW_SECTION  Add a new section called NEW_SECTION to the config (to
                         add e.g. a new platform, use 'platform:NAME'). Multiple
                         --add options can be specified.
    
    Display options:
      --raw              Show the 'raw' configuration (i.e. only parameters and
                         values explicitly defined in the config before defaults
                         are loaded)
    
    Deprecated/defunct options:
      --show             Show the values of parameters and settings (does nothing;
                         use 'config' with no options to display settings)
    
.. _commands_params:

params
******

::

    usage: auto_process.py params [-h] [--version] [--set KEY_VALUE] [--debug]
                                  [ANALYSIS_DIR]
    
    Query and change processing parameters and settings for ANALYSIS_DIR.
    
    positional arguments:
      ANALYSIS_DIR     auto_process analysis directory (optional: defaults to the
                       current directory)
    
    optional arguments:
      -h, --help       show this help message and exit
      --version        show program's version number and exit
      --set KEY_VALUE  Set the value of a parameter. KEY_VALUE should be of the
                       form '<param>=<value>'. Multiple --set options can be
                       specified.
      --debug          Turn on debugging output
    
.. _commands_metadata:

metadata
********

::

    usage: auto_process.py metadata [-h] [--version] [--set [PROJECT:]PARAM=VALUE]
                                    [--update] [--debug]
                                    [ANALYSIS_DIR]
    
    Query and change metadata associated with ANALYSIS_DIR.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --set [PROJECT:]PARAM=VALUE
                            Set metadata item PARAM to VALUE for ANALYSIS_DIR, or
                            if the PROJECT identifier is also specified, then for
                            item PARAM in that project. Multiple --set options can
                            be specified.
      --update              Automatically update metadata items where possible
                            (e.g. for older analyses which have old or missing
                            metadata files)
      --debug               Turn on debugging output
    
.. _commands_readme:

readme
******

::

    usage: auto_process.py readme [-h] [--version] [--init] [-V] [-e] [-m MESSAGE]
                                  [--debug]
                                  [ANALYSIS_DIR]
    
    Add or amend a README file in the analysis directory DIR.
    
    positional arguments:
      ANALYSIS_DIR          auto_process analysis directory (optional: defaults to
                            the current directory)
    
    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      --init                create a new README file
      -V, --view            display the contents of the README file
      -e, --edit            bring up README file in an editor to make changes
      -m MESSAGE, --message MESSAGE
                            append MESSAGE text to the README file
      --debug               Turn on debugging output
    
.. _commands_clone:

clone
*****

::

    usage: auto_process.py clone [-h] [--version] [--copy-fastqs]
                                 [--exclude-projects] [--debug]
                                 [ANALYSIS_DIR] CLONE_DIR
    
    Make a copy of an existing directory DIR in a new directory CLONE_DIR.
    
    positional arguments:
      ANALYSIS_DIR        existing auto_process analysis directory to clone
                          (optional: defaults to the current directory)
      CLONE_DIR           path to cloned directory
    
    optional arguments:
      -h, --help          show this help message and exit
      --version           show program's version number and exit
      --copy-fastqs       Copy fastq.gz files from DIR into CLONE_DIR (default is
                          to make a link to the bcl-to-fastq directory)
      --exclude-projects  Exclude (i.e. don't copy) project directories from DIR
      --debug             Turn on debugging output