

This documentation has been auto-generated from the command help

In addition to the main auto_process.py command, a number of utilities are available:


    analyse_barcodes.py FASTQ [FASTQ...]
    analyse_barcodes.py DIR
    analyse_barcodes.py -c COUNTS_FILE [COUNTS_FILE...]

Collate and report counts and statistics for Fastq index sequences (aka
barcodes). If multiple Fastq files are supplied then sequences will be pooled
before being analysed. If a single directory is supplied then this will be
assumed to be an output directory from bcl2fastq and files will be processed
on a per-lane basis. If the -c option is supplied then the input must be one
or more file of barcode counts generated previously using the -o option.

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit

Input and output options:
  -c, --counts          input is one or more counts files generated by
                        previous runs using the '-o/--output' option
                        output all counts to tab-delimited file
                        COUNTS_FILE_OUT. This can be used again in another run
                        by specifying the '-c' option

Reporting options:
  -l LANES, --lanes LANES
                        restrict analysis to the specified lane numbers
                        (default is to process all lanes). Multiple lanes can
                        be specified using ranges (e.g. '2-3'), comma-
                        separated list ('5,7') or a mixture ('2-3,5,7')
  -m MISMATCHES, --mismatches MISMATCHES
                        maximum number of mismatches to use when grouping
                        similar barcodes (will be determined automatically if
                        samplesheet is supplied, otherwise defaults to 0)
  --cutoff CUTOFF       exclude barcodes/barcode groups from reporting with a
                        smaller fraction of associated reads than CUTOFF, e.g.
                        '0.01' excludes barcodes with < 1.0% of reads
                        (default: 0.001)
  -s SAMPLE_SHEET, --sample-sheet SAMPLE_SHEET
                        report best matches against barcodes in SAMPLE_SHEET
                        write report to REPORT_FILE (otherwise write to
  -x XLS_FILE, --xls XLS_FILE
                        write XLS version of report to XLS_FILE
  -f HTML_FILE, --html HTML_FILE
                        write HTML version of report to HTML_FILE
  -t TITLE, --title TITLE
                        title for HTML report (default: 'Barcodes Report')
  -n, --no-report       suppress reporting (overrides --report)

Advanced options:
  --minimum_read_fraction FRACTION
                        weed out individual barcodes from initial analysis
                        which have a smaller fraction of reads than FRACTION,
                        e.g. '0.001' removes barcodes with < 0.1% of reads;
                        speeds up analysis at the expense of accuracy as
                        reported counts will be approximate (default: 1.0e-6)


usage: assign_barcodes.py [-h] [-n N] INPUT.fq OUTPUT.fq

Extract arbitrary sequence fragments from reads in INPUT.fq FASTQ file and
assign these as the index (barcode) sequences in the read headers in

positional arguments:
  INPUT.fq    Input FASTQ file
  OUTPUT.fq   Output FASTQ file

optional arguments:
  -h, --help  show this help message and exit
  -n N        remove first N bases from each read and assign these as barcode
              index sequence (default: 5)


usage: audit_projects.py [-h] [--pi PI_NAME] [--unassigned] [DIR [DIR ...]]

Summarise the disk usage for runs that have been processed using auto_process.
The supplied DIRs are directories holding one or more top-level analysis
directories corresponding to different runs. The program reports total disk
usage for projects assigned to each PI across all DIRs.

positional arguments:
  DIR           directory to search for analysis directories for auditing

optional arguments:
  -h, --help    show this help message and exit
  --pi PI_NAME  list data for PI(s) matching PI_NAME (can use glob-style
  --unassigned  list data for projects where PI is not assigned


usage: build_index.py [-h] [-v] [-o OUT_DIR] [--ebwt_base NAME]
                      [--bt2_base NAME] [--overhang N] [-V VERSION]
                      [-r RUNNER]
                      ALIGNER FASTA [ANNOTATION]

Generate indexes for aligners

positional arguments:
  ALIGNER               aligner to build index for (one of 'bowtie',
                        'bowtie2', 'star')
  FASTA                 FASTA file with sequence
  ANNOTATION            annotation file (for use with STAR)

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -o OUT_DIR            output directory for indexes

Bowtie-specific options:
  --ebwt_base NAME      specify basename for output .ebwt files (defaults to
                        FASTA file basename)

Bowtie-specific options:
  --bt2_base NAME       specify basename for output .bt2 files (defaults to
                        FASTA file basename)

STAR-specific options:
  --overhang N          set value for STAR --sjdbOverhang option (default:

Advanced options:
  -V VERSION, --aligner-version VERSION
                        specify the version of the aligner to target (only
                        works if conda dependency resolution is configured)
  -r RUNNER, --runner RUNNER
                        explicitly specify runner definition for building the
                        index. RUNNER must be a valid job runner specification
                        e.g. 'GEJobRunner(-pe smp.pe 8)' (default: use
                        appropriate runner from configuration)


usage: concat_fastqs.py [-h] [--version] [-v] FASTQ [FASTQ ...] FASTQ_OUT

Concatenate reads from one or more input Fastq files into a single new file

positional arguments:
  FASTQ          Input FASTQ to concatenate
  FASTQ_OUT      Output FASTQ with concatenated reads

optional arguments:
  -h, --help     show this help message and exit
  --version      show program's version number and exit
  -v, --verbose  verbose output


    barcode_splitter.py [OPTIONS] FASTQ [FASTQ ...]
    barcode_splitter.py [OPTIONS] FASTQ_R1,FASTQ_R2 [FASTQ_R1,FASTQ_R2 ...]
    barcode_splitter.py [OPTIONS] DIR

Split reads from one or more input Fastq files into new Fastqs based on
matching supplied barcodes.

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -b INDEX_SEQ, --barcode INDEX_SEQ
                        specify index sequence to filter using
  -m N_MISMATCHES, --mismatches N_MISMATCHES
                        maximum number of differing bases to allow for two
                        index sequences to count as a match. Default is zero
                        (i.e. exact matches only)
  -n BASE_NAME, --name BASE_NAME
                        basename to use for output files
  -o OUT_DIR, --output-dir OUT_DIR
                        specify directory for output split Fastqs
                        specify subdirectory with outputs from bcl2fastq
  -l LANE, --lane LANE  specify lane to collect and split Fastqs for


usage: download_fastqs.py [-h] URL [DIR]

Download checksum file and fastqs from URL into current directory (or
directory DIR, if specified), and verify the downloaded files against the
checksum file.

positional arguments:
  URL         URL with checksum file and fastqs
  DIR         directory to put downloaded fastqs into (defaults to current

optional arguments:
  -h, --help  show this help message and exit


usage: demultiplex_icell8_atac.py [-h] [-v] [-o OUTDIR] [-b N] [-n N]
                                  [-m {samples,barcodes}] [--unassigned NAME]
                                  [--reverse-complement {i1,i2,both}] [-u]
                                  WELL_LIST FASTQ_R1 FASTQ_R2 FASTQ_I1

Assign reads from ICELL8 ATAC R1/R2/I1/I2 Fastq set to barcodes and samples in
a well list file

positional arguments:
  WELL_LIST             Well list file
  FASTQ_R1              FASTQ R1
  FASTQ_R2              FASTQ R2
  FASTQ_I1              FASTQ I1
  FASTQ_I2              FASTQ I2

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -o OUTDIR, --output-dir OUTDIR
                        path to demultiplexed output
  -b N, --batch_size N  batch size for splitting index read Fastqs (default:
  -n N, --nprocessors N
                        number of processors to use
  -m {samples,barcodes}, --mode {samples,barcodes}
                        demultiplex reads by sample (default) or by barcode
  --unassigned NAME     basename for output Fastqs with reads which cannot be
                        assigned to any sample or barcode (default:
  --swap-i1-i2          swap supplied I1 and I2 Fastqs
  --reverse-complement {i1,i2,both}
                        reverse complement one or both of the indices from the
                        well list file
  -u, --update-read-headers
                        update read headers in the output Fastqs to include
                        the matching index sequence (i.e. barcode) from the
                        well list file
  --no-demultiplexing   don't generate demultiplexed Fastqs (only the stats)


usage: fastq_statistics.py [-h] [-v] [--unaligned UNALIGNED_DIR]
                           [--sample-sheet SAMPLE_SHEET] [-o STATS_FILE]
                           [-p PER_LANE_STATS_FILE]
                           [-s PER_LANE_SAMPLE_STATS_FILE]
                           [-f FULL_STATS_FILE] [-u] [-n N] [--debug]

Generate statistics for FASTQ files in ILLUMINA_RUN_DIR (top-level directory
of a processed Illumina run)

positional arguments:
  ILLUMINA_RUN_DIR      input Illumina run directory

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  --unaligned UNALIGNED_DIR
                        specify an alternative name for the 'Unaligned'
                        directory containing the fastq.gz files
  --sample-sheet SAMPLE_SHEET
                        specify a sample sheet file to get additional
                        information from
  -o STATS_FILE, --output STATS_FILE
                        name of output file for per-file statistics (default
                        is 'statistics.info')
                        name of output file for per-lane statistics (default
                        is 'per_lane_statistics.info')
                        name of output file for per-lane statistics (default
                        is 'per_lane_sample_stats.info')
                        name of output file for full statistics (default is
  -u, --update          update existing full statistics file with stats for
                        additional files
  -n N, --nprocessors N
                        spread work across N processors/cores (default is 1)
  --debug               turn on debugging output

Deprecated/defunct options:
  --force               does nothing: retained for backwards compatibility


usage: icell8_contamination_filter.py [-h] [-o OUT_DIR] [-m MAMMALIAN_CONF]
                                      [-c CONTAMINANTS_CONF]
                                      [-a {bowtie,bowtie2}] [-n THREADS]
                                      FQ_R1 FQ_R2

positional arguments:
  FQ_R1                 R1 FASTQ file
  FQ_R2                 Matching R2 FASTQ file

optional arguments:
  -h, --help            show this help message and exit
  -o OUT_DIR, --outdir OUT_DIR
                        directory to write output FASTQ files to (default:
                        current directory)
                        fastq_screen 'conf' file with the mammalian genome
                        fastq_screen 'conf' file with the contaminant genome
  -a {bowtie,bowtie2}, --aligner {bowtie,bowtie2}
                        aligner to use with fastq_screen (default: don't
                        specify an aligner)
  -n THREADS, --threads THREADS
                        number of threads to run fastq_screen with (default:


usage: icell8_report.py [-h] [-s STATS_FILE] [-o OUT_FILE] [-n NAME] [DIR]

positional arguments:
  DIR                   directory with ICell8 processing outputs

optional arguments:
  -h, --help            show this help message and exit
  -s STATS_FILE, --stats_file STATS_FILE
                        ICell8 stats file (default:
  -o OUT_FILE, --out_file OUT_FILE
                        Output HTML file (default: 'icell8_processing.html')
  -n NAME, --name NAME  specify a string to append to the zip archive name and


[2024/05/01-09:13:09] ICell8 stats started
usage: icell8_stats.py [-h] [-w WELL_LIST_FILE] [-u] [-f STATS_FILE] [-a]
                       [-s SUFFIX] [-n NPROCESSORS] [-m MAX_BATCH_SIZE]
                       [-T DIR]
                       [FASTQ_R1 FASTQ_R2 [FASTQ_R1 FASTQ_R2 ...]]

positional arguments:
  FASTQ_R1 FASTQ_R2     FASTQ file pairs

optional arguments:
  -h, --help            show this help message and exit
                        iCell8 'well list' file
  -u, --unassigned      include 'unassigned' reads
  -f STATS_FILE, --stats-file STATS_FILE
                        output statistics file
  -a, --append          append to statistics file
  -s SUFFIX, --suffix SUFFIX
                        suffix to attach to column names
                        number of processors/cores available for statistics
                        generation (default: 1)
  -m MAX_BATCH_SIZE, --max-batch-size MAX_BATCH_SIZE
                        maximum number of reads per batch when dividing Fastqs
                        (multicore only; default: 100000000)
  -T DIR, --temporary-directory DIR
                        use DIR for temporaries, not $TMPDIR or /tmp


    manage_fastqs.py DIR
    manage_fastqs.py DIR PROJECT
    manage_fastqs.py DIR PROJECT copy [[user@]host:]DEST
    manage_fastqs.py DIR PROJECT md5
    manage_fastqs.py DIR PROJECT zip

Fastq management utility. If only DIR is supplied then list the projects; if
PROJECT is supplied then list the fastqs; 'copy' command copies fastqs for the
specified PROJECT to DEST on a local or remote server; 'md5' command generates
checksums for the fastqs; 'zip' command creates a zip file with the fastq

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  --filter PATTERN      filter file names for reporting and copying based on
  --fastq_dir FASTQ_DIR
                        explicitly specify subdirectory of DIR with Fastq
                        files to run the QC on
  --max_zip_size MAX_ZIP_SIZE
                        for 'zip' command, defines the maximum size for the
                        output zip file; multiple zip files will be created if
                        the data exceeds this limit (default is create a
                        single zip file with no size limit)
  --link                hard link files instead of copying


usage: process_icell8.py [-h] [-u UNALIGNED_DIR] [-p NAME] [-o OUTDIR]
                         [-m MAMMALIAN_CONF] [-c CONTAMINANTS_CONF] [-q]
                         [-a {bowtie,bowtie2}] [-r STAGE=RUNNER] [-n STAGE=N]
                         [-s BATCH_SIZE] [-j MAX_JOBS]
                         [--no-contaminant-filter] [--no-cleanup] [--force]
                         [-v] [--no-quality-filter] [--threads THREADS]
                         WELL_LIST [FASTQ_R1 FASTQ_R2 [FASTQ_R1 FASTQ_R2 ...]]

Perform initial QC on FASTQs from Wafergen ICell8: assign to barcodes, filter
on barcode & UMI quality, trim reads, perform contaminant filtering and split
by barcode.

positional arguments:
  WELL_LIST             Well list file
  FASTQ_R1 FASTQ_R2     FASTQ file pairs

optional arguments:
  -h, --help            show this help message and exit
                        process FASTQs from 'unaligned' dir with output from
                        bcl2fastq (NB cannot be used with -p option)
  -p NAME, --project NAME
                        process FASTQS from project directory NAME (NB if -o
                        not specified then this will also be used as the
                        output directory; cannot be used with -u option)
  -o OUTDIR, --outdir OUTDIR
                        directory to write outputs to (default: 'CWD/icell8',
                        or project dir if -p is specified)
                        fastq_screen 'conf' file with the 'mammalian' genome
                        indices (default: None)
                        fastq_screen 'conf' file with the 'contaminant' genome
                        indices (default: None)
  -q, --quality-filter  filter out read pairs with low quality barcode and UMI
                        sequences (not recommended for NextSeq data)
  -a {bowtie,bowtie2}, --aligner {bowtie,bowtie2}
                        aligner to use with fastq_screen (default: don't
                        specify the aligner)
                        explicitly specify runner definitions for running
                        pipeline jobs at each stage. STAGE can be one of 'defa
                        If STAGE is not specified then it is assumed to be
                        'default'. RUNNER must be a valid job runner
                        specification e.g. 'GEJobRunner(-j y)'. Multiple
                        --runner arguments can be specified (default:
  -n STAGE=N, --nprocessors STAGE=N
                        specify number of processors to use at each stage.
                        STAGE can be one of 'default','contaminant_filter','qc
                        ','statistics','report'. If STAGE is not specified
                        then it is assumed to be 'default'. Multiple
                        --nprocessors arguments can be specified (default: 1)
                        number of reads per batch when splitting FASTQ files
                        for processing (default: 5000000)
  -j MAX_JOBS, --max-jobs MAX_JOBS
                        maxiumum number of concurrent jobs to run (default:
                        don't perform contaminant filter step (default is to
                        do contaminant filtering)
  --no-cleanup          don't remove intermediate Fastq files (default is to
                        delete intermediate Fastqs once no longer needed)
  --force               force overwrite of existing outputs
  -v, --verbose         produce verbose output for diagnostics
  --no-quality-filter   deprecated: kept for backwards compatibility only as
                        barcode/UMI quality checks are now disabled by default
  --threads THREADS     deprecated (use -n/--nprocessors option instead):
                        number of threads to use with multicore tasks (e.g.


usage: run_qc.py [-h] [--version] [--info] [-n NAME] [-o OUT_DIR]
                 [--qc_dir QC_DIR] [-f FILENAME] [-u] [--organism ORGANISM]
                 [--library-type LIBRARY] [--single-cell-platform PLATFORM]
                 [-p PROTOCOL] [--fastq_subset SUBSET] [-t NTHREADS]
                 [--star-index INDEX] [--gtf GTF]
                 [--cellranger CELLRANGER_EXE]
                 [--cellranger-reference REFERENCE]
                 [--10x_chemistry {ARC-v1,SC3Pv1,SC3Pv2,SC3Pv3,SC5P-PE,SC5P-R2,auto,fiveprime,threeprime}]
                 [--10x_force_cells N_CELLS] [--enable-conda {yes,no}]
                 [--conda-env-dir CONDA_ENV_DIR] [--local] [-c N] [-m M]
                 [-j N] [-b NBATCHES] [-r RUNNER] [-s N] [--ignore-metadata]
                 [--split-fastqs-by-lane] [--use-legacy-screen-names {yes,no}]
                 [--no-multiqc] [--verbose] [--work-dir WORKING_DIR]
                 [--no-cleanup] [--fastq_screen_subset SUBSET] [--force]
                 DIR | FASTQ [FASTQ ...] [DIR | FASTQ [FASTQ ...] ...]

Run the QC pipeline standalone on an arbitrary set of Fastq files.

positional arguments:
  DIR | FASTQ [ FASTQ ... ]
                        directory or list of Fastq files to run the QC on

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  --info                display information on protocols, organisms and other
                        settings (then exit)

Output and reporting:
  -n NAME, --name NAME  name for the project (used in report title)
  -o OUT_DIR, --out_dir OUT_DIR
                        top-level directory for reports and QC output
                        subdirectory (default: current working directory)
  --qc_dir QC_DIR       explicitly specify QC output directory. NB if a
                        relative path is supplied then it's assumed to be a
                        subdirectory of OUT_DIR (default: <OUT_DIR>/qc)
  -f FILENAME, --filename FILENAME
                        file name for output QC report (default:
  -u, --update          force QC pipeline to run even if output QC directory
                        already exists in <OUT_DIR> (default: stop if output
                        QC directory already exists)

  --organism ORGANISM   explicitly specify organism (e.g. 'human', 'mouse').
                        Multiple organisms should be separated by commas (e.g.
                        'human,mouse'). HINT use the --info option to list the
                        defined organisms
  --library-type LIBRARY
                        explicitly specify library type (e.g. 'RNA-seq',
  --single-cell-platform PLATFORM
                        explicitly specify the single cell platform (e.g.
                        '10xGenomics Chromium 3'v3')

QC options:
  -p PROTOCOL, --protocol PROTOCOL
                        explicitly specify the QC protocol to use; can be one
                        of 'standardSE', 'standardPE', '10x_scRNAseq',
                        '10x_snRNAseq', '10x_scATAC', '10x_Multiome_GEX',
                        '10x_Multiome_ATAC', '10x_CellPlex', '10x_Flex',
                        '10x_ImmuneProfiling', '10x_Visium',
                        '10x_Visium_FFPE', '10x_Visium_FFPE_PEX',
                        'ParseEvercode', 'singlecell', 'ICELL8_scATAC'. If not
                        set then protocol will be determined automatically
                        based on directory contents and metadata.
  --fastq_subset SUBSET
                        specify size of subset of reads to use for
                        FastQScreen, strandedness, coverage etc option);
                        (default 100000, set to 0 to use all reads)
  -t NTHREADS, --threads NTHREADS
                        number of threads to use for QC script (default: taken
                        from job runner)

Reference data:
  --star-index INDEX    specify the path to the STAR genome index to use when
                        mapping reads for metrics such as strandedness etc
                        (overrides the organism-specific indexes defined in
                        the config file)
  --gtf GTF             specify the path to the GTF annotation file to use for
                        metrics such as 'qualimap rnaseq' (overrides the
                        organism-specific GTF files defined in the config

Cellranger/10xGenomics options:
  --cellranger CELLRANGER_EXE
                        explicitly specify path to Cellranger executable to
                        use for single library analysis
  --cellranger-reference REFERENCE
                        specify the path to the reference dataset to use when
                        running single libary analysis (overrides the
                        organism-specific references defined in the config
  --10x_chemistry {ARC-v1,SC3Pv1,SC3Pv2,SC3Pv3,SC5P-PE,SC5P-R2,auto,fiveprime,threeprime}
                        assay configuration for 10xGenomics scRNA-seq; if set
                        to 'auto' (the default) then cellranger will attempt
                        to determine this automatically
  --10x_force_cells N_CELLS
                        force number of cells for 10xGenomics scRNA-seq and
                        scATAC-seq, overriding automatic cell detection
                        algorithms (default is to use built-in cell detection)

Conda dependency resolution:
  --enable-conda {yes,no}
                        use conda to resolve task dependencies; can be 'yes'
                        or 'no' (default: no)
  --conda-env-dir CONDA_ENV_DIR
                        specify directory for conda enviroments (default:
                        temporary directory)

Job control options:
  --local               run the QC on the local system (overrides any runners
                        defined in the configuration or on the command line)
  -c N, --maxcores N    maximum number of cores available for QC jobs when
                        using --local (default no limit, change in in settings
  -m M, --maxmem M      maximum total memory jobs can request at once when
                        using --local (in Gbs; default: unlimited)
  -j N, --maxjobs N     explicitly specify maximum number of concurrent QC
                        jobs to run (default 12, change in settings file;
                        ignored when using --local)
  -b NBATCHES, --maxbatches NBATCHES
                        enable dynamic batching of pipeline jobs with maximum
                        number of batches set to NBATCHES (default: no

Advanced options:
  -r RUNNER, --runner RUNNER
                        explicitly specify runner definition for running QC
                        components. RUNNER must be a valid job runner
                        specification e.g. 'GEJobRunner(-j y)' (default: use
                        runners set in configuration)
  -s N, --batch_size N  batch QC commands with N commands per job (default: no
  --ignore-metadata     ignore information from project metadata file even if
                        one is located (default is to use project metadata)
                        run QC on copies of input Fastqs where reads have been
                        split according to lane (default is to run QC on
                        original Fastqs)
  --use-legacy-screen-names {yes,no}
                        use 'legacy' naming convention for FastqScreen output
                        files; can be 'yes' or 'no' (default: no)
  --no-multiqc          turn off generation of MultiQC report

Debugging options:
  --verbose             run pipeline in 'verbose' mode
  --work-dir WORKING_DIR
                        specify the working directory for the pipeline
  --no-cleanup          don't remove the temporary project directory on
                        completion (by default the temporary directory is

Deprecated/redundant options:
  --fastq_screen_subset SUBSET
                        redundant: use the --fastq_subset option instead
  --force               redundant: HTML report generation will always be
                        attempted (even when pipeline fails)
  --multiqc             redundant: MultiQC report is generated by default (use
                        --no-multiqc to disable)


usage: split_icell8_fastqs.py [-h] [-w WELL_LIST_FILE]
                              [-m {barcodes,batch,none}] [-s BATCH_SIZE]
                              [-b BASENAME] [-o OUT_DIR] [-d] [-q] [-c]
                              [FASTQ_R1 FASTQ_R2 [FASTQ_R1 FASTQ_R2 ...]]

positional arguments:
  FASTQ_R1 FASTQ_R2     FASTQ file pairs

optional arguments:
  -h, --help            show this help message and exit
                        iCell8 'well list' file
  -m {barcodes,batch,none}, --mode {barcodes,batch,none}
                        how to split the input FASTQs: 'barcodes' (one FASTQ
                        pair per barcode), 'batch' (one or more FASTQ pairs
                        with fixed number of reads not exceeding BATCH_SIZE),
                        or 'none' (output all reads to a single FASTQ pair)
                        (default: 'barcodes')
                        number of reads per batch in 'batch' mode (default:
  -b BASENAME, --basename BASENAME
                        basename for output FASTQ files (default: 'icell8')
  -o OUT_DIR, --outdir OUT_DIR
                        directory to write output FASTQ files to (default:
                        current directory)
  -d, --discard-unknown-barcodes
                        discard reads with barcodes which don't match any of
                        those in the WELL_LIST_FILE (default: keep all reads)
  -q, --quality-filter  filter reads by barcode and UMI quality (default:
                        don't filter reads on quality)
  -c, --compress        output compressed .gz FASTQ files


usage: transfer_data.py [-h] [--version] [--subdir {random_bin,run_id}]
                        [--zip_fastqs] [--max_zip_size MAX_ZIP_SIZE]
                        [--no_fastqs] [--readme README_TEMPLATE]
                        [--weburl WEBURL] [--include_downloader]
                        [--include_qc_report] [--include_10x_outputs] [--link]
                        [--filter FILTER_PATTERN] [--runner RUNNER]
                        DEST PROJECT

Transfer copies of Fastq data from an analysis project to an arbitrary
destination for sharing with other people

positional arguments:
  DEST                  destination to copy Fastqs to; can be the name of a
                        destination defined in the configuration file, or an
                        arbitrary location of the form '[[USER@]HOST:]DIR' (no
                        destinations currently defined)
  PROJECT               path to project directory (or to a Fastqs subdirectory
                        in a project) to copy Fastqs from

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  --subdir {random_bin,run_id}
                        subdirectory naming scheme: 'random_bin' locates a
                        random pre-existing empty subdirectory under the
                        target directory; 'run_id' creates a new subdirectory
                        'PLATFORM_DATESTAMP.RUN_ID-PROJECT'. If this option is
                        not set then no subdirectory will be used
  --zip_fastqs          put Fastqs into a ZIP file
  --max_zip_size MAX_ZIP_SIZE
                        when using '--zip_fastqs' option, defines the maximum
                        size for the output zip file; multiple zip files will
                        be created if the data exceeds this limit (default is
                        create a single zip file with no size limit)
  --no_fastqs           don't copy Fastqs (other artefacts will be copied, if
                        template file to generate README file from; can be
                        full path to a template file, or the name of a file in
                        the 'templates' directory
  --weburl WEBURL       base URL for webserver (sets the value of the WEBURL
                        variable in the template README)
  --include_downloader  copy the 'download_fastqs.py' utility to the final
  --include_qc_report   copy the zipped QC reports to the final location
                        copy outputs from 10xGenomics pipelines (e.g.
                        'cellranger count') to the final location
  --link                hard link files instead of copying
                        filter Fastq file names based on PATTERN
  --runner RUNNER       specify the job runner to use for executing the
                        checksumming, Fastq copy and tar gzipping operations
                        (defaults to job runner defined for copying in config
                        file [SimpleJobRunner(join_logs=True)])


usage: update_project_metadata.py [-h] [-i] [-u UPDATE] DIR PROJECT

positional arguments:
  DIR                   analysis directory to update metadata for
  PROJECT               project within the analysis directory to update
                        metadata for

optional arguments:
  -h, --help            show this help message and exit
  -i, --init            initialise metadata file for the selected project (nb
                        can only be applied to one project at a time)
  -u UPDATE, --update UPDATE
                        update the metadata in the selected project by
                        specifying key=value pairs e.g. user='Peter Briggs'
                        (nb can only be applied to one project at a time)