auto-process-ngs: automated processing of NGS data
auto_process_ngs provides a set of utilities which automate
the processing of sequencing data from Illumina Next Generation
Sequencing (NGS) platforms, specifically:
Generation of Fastq files from raw Bcl data produced by the sequencer
Dividing Fastqs into projects for subsequent analysis
Running quality checks (QC) on each project
Copying final data to an archive location ready for appropriate bioinformatic analyses
In addition to standard Illumina sequencing data it can also handle data prepared using a number of single-cell (SC) RNA-seq platforms.
Together these utilities form the pipeline used for the initial processing, QC and management of sequencing data within the Bioinformatics Core Facility at the University of Manchester.
Getting started
- Overview
- Requirements
- Installation
- Configuration
- Overview
- Basic configuration
- Sequencers and platforms
- Metadata
- Job Runners
- Setting number of available CPUs
- Running on a compute cluster
- Managing concurrent jobs and process loads
- Using environment modules
- Using conda to resolve pipeline dependencies
- Specifying BCL to Fastq conversion software and options
- QC pipeline configuration
- Data transfer destinations
- Bash tab completion
Pipeline stages
- Starting an analysis
- Fastq generation
- Overview
- Fastq generation protocols
- Commonly used options
- Truncating R1/R2/R3 read lengths and setting bases mask
- Configuring adapter trimming and masking
- Fastq generation for runs with mixed protocols and options
- Processing a single run multiple times
- Specifying Illumina BCL conversion software
- Outputs
- Setting up projects
- Running QC
- Publishing QC
- Archiving analyses
- Troubleshooting
Post-processing
- Reporting analyses
- Managing and sharing data
fetch_data.py: import files and directoriesmanage_fastqs.py: managing and copy Fastq filestransfer_data.py: copying data for transfer to end usersdownload_fastqs.py: fetch Fastqs from a webserver in batchupdate_project_metadata.py: manage metadata associated with a projectaudit_projects.py: auditing disk usage for multiple runs- Exporting Fastqs to a data library in a local Galaxy instance
- Importing projects
- Running QC stand-alone
- Specifying the QC metadata
- Specifying the outputs
- Updating existing QC outputs
- Specifying reference data
- Running 10xGenomics single library analyses
- Running 10x Genomics CellRanger
multianalysis - Running on different platforms:
--local - Listing organisms and other information:
--info - Per-lane QC:
--split-fastqs-by-lane - Specifying and customising the QC protocols
- Handling non-standard Fastq file names
Spatial data
Helpers
Control files
Reference Documentation
Developer Documentation
- Updating CellRanger
- Developers’ API documentation
auto_process_ngs.analysisauto_process_ngs.applicationsauto_process_ngs.appsauto_process_ngs.auto_processorauto_process_ngs.barcodesauto_process_ngs.barcodes.analysisauto_process_ngs.barcodes.pipelineauto_process_ngs.barcodes.splitterauto_process_ngs.bcl2fastqauto_process_ngs.bcl2fastq.appsauto_process_ngs.bcl2fastq.pipelineauto_process_ngs.bcl2fastq.protocolsauto_process_ngs.bcl2fastq.reportingauto_process_ngs.bcl2fastq.utilsauto_process_ngs.cliauto_process_ngs.cli.auto_processauto_process_ngs.cli.build_indexauto_process_ngs.cli.fetch_dataauto_process_ngs.cli.reportqcauto_process_ngs.cli.run_qcauto_process_ngs.cli.transfer_dataauto_process_ngs.commandauto_process_ngs.commandsauto_process_ngs.commands.analyse_barcodes_cmdauto_process_ngs.commands.archive_cmdauto_process_ngs.commands.clone_cmdauto_process_ngs.commands.import_project_cmdauto_process_ngs.commands.make_fastqs_cmdauto_process_ngs.commands.merge_fastq_dirs_cmdauto_process_ngs.commands.publish_qc_cmdauto_process_ngs.commands.report_cmdauto_process_ngs.commands.run_qc_cmdauto_process_ngs.commands.samplesheet_cmdauto_process_ngs.commands.setup_analysis_dirs_cmdauto_process_ngs.commands.setup_cmdauto_process_ngs.commands.update_cmdauto_process_ngs.commands.update_fastq_stats_cmdauto_process_ngs.condaauto_process_ngs.configauto_process_ngs.conftestauto_process_ngs.css_rulesauto_process_ngs.decoratorsauto_process_ngs.docsauto_process_ngs.docwriterauto_process_ngs.exceptionsauto_process_ngs.fastq_utilsauto_process_ngs.fileopsauto_process_ngs.indexesauto_process_ngs.metadataauto_process_ngs.mockauto_process_ngs.mock10xdataauto_process_ngs.mockqcauto_process_ngs.mockqcdataauto_process_ngs.pipelinerauto_process_ngs.qcauto_process_ngs.qc.appsauto_process_ngs.qc.apps.cellrangerauto_process_ngs.qc.apps.fastq_screenauto_process_ngs.qc.apps.fastq_strandauto_process_ngs.qc.apps.fastqcauto_process_ngs.qc.apps.picardauto_process_ngs.qc.apps.qualimapauto_process_ngs.qc.apps.rseqcauto_process_ngs.qc.apps.seqlensauto_process_ngs.qc.fastq_statsauto_process_ngs.qc.modulesauto_process_ngs.qc.modules.cellranger_arc_countauto_process_ngs.qc.modules.cellranger_atac_countauto_process_ngs.qc.modules.cellranger_countauto_process_ngs.qc.modules.cellranger_multiauto_process_ngs.qc.modules.fastq_screenauto_process_ngs.qc.modules.fastqcauto_process_ngs.qc.modules.multiqcauto_process_ngs.qc.modules.picard_insert_size_metricsauto_process_ngs.qc.modules.qualimap_rnaseqauto_process_ngs.qc.modules.rseqc_genebody_coverageauto_process_ngs.qc.modules.rseqc_infer_experimentauto_process_ngs.qc.modules.sequence_lengthsauto_process_ngs.qc.modules.strandednessauto_process_ngs.qc.outputsauto_process_ngs.qc.pipelineauto_process_ngs.qc.plotsauto_process_ngs.qc.protocolsauto_process_ngs.qc.qc_modulesauto_process_ngs.qc.reportingauto_process_ngs.qc.utilsauto_process_ngs.qc.verificationauto_process_ngs.samplesheet_utilsauto_process_ngs.settingsauto_process_ngs.simple_schedulerauto_process_ngs.statsauto_process_ngs.tenxauto_process_ngs.tenx.cellplexauto_process_ngs.tenx.metricsauto_process_ngs.tenx.multiomeauto_process_ngs.tenx.utilsauto_process_ngs.utils