auto_process_ngs.qc.modules.cellranger_multi
Implements the ‘cellranger_multi’ QC module:
CellrangerMulti: core QCModule class
GetCellrangerMultiConfig: pipeline task to acquire multi config file
RunCellrangerMulti: pipeline task to run ‘cellranger multi’
expected_outputs: helper function for handling ‘cellranger multi’ outputs
Also imports the following pipeline tasks:
Get10xPackage
DetermineRequired10xPackage
- class auto_process_ngs.qc.modules.cellranger_multi.CellrangerMulti
Class for handling the ‘cellranger_multi’ QC module
- classmethod add_to_pipeline(p, project_name, project, qc_dir, qc_module_name, cellranger_exe=None, cellranger_out_dir=None, cellranger_jobmode=None, cellranger_maxjobs=None, cellranger_mempercore=None, cellranger_jobinterval=None, cellranger_localcores=None, cellranger_localmem=None, cellranger_required_version=None, required_tasks=None, cellranger_runner=None, envmodules=None, working_dir=None)
Adds tasks for ‘cellranger_multi’ module to pipeline
Arguments: p (Pipeline): pipeline to extend project_name (str): name to associate with project for
reporting tasks
- project (AnalysisProject): project to run 10x
cellranger pipeline within
- qc_dir (str): directory for QC outputs (defaults
to subdirectory ‘qc’ of project directory)
qc_module_name (str): QC module being used cellranger_exe (str): optional, explicitly specify
the cellranger executable to use (default: cellranger executable is determined automatically)
- cellranger_jobmode (str): specify the job mode to
pass to cellranger (default: “local”)
- cellranger_maxjobs (int): specify the maximum
number of jobs to pass to cellranger (default: None)
- cellranger_mempercore (int): specify the memory
per core (in Gb) to pass to cellranger (default: None)
- cellranger_jobinterval (int): specify the interval
between launching jobs (in ms) to pass to cellranger (default: None)
- cellranger_localcores (int): maximum number of cores
cellranger can request in jobmode ‘local’ (default: None)
- cellranger_localmem (int): maximum memory cellranger
can request in jobmode ‘local’ (default: None)
- required_tasks (list): list of tasks that the
cellranger pipeline should wait for
- cellranger_runner (JobRunner): runner to use for
running ‘cellranger multi’
- envmodules (list): environment module names to
load for running Cellranger
- working_dir (str): explicitly specify path to working
directory
- classmethod collect_qc_outputs(qc_dir)
Collect information on Cellranger multi outputs
Returns an AttributeDictionary with the following attributes:
name: set to ‘cellranger_multi’
software: dictionary of software and versions
references: list of associated reference datasets
probe_sets: list of associated probe sets
fastqs: list of associated Fastq names
multiplexed_samples: list of associated multiplexed sample names
pipelines: list of tuples defining 10x pipelines in the form (name,version,reference)
samples_by_pipeline: dictionary with lists of multiplexed sample names associated with each 10x pipeline tuple
config_files: list of associated config files (‘10x_multi_config[.<SAMPLE>].csv’)
output_files: list of associated output files
tags: list of associated output classes
- Parameters:
qc_dir (QCDir) – QC directory to examine
- classmethod verify(params, qc_outputs)
Verify ‘cellranger_multi’ QC module against outputs
Returns one of 3 values:
True: outputs verified ok
False: outputs failed to verify
None: verification not possible
- Parameters:
params (AttributeDictionary) – values of parameters used as inputs
qc_outputs (AttributeDictionary) – QC outputs returned from the ‘collect_qc_outputs’ method
- class auto_process_ngs.qc.modules.cellranger_multi.GetCellrangerMultiConfigs(_name, *args, **kws)
Locate ‘config.csv’ files for cellranger multi
- init(project, qc_dir)
Initialise the GetCellrangerMultiConfig task.
- Parameters:
project (AnalysisProject) – project to run QC for
qc_dir (str) – top-level QC directory to put ‘config.csv’ files
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.qc.modules.cellranger_multi.RunCellrangerMulti(_name, *args, **kws)
Run ‘cellranger multi’
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(project, config_csvs, samples, reference_data_path, probe_set_path, out_dir, qc_dir=None, cellranger_exe=None, cellranger_version=None, cellranger_jobmode='local', cellranger_maxjobs=None, cellranger_mempercore=None, cellranger_jobinterval=None, cellranger_localcores=None, cellranger_localmem=None, cellranger_required_version=None, working_dir=None)
Initialise the RunCellrangerMulti task.
- Parameters:
project (AnalysisProject) – project to run QC for
config_csvs (list) – list of paths to ‘cellranger multi’ configuration files
samples (list) – list of sample names from the config.csv file
reference_data_path (str) – path to the cellranger compatible reference dataset from the config.csv file
probe_set_path (str) – path to the probe set reference dataset from the config.csv file
out_dir (str) – top-level directory to copy all final ‘multi’ outputs into. Outputs won’t be copied if no value is supplied
qc_dir (str) – top-level QC directory to put ‘count’ QC outputs (e.g. metrics CSV and summary HTML files) into. Outputs won’t be copied if no value is supplied
cellranger_exe (str) – the path to the Cellranger software package to use (e.g. ‘cellranger’, ‘cellranger-atac’, ‘spaceranger’)
cellranger_version (str) – the version string for the Cellranger package
cellranger_jobmode (str) – specify the job mode to pass to cellranger (default: “local”)
cellranger_maxjobs (int) – specify the maximum number of jobs to pass to cellranger (default: None)
cellranger_mempercore (int) – specify the memory per core (in Gb) to pass to cellranger (default: None)
cellranger_jobinterval (int) – specify the interval between launching jobs (in ms) to pass to cellranger (default: None)
cellranger_localcores (int) – maximum number of cores cellranger can request in jobmode ‘local’ (defaults to number of slots set in runner)
cellranger_localmem (int) – maximum memory cellranger can request in jobmode ‘local’ (default: None)
cellranger_required_version (str) – string specifying the required Cellranger version (default: None)
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- auto_process_ngs.qc.modules.cellranger_multi.expected_outputs(config_csv, multi_id=None, prefix=None)
Generate expected output file paths from 10x multi config
- Parameters:
config_csv (str) – path to the 10x multi config file to generate the output file names for
multi_id (str) – optional, the ID of the multi run (supplied via the –id argument), used if the config file doesn’t define multiplexed samples
prefix (str) – optional path to prepend to the expected file paths
- Returns:
list of paths to expected output files.
- Return type: