auto_process_ngs.qc.modules.rseqc_infer_experiment
Implements the ‘rseqc_infer_experiment’ QC module:
RseqcInferExperiment: core QCModule class
RunRSeQCGenebodyCoverage: pipeline task to run ‘infer_experiment.py’
- class auto_process_ngs.qc.modules.rseqc_infer_experiment.RseqcInferExperiment
Class for handling the ‘rseqc_infer_experiment’ QC module
- classmethod add_to_pipeline(p, project_name, qc_dir, bam_files, reference_gene_model, organism_name, required_tasks=[], rseqc_runner=None)
Adds tasks for ‘rseqc_infer_experiment’ module to pipeline
- Parameters:
p (Pipeline) – pipeline to extend
project_name (str) – name of project
qc_dir (str) – path to QC directory
bam_files (list) – BAM files to run the module on
reference_gene_model (str) – path to reference gene model BED file
organism_name (str) – normalised name for organism that BAMs are aligned to
required_tasks (list) – list of tasks that the module needs to wait for
rseqc_runner (JobRunner) – runner to use for RSeQC
- classmethod collect_qc_outputs(qc_dir)
Collect information on RSeQC infer_experiment.py outputs
Returns an AttributeDictionary with the following attributes:
name: set to ‘rseqc_infer_experiment’
software: dictionary of software and versions
organisms: list of organisms with associated outputs
bam_files: list of associated BAM file names
output_files: list of associated output files
tags: list of associated output classes
- Parameters:
qc_dir (QCDir) – QC directory to examine
- classmethod verify(params, qc_outputs)
Verify ‘rseqc_infer_experiment’ QC module against outputs
Returns one of 3 values:
True: outputs verified ok
False: outputs failed to verify
None: verification not possible
- Parameters:
params (AttributeDictionary) – values of parameters used as inputs
qc_outputs (AttributeDictionary) – QC outputs returned from the ‘collect_qc_outputs’ method
- class auto_process_ngs.qc.modules.rseqc_infer_experiment.RunRSeQCInferExperiment(_name, *args, **kws)
Run RSeQC’s ‘infer_experiment.py’ on BAM files
Given a list of BAM files, for each file runs the RSeQC ‘infer_experiment.py’ utility (http://rseqc.sourceforge.net/#infer-experiment-py).
The log for each run is written to a file called ‘<BASENAME>.infer_experiment.log’; the data are also extracted and put into an output parameter for direct consumption by downstream tasks.
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(bam_files, reference_gene_model, out_dir)
Initialise the RunRSeQCInferExperiment task
- Parameters:
bam_files (list) – list of paths to BAM files to run infer_experiment.py on
reference_gene_model (str) – path to BED file with the reference gene model data
out_dir (str) – path to a directory where the output files will be written
- Outputs:
- experiments: a dictionary with BAM files as
keys; each value is another dictionary with keys ‘paired_end’ (True for paired-end data, False for single-end), ‘reverse’, ‘forward’ and ‘unstranded’ (fractions of reads mapped in each configuration).
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass