auto_process_ngs.qc.modules.rseqc_genebody_coverage

Implements the ‘rseqc_genebody_coverage’ QC module:

  • RSeqcGenebodyCoverage: core QCModule class

  • RunRSeQCGenebodyCoverage: pipeline task to run ‘genebody_coverage.py’

class auto_process_ngs.qc.modules.rseqc_genebody_coverage.RseqcGenebodyCoverage

Class for handling the ‘rseqc_genebody_coverage’ QC module

classmethod add_to_pipeline(p, project_name, project, qc_dir, bam_files, reference_gene_model, organism_name, required_tasks=[], rseqc_runner=None)

Adds tasks for ‘rseqc_genebody_coverage’ module to pipeline

Parameters:
  • p (Pipeline) – pipeline to extend

  • project_name (str) – name of project

  • project (AnalysisProject) – project to run module on

  • qc_dir (str) – path to QC directory

  • bam_files (list) – BAM files to run the module on

  • reference_gene_model (str) – path to reference gene model BED file

  • organism_name (str) – normalised name for organism that BAMs are aligned to

  • required_tasks (list) – list of tasks that the module needs to wait for

  • rseqc_runner (JobRunner) – runner to use for RSeQC

classmethod collect_qc_outputs(qc_dir)

Collect information on RSeQC geneBody_coverage.py outputs

Returns an AttributeDictionary with the following attributes:

  • name: set to ‘rseqc_genebody_coverage’

  • software: dictionary of software and versions

  • organisms: list of organisms with associated outputs

  • output_files: list of associated output files

  • tags: list of associated output classes

Parameters:

qc_dir (QCDir) – QC directory to examine

classmethod verify(params, qc_outputs)

Verify ‘rseqc_genebody_coverage’ QC module against outputs

Returns one of 3 values:

  • True: outputs verified ok

  • False: outputs failed to verify

  • None: verification not possible

Parameters:
  • params (AttributeDictionary) – values of parameters used as inputs

  • qc_outputs (AttributeDictionary) – QC outputs returned from the ‘collect_qc_outputs’ method

class auto_process_ngs.qc.modules.rseqc_genebody_coverage.RunRSeQCGenebodyCoverage(_name, *args, **kws)

Run RSeQC’s ‘genebody_coverage.py’ on BAM files

Given a collection of BAM files, runs the RSeQC ‘genebody_coverage.py’ utility (http://rseqc.sourceforge.net/#genebody-coverage-py).

finish()

Perform actions on task completion

Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.

Must be implemented by the subclass

init(bam_files, reference_gene_model, out_dir, name='rseqc')

Initialise the RunRSeQCGenebodyCoverage task

Parameters:
  • bam_files (list) – list of paths to BAM files to run genebody_coverage.py on

  • reference_gene_model (str) – path to BED file with the reference gene model data

  • out_dir (str) – path to a directory where the output files will be written

  • name (str) – optional basename for the output files (defaults to ‘rseqc’)

setup()

Set up commands to be performed by the task

Must be implemented by the subclass