auto_process_ngs.qc.apps.picard

Provides utility classes and functions for handling Picard outputs.

Provides the following classes:

  • CollectInsertSizeMetrics: wrapper for handling outputs from Picard ‘CollectInsertSizeMetrics’

Provides the following functions:

  • fastqc_output_files: generates names of FastQC outputs files

class auto_process_ngs.qc.apps.picard.CollectInsertSizeMetrics(metrics_file)

Wrapper class for outputs from CollectInsertSizeMetrics

The CollectInsertSizeMetrics object gives access to various aspects of the outputs of the Picard CollectInsertSizeMetrics utility.

The following properties are available:

  • metrics (dict): dictionary mapping metrics to values

  • histogram (dict): dictionary holding histogram data

property histogram

Histogram data for insert sizes

Dictionary mapping insert sizes (keys) to associated number of alignments.

property metrics

Dictionary mapping metrics to values

To get the value associated with a metric, use e.g.:

>>> mean = insertsizes.metrics['MEAN_INSERT_SIZE']

To see what metrics are available, use e.g.:

>>> insertsizes.metrics.keys()
property metrics_file

Return path to source metrics file

auto_process_ngs.qc.apps.picard.logger = <Logger auto_process_ngs.qc.apps.picard (WARNING)>

Example CollectInsertSizeMetrics output (SAMPLE.insert_size_metrics.txt):

## htsjdk.samtools.metrics.StringHeader # CollectInsertSizeMetrics –Histogram_FILE <SAMPLE>.insert_size_histogram.pdf –INPUT <BAM> –OUTPUT <BASENAME>.insert_size_metrics.txt –DEVIATIONS 10.0 –MINIMUM_PCT 0.05 –METRIC_ACCUMULATION_LEVEL ALL_READS –INCLUDE_DUPLICATES false –ASSUME_SORTED true –STOP_AFTER 0 –VERBOSITY INFO –QUIET false –VALIDATION_STRINGENCY STRICT –COMPRESSION_LEVEL 5 –MAX_RECORDS_IN_RAM 500000 –CREATE_INDEX false –CREATE_MD5_FILE false –GA4GH_CLIENT_SECRETS client_secrets.json –help false –version false –showHidden false –USE_JDK_DEFLATER false –USE_JDK_INFLATER false ## htsjdk.samtools.metrics.StringHeader # Started on: Mon Oct 04 12:53:24 BST 2021

## METRICS CLASS picard.analysis.InsertSizeMetrics MEDIAN_INSERT_SIZE MODE_INSERT_SIZE MEDIAN_ABSOLUTE_DEVIATION MIN_INSERT_SIZE MAX_INSERT_SIZE MEAN_INSERT_SIZE STANDARD_DEVIATION READ_PAIRS PAIR_ORIENTATION WIDTH_OF_10_PERCENT WIDTH_OF_20_PERCENT WIDTH_OF_30_PERCENT WIDTH_OF_40_PERCENT WIDTH_OF_50_PERCENT WIDTH_OF_60_PERCENT WIDTH_OF_70_PERCENT WIDTH_OF_80_PERCENT WIDTH_OF_90_PERCENT WIDTH_OF_95_PERCENT WIDTH_OF_99_PERCENT SAMPLE LIBRARY READ_GROUP 139 103 37 28 753074 153.754829 69.675347 175847 FR 15 31 47 61 75 93 115 157 323 975 31983

## HISTOGRAM java.lang.Integer insert_size All_Reads.fr_count 28 1 29 1 30 1 31 1 32 1 33 2 34 3 …

auto_process_ngs.qc.apps.picard.picard_collect_insert_size_metrics_output(filen, prefix=None)

Generate names of Picard CollectInsertSizeMetrics output

Given a Fastq or BAM file name, the output from Picard’s CollectInsertSizeMetrics function will look like:

  • {PREFIX}/{FASTQ}.insert_size_metrics.txt

  • {PREFIX}/{FASTQ}.insert_size_histogram.pdf

Parameters:
  • filen (str) – name of Fastq or BAM file

  • prefix (str) – optional directory to prepend to outputs

Returns:

CollectInsertSizeMetrics output (without leading

paths)

Return type:

tuple