`auto_process_ngs.qc.reporting`

Utilities for reporting QC pipeline outputs.

Provides the following core class:

QCReport: create QC report document for one or more projects

In addition there are a number of supporting classes:

QCProject: gather information about the QC associated with a project
SampleQCReporter: reports the QC for a sample
FastqGroupQCReporter: reports the QC for a group of Fastqs
FastqQCReporter: interface to QC outputs for a single Fastq

There are also a number of utility functions:

report: report the QC for a project
pretty_print_reads: print number of reads with commas at each thousand
sanitize_name: replace ‘unsafe’ characters in HTML link targets

Overview

The SampleQCReporter, FastqGroupQCReporter and FastqQCReporter classes are used by the top-level QCProject class to report QC outputs at the level of samples (for example, single library analyses), groups of Fastqs (for example, strandedness), and individual Fastqs (for example, FastQC or screen data).

Adding support for new metadata

Support for new metadata items should be implemented within the _init_metadata_table method of QCReport. Descriptions of new items should also be added to the METADATA_FIELD_DESCRIPTIONS module constant.

Adding support for new QC outputs

When adding reporting of new QC outputs it is recommended first to ensure that they are detected by the QCOutputs class (in the outputs module); then the relevant reporter class should be extended depending on the level that the QC outputs are associated with (i.e. project, sample, Fastq group or individual Fastq).

Typically this is done by adding support for new summary table fields, which can implemented within the get_value or get_10x_value methods in SampleQCreporter (for sample-level QC) or the get_value method of FastqGroupQCReporter (for Fastq-group level QC). (This may require additional internal functionality to be implemented within the relevant class.)

Descriptions for new fields also need to be added to the SUMMARY_FIELD_DESCRIPTIONS module constant. Additionally: if the QC outputs are produced using new software packages then these should be added to the SOFTWARE_PACKAGE_NAMES module constant; as long as these are reported by QCOutputs then they will also be listed automatically within the QC report.

class auto_process_ngs.qc.reporting.FastqGroupQCReporter(fastqs, qc_dir, project, project_id=None, fastq_attrs=<class 'auto_process_ngs.analysis.AnalysisFastq'>, is_seq_data=True)

Utility class for reporting the QC for a Fastq group

Provides the following properties:

reads: list of read ids e.g. [‘r1’,’r2’]
fastqs: dictionary mapping read ids to Fastq paths
reporters: dictionary mapping read ids to FastqQCReporter instances
paired_end: whether FastqGroup is paired end
bam: associated BAM file name
fastq_strand_txt: location of associated Fastq_strand output

Provides the following methods:

infer_experiment: fetch data from RSeQC ‘infer_experiment.py’
ustrandednessplot: return mini-plot of strandedness data from RSeQC ‘infer_experiment.py’
insert_size_metrics: fetch insert size metrics
uinsertsizeplot: return mini-plot of insert size histogram
qualimap_rnaseq: fetch Qualimap ‘rnaseq’ metrics
ucoverageprofileplot: return mini-plot of gene coverage profile
ugenomicoriginplot: return mini-plot of genomic origin of reads data
strandedness: fetch strandedness data for this group
ustrandplot: return mini-strand stats summary plot
report_strandedness: write report for strandedness
report: write report for the group
update_summary_table: add line to summary table for the group

Parameters:

fastqs (list) – list of paths to Fastqs in the group
qc_dir (str) – path to the QC output dir; relative path will be treated as a subdirectory of the project
project (QCProject) – parent project
project_id (str) – identifier for the project
fastq_attrs (BaseFastqAttrs) – class for extracting data from Fastq names
is_seq_data (bool) – if True then indicates that the group contains biological data

property fastq_strand_txt: Locate output from fastq_strand (None if not found)

get_value(field, relpath=None)

Return the value for the specified field

The following fields can be reported for each Fastq group:

fastqs (if paired-end)
fastq (if single-end)
bam_file
reads
read_lengths
read_counts
sequence_duplication
adapter_content
read_lengths_dist_r1
read_lengths_dist_r2
read_lengths_dist_r3
boxplot_r1
boxplot_r2
boxplot_r3
fastqc_r1
fastqc_r2
fastqc_r3
screens_r1
screens_r2
screens_r3
strandedness
strand_specificity
insert_size_histogram
coverage_profile_along_genes
reads_genomic_origin

Parameters:

field (str) – name of the field to report; if the field is not recognised then KeyError is raised
relpath (str) – if set then make link paths relative to ‘relpath’

infer_experiment(organism): Return RSeQC infer_experiment.py data for organism

insert_size_metrics(organism): Return Picard insert size metrics for specified organism

property paired_end: True if pair consists of R1/R2 files

qualimap_rnaseq(organism): Return Qualimap ‘rnaseq’ outputs instance for specified organism

report(sample_report, attrs=None, relpath=None)

Add report for Fastq group to a document section

Creates a new subsection in ‘sample_report’ for the Fastq pair, within which are additional subsections for each Fastq file.

The following ‘attributes’ can be reported for each Fastq:

fastqc
fastq_screen
program versions

By default all attributes are reported.

Parameters:

sample_report (Section) – section to add the report to
attrs (list) – optional list of custom ‘attributes’ to report
relpath (str) – if set then make link paths relative to ‘relpath’

report_strandedness(document)

Report the strandedness outputs to a document

Creates a new subsection called “Strandedness” with a table of the strandedness determination outputs.

Parameters:: document (Section) – section to add report to

strandedness(): Return strandedness from fastq_strand.py

ucoverageprofileplot(organism): Return a mini-plot of the Qualimap gene coverage profile

ugenomicoriginplot(organism, width=100, height=40): Return a mini-barplot of the Qualimap genomic origin of reads data

uinsertsizeplot(organism): Return a mini-plot with the Picard insert size histogram

update_summary_table(summary_table, idx=None, fields=None, relpath=None)

Add a line to a summary table reporting a Fastq group

Creates a new line in ‘summary_table’ (or updates an existing line) for the Fastq pair, adding content for each specified field.

See the ‘get_value’ method for a list of valid fields.

Parameters:

summary_table (Table) – table to add the summary to
idx (int) – if supplied then indicates which existing table row to update (if None then a new row is appended)
fields (list) – list of custom fields to report
relpath (str) – if set then make link paths relative to ‘relpath’

Returns:

True if report didn’t contain any issues,: False otherwise.

Return type:

Boolean

ustrandednessplot(organism, width=50, height=24): Return a mini-plot for RSeQC strandness data

ustrandplot(): Return a mini-strand stats summary plot

class auto_process_ngs.qc.reporting.FastqQCReporter(fastq, qc_dir, project_id=None, fastq_attrs=<class 'auto_process_ngs.analysis.AnalysisFastq'>)

Provides interface to QC outputs for Fastq file

Provides the following attributes:

name: basename of the Fastq
path: path to the Fastq
safe_name: name suitable for use in HTML links etc
sample_name: sample name derived from the Fastq basename
sequence_lengths: SeqLens instance
fastqc: Fastqc instance
fastq_screen.names: list of FastQScreen names
fastq_screen.SCREEN.description: description of SCREEN
fastq_screen.SCREEN.png: associated PNG file for SCREEN
fastq_screen.SCREEN.txt: associated TXT file for SCREEN
fastq_screen.SCREEN.version: associated version for SCREEN
program_versions.NAME: version of package NAME
adapters: list of adapters from Fastqc
adapters_summary: dictionary summarising adapter content

Provides the following methods:

report_fastqc
report_fastq_screens
report_program_versions
useqlenplot
ureadcountplot
uboxplot
ufastqcplot
useqduplicationplot
uadapterplot
uscreenplot

Parameters:

fastq (str) – path to Fastq file
qc_dir (str) – path to QC directory
project_id (str) – identifier for the parent project
fastq_attrs (BaseFastqAttrs) – class for extracting data from Fastq names

fastqc_summary(): Return plaintext version of the FastQC summary

report_fastq_screens(document, relpath=None)

Report the FastQScreen outputs to a document

Creates a new subsection called “Screens” with copies of the screen plots for each screen and links to the “raw” text files.

Parameters:

document (Section) – section to add report to
relpath (str) – if set then make link paths relative to ‘relpath’

report_fastqc(document, relpath=None)

Report the FastQC outputs to a document

Creates a new subsection called “FastQC” with a copy of the FastQC sequence quality boxplot and a summary table of the results from each FastQC module.

Parameters:

document (Section) – section to add report to
relpath (str) – if set then make link paths relative to ‘relpath’

report_program_versions(document)

Report the program versions to a document

Creates a new subsection called “Program versions” with a table listing the versions of the QC programs.

Parameters:

document (Section) – section to add report to
relpath (str) – if set then make link paths relative to ‘relpath’

uadapterplot(height=40, multi_bar=False, inline=True)

Return a mini-adapter content summary plot

Parameters:

height (int) – optionally set the plot height in pixels
multi_bar (bool) – if True then create a plot with one bar per adapter class (otherwise summarise adapter content on one bar in the plot)
inline (bool) – if True then return plot in format for inlining in HTML document

uboxplot(inline=True)

Return a mini-sequence quality boxplot

Parameters:: inline (bool) – if True then return plot in format for inlining in HTML document

uduplicationplot(mode='dup', inline=True)

Return a mini-sequence duplication plot

Parameters:

mode (str) – either ‘dup’ or ‘dedup’
inline (bool) – if True then return plot in format for inlining in HTML document

ufastqcplot(inline=True)

Return a mini-FastQC summary plot

Parameters:: inline (bool) – if True then return plot in format for inlining in HTML document

ureadcountplot(max_reads=None, inline=True)

Return a mini-sequence composition plot

Parameters:

max_reads (int) – if set then scale the reads for this Fastq against this value in the plot
inline (bool) – if True then return plot in format for inlining in HTML document

uscreenplot(inline=True)

Return a mini-FastQScreen summary plot

Parameters:: inline (bool) – if True then return plot in format for inlining in HTML document

useqlenplot(max_len=None, min_len=None, height=None, inline=True)

Return a mini-sequence length distribution plot

Parameters:

max_len (int) – set the upper limit of the x-axis
min_len (int) – set the lower limit of the x-axis
height (int) – optionally set the plot height in pixels
inline (bool) – if True then return plot in format for inlining in HTML document

class auto_process_ngs.qc.reporting.QCProject(project, qc_dir=None)

Gather information about the QC associated with a project

Collects data about the QC for an AnalysisProject and makes it available via the following properties:

project: the AnalysisProject instance
qc_dir: the directory to examine for QC outputs
run_metadata: AnalysisDirMetadata instance with metadata from the parent run (if present)
processing_software: dictionary with information on software used to process the initial data

Properties that shortcut to properties of the parent AnalysisProject:

name: project name
dirn: path to associated directory
comments: comments associated with the project
info: shortcut to the project’s AnalysisProjectMetadata instance
qc_info: shortcut to the QC directory’s AnalysisProjectQCDirInfo instance
fastq_attrs: class to use for extracting data from Fastq file names

Properties based on artefacts detected in the QC directory:

fastqs: sorted list of Fastq names
reads: list of reads (e.g. ‘r1’, ‘r2’, ‘i1’ etc)
samples: sorted list of sample names extracted from Fastqs
bams: sorted list of BAM file names
multiplexed_samples: sorted list of sample names for multiplexed samples (e.g. 10x CellPlex)
organisms: sorted list of organism names
outputs: list of QC output categories detected (see below for valid values)
output_files: list of absolute paths to QC output files
software: dictionary with information on the QC software packages
stats: AttrtibuteDictionary with useful stats from across the project

The ‘stats’ property has the following attributes:

max_seqs: maximum number of sequences across all Fastq files
min_sequence_length: minimum sequence length across all Fastq files
max_sequence_length: maximum sequence length across all Fastq files
max_sequence_length_read[READ]: maximum sequence length across all READ Fastqs (where READ is ‘r1’, ‘r2’ etc)
min_sequence_length_read[READ]: minimum sequence length across all READ Fastqs (where READ is ‘r1’, ‘r2’ etc)

Valid values of the ‘outputs’ property are taken from the QCOutputs class.

General properties about the project:

is_single_cell: True if the project has single cell
data (10xGenomics, ICELL8 etc)

Parameters:

project (AnalysisProject) – project to report QC for
qc_dir (str) – path to the QC output dir; relative path will be treated as a subdirectory of the project

property comments: Comments associated with the project

property dirn: Path to project directory

property id

Identifier for the project

This is a string of the form:

<RUN_ID>:<PROJECT_NAME>

e.g. MINISEQ_201120#22:PJB

If the run id can’t be determined then the name of the parent directory is used instead.

property is_single_cell: Check whether project has single cell data

property name: Name of project

property run_id

Identifier for parent run

This is the standard identifier constructed from the platform, datestamp and facility run number (e.g. MINISEQ_201120#22).

If an identifier can’t be constructed then None is returned.

software_info(pkg, exclude_processing=False)

Get information on software package

Parameters:

pkg (str) – name of software package to get information about
exclude_processing (bool) – if True then don’t fall back to processing software information if package is not found (default: False, do fall back to checking processing software)

Returns:

software version information, or: None if no information is stored.

Return type:

String

class auto_process_ngs.qc.reporting.QCReport(projects, title=None, qc_dir=None, report_attrs=None, summary_fields=None, relpath=None, data_dir=None, suppress_warning=False)

Create a QC report document for one or more projects

Example usage:

>>> report = QCReport(project)
>>> report.write("qc_report.html")

To control the fields written to the summary table, specify a list of field names via the ‘summary_fields’ argument. Valid field names are:

sample: sample name
fastq: Fastq name
fastqs: Fastq R1/R2 names
reads: number of reads
read_lengths: length of reads
read_lengths_distributions: mini-plots of read length distributions
read_counts: mini-plots of fractions of masked/padded/total reads
adapter_content: mini-plots summarising adapter content for all reads
read_length_dist_[read]: length dist mini-plot for [read] (r1,r2,…)
fastqc_[read]: FastQC mini-plot for [read]
boxplot_[read]: FastQC per-base-quality mini-boxplot’ for [read]
screens_[read]: FastQScreen mini-plots for [read]
strandedness: ‘forward’, ‘reverse’ or ‘unstranded’ for pair
cellranger_count: ‘cellranger count’ outputs for each sample

To control the elements written to the reports for each Fastq pair, specify a list of element names via the ‘report_attrs’ argument. Valid element names are:

fastqc: FastQC report
fastq_screen: FastQCScreen report
program_versions: program versions

Parameters:

projects (AnalysisProject) – list of projects to report QC for
title (str) – title for the report (defaults to “QC report: <PROJECT_NAME>”)
qc_dir (str) – path to the QC output dir; relative path will be treated as a subdirectory of the project
report_attrs (list) – list of elements to report for each Fastq pair
summary_fields (list) – list of fields to report for each sample in the summary table
relpath (str) – if set then make link paths relative to ‘relpath’
data_dir (str) – if set then copy external data files to this directory and make link paths to these copies; relative path will be treated as a subdirectory of the project
suppress_warning (bool) – if True then don’t show the warning message even when there are missing metrics (default: show the warning if there are missing metrics)

add_multiplex_analysis_table(project, fields, section): Create a new table for summarising 10x multiplexing analyses

add_single_library_analysis_table(package, project, fields, section): Create a new table for summarising 10x single library analyses

add_summary_table(project, fields, section)

Create a new table for summarising samples from a project

Associated CSS classes are ‘summary’ and ‘fastq_summary’

fetch_qc_dir(project)

Return path to QC dir for reporting

If a ‘data directory’ has been defined for this report then QC artefacts will have been copied to a project-specific subdirectory of that directory; otherwise QC artefacts will be in the QC directory of the original project.

report_additional_metrics(project, section)

Report additional QC metrics

Creates a summary table in the specified section to report additional metrics to those in the main summary section (e.g. adapters, FastQC summary, read length distributions etc)

Parameters:

project (QCProject) – project to report
section (Section) – document section to add the report to

report_comments(project)

Report the comments associated with a project

Adds the comments from the project metadata as a list to the comments section.

Parameters:: project (QCProject) – project to report

report_extra_outputs(project, section)

Report extra/external outputs

Populates the specified section with a list of links to the extra/external outputs referenced in the ‘extra_outputs.tsv’ file in the QC directory of the project.

Parameters:

project (QCProject) – project to report
section (Section) – document section to add the extra outputs to

report_genebody_coverage(project, section)

Add RSeQC gene body coverage reports to a document section

Parameters:

project (QCProject) – parent project
section (Section) – section to add the report to

report_insert_size_metrics(project, section)

Add links to insert size metrics to a document section

Parameters:

project (QCProject) – parent project
section (Section) – section to add the report to

report_metadata(project, tbl, items)

Report project metadata to a table

Adds entries for project metadata items to the specified table

Parameters:

project (QCProject) – project to report
tbl (Table) – table to report the metadata items to
items (list) – list of metadata items to report to the table

report_multiplexing_analyses(project, sample, multiplexing_analysis_table, fields)

Report the multiplexing analyses for a sample

Writes lines to the multiplexing analysis summary table for each analysis found that is associated with the specified sample.

Parameters:

project (QCProject) – project to report
sample (str) – name of multiplexed sample to report
multiplexing_analysis_table (Table) – summary table to report each analysis in
fields (list) – list of fields to report for each analysis in the summary table

report_multiqc(project, section)

Add link to MultiQC report to a document section

Parameters:

project (QCProject) – parent project
section (Section) – section to add the report to

report_processing_software(project)

Report the software versions used in the processing

Adds entries for the software versions to the “processing software” table in the report

report_qc_software(project)

Report the software versions used in the QC

Adds entries for the software versions to the “qc_software” table in the report

report_sample(project, sample, report_attrs, summary_table, summary_fields)

Report the QC for a sample

Reports the QC for the sample and Fastqs to the summary table and appends a section with detailed reports to the document.

Parameters:

project (QCProject) – project to report
sample (str) – name of sample to report
report_attrs (list) – list of elements to report for each set of Fastqs
summary_table (Table) – summary table to report each sample in
summary_fields (list) – list of fields to report for each sample in the summary table

report_single_library_analyses(package, project, sample, single_library_analysis_table, fields)

Report the single library analyses for a sample

Writes lines to the single library analysis summary table for each analysis found that is associated with the specified sample.

Parameters:

package (str) – name of 10x package to report
project (QCProject) – project to report
sample (str) – name of sample to report
single_library_analysis_table (Table) – summary table to report each analysis in
fields (list) – list of fields to report for each analysis in the summary table

report_status(): Set the visibility of the “warnings” section

class auto_process_ngs.qc.reporting.SampleQCReporter(project, sample, is_seq_data=True, qc_dir=None, fastq_attrs=<class 'auto_process_ngs.analysis.AnalysisFastq'>)

Utility class for reporting the QC for a sample

Provides the following properties:

sample: name of the sample fastqs: list of the Fastqs associated with the sample reads: list of read ids e.g. [‘r1’,’r2’] fastq_groups: list of FastqGroupQCReporter instances from

grouped Fastqs associated with the sample

cellranger_count: list of CellrangerCount instances: associated with the sample
multiome_libraries: MultiomeLibraries instance with: data for 10xGenomics libraries for the project (or None, if the libraries file doesn’t exist)

Provides the following methods:

report_fastq_groups: add reports for Fastq groups in: the sample to a document section
update_summary_table: add lines to summary table for: the sample
update_single_library_table: add lines to the single: library analysis summary table
update_multiplexing_analysis_table: add lines to the: multiplexing analysis summary table

get_10x_value(field, cellranger_data, metrics, web_summary, relpath=None)

Return the value for the specified field

The following fields can be reported for each sample:

Valid fields are:

10x_cells
10x_reads_per_cell
10x_genes_per_cell
10x_frac_reads_in_cell
10x_fragments_per_cell
10x_fragments_overlapping_targets
10x_fragments_overlapping_peaks
10x_tss_enrichment_score
10x_atac_fragments_per_cell
10x_gex_genes_per_cell
10x_genes_detected
10x_umis_per_cell
10x_pipeline
10x_reference
10x_web_summary
linked_sample

Parameters:

field (str) – name of the field to report; if the field is not recognised then KeyError is raised
cellranger_data (object) – parent CellrangerCount object for the sample
metrics (MetricsSummary) – summary metrics for the sample
web_summary (str) – path to the web_summary.html report for the sample
relpath (str) – if set then make link paths relative to ‘relpath’

get_value(field, relpath=None)

Return the value for the specified field

The following fields can be reported for each Fastq pair:

sample
cellranger_count

Parameters:

field (str) – name of the field to report; if the field is not recognised then KeyError is raised
relpath (str) – if set then make link paths relative to ‘relpath’

report_fastq_groups(sample_report, attrs, relpath=None)

Add reports for all Fastq groups to a document section

Creates new subsections in ‘sample_report’ for each Fastq group, to report data on each Fastq file in the group.

The following ‘attributes’ that can be reported for each Fastq are those available for the ‘report’ method of the ‘FastqGroupQCReporter’ class.

By default all attributes are reported.

Parameters:

sample_report (Section) – section to add the reports to
attrs (list) – optional list of custom ‘attributes’ to report
relpath (str) – if set then make link paths relative to ‘relpath’

update_multiplexing_analysis_table(multiplexing_analysis_table, fields=None, relpath=None)

Add lines to a table reporting multiplexing analyses

Creates new lines in ‘multiplexing_analysis_table’ for the sample (one line per multiplexed analysis), adding content for each specified field.

Valid fields are any supported by the ‘get_10x_value’ method.

Parameters:

summary_table (Table) – table to update
fields (list) – list of custom fields to report
relpath (str) – if set then make link paths relative to ‘relpath’

Returns:

True if report didn’t contain any issues,: False otherwise.

Return type:

Boolean

update_single_library_table(package, single_library_table, fields=None, relpath=None)

Add lines to a table reporting single library analyses

Creates new lines in ‘single_library_table’ for the sample (one line per single library analysis group), adding content for each specified field.

Valid fields are any supported by the ‘get_10x_value’ method.

Parameters:

package (str) – 10x package to report on
summary_table (Table) – table to update
fields (list) – list of custom fields to report
relpath (str) – if set then make link paths relative to ‘relpath’

Returns:

True if report didn’t contain any issues,: False otherwise.

Return type:

Boolean

update_summary_table(summary_table, fields=None, sample_report=None, relpath=None)

Add lines to a summary table reporting a sample

Creates new lines in ‘summary_table’ for the sample (one line per Fastq group), adding content for each specified field.

See the ‘get_value’ method for a list of valid fields.

Parameters:

summary_table (Table) – table to update
fields (list) – list of custom fields to report
relpath (str) – if set then make link paths relative to ‘relpath’

Returns:

True if report didn’t contain any issues,: False otherwise.

Return type:

Boolean

class auto_process_ngs.qc.reporting.ToggleButton(toggle_section, show_text='Show', hide_text='Hide', css_classes=None, help_text=None, hidden_by_default=True)

Utility class for creating a ‘toggle button’

A ‘toggle button’ is an HTML button that is linked to a document section such that successive button clicks toggle the section’s visibility, between ‘visible’ and ‘hidden’ states.

The toggle functionality is provided by a JavaScript function which changes the section’s ‘display’ attribute to ‘block’ (to make it visible) and ‘none’ (to hide it).

The button will always have the ‘toggle_button’ CSS class associated with it, in addition to any others specified.

Example usage:

>>> d = Document()
>>> s = Section('Toggle section')
>>> b = ToggleButton(s)
>>> d.add(b,s)

Parameters:

toggle_section (Section) – section to toggle the visibility of
show_text (str) – text to show on the button when the section is in the hidden state
hide_text (str) – text to show on the button when the section is in the visible state
help_text (str) – text to show when the mouse is over the button
css_classes (list) – additional CSS classes to associate with the button
hidden_by_default (bool) – if True then the section will be hidden by default

html()

Generate HTML version of the toggle button

Returns:: HTML representation of the button.
Return type:: String

auto_process_ngs.qc.reporting.pretty_print_reads(n)

Print the number of reads with commas at each thousand

For example:

>>> pretty_print_reads(10409789)
10,409,789

Parameters:: n (int) – number of reads
Returns:: representation with commas for every thousand.
Return type:: String

auto_process_ngs.qc.reporting.report(projects, title=None, filename=None, qc_dir=None, report_attrs=None, summary_fields=None, relative_links=False, use_data_dir=False, make_zip=False, suppress_warning=False)

Report the QC for a project

Parameters:

projects (list) – AnalysisProject instances to report QC for
title (str) – optional, specify title for the report (defaults to ‘<PROJECT_NAME>: QC report’)
filename (str) – optional, specify path and name for the output report file (defaults to ‘<PROJECT_NAME>.qc_report.html’)
qc_dir (str) – path to the QC output dir
report_attrs (list) – optional, list of elements to report for each Fastq pair
summary_fields (list) – optional, list of fields to report for each sample in the summary table relative_links (boolean): optional, if set to True then use relative paths for links in the report (default is to use absolute paths)
use_data_dir (boolean) – if True then copy QC artefacts to a data directory parallel to the output report file
make_zip (boolean) – if True then also create a ZIP archive of the QC report and outputs (default is not to create the ZIP archive)
suppress_warning (bool) – if True then don’t show the warning message even when there are missing metrics (default: show the warning if there are missing metrics)

Returns:

filename of the output HTML report.

Return type:

String

auto_process_ngs.qc.reporting.sanitize_name(s, new_char='_')

Replace ‘unsafe’ characters in HTML link targets

Replaces ‘unsafe’ characters in a string used as a link target in an HTML document with underscore characters.

Parameters:

s (str) – string to sanitize
replace_with (str) – string to replace ‘unsafe’ characters with (default: ‘_’)

auto_process_ngs.qc.reporting

Overview

Adding support for new metadata

Adding support for new QC outputs

`auto_process_ngs.qc.reporting`