auto_process_ngs.qc.reporting
Utilities for reporting QC pipeline outputs.
Provides the following core class:
QCReport: create QC report document for one or more projects
In addition there are a number of supporting classes:
QCProject: gather information about the QC associated with a project
SampleQCReporter: reports the QC for a sample
FastqGroupQCReporter: reports the QC for a group of Fastqs
FastqQCReporter: interface to QC outputs for a single Fastq
There are also a number of utility functions:
report: report the QC for a project
pretty_print_reads: print number of reads with commas at each thousand
sanitize_name: replace ‘unsafe’ characters in HTML link targets
Overview
The SampleQCReporter
, FastqGroupQCReporter
and
FastqQCReporter
classes are used by the top-level QCProject
class to report QC outputs at the level of samples (for example,
single library analyses), groups of Fastqs (for example, strandedness),
and individual Fastqs (for example, FastQC or screen data).
Adding support for new metadata
Support for new metadata items should be implemented within the
_init_metadata_table
method of QCReport
. Descriptions of
new items should also be added to the METADATA_FIELD_DESCRIPTIONS
module constant.
Adding support for new QC outputs
When adding reporting of new QC outputs it is recommended first to
ensure that they are detected by the QCOutputs
class (in the
outputs
module); then the relevant reporter class should be
extended depending on the level that the QC outputs are associated
with (i.e. project, sample, Fastq group or individual Fastq).
Typically this is done by adding support for new summary table fields,
which can implemented within the get_value
or get_10x_value
methods in SampleQCreporter
(for sample-level QC) or the get_value
method of FastqGroupQCReporter
(for Fastq-group level QC). (This
may require additional internal functionality to be implemented
within the relevant class.)
Descriptions for new fields also need to be added to the
SUMMARY_FIELD_DESCRIPTIONS
module constant. Additionally: if the
QC outputs are produced using new software packages then these should
be added to the SOFTWARE_PACKAGE_NAMES
module constant; as long
as these are reported by QCOutputs
then they will also be listed
automatically within the QC report.
- class auto_process_ngs.qc.reporting.FastqGroupQCReporter(fastqs, qc_dir, project, project_id=None, fastq_attrs=<class 'auto_process_ngs.analysis.AnalysisFastq'>, is_seq_data=True)
Utility class for reporting the QC for a Fastq group
Provides the following properties:
reads: list of read ids e.g. [‘r1’,’r2’]
fastqs: dictionary mapping read ids to Fastq paths
reporters: dictionary mapping read ids to FastqQCReporter instances
paired_end: whether FastqGroup is paired end
bam: associated BAM file name
fastq_strand_txt: location of associated Fastq_strand output
Provides the following methods:
infer_experiment: fetch data from RSeQC ‘infer_experiment.py’
ustrandednessplot: return mini-plot of strandedness data from RSeQC ‘infer_experiment.py’
insert_size_metrics: fetch insert size metrics
uinsertsizeplot: return mini-plot of insert size histogram
qualimap_rnaseq: fetch Qualimap ‘rnaseq’ metrics
ucoverageprofileplot: return mini-plot of gene coverage profile
ugenomicoriginplot: return mini-plot of genomic origin of reads data
strandedness: fetch strandedness data for this group
ustrandplot: return mini-strand stats summary plot
report_strandedness: write report for strandedness
report: write report for the group
update_summary_table: add line to summary table for the group
- Parameters:
fastqs (list) – list of paths to Fastqs in the group
qc_dir (str) – path to the QC output dir; relative path will be treated as a subdirectory of the project
project (QCProject) – parent project
project_id (str) – identifier for the project
fastq_attrs (BaseFastqAttrs) – class for extracting data from Fastq names
is_seq_data (bool) – if True then indicates that the group contains biological data
- property fastq_strand_txt
Locate output from fastq_strand (None if not found)
- get_value(field, relpath=None)
Return the value for the specified field
The following fields can be reported for each Fastq group:
fastqs (if paired-end)
fastq (if single-end)
bam_file
reads
read_lengths
read_counts
sequence_duplication
adapter_content
read_lengths_dist_r1
read_lengths_dist_r2
read_lengths_dist_r3
boxplot_r1
boxplot_r2
boxplot_r3
fastqc_r1
fastqc_r2
fastqc_r3
screens_r1
screens_r2
screens_r3
strandedness
strand_specificity
insert_size_histogram
coverage_profile_along_genes
reads_genomic_origin
- Parameters:
field (str) – name of the field to report; if the field is not recognised then KeyError is raised
relpath (str) – if set then make link paths relative to ‘relpath’
- infer_experiment(organism)
Return RSeQC infer_experiment.py data for organism
- insert_size_metrics(organism)
Return Picard insert size metrics for specified organism
- property paired_end
True if pair consists of R1/R2 files
- qualimap_rnaseq(organism)
Return Qualimap ‘rnaseq’ outputs instance for specified organism
- report(sample_report, attrs=None, relpath=None)
Add report for Fastq group to a document section
Creates a new subsection in ‘sample_report’ for the Fastq pair, within which are additional subsections for each Fastq file.
The following ‘attributes’ can be reported for each Fastq:
fastqc
fastq_screen
program versions
By default all attributes are reported.
- Parameters:
sample_report (Section) – section to add the report to
attrs (list) – optional list of custom ‘attributes’ to report
relpath (str) – if set then make link paths relative to ‘relpath’
- report_strandedness(document)
Report the strandedness outputs to a document
Creates a new subsection called “Strandedness” with a table of the strandedness determination outputs.
- Parameters:
document (Section) – section to add report to
- strandedness()
Return strandedness from fastq_strand.py
- ucoverageprofileplot(organism)
Return a mini-plot of the Qualimap gene coverage profile
- ugenomicoriginplot(organism, width=100, height=40)
Return a mini-barplot of the Qualimap genomic origin of reads data
- uinsertsizeplot(organism)
Return a mini-plot with the Picard insert size histogram
- update_summary_table(summary_table, idx=None, fields=None, relpath=None)
Add a line to a summary table reporting a Fastq group
Creates a new line in ‘summary_table’ (or updates an existing line) for the Fastq pair, adding content for each specified field.
See the ‘get_value’ method for a list of valid fields.
- Parameters:
summary_table (Table) – table to add the summary to
idx (int) – if supplied then indicates which existing table row to update (if None then a new row is appended)
fields (list) – list of custom fields to report
relpath (str) – if set then make link paths relative to ‘relpath’
- Returns:
- True if report didn’t contain any issues,
False otherwise.
- Return type:
Boolean
- ustrandednessplot(organism, width=50, height=24)
Return a mini-plot for RSeQC strandness data
- ustrandplot()
Return a mini-strand stats summary plot
- class auto_process_ngs.qc.reporting.FastqQCReporter(fastq, qc_dir, project_id=None, fastq_attrs=<class 'auto_process_ngs.analysis.AnalysisFastq'>)
Provides interface to QC outputs for Fastq file
Provides the following attributes:
name: basename of the Fastq
path: path to the Fastq
safe_name: name suitable for use in HTML links etc
sample_name: sample name derived from the Fastq basename
sequence_lengths: SeqLens instance
fastqc: Fastqc instance
fastq_screen.names: list of FastQScreen names
fastq_screen.SCREEN.description: description of SCREEN
fastq_screen.SCREEN.png: associated PNG file for SCREEN
fastq_screen.SCREEN.txt: associated TXT file for SCREEN
fastq_screen.SCREEN.version: associated version for SCREEN
program_versions.NAME: version of package NAME
adapters: list of adapters from Fastqc
adapters_summary: dictionary summarising adapter content
Provides the following methods:
report_fastqc
report_fastq_screens
report_program_versions
useqlenplot
ureadcountplot
uboxplot
ufastqcplot
useqduplicationplot
uadapterplot
uscreenplot
- Parameters:
fastq (str) – path to Fastq file
qc_dir (str) – path to QC directory
project_id (str) – identifier for the parent project
fastq_attrs (BaseFastqAttrs) – class for extracting data from Fastq names
- fastqc_summary()
Return plaintext version of the FastQC summary
- report_fastq_screens(document, relpath=None)
Report the FastQScreen outputs to a document
Creates a new subsection called “Screens” with copies of the screen plots for each screen and links to the “raw” text files.
- Parameters:
document (Section) – section to add report to
relpath (str) – if set then make link paths relative to ‘relpath’
- report_fastqc(document, relpath=None)
Report the FastQC outputs to a document
Creates a new subsection called “FastQC” with a copy of the FastQC sequence quality boxplot and a summary table of the results from each FastQC module.
- Parameters:
document (Section) – section to add report to
relpath (str) – if set then make link paths relative to ‘relpath’
- report_program_versions(document)
Report the program versions to a document
Creates a new subsection called “Program versions” with a table listing the versions of the QC programs.
- Parameters:
document (Section) – section to add report to
relpath (str) – if set then make link paths relative to ‘relpath’
- uadapterplot(height=40, multi_bar=False, inline=True)
Return a mini-adapter content summary plot
- Parameters:
height (int) – optionally set the plot height in pixels
multi_bar (bool) – if True then create a plot with one bar per adapter class (otherwise summarise adapter content on one bar in the plot)
inline (bool) – if True then return plot in format for inlining in HTML document
- uboxplot(inline=True)
Return a mini-sequence quality boxplot
- Parameters:
inline (bool) – if True then return plot in format for inlining in HTML document
- uduplicationplot(mode='dup', inline=True)
Return a mini-sequence duplication plot
- Parameters:
mode (str) – either ‘dup’ or ‘dedup’
inline (bool) – if True then return plot in format for inlining in HTML document
- ufastqcplot(inline=True)
Return a mini-FastQC summary plot
- Parameters:
inline (bool) – if True then return plot in format for inlining in HTML document
- ureadcountplot(max_reads=None, inline=True)
Return a mini-sequence composition plot
- Parameters:
max_reads (int) – if set then scale the reads for this Fastq against this value in the plot
inline (bool) – if True then return plot in format for inlining in HTML document
- uscreenplot(inline=True)
Return a mini-FastQScreen summary plot
- Parameters:
inline (bool) – if True then return plot in format for inlining in HTML document
- useqlenplot(max_len=None, min_len=None, height=None, inline=True)
Return a mini-sequence length distribution plot
- Parameters:
max_len (int) – set the upper limit of the x-axis
min_len (int) – set the lower limit of the x-axis
height (int) – optionally set the plot height in pixels
inline (bool) – if True then return plot in format for inlining in HTML document
- class auto_process_ngs.qc.reporting.QCProject(project, qc_dir=None)
Gather information about the QC associated with a project
Collects data about the QC for an AnalysisProject and makes it available via the following properties:
project: the AnalysisProject instance
qc_dir: the directory to examine for QC outputs
run_metadata: AnalysisDirMetadata instance with metadata from the parent run (if present)
processing_software: dictionary with information on software used to process the initial data
Properties that shortcut to properties of the parent AnalysisProject:
name: project name
dirn: path to associated directory
comments: comments associated with the project
info: shortcut to the project’s AnalysisProjectMetadata instance
qc_info: shortcut to the QC directory’s AnalysisProjectQCDirInfo instance
fastq_attrs: class to use for extracting data from Fastq file names
Properties based on artefacts detected in the QC directory:
fastqs: sorted list of Fastq names
reads: list of reads (e.g. ‘r1’, ‘r2’, ‘i1’ etc)
samples: sorted list of sample names extracted from Fastqs
bams: sorted list of BAM file names
multiplexed_samples: sorted list of sample names for multiplexed samples (e.g. 10x CellPlex)
organisms: sorted list of organism names
outputs: list of QC output categories detected (see below for valid values)
output_files: list of absolute paths to QC output files
software: dictionary with information on the QC software packages
stats: AttrtibuteDictionary with useful stats from across the project
The ‘stats’ property has the following attributes:
max_seqs: maximum number of sequences across all Fastq files
min_sequence_length: minimum sequence length across all Fastq files
max_sequence_length: maximum sequence length across all Fastq files
max_sequence_length_read[READ]: maximum sequence length across all READ Fastqs (where READ is ‘r1’, ‘r2’ etc)
min_sequence_length_read[READ]: minimum sequence length across all READ Fastqs (where READ is ‘r1’, ‘r2’ etc)
Valid values of the ‘outputs’ property are taken from the QCOutputs class.
General properties about the project:
- is_single_cell: True if the project has single cell
data (10xGenomics, ICELL8 etc)
- Parameters:
project (AnalysisProject) – project to report QC for
qc_dir (str) – path to the QC output dir; relative path will be treated as a subdirectory of the project
- property comments
Comments associated with the project
- property dirn
Path to project directory
- property id
Identifier for the project
This is a string of the form:
<RUN_ID>:<PROJECT_NAME>
e.g.
MINISEQ_201120#22:PJB
If the run id can’t be determined then the name of the parent directory is used instead.
- property is_single_cell
Check whether project has single cell data
- property name
Name of project
- property run_id
Identifier for parent run
This is the standard identifier constructed from the platform, datestamp and facility run number (e.g.
MINISEQ_201120#22
).If an identifier can’t be constructed then
None
is returned.
- software_info(pkg, exclude_processing=False)
Get information on software package
- Parameters:
pkg (str) – name of software package to get information about
exclude_processing (bool) – if True then don’t fall back to processing software information if package is not found (default: False, do fall back to checking processing software)
- Returns:
- software version information, or
None if no information is stored.
- Return type:
String
- class auto_process_ngs.qc.reporting.QCReport(projects, title=None, qc_dir=None, report_attrs=None, summary_fields=None, relpath=None, data_dir=None, suppress_warning=False)
Create a QC report document for one or more projects
Example usage:
>>> report = QCReport(project) >>> report.write("qc_report.html")
To control the fields written to the summary table, specify a list of field names via the ‘summary_fields’ argument. Valid field names are:
sample: sample name
fastq: Fastq name
fastqs: Fastq R1/R2 names
reads: number of reads
read_lengths: length of reads
read_lengths_distributions: mini-plots of read length distributions
read_counts: mini-plots of fractions of masked/padded/total reads
adapter_content: mini-plots summarising adapter content for all reads
read_length_dist_[read]: length dist mini-plot for [read] (r1,r2,…)
fastqc_[read]: FastQC mini-plot for [read]
boxplot_[read]: FastQC per-base-quality mini-boxplot’ for [read]
screens_[read]: FastQScreen mini-plots for [read]
strandedness: ‘forward’, ‘reverse’ or ‘unstranded’ for pair
cellranger_count: ‘cellranger count’ outputs for each sample
To control the elements written to the reports for each Fastq pair, specify a list of element names via the ‘report_attrs’ argument. Valid element names are:
fastqc: FastQC report
fastq_screen: FastQCScreen report
program_versions: program versions
- Parameters:
projects (AnalysisProject) – list of projects to report QC for
title (str) – title for the report (defaults to “QC report: <PROJECT_NAME>”)
qc_dir (str) – path to the QC output dir; relative path will be treated as a subdirectory of the project
report_attrs (list) – list of elements to report for each Fastq pair
summary_fields (list) – list of fields to report for each sample in the summary table
relpath (str) – if set then make link paths relative to ‘relpath’
data_dir (str) – if set then copy external data files to this directory and make link paths to these copies; relative path will be treated as a subdirectory of the project
suppress_warning (bool) – if True then don’t show the warning message even when there are missing metrics (default: show the warning if there are missing metrics)
- add_multiplex_analysis_table(project, fields, section)
Create a new table for summarising 10x multiplexing analyses
- add_single_library_analysis_table(package, project, fields, section)
Create a new table for summarising 10x single library analyses
- add_summary_table(project, fields, section)
Create a new table for summarising samples from a project
Associated CSS classes are ‘summary’ and ‘fastq_summary’
- fetch_qc_dir(project)
Return path to QC dir for reporting
If a ‘data directory’ has been defined for this report then QC artefacts will have been copied to a project-specific subdirectory of that directory; otherwise QC artefacts will be in the QC directory of the original project.
- report_additional_metrics(project, section)
Report additional QC metrics
Creates a summary table in the specified section to report additional metrics to those in the main summary section (e.g. adapters, FastQC summary, read length distributions etc)
- report_comments(project)
Report the comments associated with a project
Adds the comments from the project metadata as a list to the comments section.
- Parameters:
project (QCProject) – project to report
- report_extra_outputs(project, section)
Report extra/external outputs
Populates the specified section with a list of links to the extra/external outputs referenced in the ‘extra_outputs.tsv’ file in the QC directory of the project.
- report_genebody_coverage(project, section)
Add RSeQC gene body coverage reports to a document section
- report_insert_size_metrics(project, section)
Add links to insert size metrics to a document section
- report_metadata(project, tbl, items)
Report project metadata to a table
Adds entries for project metadata items to the specified table
- report_multiplexing_analyses(project, sample, multiplexing_analysis_table, fields)
Report the multiplexing analyses for a sample
Writes lines to the multiplexing analysis summary table for each analysis found that is associated with the specified sample.
- report_multiqc(project, section)
Add link to MultiQC report to a document section
- report_processing_software(project)
Report the software versions used in the processing
Adds entries for the software versions to the “processing software” table in the report
- report_qc_software(project)
Report the software versions used in the QC
Adds entries for the software versions to the “qc_software” table in the report
- report_sample(project, sample, report_attrs, summary_table, summary_fields)
Report the QC for a sample
Reports the QC for the sample and Fastqs to the summary table and appends a section with detailed reports to the document.
- Parameters:
project (QCProject) – project to report
sample (str) – name of sample to report
report_attrs (list) – list of elements to report for each set of Fastqs
summary_table (Table) – summary table to report each sample in
summary_fields (list) – list of fields to report for each sample in the summary table
- report_single_library_analyses(package, project, sample, single_library_analysis_table, fields)
Report the single library analyses for a sample
Writes lines to the single library analysis summary table for each analysis found that is associated with the specified sample.
- report_status()
Set the visibility of the “warnings” section
- class auto_process_ngs.qc.reporting.SampleQCReporter(project, sample, is_seq_data=True, qc_dir=None, fastq_attrs=<class 'auto_process_ngs.analysis.AnalysisFastq'>)
Utility class for reporting the QC for a sample
Provides the following properties:
sample: name of the sample fastqs: list of the Fastqs associated with the sample reads: list of read ids e.g. [‘r1’,’r2’] fastq_groups: list of FastqGroupQCReporter instances from
grouped Fastqs associated with the sample
- cellranger_count: list of CellrangerCount instances
associated with the sample
- multiome_libraries: MultiomeLibraries instance with
data for 10xGenomics libraries for the project (or None, if the libraries file doesn’t exist)
Provides the following methods:
- report_fastq_groups: add reports for Fastq groups in
the sample to a document section
- update_summary_table: add lines to summary table for
the sample
- update_single_library_table: add lines to the single
library analysis summary table
- update_multiplexing_analysis_table: add lines to the
multiplexing analysis summary table
- get_10x_value(field, cellranger_data, metrics, web_summary, relpath=None)
Return the value for the specified field
The following fields can be reported for each sample:
Valid fields are:
10x_cells
10x_reads_per_cell
10x_genes_per_cell
10x_frac_reads_in_cell
10x_fragments_per_cell
10x_fragments_overlapping_targets
10x_fragments_overlapping_peaks
10x_tss_enrichment_score
10x_atac_fragments_per_cell
10x_gex_genes_per_cell
10x_genes_detected
10x_umis_per_cell
10x_pipeline
10x_reference
10x_web_summary
linked_sample
- Parameters:
field (str) – name of the field to report; if the field is not recognised then KeyError is raised
cellranger_data (object) – parent CellrangerCount object for the sample
metrics (MetricsSummary) – summary metrics for the sample
web_summary (str) – path to the web_summary.html report for the sample
relpath (str) – if set then make link paths relative to ‘relpath’
- get_value(field, relpath=None)
Return the value for the specified field
The following fields can be reported for each Fastq pair:
sample
cellranger_count
- Parameters:
field (str) – name of the field to report; if the field is not recognised then KeyError is raised
relpath (str) – if set then make link paths relative to ‘relpath’
- report_fastq_groups(sample_report, attrs, relpath=None)
Add reports for all Fastq groups to a document section
Creates new subsections in ‘sample_report’ for each Fastq group, to report data on each Fastq file in the group.
The following ‘attributes’ that can be reported for each Fastq are those available for the ‘report’ method of the ‘FastqGroupQCReporter’ class.
By default all attributes are reported.
- Parameters:
sample_report (Section) – section to add the reports to
attrs (list) – optional list of custom ‘attributes’ to report
relpath (str) – if set then make link paths relative to ‘relpath’
- update_multiplexing_analysis_table(multiplexing_analysis_table, fields=None, relpath=None)
Add lines to a table reporting multiplexing analyses
Creates new lines in ‘multiplexing_analysis_table’ for the sample (one line per multiplexed analysis), adding content for each specified field.
Valid fields are any supported by the ‘get_10x_value’ method.
- Parameters:
summary_table (Table) – table to update
fields (list) – list of custom fields to report
relpath (str) – if set then make link paths relative to ‘relpath’
- Returns:
- True if report didn’t contain any issues,
False otherwise.
- Return type:
Boolean
- update_single_library_table(package, single_library_table, fields=None, relpath=None)
Add lines to a table reporting single library analyses
Creates new lines in ‘single_library_table’ for the sample (one line per single library analysis group), adding content for each specified field.
Valid fields are any supported by the ‘get_10x_value’ method.
- Parameters:
package (str) – 10x package to report on
summary_table (Table) – table to update
fields (list) – list of custom fields to report
relpath (str) – if set then make link paths relative to ‘relpath’
- Returns:
- True if report didn’t contain any issues,
False otherwise.
- Return type:
Boolean
- update_summary_table(summary_table, fields=None, sample_report=None, relpath=None)
Add lines to a summary table reporting a sample
Creates new lines in ‘summary_table’ for the sample (one line per Fastq group), adding content for each specified field.
See the ‘get_value’ method for a list of valid fields.
- Parameters:
summary_table (Table) – table to update
fields (list) – list of custom fields to report
relpath (str) – if set then make link paths relative to ‘relpath’
- Returns:
- True if report didn’t contain any issues,
False otherwise.
- Return type:
Boolean
- class auto_process_ngs.qc.reporting.ToggleButton(toggle_section, show_text='Show', hide_text='Hide', css_classes=None, help_text=None, hidden_by_default=True)
Utility class for creating a ‘toggle button’
A ‘toggle button’ is an HTML button that is linked to a document section such that successive button clicks toggle the section’s visibility, between ‘visible’ and ‘hidden’ states.
The toggle functionality is provided by a JavaScript function which changes the section’s ‘display’ attribute to ‘block’ (to make it visible) and ‘none’ (to hide it).
The button will always have the ‘toggle_button’ CSS class associated with it, in addition to any others specified.
Example usage:
>>> d = Document() >>> s = Section('Toggle section') >>> b = ToggleButton(s) >>> d.add(b,s)
- Parameters:
toggle_section (Section) – section to toggle the visibility of
show_text (str) – text to show on the button when the section is in the hidden state
hide_text (str) – text to show on the button when the section is in the visible state
help_text (str) – text to show when the mouse is over the button
css_classes (list) – additional CSS classes to associate with the button
hidden_by_default (bool) – if True then the section will be hidden by default
- html()
Generate HTML version of the toggle button
- Returns:
HTML representation of the button.
- Return type:
String
- auto_process_ngs.qc.reporting.pretty_print_reads(n)
Print the number of reads with commas at each thousand
For example:
>>> pretty_print_reads(10409789) 10,409,789
- Parameters:
n (int) – number of reads
- Returns:
representation with commas for every thousand.
- Return type:
String
- auto_process_ngs.qc.reporting.report(projects, title=None, filename=None, qc_dir=None, report_attrs=None, summary_fields=None, relative_links=False, use_data_dir=False, make_zip=False, suppress_warning=False)
Report the QC for a project
- Parameters:
projects (list) – AnalysisProject instances to report QC for
title (str) – optional, specify title for the report (defaults to ‘<PROJECT_NAME>: QC report’)
filename (str) – optional, specify path and name for the output report file (defaults to ‘<PROJECT_NAME>.qc_report.html’)
qc_dir (str) – path to the QC output dir
report_attrs (list) – optional, list of elements to report for each Fastq pair
summary_fields (list) – optional, list of fields to report for each sample in the summary table relative_links (boolean): optional, if set to True then use relative paths for links in the report (default is to use absolute paths)
use_data_dir (boolean) – if True then copy QC artefacts to a data directory parallel to the output report file
make_zip (boolean) – if True then also create a ZIP archive of the QC report and outputs (default is not to create the ZIP archive)
suppress_warning (bool) – if True then don’t show the warning message even when there are missing metrics (default: show the warning if there are missing metrics)
- Returns:
filename of the output HTML report.
- Return type:
String
- auto_process_ngs.qc.reporting.sanitize_name(s, new_char='_')
Replace ‘unsafe’ characters in HTML link targets
Replaces ‘unsafe’ characters in a string used as a link target in an HTML document with underscore characters.
- Parameters:
s (str) – string to sanitize
replace_with (str) – string to replace ‘unsafe’ characters with (default: ‘_’)