auto_process_ngs.tenx.metrics
Utility classes for handling the metric summary files produced by various 10x Genomics pipelines:
MetricSummary
GexSummary
AtacSummary
MultiomeSummary
MultiplexSummary
- class auto_process_ngs.tenx.metrics.AtacSummary(f)
Extract data from summary.csv file for scATAC-seq
Utility class for extracting data from a ‘summary.csv’ file output from running ‘cellranger-atac count’.
The file consists of two lines: the first is a header line, the second consists of corresponding data values.
The following properties are available:
cells_detected
annotated_cells
median_fragments_per_cell
frac_fragments_overlapping_targets
- property annotated_cells
Return the number of annotated cells
Only supported for Cellranger ATAC < 2.0.0; raises AttributeError otherwise.
- property cells_detected
Return the number of cells detected
Only supported for Cellranger ATAC < 2.0.0; raises AttributeError otherwise.
- property estimated_number_of_cells
Return the estimated number of cells
Only supported for Cellranger ATAC < 2.0.0; raises AttributeError otherwise.
- property frac_fragments_overlapping_peaks
Return the fraction of fragments overlapping targets
- property frac_fragments_overlapping_targets
Return the fraction of fragments overlapping targets
- property median_fragments_per_cell
Return the median fragments per cell
- property tss_enrichment_score
Return the TSS enrichment score
- property version
Return the pipeline version
- class auto_process_ngs.tenx.metrics.GexSummary(f)
Extract data from metrics_summary.csv file for scRNA-seq
Utility class for extracting data from a ‘metrics_summary.csv’ file output from running ‘cellranger count’.
The file consists of two lines: the first is a header line, the second consists of corresponding data values.
The following properties are available:
estimated_number_of_cells
mean_reads_per_cell
median_genes_per_cell
frac_reads_in_cells
- property estimated_number_of_cells
Return the estimated number of cells
- property frac_reads_in_cells
Return the fraction of reads in cells
- property mean_reads_per_cell
Return the mean reads per cell
- property median_genes_per_cell
Return the median genes per cell
- class auto_process_ngs.tenx.metrics.MetricsSummary(f, multiline=False)
Base class for extracting data from cellranger* count *summary.csv files
By default the files consists of two lines: the first is a header line, the second consists of corresponding data values. There is also a multi-line variation (e.g. from cellranger multi) with a header line followed by multiple lines of data (use the ‘multiline’ argument to indicate this is the expected format).
In addition: in some variants (e.g. ‘metrics_summary.csv’), integer data values are formatted to use commas to separate thousands (e.g. 2,272) and values which contain commas are enclosed in double quotes.
For example:
Estimated Number of Cells,Mean Reads per Cell,… “2,272”,”107,875”,”1,282”,”245,093,084”,98.3%,…
This class extracts the data values and where possible converts them to integers.
- fetch(field)
Fetch data associated with an arbitrary field
- exception auto_process_ngs.tenx.metrics.MissingMetricError
Custom exception class when metrics are not found
- class auto_process_ngs.tenx.metrics.MultiomeSummary(f)
Extract data from summary.csv file for multiome GEX-ATAC
Utility class for extracting data from a ‘summary.csv’ file output from running ‘cellranger-arc count’.
The file consists of two lines: the first is a header line, the second consists of corresponding data values.
The following properties are available:
estimated_number_of_cells
atac_median_high_quality_fragments_per_cell
gex_median_genes_per_cell
- property atac_median_high_quality_fragments_per_cell
Return the median high-quality fragments per cell for ATAC data
- property estimated_number_of_cells
Return the estimated number of cells
- property gex_median_genes_per_cell
Return the median genes per cell for GEX data
- class auto_process_ngs.tenx.metrics.MultiplexSummary(f)
Extract data from metrics_summary.csv file for CellPlex
Utility class for extracting data from a ‘metrics_summary.csv’ file output from running ‘cellranger multi’.
The file consists of a header line followed by multiple data lines, with one set of values per line.
The following properties are available:
cells
mean_reads_per_cell
median_reads_per_cell
median_genes_per_cell
total_genes_detected
median_umi_counts_per_cell
NB the returned values for these properties are all from the gene expression data.
Values for other library types can be fetched directly by using the ‘fetch’ method and specifying the library type, for example:
>>> MultiplexSummary(f).fetch('Cells','Antibody Capture')
- property cells
Returns the number of cells
- fetch(name, library_type='Gene Expression')
Fetch data associated with an arbitrary field
By default data associated with ‘Gene Expression’ are returned; data associated with other library types can be fetched by specifying a different value for the ‘library_type’ argument (either ‘Multiplexing Capture’ or ‘Antibody Capture’).
- Parameters:
name (str) – name of the metric
library_type (str) – library type to fetch the metric for (default: ‘Gene Expression’)
- Raises:
MissingMetricError – if metric not defined
KeyError – if library type not found
- property mean_reads_per_cell
Returns the mean reads per cell
- property median_genes_per_cell
Returns the median genes per cell
- property median_reads_per_cell
Returns the median reads per cell
- property median_umi_counts_per_cell
Returns the median UMI counts per cell
- property total_genes_detected
Returns the total genes detected