auto_process_ngs.tenx.metrics

Utility classes for handling the metric summary files produced by various 10x Genomics pipelines:

  • MetricSummary

  • GexSummary

  • AtacSummary

  • MultiomeSummary

  • MultiplexSummary

class auto_process_ngs.tenx.metrics.AtacSummary(f)

Extract data from summary.csv file for scATAC-seq

Utility class for extracting data from a ‘summary.csv’ file output from running ‘cellranger-atac count’.

The file consists of two lines: the first is a header line, the second consists of corresponding data values.

The following properties are available:

  • cells_detected

  • annotated_cells

  • median_fragments_per_cell

  • frac_fragments_overlapping_targets

property annotated_cells

Return the number of annotated cells

Only supported for Cellranger ATAC < 2.0.0; raises AttributeError otherwise.

property cells_detected

Return the number of cells detected

Only supported for Cellranger ATAC < 2.0.0; raises AttributeError otherwise.

property estimated_number_of_cells

Return the estimated number of cells

Only supported for Cellranger ATAC < 2.0.0; raises AttributeError otherwise.

property frac_fragments_overlapping_peaks

Return the fraction of fragments overlapping targets

property frac_fragments_overlapping_targets

Return the fraction of fragments overlapping targets

property median_fragments_per_cell

Return the median fragments per cell

property tss_enrichment_score

Return the TSS enrichment score

property version

Return the pipeline version

class auto_process_ngs.tenx.metrics.GexSummary(f)

Extract data from metrics_summary.csv file for scRNA-seq

Utility class for extracting data from a ‘metrics_summary.csv’ file output from running ‘cellranger count’.

The file consists of two lines: the first is a header line, the second consists of corresponding data values.

The following properties are available:

  • estimated_number_of_cells

  • mean_reads_per_cell

  • median_genes_per_cell

  • frac_reads_in_cells

property estimated_number_of_cells

Return the estimated number of cells

property frac_reads_in_cells

Return the fraction of reads in cells

property mean_reads_per_cell

Return the mean reads per cell

property median_genes_per_cell

Return the median genes per cell

class auto_process_ngs.tenx.metrics.MetricsSummary(f, multiline=False)

Base class for extracting data from cellranger* count *summary.csv files

By default the files consists of two lines: the first is a header line, the second consists of corresponding data values. There is also a multi-line variation (e.g. from cellranger multi) with a header line followed by multiple lines of data (use the ‘multiline’ argument to indicate this is the expected format).

In addition: in some variants (e.g. ‘metrics_summary.csv’), integer data values are formatted to use commas to separate thousands (e.g. 2,272) and values which contain commas are enclosed in double quotes.

For example:

Estimated Number of Cells,Mean Reads per Cell,… “2,272”,”107,875”,”1,282”,”245,093,084”,98.3%,…

This class extracts the data values and where possible converts them to integers.

fetch(field)

Fetch data associated with an arbitrary field

exception auto_process_ngs.tenx.metrics.MissingMetricError

Custom exception class when metrics are not found

class auto_process_ngs.tenx.metrics.MultiomeSummary(f)

Extract data from summary.csv file for multiome GEX-ATAC

Utility class for extracting data from a ‘summary.csv’ file output from running ‘cellranger-arc count’.

The file consists of two lines: the first is a header line, the second consists of corresponding data values.

The following properties are available:

  • estimated_number_of_cells

  • atac_median_high_quality_fragments_per_cell

  • gex_median_genes_per_cell

property atac_median_high_quality_fragments_per_cell

Return the median high-quality fragments per cell for ATAC data

property estimated_number_of_cells

Return the estimated number of cells

property gex_median_genes_per_cell

Return the median genes per cell for GEX data

class auto_process_ngs.tenx.metrics.MultiplexSummary(f)

Extract data from metrics_summary.csv file for CellPlex

Utility class for extracting data from a ‘metrics_summary.csv’ file output from running ‘cellranger multi’.

The file consists of a header line followed by multiple data lines, with one set of values per line.

The following properties are available:

  • cells

  • mean_reads_per_cell

  • median_reads_per_cell

  • median_genes_per_cell

  • total_genes_detected

  • median_umi_counts_per_cell

NB the returned values for these properties are all from the gene expression data.

Values for other library types can be fetched directly by using the ‘fetch’ method and specifying the library type, for example:

>>> MultiplexSummary(f).fetch('Cells','Antibody Capture')
property cells

Returns the number of cells

fetch(name, library_type='Gene Expression')

Fetch data associated with an arbitrary field

By default data associated with ‘Gene Expression’ are returned; data associated with other library types can be fetched by specifying a different value for the ‘library_type’ argument (either ‘Multiplexing Capture’ or ‘Antibody Capture’).

Parameters:
  • name (str) – name of the metric

  • library_type (str) – library type to fetch the metric for (default: ‘Gene Expression’)

Raises:
property mean_reads_per_cell

Returns the mean reads per cell

property median_genes_per_cell

Returns the median genes per cell

property median_reads_per_cell

Returns the median reads per cell

property median_umi_counts_per_cell

Returns the median UMI counts per cell

property total_genes_detected

Returns the total genes detected