auto_process_ngs.tenx.cellplex

Utilities for working with 10x Genomics single cell multiplexing (CellPlex) pipelines:

  • CellrangerMultiConfigCsv

class auto_process_ngs.tenx.cellplex.CellrangerMultiConfigCsv(filen)

Class to handle cellranger multi ‘config.csv’ files

See https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/multi#cellranger-multi

Provides the following properties:

  • sample_names: list of multiplexed sample names

  • sections: list of the sections in the config

  • reference_data_path: path to the reference dataset

  • probe_set_path: path to the probe set

  • feature_reference_path: path to the feature reference

  • vdj_reference_path: path to the V(D)J-compatible reference

  • gex_libraries: list of Fastq IDs associated with GEX data

Provides the following methods:

  • sample: returns information on a specific multiplexed sample

  • gex_library: returns information on a specific GEX library

  • fastq_dirs: returns mapping of library names to the associated Fastq directory paths

  • pretty_print_samples: returns a string with a ‘nice’ description of the multiplexed sample names

property fastq_dirs

Return mapping of library names to Fastq directories

property feature_reference_path

Return the path to the feature reference file from config.csv

property feature_types

Return list of feature types defined in config file

Feature type names are returned converted to lower case.

property gex_libraries

Return the library names associated with GEX data from config.csv

Libraries are listed in the ‘[libraries]’ section

gex_library(name)

Return dictionary of values associated with GEX library

Parameters:

name (str) – name of the sample of interest

libraries(feature_type)

Return library names associated with specified feature type

library(feature_type, name)

Return dictionary of values associated with library

Keys include:

  • ‘fastqs’ (path to Fastqs)

  • ‘lanes’ (associated lanes)

  • ‘library_id’ (physical library ID)

  • ‘feature_type’ (e.g. ‘Gene Expression’)

  • ‘subsample_rate’ (the associated subsampling rate)

Parameters:
  • feature_type (str) – feature type of the library of interest (e.g. ‘Gene Expression’)

  • name (str) – name of the library of interest

pretty_print_samples()

Return string describing the multiplexed sample names

Wraps a call to ‘pretty_print_names’ function.

Returns:

pretty description of multiplexed sample names.

Return type:

String

property probe_set_path

Return the path to the probe set file from config.csv

property reference_data_path

Return the path to the reference dataset from config.csv

sample(sample_name)

Return dictionary of values associated with multiplexed sample

Keys include ‘cmo’ (list of CMO ids) and ‘description’ (description text) associated with the sample in the ‘[samples]’ section of the config.csv file.

Parameters:

sample_name (str) – name of the sample of interest

property sample_names

Return the multiplexed sample names from config.csv

Samples are listed in the ‘[samples]’ section.

property sections

Return the list of sections in the config.csv file

property vdj_reference_path

Return the path to the V(D)J reference file from config.csv