auto_process_ngs.applications
Collects information about sequencing applications (combinations of platform and library type).
Defines the APPLICATIONS list, which contains dictionaries defining known
applications, and the identify_application() function, which can be used
to identify the appropriate application for a given platform and library type.
Each application is defined as a dictionary with the following keys:
platforms: list of platform names (with optional wildcards) which correspond to the application; use [“*”] to indicate all platforms, or an empty list to indicate that the platform must not be set
libraries: list of library types (with optional wildcards) which correspond to the application; use [“*”] to indicate all library types, or an empty list to indicate that the library type must not be set
extensions: list of library type extensions which can be appended to the library type via the plus symbol (“+”)
alternative_extensions: dictionary of alternative names for library type extensions mapped to the “canonical” names in the ‘extensions’ list
fastq_generation: name of the Fastq generation protocol to use for this application
qc_protocol: name of the QC protocol to use for this application
setup: dictionary with information about actions that should be performed as part of setting up analysis project directories for this application; contains the following keys: - templates: list of template names to use for this application; can be
one or more of: “10x_multi_config”, “10x_multiome_libraries”.
directories: list of subdirectory names which will be created in the analysis project directory (for example “Visium_images”)
assays: optional list of assays to associate with this application
tags: optional list of tags to associate with this application; tags can be one or more of: “10x”, “bio_rad”, “parse”, “single_cell”, “spatial”, “legacy”. Tags are used for automated documentation generation.
The minimum required keys for each application are platforms, libraries,
fastq_generation, and qc_protocol.
The module also defines the following user-facing function:
identify_application: returns the dictionary defining the applicationwhich matches a given platform and library type
fetch_application_data: returns a list of application definitionsmatching specified tags
The following functions are also defined for internal use:
match_application: determines whether a given platform and library type match a given application definitionscore_match: returns a score for a platform/library combination
- auto_process_ngs.applications.fetch_application_data(tags, applications=None, expand=False)
Fetch application data matching specified tags
- Parameters:
tags (list) – list of tags to match; tags starting with ‘!’ are treated as negative tags
applications (list) – list of application definitions to filter; if None then the default APPLICATIONS list is used
expand (bool) – if True then applications with multiple platforms/libraries are expanded so there is one entry per platform/library combination
- Returns:
list of application definitions matching the specified tags.
- Return type:
list
- auto_process_ngs.applications.identify_application(platform_name, library_type)
Returns information about an application
Applications are combinations of platforms and libraries.
- Parameters:
platform_name (str) – name of the platform
library_type (str) – name of the library
- Returns:
application-specific information
- Return type:
dict
- auto_process_ngs.applications.match_application(application_info, platform_name, library_type)
Determine if platform and library type matches the supplied application
Given information about an application (supplied as a dictionary with elements
platformsandlibraries), determines whether the supplied platform and library match that information.FIXME doesn’t currently include matching against the library extensions
- Parameters:
application_info (dict) – information about the application
platform_name (str) – name of the platform
library_type (str) – name of the library
- Returns:
if the platform and library match the application then returns a tuple of the form (application, list of platform matches, list of library matches); otherwise returns None
- Return type:
tuple
- auto_process_ngs.applications.score_match(platforms, libraries)
Return a score for a platform/library combination
The score is calculated as the sum of the minimum number of wildcard characters (‘*’) in the platform and library lists. A lower score indicates a more specific match.
- Parameters:
platforms (list) – list of platforms
libraries (list) – list of libraries
- auto_process_ngs.applications.split_library_type(library_type)
Splits a library type into its components
Library types are expected to consist of a “base” library type followed by none or more optional “extensions”, which are identified by a preceding ‘+’ character.
For example: “GEX” has a base library type with no extensions; “GEX+CSP” has the base type “GEX” with extensions [“CSP”]; and “GEX+CSP+VDJ” has the base type “GEX” with extensions [“CSP”, “VDJ”].
This function returns a tuple of the form:
(BASE, EXTENSIONS)
- Parameters:
library_type (str) – name of the library
- Returns:
tuple with two elements, first is the base library type, the second is a list of the extensions.
- Return type:
tuple