auto_process_ngs.settings

Classes and functions for handling the collection of configuration settings for automated processing.

The settings are stored in a ‘.ini’-formatted file (by default called ‘auto_process.ini’; this file can be created by making a copy of the ‘auto_process.ini.sample’ file).

The simplest usage example is:

>>> from settings import Settings
>>> s = Settings()

The values of the configuration parameters can then be accessed using e.g.

>>> s.general.max_concurrent_jobs
4

To print the values of all parameters use

>>> s.report_settings()

To import values from a non-standard named file use e.g.

>>> s = Settings('my_auto_process.ini')

The ‘locate_settings_file’ function is used implicitly to locate the settings file if none is given; it can also automatically create a settings file if none is found but there is a sample version on the search path.

To update values once the settings have been read in do e.g.

>>> s.set('general.max_concurrent_jobs',4)

To update the configuration file use the save method e.g.

>>> s.save()
class auto_process_ngs.settings.Settings(settings_file=None, resolve_undefined=True)

Load parameter values from an external config file

The input file should be in ‘.ini’ format and contain sections and values consistent with the sample ‘auto_process.ini’ file.

add_section(section)

Add a new section

Parameters:

section (str) – an identifier of the form SECTION[:SUBSECTION] which specifies the section to add

fetch_value(param)

Return the value stored against a parameter

NB parameters are referenced by the names that appear in the config file (rather than the internal representation within the Settings instance).

Parameters:

param (str) – parameter name of the form SECTION[:SUBSECTION][.ATTR]

get_bcl_converter_config(section, config, attr_dict=None)

Retrieve BCL conversion configuration options from .ini file

Given the name of a section (e.g. ‘bcl_conversion’, ‘platform:miseq’), fetch the BCL converter settings and return in an AttributeDictionary object.

The options that can be extracted are:

  • bcl_converter

  • nprocessors

  • no_lane_splitting

  • create_empty_fastqs

There are also some legacy options:

  • default_version

  • bcl2fastq

Parameters:
  • section (str) – name of the section to retrieve the settings from

  • config (Config) – Config object with settings loaded

  • attr_dict (AttributeDictionary) – optional, existing AttributeDictionary which will be added to

Returns:

dictionary of option:value pairs.

Return type:

AttributeDictionary

get_destination_config(section, config)

Retrieve ‘destination’ configuration options from .ini file

Given the name of a section (e.g. ‘destination:webserver’), fetch the associated data transfer settings and return in an AttributeDictionary object.

The options that can be extracted are:

  • directory (compulsory, str)

  • subdir (optional, str, default ‘None’)

  • zip_fastqs (optional, boolean, default ‘False’)

  • max_zip_size (option, str, default ‘None’)

  • readme_template (optional, str, default ‘None’)

  • url (optional, str, default ‘None’)

  • include_downloader (optional, boolean, default ‘False’)

  • include_qc_report (optional, boolean, default ‘False’)

  • hard_links (optional, boolean, default ‘False’)

Parameters:
  • section (str) – name of the section to retrieve the settings from

  • config (Config) – Config object with settings loaded

Returns:

dictionary of option:value pairs.

Return type:

AttributeDictionary

get_organism_config(section=None, config=None)

Retrieve ‘organism’ configuration options from .ini file

Given the name of a section (e.g. ‘organism:Human’), fetch the data association with the organism and return in an AttributeDictionary object.

The items that can be extracted are:

  • star_index (str, path to STAR index)

  • bowtie_index (str, path to Bowtie index)

  • annotation_bed (str, path to BED file with annotation)

  • annotation_gtf (str, path to GTF file with annotation)

  • cellranger_reference (str)

  • cellranger_premrna_reference (str)

  • cellranger_atac_reference (str)

  • cellranger_arc_reference (str)

  • cellranger_probe_set (str)

Parameters:
  • section (str) – name of the section to retrieve the settings from

  • config (Config) – Config object with settings loaded

Returns:

dictionary of option:value pairs.

Return type:

AttributeDictionary

get_sequencer_config(section, config)

Retrieve ‘sequencer’ configuration options from .ini file

Given the name of a section (e.g. ‘sequencer:SN7001250’), fetch the data associated with the sequencer instrument and return in an AttributeDictionary object.

The items that can be extracted are:

  • platform (compulsory, str)

  • model (str, default ‘None’)

Parameters:
  • section (str) – name of the section to retrieve the settings from

  • config (Config) – Config object with settings loaded

Returns:

dictionary of option:value pairs.

Return type:

AttributeDictionary

has_subsections(section)

Check if section contains subsections

Parameters:

section (str) – name of the section to check

list_params(pattern=None, exclude_undefined=False)

Return (yield) all the stored parameters

NB parameters are referenced by the names that appear in the config file (rather than the internal representation within the Settings instance).

Parameters:
  • pattern (str) – optional glob-style pattern; if supplied then only parameters matching the pattern will be returned

  • exclude_undefined (bool) – if True then only undefined parameters (i.e. those with null values) will be returned

Yields:

String

parameter names of the form

SECTION[:SUBSECTION][.ATTR]

report_settings(exclude_undefined=False)

Report the settings read from the config file

Parameters:

exclude_undefined (bool) – if True then parameters with null values will not be shown (default: show all parameters)

Returns:

report of the settings.

Return type:

String

resolve_undefined_params()

Set non-null values for all parameters which are null

Resolution has three stages:

  • Fallback parameters: if a parameter is undefined and has a defined fallback that value is assigned

  • Default paramaters: if a parameter is undefined after fallbacks are exhausted and has a defined default then that value is assigned

  • Default runner: any undefined runner parameters are assigned the value of the default runner, if set

Any parameters that are still undefined at this point are then assigned the value of ‘None’.

save(out_file=None, exclude_undefined=True)

Save the current configuration to the config file

If no config file was specified on initialisation then this method doesn’t do anything.

Parameters:
  • out_file (str) – specify output file (default: overwrite initial config file)

  • exclude_undefined (bool) – if True then parameters with null values will not be written to the output config file (default)

set(param, value)

Update a configuration parameter value

NB parameters are referenced by the names that appear in the config file (rather than the internal representation within the Settings instance).

Parameters:
  • param (str) – an identifier of the form SECTION[:SUBSECTION].ATTR which specifies the parameter to update

  • value (str) – the new value of the parameter

auto_process_ngs.settings.fetch_reference_data(s, name)

Fetch specific reference data for all organisms

Given the name of a reference data item (e.g. ‘star_index’), extracts the relevant data for all organisms from the supplied settings object and returns it as a dictionary.

Parameters:
  • s (Settings) – populated Settings instance

  • name (str) – name of the reference data items required (e.g. ‘star_index’)

Returns:

keys are organism names and

values are the corresponding reference data.

Return type:

Dictionary

auto_process_ngs.settings.get_config_dir()

Return location of config directory

Returns the path to the ‘config’ directory, or None if it doesn’t exist.

auto_process_ngs.settings.get_install_dir()

Return location of top-level directory of installation

This is a directory one or more level above the location of this module which contains a ‘config’ subdir with an ‘auto_process.ini.sample’ file, for example: if this file is located in

/opt/auto_process/lib/python3.6/site-packages/auto_process_ngs

then each level will be searched until a matching ‘config’ dir is located.

If it can’t be located then the directory of this module is returned.

auto_process_ngs.settings.locate_settings_file(name='auto_process.ini', create_from_sample=False)

Locate configuration settings file

Look for a configuration settings file (default name ‘auto_process.ini’). The search path is:

  1. file specified by the AUTO_PROCESS_CONF environment variable (if it exists)

  2. current directory

  3. ‘config’ subdir of installation location

  4. top-level installation location

The first file with a matching name is returned.

If no matching file is located but one of the locations contains a file with the correct name ending in ‘.sample’, and if the ‘create_from_sample’ argument is set, then use this to make a settings file in the same location.

Returns the path to a settings file, or None if one isn’t found.

auto_process_ngs.settings.show_dictionary(d, indent='   ', exclude_value=None)

Print the contents of a dictionary

Parameters:
  • d (str) – dictionary instance to show

  • exclude_value (object) – optional, if not ‘None’ then don’t include entries which match this value