auto_process_ngs.settings

Classes and functions for handling the collection of configuration settings for automated processing from ‘ini’-style files.

The core class is called GenericSettings and provides the base functionality for defining and handling arbitrary configuration files. This should be subclassed to handle specific configurations, by defining parameters in different sections along with their types and default values (as required).

For example, a simple .ini file could look like:

[general]
max_cores = 12
#max_jobs = 24
#poll_interval = 0.5

A simple settings subclass to handle this might look like:

class SimpleSettings(GenericSettings):
    def __init__(self, settings_file):
        GenericSettings.__init__(
            self,
            settings = {
                "general": { "max_cores": int,
                             "max_jobs": int,
                             "poll_interval": float },
            },
            defaults = {
                "general.max_cores": 8,
                "general.max_jobs": 24,
                "general.poll_interval": 0.5 },
            settings_file = settings_file)

The values of the parameters can then be accessed either via a chain of attributes, for example:

>>> s.general.max_cores
12

or via keys, for example:

>>> s["general"]["max_cores"]
12

To update values programmatically after they have been set, use the set method with the fully-qualified parameter name, for example:

>>> s.set("general.max_cores", 8)

To print the values of all parameters use the report_settings method:

>>> s.report_settings()

and to write the settings back to file use the save method:

>>> s.save()

More advanced configuration settings can also be defined, including named and user-defined subsections, and user-defined parameters.

Named subsections, for example:

[general:cores]
max_cores = 12

[general:jobs]
#max_jobs = 24
#poll_interval = 0.5

could be handled using by:

class SimpleSettings(GenericSettings):
    def __init__(self, settings_file):
        GenericSettings.__init__(
            self,
            settings = {
                "general:cores": { "max_cores": int },
                "general:jobs": { "max_jobs": int,
                                  "poll_interval": float },
            },
            defaults = {
                "general:cores.max_cores": 8,
                "general:jobs.max_jobs": 24,
                "general:jobs.poll_interval": 0.5 },
            settings_file = settings_file)

User-defined subsections can be managed using a wild card for the section name, for example:

[organism:human]
fasta = /data/hg38.fasta

[organism:mouse]
fasta = /data/mm10.fasta

could be handled by:

class SimpleSettings(GenericSettings):
    def __init__(self, settings_file):
        GenericSettings.__init__(
            self,
            settings = {
                "organism:*": { "fasta": str },
            },
            settings_file = settings_file)

Alternatively a section can include user-defined parameters in a similar way:

[templates]
default = "a,b,c"
mine = "d,e,f"
class SimpleSettings(GenericSettings):
    def __init__(self, settings_file):
        GenericSettings.__init__(
            self,
            settings = {
                "templates": { "*": str },
            },
            settings_file = settings_file)

Optionally it is also possible to specify “fallbacks” (parameters from which other unassigned parameters should take their values) and legacy parameters (deprecated parameters which can be used to supply values to other paramters, as a mechanism for backwards compatibility).

The Settings class is built using the GenericSettings class specifically to handle the auto_process.ini configuration file (by default called ‘auto_process.ini’).

The simplest usage example is:

>>> s = Settings()

The ‘locate_settings_file’ function is used implicitly to locate the settings file if none is given; it can also automatically create a settings file if none is found.

Two ‘type’ functions are also included:

  • ‘jobrunner’: returns a JobRunner instance

  • ‘path’: returns a string with variables expanded

class auto_process_ngs.settings.GenericSettings(settings, defaults={}, fallbacks={}, legacy={}, expand_vars=[], aliases={}, settings_file=None, resolve_undefined=True)

Base class for handling .ini configuration files

Parameters:
  • settings (dict) – mapping of section names to dictionaries defining parameters names and types

  • defaults (dict) – dictionary of fully qualified parameter names mapped to default settings

  • fallbacks (dict) – dictionary of fully qualified parameter names mapped to fallback parameters

  • legacy (dict) – dictionary of fully qualified parameter names mapped to “legacy” fallback parameter names

  • expand_vars (list) – list of fully qualified parameter names where the values can be expanded by substituting environment variables

  • aliases (dict) – dictionary defining “aliases” for section names

  • settings_file (str) – path to an .ini format file to load values from

  • resolve_undefined (bool) – if True (default) then assign values to “null” parameters by checking fallback parameters and default values

fetch_value(param)

Return the value stored against a parameter

NB parameters are referenced by the names that appear in the config file (rather than the internal representation within the Settings instance).

Parameters:

param (str) – parameter name of the form SECTION[:SUBSECTION][.NAME]

has_subsections(section)

Check if section contains subsections

Parameters:

section (str) – name of the section to check

list_params(pattern=None, exclude_undefined=False)

Return (yield) all the stored parameters

NB parameters are referenced by the names that appear in the config file (rather than the internal representation within the Settings instance).

Parameters:
  • pattern (str) – optional glob-style pattern; if supplied then only parameters matching the pattern will be returned

  • exclude_undefined (bool) – if True then only undefined parameters (i.e. those with null values) will be returned

Yields:

String

parameter names of the form

SECTION[:SUBSECTION][.NAME]

report_settings(exclude_undefined=False)

Report the settings read from the config file

Parameters:

exclude_undefined (bool) – if True then parameters with null values will not be shown (default: show all parameters)

Returns:

report of the settings.

Return type:

String

resolve_undefined_params()

Set non-null values for all parameters which are null

Resolution comprises the following stages:

  • Legacy parameters: if a parameter is unset (i.e. null) and has a “legacy” fallback defined, then it will be set to the value of the legacy parameter;

  • Fallback parameters: if a parameter is unset and has a standard fallback defined, then it will be set to the value of the fallback;

  • Default paramaters: if a parameter is unset after the fallbacks are exhausted but has a defined default, then it will be set to the default value.

A final round of checking fallback parameters is then performed, in cases fallbacks that were previously unset have been assigned a default value.

Any parameters that are still undefined at this point are then assigned the value of ‘None’.

save(out_file=None, exclude_undefined=True)

Save the current configuration to the config file

If no config file was specified on initialisation then this method doesn’t do anything.

Parameters:
  • out_file (str) – specify output file (default: overwrite initial config file)

  • exclude_undefined (bool) – if True then parameters with null values will not be written to the output config file (default)

section_name(name)

Return the internal name for a section

set(param, value)

Update a configuration parameter value

NB parameters are referenced by the names that appear in the config file (rather than the internal representation within the Settings instance).

Parameters:
  • param (str) – an identifier of the form SECTION[:SUBSECTION].NAME which specifies the parameter to update

  • value (str) – the new value of the parameter

property settings_file

Return path to settings file as a string

transform_parameter(param, source_pattern, target_pattern)

Transforms parameter by mapping from a source to a target pattern

Given a parameter, map this to another parameter using a source pattern (e.g. section.*) and a target pattern (e.g. new_section:*.value).

The value in the * wildcard position in the target is replaced by that in the wildcard position from the source (e.g. transforming section.name to new_section:name.value).

Parameters:
  • param (str) – fully-qualified source paramter name to transform

  • source_pattern (str) – pattern for source parameter

  • target_pattern (str) – pattern to use for transformation

Returns:

transformed parameter name.

Return type:

String

update_value(value, param_type=None)

Update raw value by stripping quotes and converting to type

‘None’ values are converted to “null”; other non-null values will have any surrounding quotes removed and then are converted to type.

Boolean values will also have ‘yes’ and ‘True’ converted to True and ‘no’ and ‘False’ converted to False.

Parameters:
  • value (object) – raw value

  • param_type (function) – type conversion function

class auto_process_ngs.settings.Settings(settings_file=None, resolve_undefined=True)

Handle local settings for auto_process parameters

Defines a set of configuration parameters and provides an interface for loading and accessing local settings defined in an .ini file.

If a config file isn’t explicitly specified then the instance will will attempt to locate one by searching various locations (as defined within the locate_auto_process_settings_file function) first using the name auto_process.ini, then with the legacy name settings.ini.

If a config file still cannot be found then the parameters will be set to any default values defined within the Settings class.

Parameters:
  • settings_file (str) – optional, path to .ini file to load parameters from

  • resolve_undefined (bool) – if True (default) then assign values to “null” parameters by checking fallback parameters and default values

auto_process_ngs.settings.fetch_reference_data(s, name)

Fetch specific reference data for all organisms

Given the name of a reference data item (e.g. ‘star_index’), extracts the relevant data for all organisms from the supplied settings object and returns it as a dictionary.

Parameters:
  • s (Settings) – populated Settings instance

  • name (str) – name of the reference data items required (e.g. ‘star_index’)

Returns:

keys are organism names and

values are the corresponding reference data.

Return type:

Dictionary

auto_process_ngs.settings.get_config_dir(file_path, sentry_file=None)

Return location of config directory

Given the path to a Python module file (typically from the __file__ variable within a module), attempts to return the path to the config directory (optionally the config directory must also contain a file with the supplied sentry_file name).

Returns:

path to the config directory (or None if it isn’t

found).

Return type:

String

Argument:

file_path (str): path to installed file for calling module sample_ini_file (str): optional, name of file that must

be found in the config directory

auto_process_ngs.settings.get_install_dir(file_path, sentry_file=None)

Return location of top-level dir of an installation

Given the path to a Python module file (typically from the __file__ variable within a module), attempts to return the path to the top-level installation directory.

This is assumed to be a directory one or more levels above the location of the supplied file path, which contains a config subdir (optionally the config directory must also contain a file with the supplied sentry_file name).

For example: if the file for this module is located in:

/opt/auto_process/lib/python3.6/site-packages/auto_process_ngs

then the function can be called with this module using:

>>> d = get_config_dir(__file__, sentry_file="auto_process.ini.sample")

and each level will be searched until a matching config directory is located which contains an auto_process.ini.sample file.

If a match can’t be found then the directory of this module is returned.

Parameters:
  • file_path (str) – path to installed file for calling module

  • sample_ini_file (str) – optional, name of file that must be found in the config directory

auto_process_ngs.settings.jobrunner(runner)

Implement a ‘type’ function for JobRunner instances

Parameters:

runner (object) – JobRunner instance or string definition

Returns:

instance created from the supplied

definition.

Return type:

JobRunner

auto_process_ngs.settings.locate_auto_process_settings_file(name='auto_process.ini', create_from_sample=False)

Locate configuration settings file for ‘auto_process_ngs’

Wraps the ‘locate_settings_file’ function with environment variables and search path specific to the ‘auto_process_ngs’ package.

Parameters:
  • name (str) – base name of configuration file (default: “auto_process.ini”)

  • create_from_sample (bool) – if True then create a new configuration file from sample version, if found (default: False, don’t create new file)

Returns:

path to the configuration file (None if one isn’t

found).

Return type:

String

auto_process_ngs.settings.locate_settings_file(name, env_vars=None, paths=None, create_from_sample=False)

Locate configuration settings file

Look for a configuration settings file with the specified name, first checking locations pointed at by the specified environment variables listed in ‘env_vars’ followed by the specified paths listed in ‘paths’, and returning the first match.

Environment variables and paths are checked in the order supplied.

If no match is found but one of the paths includes a sample configuration file (indicated by having the suffix ‘.sample’ appended to the name) and ‘create_from_sample’ is True, then a new configuration file will be created in that location based on the contents of the sample file.

Example usage:

>>> cf = locate_settings_file("auto_process.ini",
...                           env_vars=["AUTO_PROCESS_CONF"],
...                           paths=[os.getcwd(),
...                                  get_config_dir(),
...                                  get_install_dir()])
Parameters:
  • name (str) – name of .ini file to locate

  • env_vars (list) – list of environment variables to check for config files

  • paths (list) – list of directories to search for config files (default: current directory)

  • create_from_sample (bool) – if True then attempt to create a new settings file from a sample file, if located (default: False)

Returns:

path to the configuration file (None if one isn’t

found).

Return type:

String

auto_process_ngs.settings.path(p)

Implement a ‘type’ function for paths

Converts to a string and expands any variables in the resulting path

Parameters:

p (str) – path definition (can include shell variables)

Returns:

version of input path with any shell

variables expanded to their values.

Return type:

String

auto_process_ngs.settings.show_dictionary(d, indent='   ', exclude_value=None, exclude_wildcards=True)

Print the contents of a dictionary

Parameters:
  • d (str) – dictionary instance to show

  • exclude_value (object) – optional, if not ‘None’ then don’t include entries which match this value

  • exclude_wildcards (bool) – optional, if True (default) then don’t include keys which are wildcards (i.e. “*”)