auto_process_ngs.settings
Classes and functions for handling the collection of configuration settings for automated processing from ‘ini’-style files.
The core class is called GenericSettings and provides the base
functionality for defining and handling arbitrary configuration files.
This should be subclassed to handle specific configurations, by
defining parameters in different sections along with their types and
default values (as required).
For example, a simple .ini file could look like:
[general]
max_cores = 12
#max_jobs = 24
#poll_interval = 0.5
A simple settings subclass to handle this might look like:
class SimpleSettings(GenericSettings):
def __init__(self, settings_file):
GenericSettings.__init__(
self,
settings = {
"general": { "max_cores": int,
"max_jobs": int,
"poll_interval": float },
},
defaults = {
"general.max_cores": 8,
"general.max_jobs": 24,
"general.poll_interval": 0.5 },
settings_file = settings_file)
The values of the parameters can then be accessed either via a chain of attributes, for example:
>>> s.general.max_cores
12
or via keys, for example:
>>> s["general"]["max_cores"]
12
To update values programmatically after they have been set, use the
set method with the fully-qualified parameter name, for example:
>>> s.set("general.max_cores", 8)
To print the values of all parameters use the report_settings
method:
>>> s.report_settings()
and to write the settings back to file use the save method:
>>> s.save()
More advanced configuration settings can also be defined, including named and user-defined subsections, and user-defined parameters.
Named subsections, for example:
[general:cores]
max_cores = 12
[general:jobs]
#max_jobs = 24
#poll_interval = 0.5
could be handled using by:
class SimpleSettings(GenericSettings):
def __init__(self, settings_file):
GenericSettings.__init__(
self,
settings = {
"general:cores": { "max_cores": int },
"general:jobs": { "max_jobs": int,
"poll_interval": float },
},
defaults = {
"general:cores.max_cores": 8,
"general:jobs.max_jobs": 24,
"general:jobs.poll_interval": 0.5 },
settings_file = settings_file)
User-defined subsections can be managed using a wild card for the section name, for example:
[organism:human]
fasta = /data/hg38.fasta
[organism:mouse]
fasta = /data/mm10.fasta
could be handled by:
class SimpleSettings(GenericSettings):
def __init__(self, settings_file):
GenericSettings.__init__(
self,
settings = {
"organism:*": { "fasta": str },
},
settings_file = settings_file)
Alternatively a section can include user-defined parameters in a similar way:
[templates]
default = "a,b,c"
mine = "d,e,f"
class SimpleSettings(GenericSettings):
def __init__(self, settings_file):
GenericSettings.__init__(
self,
settings = {
"templates": { "*": str },
},
settings_file = settings_file)
Optionally it is also possible to specify “fallbacks” (parameters from which other unassigned parameters should take their values) and legacy parameters (deprecated parameters which can be used to supply values to other paramters, as a mechanism for backwards compatibility).
The Settings class is built using the GenericSettings class
specifically to handle the auto_process.ini configuration file
(by default called ‘auto_process.ini’).
The simplest usage example is:
>>> s = Settings()
The ‘locate_settings_file’ function is used implicitly to locate the settings file if none is given; it can also automatically create a settings file if none is found.
Two ‘type’ functions are also included:
‘jobrunner’: returns a JobRunner instance
‘path’: returns a string with variables expanded
- class auto_process_ngs.settings.GenericSettings(settings, defaults={}, fallbacks={}, legacy={}, expand_vars=[], aliases={}, settings_file=None, resolve_undefined=True)
Base class for handling .ini configuration files
- Parameters:
settings (dict) – mapping of section names to dictionaries defining parameters names and types
defaults (dict) – dictionary of fully qualified parameter names mapped to default settings
fallbacks (dict) – dictionary of fully qualified parameter names mapped to fallback parameters
legacy (dict) – dictionary of fully qualified parameter names mapped to “legacy” fallback parameter names
expand_vars (list) – list of fully qualified parameter names where the values can be expanded by substituting environment variables
aliases (dict) – dictionary defining “aliases” for section names
settings_file (str) – path to an .ini format file to load values from
resolve_undefined (bool) – if True (default) then assign values to “null” parameters by checking fallback parameters and default values
- fetch_value(param)
Return the value stored against a parameter
NB parameters are referenced by the names that appear in the config file (rather than the internal representation within the Settings instance).
- Parameters:
param (str) – parameter name of the form SECTION[:SUBSECTION][.NAME]
- has_subsections(section)
Check if section contains subsections
- Parameters:
section (str) – name of the section to check
- list_params(pattern=None, exclude_undefined=False)
Return (yield) all the stored parameters
NB parameters are referenced by the names that appear in the config file (rather than the internal representation within the Settings instance).
- Parameters:
pattern (str) – optional glob-style pattern; if supplied then only parameters matching the pattern will be returned
exclude_undefined (bool) – if True then only undefined parameters (i.e. those with null values) will be returned
- Yields:
String –
- parameter names of the form
SECTION[:SUBSECTION][.NAME]
- report_settings(exclude_undefined=False)
Report the settings read from the config file
- Parameters:
exclude_undefined (bool) – if True then parameters with null values will not be shown (default: show all parameters)
- Returns:
report of the settings.
- Return type:
String
- resolve_undefined_params()
Set non-null values for all parameters which are null
Resolution comprises the following stages:
Legacy parameters: if a parameter is unset (i.e. null) and has a “legacy” fallback defined, then it will be set to the value of the legacy parameter;
Fallback parameters: if a parameter is unset and has a standard fallback defined, then it will be set to the value of the fallback;
Default paramaters: if a parameter is unset after the fallbacks are exhausted but has a defined default, then it will be set to the default value.
A final round of checking fallback parameters is then performed, in cases fallbacks that were previously unset have been assigned a default value.
Any parameters that are still undefined at this point are then assigned the value of ‘None’.
- save(out_file=None, exclude_undefined=True)
Save the current configuration to the config file
If no config file was specified on initialisation then this method doesn’t do anything.
- Parameters:
out_file (str) – specify output file (default: overwrite initial config file)
exclude_undefined (bool) – if True then parameters with null values will not be written to the output config file (default)
- section_name(name)
Return the internal name for a section
- set(param, value)
Update a configuration parameter value
NB parameters are referenced by the names that appear in the config file (rather than the internal representation within the Settings instance).
- Parameters:
param (str) – an identifier of the form SECTION[:SUBSECTION].NAME which specifies the parameter to update
value (str) – the new value of the parameter
- property settings_file
Return path to settings file as a string
- transform_parameter(param, source_pattern, target_pattern)
Transforms parameter by mapping from a source to a target pattern
Given a parameter, map this to another parameter using a source pattern (e.g.
section.*) and a target pattern (e.g.new_section:*.value).The value in the
*wildcard position in the target is replaced by that in the wildcard position from the source (e.g. transformingsection.nametonew_section:name.value).- Parameters:
param (str) – fully-qualified source paramter name to transform
source_pattern (str) – pattern for source parameter
target_pattern (str) – pattern to use for transformation
- Returns:
transformed parameter name.
- Return type:
String
- update_value(value, param_type=None)
Update raw value by stripping quotes and converting to type
‘None’ values are converted to “null”; other non-null values will have any surrounding quotes removed and then are converted to type.
Boolean values will also have ‘yes’ and ‘True’ converted to True and ‘no’ and ‘False’ converted to False.
- Parameters:
value (object) – raw value
param_type (function) – type conversion function
- class auto_process_ngs.settings.Settings(settings_file=None, resolve_undefined=True)
Handle local settings for
auto_processparametersDefines a set of configuration parameters and provides an interface for loading and accessing local settings defined in an
.inifile.If a config file isn’t explicitly specified then the instance will will attempt to locate one by searching various locations (as defined within the
locate_auto_process_settings_filefunction) first using the nameauto_process.ini, then with the legacy namesettings.ini.If a config file still cannot be found then the parameters will be set to any default values defined within the
Settingsclass.- Parameters:
settings_file (str) – optional, path to .ini file to load parameters from
resolve_undefined (bool) – if True (default) then assign values to “null” parameters by checking fallback parameters and default values
- auto_process_ngs.settings.fetch_reference_data(s, name)
Fetch specific reference data for all organisms
Given the name of a reference data item (e.g. ‘star_index’), extracts the relevant data for all organisms from the supplied settings object and returns it as a dictionary.
- Parameters:
s (Settings) – populated Settings instance
name (str) – name of the reference data items required (e.g. ‘star_index’)
- Returns:
- keys are organism names and
values are the corresponding reference data.
- Return type:
Dictionary
- auto_process_ngs.settings.get_config_dir(file_path, sentry_file=None)
Return location of config directory
Given the path to a Python module file (typically from the
__file__variable within a module), attempts to return the path to theconfigdirectory (optionally theconfigdirectory must also contain a file with the suppliedsentry_filename).- Returns:
- path to the
configdirectory (or None if it isn’t found).
- path to the
- Return type:
String
- Argument:
file_path (str): path to installed file for calling module sample_ini_file (str): optional, name of file that must
be found in the
configdirectory
- auto_process_ngs.settings.get_install_dir(file_path, sentry_file=None)
Return location of top-level dir of an installation
Given the path to a Python module file (typically from the
__file__variable within a module), attempts to return the path to the top-level installation directory.This is assumed to be a directory one or more levels above the location of the supplied file path, which contains a
configsubdir (optionally theconfigdirectory must also contain a file with the suppliedsentry_filename).For example: if the file for this module is located in:
/opt/auto_process/lib/python3.6/site-packages/auto_process_ngs
then the function can be called with this module using:
>>> d = get_config_dir(__file__, sentry_file="auto_process.ini.sample")
and each level will be searched until a matching
configdirectory is located which contains anauto_process.ini.samplefile.If a match can’t be found then the directory of this module is returned.
- Parameters:
file_path (str) – path to installed file for calling module
sample_ini_file (str) – optional, name of file that must be found in the
configdirectory
- auto_process_ngs.settings.jobrunner(runner)
Implement a ‘type’ function for JobRunner instances
- Parameters:
runner (object) – JobRunner instance or string definition
- Returns:
- instance created from the supplied
definition.
- Return type:
JobRunner
- auto_process_ngs.settings.locate_auto_process_settings_file(name='auto_process.ini', create_from_sample=False)
Locate configuration settings file for ‘auto_process_ngs’
Wraps the ‘locate_settings_file’ function with environment variables and search path specific to the ‘auto_process_ngs’ package.
- Parameters:
name (str) – base name of configuration file (default: “auto_process.ini”)
create_from_sample (bool) – if True then create a new configuration file from sample version, if found (default: False, don’t create new file)
- Returns:
- path to the configuration file (None if one isn’t
found).
- Return type:
String
- auto_process_ngs.settings.locate_settings_file(name, env_vars=None, paths=None, create_from_sample=False)
Locate configuration settings file
Look for a configuration settings file with the specified name, first checking locations pointed at by the specified environment variables listed in ‘env_vars’ followed by the specified paths listed in ‘paths’, and returning the first match.
Environment variables and paths are checked in the order supplied.
If no match is found but one of the paths includes a sample configuration file (indicated by having the suffix ‘.sample’ appended to the name) and ‘create_from_sample’ is True, then a new configuration file will be created in that location based on the contents of the sample file.
Example usage:
>>> cf = locate_settings_file("auto_process.ini", ... env_vars=["AUTO_PROCESS_CONF"], ... paths=[os.getcwd(), ... get_config_dir(), ... get_install_dir()])
- Parameters:
name (str) – name of .ini file to locate
env_vars (list) – list of environment variables to check for config files
paths (list) – list of directories to search for config files (default: current directory)
create_from_sample (bool) – if True then attempt to create a new settings file from a sample file, if located (default: False)
- Returns:
- path to the configuration file (None if one isn’t
found).
- Return type:
String
- auto_process_ngs.settings.path(p)
Implement a ‘type’ function for paths
Converts to a string and expands any variables in the resulting path
- Parameters:
p (str) – path definition (can include shell variables)
- Returns:
- version of input path with any shell
variables expanded to their values.
- Return type:
String
- auto_process_ngs.settings.show_dictionary(d, indent=' ', exclude_value=None, exclude_wildcards=True)
Print the contents of a dictionary
- Parameters:
d (str) – dictionary instance to show
exclude_value (object) – optional, if not ‘None’ then don’t include entries which match this value
exclude_wildcards (bool) – optional, if True (default) then don’t include keys which are wildcards (i.e. “*”)