auto_process_ngs.cli.run_qc

Runs the QC pipeline standalone on an arbitrary set of Fastq files

class auto_process_ngs.cli.run_qc.InfoAction(option_strings, settings, nargs=None, *args, **kws)

Custom parser action for the –info option

Example usage:

>>> p.add_argument('--info',action=InfoAction,settings=settings)

where ‘settings’ should be a populated ‘Settings’ instance.

When invoked the action will display information on protocols, organisms and other configuration settings, and then exit.

auto_process_ngs.cli.run_qc.add_10x_options(p)

Cellranger/10x Genomics options

auto_process_ngs.cli.run_qc.add_advanced_options(p, use_legacy_screen_names)

Advanced options

auto_process_ngs.cli.run_qc.add_conda_options(p, enable_conda, conda_env_dir)

Conda options

auto_process_ngs.cli.run_qc.add_debug_options(p)

Debugging options

auto_process_ngs.cli.run_qc.add_deprecated_options(p)

Deprecated options

auto_process_ngs.cli.run_qc.add_job_control_options(p, max_cores, max_jobs, max_batches)

Job control options

auto_process_ngs.cli.run_qc.add_metadata_options(p)

Metadara options

auto_process_ngs.cli.run_qc.add_pipeline_options(p, fastq_subset_size, default_nthreads)

QC pipeline options

auto_process_ngs.cli.run_qc.add_reference_data_options(p)

Reference data options

auto_process_ngs.cli.run_qc.add_reporting_options(p)

Reporting options

auto_process_ngs.cli.run_qc.announce(title)

Print arbitrary string as a title

Prints the supplied string as a title, e.g.

>>> announce("Hello!")
... ======
... Hello!
... ======
Parameters:

title (str) – string to print

Returns:

None

auto_process_ngs.cli.run_qc.cleanup_atexit(tmp_project_dir)

Perform clean up actions on exit

Removes the temporary project directory created for running the QC

auto_process_ngs.cli.run_qc.display_info(s)

Displays information about the current configuration

The information includes the available QC protocols, organisms and FastqScreen conf files.

Parameters:

s (Settings) – populated Settings instance

auto_process_ngs.cli.run_qc.get_execution_environment()

Fetch information on the local execution environment

Interrogates the local system to get information on number of cores, memory etc.

It returns a dictionary-like object with the following elements:

  • ‘cpu_count’: total number of CPUs

  • ‘total_mem’: total amount of memory (Gb)

  • ‘nslots’: value of the ‘NSLOTS’ env variable

  • ‘max_cores’: maximum available cores

  • ‘max_mem’: maximum available memory (Gb)

  • ‘mem_per_core’: memory per core (Gb)

Available cores is the number of CPUs, or the value of ‘NSLOTS’ if set. Available memory is the proportion of total memory scaled by the number of available cores. Memory per core is the total memory divided by the total number of CPUs.

Returns:

elements are ‘cpu_count’,

’total_mem’, ‘nslots’, ‘max_cores’, ‘max_mem’ and ‘mem_per_core’

Return type:

AttributeDictionary

auto_process_ngs.cli.run_qc.process_inputs(input_list)

Process the inputs and return Fastqs etc

The inputs can be one of:

  • a subdirectory in a project

  • a project directory

  • a non-project directory with Fastqs

  • a ‘raw’ list of Fastqs

The function attempts to determine which type the inputs are, generate a list of Fastq files, and locate any related filesystem objects (for example a “parent” project directory).

It returns a dictionary-like object with the following elements:

  • ‘fastqs’: a list of Fastq files

  • ‘dir_path’: the directory supplied as an input, if any

  • ‘info_file’: path to an AnalysisProject metadata file

  • ‘extra_files’: any additional QC-related config files

Returns:

elements are ‘fastqs’, ‘dir_path’,

’info_file’ and ‘extra_files’

Return type:

AttributeDictionary