`auto_process_ngs.mock`

Provides classes for mocking up examples of inputs and outputs for various parts of the process pipeline (including example directory structures), as well as mock executables, to be used in testing.

The core classes are:

MockAnalysisDir: create mock auto-process analysis directories
MockAnalysisProject: create mock auto-process project directories

These can be used to configure and create mock directories mimicking “minimal” versions of analysis directories and projects.

Additional mock artefacts (e.g. QC outputs, barcode analysis etc) can be added to the mock directories once they have been created, using the “updater” classes:

UpdateAnalysisDir: add artefacts to an analysis directory
UpdateAnalysisProject: add artefacts to an analysis project

There is also a convenience factory class which provides methods to quickly make “default” analysis directories for testing:

MockAnalysisDirFactory

It is also possible to make mock executables which mimick some of the external software required for parts of the pipeline:

MockBcl2fastq2Exe
MockBclConvertExe
Mock10xPackageExe
MockFastqScreen
MockFastQC
MockFastqStrandPy
MockGtf2bed
MockSeqtk
MockStar
MockSamtools
MockPicard
MockRSeQC
MockQualimap
MockMultiQC
MockConda
MockBowtieBuild
MockBowtie2Build

There also is a wrapper for the ‘Mock10xPackageExe’ class which is maintained for backwards compatibility:

MockCellrangerExe

There are supporting standalone functions for mocking outputs:

make_mock_bcl2fastq2_output: create mock output from bcl2fastq
make_mock_analysis_project: create a mock analysis project directory

class auto_process_ngs.mock.DirectoryUpdater(base_dir)

Base class for updating mock directories

Provides the following methods:

add_subdir: adds arbitrary subdirectory
add_file: adds arbitrary file

add_file(filen, content=None)

Add an arbitrary file to the base dir

Parameters:

filen (str) – path of file to add
content (str) – if supplied then will be written as content of new file

add_subdir(dirn)

Add an arbitrary directory to the base dir

Parameters:: dirn (str) – path of directory to add

class auto_process_ngs.mock.Mock10xPackageExe(path, exit_code=0, platform=None, assert_bases_mask=None, assert_include_introns=None, assert_chemistry=None, assert_force_cells=None, assert_filter_single_index=None, assert_filter_dual_index=None, assert_rc_i2_override=None, reads=None, multiome_data=None, multi_outputs=None, version=None)

Create mock 10xGenomics pipeline executable

This class can be used to create a mock 10xGenomics pipeline executable, which in turn can be used in place of the actual pipeline software (e.g. cellranger) for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> Mock10xPackageExe.create("/tmpbin/cellranger")

The resulting executable will generate mock outputs when run on actual or mock Illumina sequencer output directories (mock versions can be produced using the ‘mock.IlluminaRun’ class in the genomics-bcftbx package).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

static create(path, exit_code=0, missing_fastqs=None, platform=None, assert_bases_mask=None, assert_include_introns=None, assert_chemistry=None, assert_force_cells=None, assert_filter_single_index=None, assert_filter_dual_index=None, assert_rc_i2_override=None, reads=None, multiome_data=None, multi_outputs=None, version=None)

Create a “mock” 10xGenomics package executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
exit_code (int) – exit code that the mock executable should complete with
missing_fastqs (list) – list of Fastq names that will not be created
platform (str) – platform for primary data (if it cannot be determined from the directory/instrument name)
assert_bases_mask (str) – if set then check that the supplied bases mask matches this value
assert_include_introns (bool) – if set to True/False then check that the ‘–include-introns’ option was/n’t set (ignored if set to None)
assert_chemistry (str) – if set then check that the supplied chemistry specification matches this value
assert_force_cells (int) – if set then check that the ‘–force-cells’ option was specified with this value
assert_filter_single_index (bool) – if set to True/False then check that the ‘–filter-single-index’ option was/n’t supplied (ignored if set to None)
assert_filter_dual_index (bool) – if set to True/False then check that the ‘–filter-dual-index’ option was/n’t supplied (ignored if set to None)
assert_rc_i2_override (str) – check that the ‘–rc-i2-override’ option was supplied and set to the supplied value (ignore if set to None)
reads (list) – list of ‘reads’ that will be created
multiome_data (str) – either ‘GEX’ or ‘ATAC’ (when mocking ‘cellranger-arc’)
multi_outputs (str) – set type of outputs for ‘cellranger multi’ (either ‘cellplex’ or ‘flex’)
version (str) – version of package to report

main(args): Internal: provides mock 10xGenomics package functionality

class auto_process_ngs.mock.MockAnalysisDir(run_name, platform, unaligned_dir='bcl2fastq', fmt='bcl2fastq2', bases_mask='auto', paired_end=True, lanes=None, no_lane_splitting=False, no_undetermined=False, top_dir=None, params=None, metadata=None, readme=None, project_metadata=None, include_stats_files=False)

Utility class for creating mock auto-process analysis directories

The MockAnalysisDir class allows artificial analysis directories to be defined, created and populated, and then destroyed.

These artifical directories are intended to be used for testing purposes.

Two styles of analysis directories can be produced: ‘casava’-style aims to mimic that produced from the CASAVA and bcl2fastq 1.8 processing software; ‘bcl2fastq2’ mimics that from the bcl2fastq 2.* software.

Basic example usage:

>>> mockdir = MockAnalysisDir('130904_PJB_XXXXX','miseq',fmt='casava')
>>> mockdir.add_fastq_batch('PJB','PJB1','PJB1_GCCAAT',lanes=[1,])
>>> ...
>>> mockdir.create()

This will make a CASAVA-style directory structure like:

1130904_PJB_XXXXX/

metadata.info bcl2fastq/

Project_PJB/

Sample_PJB1/
PJB1_GCCAAT_L001_R1_001.fastq.gz PJB1_GCCAAT_L001_R2_001.fastq.gz

PJB/: README.info fastqs/

PJB1_GCCAAT_L001_R1_001.fastq.gz PJB1_GCCAAT_L001_R2_001.fastq.gz

…

To delete the physical directory structure when finished:

>>> mockdata.remove()

create(no_project_dirs=False)

Build and populate the directory structure

Creates the directory structure on disk which has been defined within the MockAnalysisDir object.

Invoke the ‘remove’ method to delete the directory structure.

The contents of the MockAnalysisDir object can be modified after the directory structure has been created, but changes will not be reflected on disk. Instead it is necessary to first remove the directory structure, and then re-invoke the create method.

‘create’ raises an OSError exception if any part of the directory structure already exists.

Parameters:: no_project_dirs (bool) – if False then don’t create analysis project subdirectories (these are created by default)

class auto_process_ngs.mock.MockAnalysisDirFactory

Collection of convenient pre-populated test cases

classmethod bcl2fastq2(run_name, platform, paired_end=True, unaligned_dir='bcl2fastq', no_lane_splitting=True, reads=None, top_dir=None, params=None, metadata=None, project_metadata=None, bases_mask='auto', include_stats_files=False): Basic analysis dir from bcl2fastq v2

classmethod casava(run_name, platform, paired_end=True, unaligned_dir='bcl2fastq', params=None, metadata=None, top_dir=None): Basic analysis dir from CASAVA/bcl2fastq v1.8

class auto_process_ngs.mock.MockAnalysisProject(name, fastq_names=None, fastq_dir=None, metadata={})

Utility class for creating mock auto-process project directories

Example usage:

>>> m = MockAnalysisProject('PJB',('PJB1_S1_R1_001.fastq.gz',
...                                'PJB1_S1_R2_001.fastq.gz'))
>>> m.create()

add_fastq(fq): Add a Fastq file to the project

create(top_dir=None, readme=True, scriptcode=True, populate_fastqs=True)

Build and populate the directory structure

Parameters:

top_dir (str) – path to directory to create project directory underneath (default is pwd)
readme (boolean) – if True then write a README file
scriptcode (boolean) – if True then write a ScriptCode subdirectory
populate_fastqs (boolean) – if True then write content to the Fastq files

class auto_process_ngs.mock.MockBcl2fastq2Exe(exit_code=0, missing_fastqs=None, platform=None, assert_bases_mask=None, assert_no_lane_splitting=None, assert_create_fastq_for_index_read=None, assert_minimum_trimmed_read_length=None, assert_mask_short_adapter_reads=None, assert_adapter=None, assert_adapter2=None, assert_find_adapters_with_sliding_window=None, version=None)

Create mock bcl2fastq2 executable

This class can be used to create a mock bcl2fastq executable, which in turn can be used in place of the actual bcl2fastq software for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockBcl2fastq2Exe.create("/tmpbin/bcl2fastq")

The resulting executable will generate mock outputs when run on actual or mock Illumina sequencer output directories (mock versions can be produced using the ‘mock.IlluminaRun’ class in the genomics-bcftbx package).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument
Fastqs can be removed from the output by specifying their names in the missing_fastqs argument

The executable can also be configured to check supplied values:

the bases mask can be checked via the assert_bases_mask argument
lane splitting can be checked via the assert_no_lane_splitting argument
creation of Fastqs for index reads can be checked via the assert_create_fastq_for_index_read argument
adapter trimming and masking can be checked via the assert_minimum_trimmed_read_length and assert_mask_short_adapter_reads arguments
adapter sequences can be checked via the assert_adapter and assert_adapter2 arguments
sliding window algorith for adapter trimming can be checked via assert_find_adapters_with_sliding_window

static create(path, exit_code=0, missing_fastqs=None, platform=None, assert_bases_mask=None, assert_no_lane_splitting=None, assert_create_fastq_for_index_read=None, assert_minimum_trimmed_read_length=None, assert_mask_short_adapter_reads=None, assert_adapter=None, assert_adapter2=None, assert_find_adapters_with_sliding_window=None, version='2.20.0.422')

Create a “mock” bcl2fastq executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
exit_code (int) – exit code that the mock executable should complete with
missing_fastqs (list) – list of Fastq names that will not be created
platform (str) – platform for primary data (if it cannot be determined from the directory/instrument name)
assert_bases_mask (str) – if set then assert that bases mask matches the supplied string
assert_lane_splitting (bool) – if set then assert that –no-lane-splitting matches the supplied boolean value
assert_create_fastq_for_index_read – (bool): if set then assert that –create-fastq-for-index-read matches the supplied boolean value
assert_minimum_trimmed_read_length (int) – if set then assert that –minimum-trimmed-read-length matches the supplied value
assert_mask_short_adapter_reads (int) – if set then assert that –mask-short-adapter-reads matches the supplied value
assert_adapter (str) – if set then assert that the adapter sequence in the sample sheet matches the supplied value
assert_adapter2 (str) – if set then assert that the adapter sequence for read2 in the sample sheet matches the supplied value
assert_find_adapters_with_sliding_window – (bool): if set then assert that –find-adapters-with-sliding-window matches the supplied boolean value
version (str) – version of bcl2fastq2 to imitate

main(args): Internal: provides mock bcl2fastq2 functionality

class auto_process_ngs.mock.MockBclConvertExe(exit_code=0, missing_fastqs=None, platform=None, assert_override_cycles=None, assert_no_lane_splitting=None, assert_create_fastq_for_index_read=None, assert_minimum_trimmed_read_length=None, assert_mask_short_reads=None, assert_adapter1=None, assert_adapter2=None, version=None)

Create mock bcl-convert executable

This class can be used to create a mock bcl-convert executable, which in turn can be used in place of the actual BCLConvert software for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockBclConvertExe.create("/tmpbin/bcl-convert")

The resulting executable will generate mock outputs when run on actual or mock Illumina sequencer output directories (mock versions can be produced using the ‘mock.IlluminaRun’ class in the genomics-bcftbx package).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument
Fastqs can be removed from the output by specifying their names in the missing_fastqs argument

The executable can also be configured to check supplied values:

the masking of cycles can be checked via the assert_override_cycles argument
lane splitting can be checked via the assert_no_lane_splitting argument
creation of Fastqs for index reads can be checked via the assert_create_fastq_for_index_read argument
adapter trimming and masking can be checked via the assert_minimum_trimmed_read_length and assert_mask_short_reads arguments
adapater sequences can be checked via the assert_adapter1 and assert_adapter2 arguments

static create(path, exit_code=0, missing_fastqs=None, platform=None, assert_override_cycles=None, assert_no_lane_splitting=None, assert_create_fastq_for_index_read=None, assert_minimum_trimmed_read_length=None, assert_mask_short_reads=None, assert_adapter1=None, assert_adapter2=None, version='3.7.5')

Create a “mock” bcl-convert executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
exit_code (int) – exit code that the mock executable should complete with
missing_fastqs (list) – list of Fastq names that will not be created
platform (str) – platform for primary data (if it cannot be determined from the directory/instrument name)
assert_override_cycles (str) – if set then assert that the ‘OverrideCycles’ setting matches the supplied string
assert_lane_splitting (bool) – if set then assert that –no-lane-splitting matches the supplied boolean value
assert_create_fastq_for_index_read – (bool): if set then assert that –create-fastq-for-index-read matches the supplied boolean value
assert_minimum_trimmed_read_length (int) – if set then assert that –minimum-trimmed-read-length matches the supplied value
assert_mask_short_reads (int) – if set then assert that –mask-short-adapter-reads matches the supplied value
assert_adapter1 (str) – if set then assert that the adapter sequence in the sample sheet matches the supplied value
assert_adapter2 (str) – if set then assert that the adapter sequence for read2 in the sample sheet matches the supplied value
version (str) – version of BCLConvert to imitate

main(args): Internal: provides mock bcl-convert functionality

class auto_process_ngs.mock.MockBowtie2Build(path, exit_code=0)

Create mock bowtie2-build

This class can be used to create a mock bowtie2-build executable, which in turn can be used in place of an actual executable for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockBowtie2Build.create("/tmpbin/bowtie2-build")

The resulting executable will generate mock outputs when run on the appropriate files (ignoring their contents).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

static create(path, exit_code=0)

Create a “mock” bowtie2-build executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock bowtie-build functionality

class auto_process_ngs.mock.MockBowtieBuild(path, exit_code=0)

Create mock bowtie-build

This class can be used to create a mock bowtie-build executable, which in turn can be used in place of an actual executable for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockBowtieBuild.create("/tmpbin/bowtie-build")

The resulting executable will generate mock outputs when run on the appropriate files (ignoring their contents).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

static create(path, exit_code=0)

Create a “mock” bowtie-build executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock bowtie-build functionality

class auto_process_ngs.mock.MockCellrangerExe(path, exit_code=0, platform=None, assert_bases_mask=None, assert_include_introns=None, assert_chemistry=None, assert_force_cells=None, assert_filter_single_index=None, assert_filter_dual_index=None, assert_rc_i2_override=None, reads=None, multiome_data=None, multi_outputs=None, version=None)

Wrapper for Mock10xPackageExe

Maintained for backwards-compatibility

class auto_process_ngs.mock.MockConda

Create mock conda installation

This class can be used to create a mock conda installation consisting of:

bin subdirectory with mock conda executable and ‘activate’ script
envs subdirectory

This can be used in place of an actual conda installation for testing purposes.

To create a mock installation, use the ‘create’ static method, e.g.

>>> MockCondaExe.create("/tmpbin/conda")

The resulting conda executable supports --version and the create command, and will generate mock outputs for both.

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument
the ‘create’ command can be forced to fail for all inputs by setting the create_fails argument
the reported version can be set via the version argument

static create(path, version='4.10.3', create_fails=False, activate_fails=False, exit_code=0)

Create a “mock” fastq_strand.py executable

Parameters:

path (str) – path to the top-level directory for the mock conda installation (which must not exist, however the directory it will be created in must be present).
version (str) – version that mock conda will claim to be
create_fails (bool) – if True then the ‘create’ subcommand of the mock conda executable will fail.
activate_fails (bool) – if True then the ‘activate’ script in the mock conda installation will return with value 1 (i.e. an error code).
exit_code (int) – exit code that the mock executable should complete with

class auto_process_ngs.mock.MockFastQC(version=None, no_outputs=False, exit_code=0)

Create mock fastqc

This class can be used to create a mock fastqc executable, which in turn can be used in place of the actual fastqc program for testing purposes.

To create a mock script, use the ‘create’ static method, e.g.

>>> MockFastQC.create("/tmpbin/fastqc")

The resulting executable will generate mock outputs when run on Fastq files (ignoring their content).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

static create(path, version=None, no_outputs=False, exit_code=0)

Create a “mock” illumina.sh “script”

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
version (str) – explicit version string
no_outputs (bool) – if True then make don’t create mock outputs for FastQC
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock fastqc functionality

class auto_process_ngs.mock.MockFastqScreen(version=None, no_outputs=False, exit_code=0)

Create mock fastq_screen

This class can be used to create a mock fastq_screen executable, which in turn can be used in place of the actual fastq_screen program for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockFastqScreen.create("/tmpbin/fastq_screen")

The resulting executable will generate mock outputs when run on a Fastq file (ignoring its content).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument
outputs for specific stages can be removed by specifying their names in the missing_fastqs argument

static create(path, version=None, no_outputs=False, exit_code=0)

Create a “mock” fastq_screen executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
version (str) – explicit version string
no_outputs (bool) – if True then don’t create outputs (default: False, do create outputs)
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock fastq_screen functionality

class auto_process_ngs.mock.MockFastqStrandPy(no_outputs=False, exit_code=0)

Create mock fastq_strand.py executable

This class can be used to create a mock fastq_strand.py executable, which in turn can be used in place of the actual fastq_strand.py executable for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockFastqStrandPy.create("/tmpbin/fastq_strand.py")

The resulting executable will generate mock outputs when run on a pair of Fastq files (ignoring their contents).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument
the outputs can be suppressed by setting the no_output argument to True

static create(path, no_outputs=False, exit_code=0)

Create a “mock” fastq_strand.py executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
no_outputs (bool) – if True then don’t create any of the expected outputs
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock fastq_strand.py functionality

class auto_process_ngs.mock.MockGtf2bed(version=None, no_outputs=False, exit_code=0)

Create mock ‘gtf2bed’ (from bedops)

This class can be used to create a mock gtf2bed executable, which in turn can be used in place of the actual gtf2bed program for testing purposes.

To create a mock script, use the ‘create’ static method, e.g.

>>> MockGtf2bed.create("/tmpbin/gtf2bed")

The resulting executable will generate mock outputs when run on GTF files (ignoring their content).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

The following flags can also be used:

no_outputs configures the mock executable not to write any output

static create(path, version=None, no_outputs=False, exit_code=0)

Create a mock ‘gtf2bed’ utility

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
version (str) – explicit version string
no_outputs (bool) – if True then make don’t create mock outputs
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock gtf2bed functionality

class auto_process_ngs.mock.MockMultiQC(version=None, no_outputs=False, exit_code=0)

Create mock MultiQC executable

This class can be used to create a mock multiqc executable, which in turn can be used in place of the actual multiqc executable for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockMultiQC.create("/tmpbin/multiqc")

The resulting executable will generate mock outputs when run on a directory (ignoring its contents).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument
the outputs can be suppressed by setting the no_output argument to True

static create(path, version=None, no_outputs=False, exit_code=0)

Create a “mock” multiqc executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
version (str) – explicit version string
no_outputs (bool) – if True then don’t create any of the expected outputs
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock multiqc functionality

class auto_process_ngs.mock.MockPicard(path, exit_code=0)

Create mock Picard tools

This class can be used to create a mock Picard tools executable, which in turn can be used in place of an actual executable for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockPicard.create("/tmpbin/picard")

The resulting executable will generate mock outputs when run on a directory (ignoring its contents).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

static create(path, exit_code=0)

Create a “mock” Picard executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock Picard tools functionality

class auto_process_ngs.mock.MockQualimap(path, exit_code=0)

Create mock Qualimap

This class can be used to create a mock Qualimap executable, which in turn can be used in place of an actual executable for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockQualimap.create("/tmpbin/qualimap")

The resulting executable will generate mock outputs when run on a directory (ignoring its contents).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

static create(path, exit_code=0)

Create a “mock” Qualimap executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock Qualimap functionality

class auto_process_ngs.mock.MockRSeQC(path, exit_code=0)

Create mock RSeQC components

This class can be used to create a mock RSeQC component (e.g. infer_experiment.py), which in turn can be used in place of an actual executable for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockRSeQC.create("/tmpbin/infer_experiment.py")

The resulting executable will generate mock outputs when run on a directory (ignoring its contents).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

static create(path, exit_code=0)

Create a “mock” RSeQC component executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock RSeQC functionality

class auto_process_ngs.mock.MockSamtools(path, exit_code=0)

Create mock samtools installation

This class can be used to create a mock samtools executable, which in turn can be used in place of the actual samtools package, for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockSamtools.create("/tmpbin/samtools")

The resulting executable will generate mock outputs according to the supplied command line.

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

static create(path, exit_code=0)

Create a “mock” samtools executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock samtools functionality

class auto_process_ngs.mock.MockSeqtk(version=None, no_outputs=False, exit_code=0)

Create mock ‘seqtk’

This class can be used to create a mock seqtk executable, which in turn can be used in place of the actual seqtk program for testing purposes.

To create a mock script, use the ‘create’ static method, e.g.

>>> MockGtf2bed.create("/tmpbin/seqtk")

The resulting executable will generate mock outputs when run on Fastq files (ignoring their content).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument

The following flags can also be used:

no_outputs configures the mock executable not to write any output

static create(path, version=None, no_outputs=False, exit_code=0)

Create a mock ‘seqtk’ utility

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
version (str) – explicit version string
no_outputs (bool) – if True then make don’t create mock outputs
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock seqtk functionality

class auto_process_ngs.mock.MockStar(path, version=None, unmapped_output=False, no_outputs=False, exit_code=0)

Create mock STAR executable

This class can be used to create a mock STAR executable, which in turn can be used in place of the actual STAR executable for testing purposes.

To create a mock executable, use the ‘create’ static method, e.g.

>>> MockSTAR.create("/tmpbin/star")

The resulting executable will generate mock outputs when run on a directory (ignoring its contents).

The executable can be configured on creation to produce different error conditions when run:

the exit code can be set to an arbitrary value via the exit_code argument
mock unmapped output can be produced via the unmapped_output argument
the outputs can be suppressed by setting the no_output argument to True

static create(path, version=None, unmapped_output=False, no_outputs=False, exit_code=0)

Create a “mock” star executable

Parameters:

path (str) – path to the new executable to create. The final executable must not exist, however the directory it will be created in must.
version (str) – explicit version string
unmapped_output (bool) – if True then produce mock “unmapped” output
no_outputs (bool) – if True then don’t create any of the expected outputs
exit_code (int) – exit code that the mock executable should complete with

main(args): Internal: provides mock STAR functionality

class auto_process_ngs.mock.UpdateAnalysisDir(ap)

Utility class to add mock artefacts to an AnalysisDir

Provides the following methods:

add_processing_report
add_barcode_analysis
add_10x_mkfastq_qc_output
add_cellranger_qc_output

Example usage:

>>> m = MockAnalysisDirFactory.bcl2fastq2(
...     '160621_M00879_0087_000000000-AGEW9',
...     'miseq')
>>> m.create()
>>> ap = AutoProcess(m.dirn)
>>> UpdateAnalysisDir(ap).add_processing_report()

add_10x_mkfastq_qc_output(pkg, lanes=None)

Add mock 10xGenomics mkfastq QC report

Parameters:

pkg (str) – 10xGenomics package (e.g. ‘cellranger’, ‘cellranger-atac’ etc)
lanes (str) – optional, specify lane numbers for the report

add_barcode_analysis(barcode_analysis_dir='barcode_analysis')

Add mock barcode analysis outputs

Parameters:: barcode_analysis_dir (str) – name of barcode analysis subdirectory (default: ‘barcode_analysis’

add_cellranger_qc_output(lanes=None)

Add mock cellranger QC report

Parameters:: lanes (str) – optional, specify lane numbers for the report

add_processing_report(name='processing_qc.html')

Add a ‘processing_qc.html’ file

Parameters:: name (str) – optionally, specify a non-standard report name (default: ‘processing_qc.html’)

class auto_process_ngs.mock.UpdateAnalysisProject(project)

Utility class to add mock artefacts to an AnalysisDir

Provides the following methods:

add_fastq_set
add_qc_outputs
add_icell8_outputs
add_cellranger_count_outputs
add_cellranger_multi_outputs

Example usage:

>>> m = MockAnalysisProject("PJB",('PJB1_S1_R1_001.fasta.gz,
...                                'PJB1_S1_R2_001.fasta.gz))
>>> m.create()
>>> p = AnalysisProject(m.name,m.name)
>>> UpdateAnalysisProject(p).add_qc_outputs()

add_cellranger_count_outputs(qc_dir=None, cellranger='cellranger', reference_data_path='/data/refdata-cellranger-1.2.0', prefix='cellranger_count', legacy=False)

Add mock ‘cellranger count’ outputs to project

Parameters:

qc_dir (str) – specify non-default QC output directory
cellranger (str) – specify the 10xGenomics software package to add outputs for (defaults to ‘cellranger’; alternatives are ‘cellranger-atac’)
reference_data_path (str) – optionally specify path for reference dataset (doesn’t need to exist)
prefix (str) – leading subdirectory for cellranger count outputs (defaults to ‘cellranger_count’; ignored if ‘legacy’ style outputs are generated)
legacy (bool) – if True then generate ‘legacy’ style cellranger outputs

add_cellranger_multi_outputs(config_csv=None, sample_names=None, reference_data_path=None, qc_dir=None, prefix='cellranger_multi')

Add mock ‘cellranger multi’ outputs to project

If a 10x multiplexing config file is supplied then the mock outputs are generated using the data within that file; otherwise the sample names and reference dataset path should be explicitly supplied.

Parameters:

config_csv (str) – path to a 10x multiplexing config file (if supplied then sample names and reference dataset path will be taken from this file)
sample_names (list) – optionally specify list of multiplexed sample names (ignored if ‘config_csv’ file is supplied)
reference_data_path (str) – optionally specify path to reference dataset (doesn’t need to exist; ignored if ‘config_csv’ file is supplied)
qc_dir (str) – specify non-default QC output directory
prefix (str) – leading subdirectory for cellranger count outputs (defaults to ‘cellranger_multi’; ignored if ‘legacy’ style outputs are generated)

add_fastq_set(fastq_set, fastqs)

Add an additional fastq set

Parameters:

fastq_set (str) – name of the new Fastq set/subdirectory
fastqs (list) – list of Fastq filenames to create in the new set

add_icell8_outputs(): Add mock ICell8 outputs to the project

add_qc_outputs(fastq_set=None, qc_dir=None, protocol='standardPE', include_fastq_strand=True, include_seqlens=True, include_multiqc=True, include_report=True, include_zip_file=True, legacy_screens=False, legacy_zip_name=False)

Add mock QC outputs

Parameters:

fastq_set (str) – specify non-default Fastq set to make QC outputs
qc_dir (str) – specify non-default QC output directory
protocol (str) – specify non-default QC protocol to use
include_fastq_strand (bool) – if True then add mock fastq_strand.py outputs
include_seqlens (bool) – if True then add mock sequence length outputs
include_multiqc (bool) – if True then add mock MultiQC outputs
include_report (bool) – if True then add mock QC report outputs
include_zip_file (bool) – if True then add mock ZIP archive for QC report
legacy_screens (bool) – if True then use old-style ‘illumina_qc.sh’ naming conventions for FastqScreen outputs
legacy_zip_name (bool) – if True then use old-style naming convention for ZIP file with QC outputs

auto_process_ngs.mock.make_mock_analysis_project(name='PJB', top_dir=None, protocol=None, paired_end=True, qc_dir='qc', fastq_dir='fastqs', fastq_names=None, sample_names=None, seq_data_samples=None, screens=('model_organisms', 'other_organisms', 'rRNA'), include_fastqc=True, include_fastq_screen=True, include_strandedness=True, include_seqlens=True, include_rseqc_infer_experiment=False, include_rseqc_genebody_coverage=False, include_picard_insert_size_metrics=False, include_qualimap_rnaseq=False, include_multiqc=True, include_cellranger_count=False, include_cellranger_multi=False, cellranger_pipelines=('cellranger',), cellranger_samples=None, cellranger_multi_samples=None, cellranger_version=None, legacy_screens=False, legacy_cellranger_outs=False)

Create a mock Analysis Project directory with QC artefacts

Parameters:

name (str) – name for the mock project
top_dir (str) – path to the directory to create the mock project directory under
protocol (str) – QC protocol to emulate
paired_end (bool) – whether the mock project should be paired-end (the default)
fastq_dir (str) – optional, set a non-standard directory for the Fastq files
fastq_names (list) – optional, explicit list of Fastq names
sample_names (list) – optional, explicit list of sample names
seq_data_samples (list) – list with subset of sample names which include sequence (i.e. biological) data
screens (list) – optional, list of non-standard FastqScreen panel names
include_fastqc (bool) – include outputs from Fastqc
include_fastq_screen (bool) – include outputs from FastqScreen
include_strandedness (bool) – include outputs from strandedness
include_seqlens (bool) – include sequence length metrics
include_rseqc_infer_experiment (bool) – include RSeQC infer_experiment.py outputs
include_rseqc_genebody_coverage (bool) – include RSeQC geneBody_coverage.py outputs
include_picard_insert_size_metrics (bool) – include Picard CollectInsertSizeMetrics outputs
include_qualimap_rnaseq (bool) – include Qualimap rnaseq outputs
include_multiqc (bool) – include MultiQC outputs
include_celllranger_count (bool) – include ‘cellranger count’ outputs
include_cellranger_multi (bool) – include ‘cellranger multi’ outputs
cellranger_pipelines (list) – list of 10xGenomics pipelines to make mock outputs for (e.g. ‘cellranger’, ‘cellranger-atac’ etc)
cellranger_samples (list) – list of sample names to produce ‘cellranger count’ outputs for
cellranger_multi_samples (list) – list of sample names to produce ‘cellranger multi’ outputs for
cellranger_version (str) – if set then specifies version of Cellranger to mimick
legacy_screens (bool) – if True then use legacy naming convention for FastqScreen outputs
legacy_cellranger_outs (bool) – if True then use legacy naming convention for 10xGenomics pipeline outputs

Returns:

path to the mock analysis project that was created.

Return type:

String

auto_process_ngs.mock.make_mock_bcl2fastq2_output(out_dir, lanes, sample_sheet=None, reads=None, no_lane_splitting=False, exclude_fastqs=None, create_fastq_for_index_read=False, paired_end=False, force_sample_dir=False)

Creates files & directories structure mimicking output from bcl2fastq2

Parameters:

out_dir (str) – path to output directory
lanes (iterable) – list of lanes to create output for
sample_sheet (str) – path to sample sheet file
reads (iterable) – list of ‘reads’ to create (e.g. (‘R1’,’R2’); defaults to (‘R1’) if not specified
no_lane_splitting (bool) – whether to produce mock Fastq files for each lane, or combine them across lanes (mimics the –no-lane-splitting option in bcl2fastq)
exclude_fastqs (iterable) – specifies a list of Fastq files to exclude from the outputs
create_fastq_for_index_read (bool) – whether to also include ‘I1’ etc Fastqs for index reads (ignored if ‘reads’ argument is set)
paired_end (bool) – whether to also include ‘R2’ and ‘I2’ Fastqs (ignored if ‘reads’ argument is set)
force_sample_dir (bool) – whether to force insertion of a ‘sample name’ directory for IEM4 sample sheets where sample name and ID are the same

auto_process_ngs.mock

`auto_process_ngs.mock`