auto_process_ngs.icell8.pipeline
icell8.pipeline.py
Pipeline components for processing the outputs from the ICELL8 platform.
Pipeline classes:
ICell8QCFilter
ICell8FinalReporting
Pipeline command classes:
ICell8Statistics
SplitAndFilterFastqPair
BatchFastqs
ConcatFastqs
TrimFastqPair
FilterPolyGReads
ContaminantFilterFastqPair
Pipeline task classes:
SetupDirectories
CollectFiles
PairFastqs
GetICell8Stats
GetICell8PolyGStats
SplitFastqsIntoBatches
FilterICell8Fastqs
TrimReads
GetReadsWithPolyGRegions
FilterContaminatedReads
SplitByBarcodes
GroupFastqsByBarcode
GroupFastqsBySample
MergeBarcodeFastqs
MergeSampleFastqs
CheckICell8Barcodes
ConvertStatsToXLSX
ReportProcessing
UpdateProjectData
CleanupDirectory
Functions:
tmp_dir
convert_to_xlsx
- class auto_process_ngs.icell8.pipeline.BatchFastqs(*args, **kws)
Split reads from Fastqs into batches using (z)cat/split
Given a list of Fastq files, combines them and then splits into batches of a specified number of reads by running a combination of ‘(z)cat’ and ‘split’ commands.
Fastqs can be gzipped, but must have the same read number (i.e. R1 or R2).
- cmd()
Build the command
Must be implemented by the subclass and return a Command instance
- init(fastqs, batch_dir, basename, batch_size=5000000)
Create a new BatchFastqs instance
- Parameters:
fastqs (list) – list of input Fastq files
batch_dir (str) – destination directory to write output files to
basename (str) – basename for output Fastqs
batch_size (int) – number of reads per output FASTQ (in batch mode) (optional)
- class auto_process_ngs.icell8.pipeline.CheckICell8Barcodes(_name, *args, **kws)
Check the barcodes are consistent
This is a sanity check: ensure that the inline barcodes for all reads in the R1 Fastq for the barcode Fastq pairs matches the assigned barcode.
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastqs)
Initialise the CheckICell8Barcodes task
- Parameters:
fastqs (list) – Fastq files to check
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.CleanupDirectory(_name, *args, **kws)
Remove a directory and all its contents
- init(dirn)
Initialise the CleanupDirectory task
- Parameters:
dirn (str) – path to the directory to remove
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.CollectFiles(_name, *args, **kws)
Collect list of files matching glob pattern
This is a utility task that can be used to collect a list of files in a directory which matches a ‘glob’-style pattern.
It is intended to offer an alternative to the FileCollector class, when it is desirable to farm out the file collection to an external process (e.g. when there are very large numbers of files to examine).
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(dirn, pattern)
Initialise the CollectFiles task
- Parameters:
dirn (str) – path to the directory holding the files to be collected
pattern (str) – glob-style pattern to match to file names
- Outputs:
files: list of collected files
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.ConcatFastqs(*args, **kws)
Concatenate reads from multiple Fastqs into a single file
Given a list of Fastq files, combines them into a single Fastq using the ‘cat’ utility.
If the output FASTQ names end with .gz then they will be automatically compressed with gzip after concatenation.
FASTQs cannot be gzipped, and must all be same read number (i.e. R1 or R2).
- cmd()
Build the command
Must be implemented by the subclass and return a Command instance
- init(fastqs, concat_dir, fastq_out)
Create a new ConcatFastqs instance
- Parameters:
fastqs (list) – list of input FASTQ files
concat_dir (str) – destination directory to write output file to
fastq_out (str) – name of output Fastq file
- class auto_process_ngs.icell8.pipeline.ContaminantFilterFastqPair(*args, **kws)
Build command to run ‘icell8_contaminantion_filter.py’ utility
- cmd()
Build the command
Must be implemented by the subclass and return a Command instance
- init(fastq_pair, filter_dir, mammalian_conf, contaminants_conf, aligner=None, threads=None)
Create a new TrimFastqPair instance
- Parameters:
fastq_pair (list) – R1/R1 FASTQ file pair
filter_dir (str) – destination directory to write output files to
mammalian_conf (str) – path to FastqScreen .conf file with mammalian genome indexes
contaminants_conf (str) – path FastqScreen .conf file with contaminant genome indexes
aligner (str) – explicitly specify name of aligner to use with FastqScreen (e.g. ‘bowtie2’) (optional)
threads (int) – explicitly specify number of threads to run FastqScreen using (optional)
- class auto_process_ngs.icell8.pipeline.ConvertStatsToXLSX(_name, *args, **kws)
Convert the stats file to XLSX format
- init(stats_file, xlsx_file)
Initialise the ConvertStatsToXLSX task
- Parameters:
stats_file (str) – path to input stats file
xlsx_file (str) – path to output XLSX file
- Outputs:
xlsx_file: path to the output XLSX file
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.FilterContaminatedReads(_name, *args, **kws)
Filter ‘contaminated’ reads from Fastq files
Given a set of Fastqs, arrange into R1/R2 file pairs and run ‘fastq_screen’ on the R2 reads against panels of ‘mammalian’ and ‘contaminant’ organisms.
Read pairs where there is an exclusive match to the contaminants (i.e. without any match to the mammalian genomes) are excluded.
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastq_pairs, filter_dir, mammalian_conf, contaminants_conf, aligner=None, threads=None)
Initialise the FilterContaminatedReads task
- Parameters:
fastq_pairs (list) – input Fastq R1/R2 pairs
filter_dir (str) – destination directory to write output files to
mammalian_conf (str) – path to FastqScreen .conf file with mammalian genome indexes
contaminants_conf (str) – path FastqScreen .conf file with contaminant genome indexes
aligner (str) – explicitly specify name of aligner to use with FastqScreen (e.g. ‘bowtie2’) (optional)
threads (int) – explicitly specify number of threads to run FastqScreen using (default: taken from job runner)
- Outputs:
- pattern (str): glob-style pattern matching output
Fastq file names
fastqs (FileCollector): output Fastq files
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.FilterICell8Fastqs(_name, *args, **kws)
Perform read assignment and optional quality filtering
For each input R1/R2 Fastq file pair:
if a well list is supplied then check that the ICell8 barcode matches one in the well list
if filtering is turned on then remove reads where the ICell8 barcode and/or UMI fail to meet the minimum quality standard across all bases
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastq_pairs, filter_dir, well_list=None, mode='none', discard_unknown_barcodes=False, quality_filter=False)
Initialise the FilterICell8Fastqs task
- Parameters:
fastq_pairs (list) – input FASTQ R1/R2 file pairs
filter_dir (str) – destination directory to write output files to
well_list (str) – ‘well list’ file to use (optional)
mode (str) – mode to run the utility in
discard_unknown_barcodes (bool) – if True then discard read pairs where the barcode doesn’t match one of those in the well list file (nb well list file must also be supplied in this case) (all reads are kept by default)
quality_filter (bool) – if True then also do filtering based on barcode- and UMI-quality (no filtering is performed by default)
Outputs:
The returned object has the following properties:
fastqs: object with properties which point to iterators listing output Fastqs (see below)
patterns: object with properties which are glob-style patterns matching output Fastqs (see below)
The output Fastqs are:
assigned: Fastqs with reads assigned to known barcodes
unassigned: Fastqs with reads not assigned to known barcodes
failed_barcodes: Fastqs with reads which failed the barcode quality check
failed_umis: Fastqs with reads which failed the UMI quality check
For example:
output.pattern.assigned = glob pattern to match Fastqs with assigned reads
output.fastqs.unassigned = iterator listing Fastqs with unassigned reads
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.FilterPolyGReads(*args, **kws)
Run ‘cutadapt’ to fetch reads with poly-G regions
- cmd()
Build the command
Must be implemented by the subclass and return a Command instance
- init(fastq_pair, out_dir)
Create a new GetPolyGReads instance
- Parameters:
fastq_pair (list) – R1/R1 FASTQ file pair
out_dir (str) – destination directory to write output files to
- class auto_process_ngs.icell8.pipeline.GetICell8PolyGStats(_name, *args, **kws)
Generate statistics for ICell8 poly-G detection
Subclass of
GetICell8Stats
task; generates and appends additional column expressing poly-G read counts as a percentage of total filtered read counts for each barcode.- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.GetICell8Stats(_name, *args, **kws)
Generate statistics for ICell8 processing stage
Counts the reads and distinct UMIs per barcode for reads pooled from the set of supplied Fastqs and writes these to columns in a tab-delimited output file.
If the output file doesn’t exist then it will created. If ‘append’ isn’t specified then an existing file will be deleted and its contents lost.
The barcodes are either taken from the supplied well list file, or from the first column of the output file (if it exists).
If ‘unassigned’ is specified then stats will also be collected on reads which don’t match any barcode.
By default the counts are written to columns called
Nreads
andDistinct_UMIs
; a suffix can be specified to distinguish the counts from those from different stages.If the columns already exist in the file when appending then they will be overwritten.
- init(fastqs, stats_file, well_list=None, suffix=None, unassigned=False, append=False, nprocs=None, temp_dir=None)
Initialise the GetICell8Stats task
- Parameters:
fastqs (list) – list of Fastqs to get stats from
stats_file (str) – path to stats file
well_list (str) – path to a well list file to take the barcodes from (optional)
suffix (str) – suffix to append to the output column names (optional)
unassigned (bool) – if True then also collect stats for read pairs that don’t match any of the expected barcodes from the well list or existing stats file (by default unassigned stats are not collected)
append (bool) – if True then append columns to existing output file (by default creates new output file)
nprocs (int) – number of cores available for stats (default: taken from job runner)
- Outputs:
stats_file (str): path to the output stats file
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.GetReadsWithPolyGRegions(_name, *args, **kws)
Run ‘cutadapt’ to identify reads with poly-G regions
Given a set of Fastq R1/R2 pairs, identifies read pairs for which R2 appears to contain poly-G regions (all other read pairs are discarded).
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastq_pairs, poly_g_regions_dir)
Initialise the GetReadsWithPolyGRegions task
- Parameters:
fastqs (list) – input Fastq R1/R2 pairs
out_dir (str) – destination directory to write output files to
- Outputs:
- pattern (str): glob-style pattern matching output
Fastq file names
fastqs (FileCollector): output Fastq files
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.GroupFastqsByBarcode(_name, *args, **kws)
Group a list of Fastqs by associated barcode
Given a set of Fastqs, groups them into lists where each list belongs to the same barcode.
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastqs)
Initialise the GroupFastqsByBarcode task
- Parameters:
fastqs (list) – input Fastq files
- Outputs:
- fastq_groups (dict): dictionary where keys
are barcodes and values are lists of associated Fastqs
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.GroupFastqsBySample(_name, *args, **kws)
Group a list of Fastqs by associated sample
Given a set of Fastqs, groups them into lists where each list belongs to the same sample.
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastqs, well_list_file)
Initialise the GroupFastqsBySample task
- Parameters:
fastqs (list) – input Fastq files
well_list_file (str) – ‘well list’ file to get sample names and barcodes from
- Outputs:
- fastq_groups (dict): dictionary where keys
are samples and values are lists of associated Fastqs pairs as (R1,R2) tuples
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.ICell8FinalReporting(outdir, project=None)
Perform final reporting from ICELL8 pipeline
Pipeline to perform final reporting of the ICELL8 processing:
Write the final processing report
Set the primary fastq dir to ‘fastqs.samples’ (if in an analysis project directory)
- class auto_process_ngs.icell8.pipeline.ICell8QCFilter(outdir, fastqs, well_list_file, mammalian_conf, contaminants_conf, batch_size, stats_dir='stats', barcode_fastqs_dir='fastqs.barcodes', sample_fastqs_dir='fastqs.samples', basename=None, aligner=None, do_contaminant_filter=True, do_quality_filter=False, do_clean_up=True, nprocessors=None)
Run QC filtering on ICELL8 data
Pipeline to perform QC filtering on ICELL8 Fastq data:
Splits reads into batches
Filter out reads which don’t match any of the barcode sequences
Optionally: filter out reads which don’t meet the barcode or UMI quality criteria
Trim and quality filter reads with cutadapt
Estimate numbers of reads with poly-G regions
Optionally: perform contaminant filtering
Assemble reads into Fastqs by barcode and by sample name
Also generates statistics for numbers of reads and UMIs for each barcode at each stage.
- run(*args, **kws)
Run the tasks in the pipeline
Takes the same arguments as the Pipeline base class and performs post-termination clean up of temporary directory.
- class auto_process_ngs.icell8.pipeline.ICell8Statistics(*args, **kws)
Build command to run the ‘icell8_stats.py’ utility
- cmd()
Build the command
Must be implemented by the subclass and return a Command instance
- init(fastqs, stats_file, well_list=None, suffix=None, unassigned=False, append=False, nprocs=1, temp_dir=None)
Create new ICell8Statistics instance
- Parameters:
fastqs (list) – list of FASTQ file names
stats_file (str) – path to output file
well_list (str) – path to ‘well list’ file (optional)
suffix (str) – suffix to append to columns with read and UMI counts (optional)
unassigned (bool) – if True then also collect stats for read pairs that don’t match any of the expected barcodes from the well list or existing stats file (by default unassigned stats are not collected)
append (bool) – if True then append columns to existing output file (by default creates new output file)
nprocs (int) – number of cores available for stats (default: 1)
- class auto_process_ngs.icell8.pipeline.MergeBarcodeFastqs(_name, *args, **kws)
Given a set of Fastq files with filtered reads, arrange into R1/R2 pairs then pool read pairs belonging to the same ICell8 barcode.
Also concatenate R1/R2 Fastq pairs for unassigned read pairs, and read pairs which failed the barcode and UMI quality filters.
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastq_groups, unassigned_fastq_pairs, failed_barcode_fastq_pairs, failed_umi_fastq_pairs, merge_dir, basename, batch_size=25)
Initialise the MergeBarcodeFastqs task
- Parameters:
fastq_groups (dict) – input groups of Fastq files (grouped by barcode)
unassigned_fastq_pairs (list) – Fastq R1/R2 pairs with reads not assigned to ICELL8 barcodes
failed_barcode_fastq_pairs (list) – Fastq R1/R2 pairs with reads failing barcode quality check
failed_umi_fastq_pairs (list) – Fastq R1/R2 pairs with reads failing UMI quality check
merge_dir (str) – destination directory to write output files to
basename (str) – basename to use for output FASTQ files
batch_size (int) – number of barcodes to group together into one command for merging (larger batches = fewer jobs, but each job takes longer) (default=25)
Outputs:
The returned object has the following properties:
fastqs: object with properties which point to iterators listing output Fastqs (see below)
patterns: object with properties which are glob-style patterns matching output Fastqs (see below)
The output Fastqs are:
assigned: Fastqs with reads assigned to known barcodes
unassigned: Fastqs with reads not assigned to known barcodes
failed_barcodes: Fastqs with reads which failed the barcode quality check
failed_umis: Fastqs with reads which failed the UMI quality check
For example:
output.pattern.assigned = glob pattern to match Fastqs with assigned reads
output.fastqs.unassigned = iterator listing Fastqs with unassigned reads
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.MergeSampleFastqs(_name, *args, **kws)
Given a set of Fastq files grouped by ICELL8 sample and arranged into R1/R2 file pairs, pool reads into new Fastq files according to the sample names.
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastq_groups, merge_dir)
Initialise the MergeSampleFastqs task
- Parameters:
fastq_groups (dict) – input groups of Fastq R1/R2 file pairs (grouped by sample)
merge_dir (str) – destination directory to write output files to
Outputs:
pattern: glob-style pattern matching output Fastq file names
fastqs: FileCollector listing output Fastq files
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.PairFastqs(_name, *args, **kws)
Arrange Fastqs into R1/R2 pairs
This is a utility task that can be used to arrange a list of Fastq files into R1/R2 pairs according to their contents. Essentially it is a wrapper for the ‘pair_fastqs’ function.
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastqs)
Initialise the PairFastqs task
- Parameters:
fastqs (list) – list of paths to the Fastq files to be paired
- Outputs:
- fastq_pairs: list of tuples with
R1/R2 Fastq pairs
unpaired: list of unpaired Fastqs
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.ReportProcessing(_name, *args, **kws)
Generate an HTML report on the processing
Runs the
icell8_report.py
script to generate the report.- init(dirn, stats_file=None, out_file=None, name=None)
Initialise the ReportProcessing task
- Parameters:
dirn (str) – directory with the ICell8 pipeline outputs
stats_file (str) – name of stats file
out_file (str) – name of output report file (default: ‘icell8_processing.html’)
name (str) – title of report
- Outputs:
- report_html: path to the output HTML
report
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.SetupDirectories(_name, *args, **kws)
Create directories
Given a list of directories, check for and if necessary create each one
- init(dirs)
Initialise the SetupDirectories task
- Parameters:
dirs (list) – list of directories to ensure are present
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.SplitAndFilterFastqPair(*args, **kws)
Build command to run the ‘split_icell8_fastqs.py’ utility
- cmd()
Build the command
Must be implemented by the subclass and return a Command instance
- init(fastq_pair, out_dir, well_list=None, basename=None, mode='none', discard_unknown_barcodes=False, quality_filter=False, compress=False)
Create a new SplitAndFilterFastqPair instance
- Parameters:
fastq_pair (list) – R1/R2 FASTQ file pair
out_dir (str) – destination directory to write output files to
well_list (str) – ‘well list’ file to use (optional)
basename (str) – basename to use for output FASTQ files (optional)
mode (str) – mode to run the utility in
discard_unknown_barcodes (bool) – if True then discard read pairs where the barcode doesn’t match one of those in the well list file (nb well list file must also be supplied in this case) (all reads are kept by default)
quality_filter (bool) – if True then also do filtering based on barcode- and UMI-quality (no filtering is performed by default)
compress (bool) – if True then gzip the output files (FASTQs are uncompressed by default)
- class auto_process_ngs.icell8.pipeline.SplitByBarcodes(_name, *args, **kws)
Given a set of Fastq R1/R2 file pairs, group reads into new Fastq file pairs by ICell8 barcode.
Output Fastqs are named:
<BASENAME>.<BARCODE>.r[1|2].fastq
.- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastq_pairs, barcodes_dir)
Initialise the SplitByBarcodes task
- Parameters:
fastq_pairs (list) – input Fastq R1/R2 pairs
barcodes_dir (str) – destination directory to write output files to
- Outputs:
- pattern (str): glob-style pattern matching output
Fastq file names
fastqs (FileCollector): output Fastq files
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.SplitFastqsIntoBatches(_name, *args, **kws)
Split reads from Fastq pairs into batches
Divides reads from supplied Fastq R1/R2 file pairs and divides into new Fastq pairs consisting of “batches”, each with a specified number of read pairs.
The output Fastqs will be named
<BASENAME>.B###.r[1|2].fastq
(where###
is the batch number)- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastq_pairs, batch_dir, basename, batch_size=5000000)
Initialise the SplitFastqsIntoBatches task
- Parameters:
fastq_pairs (list) – list of input Fastq R1/R2 file pairs
batch_dir (str) – destination directory to write output files to
basename (str) – basename for output Fastqs
batch_size (int) – number of reads per output FASTQ (in batch mode) (optional)
- Outputs:
- pattern: glob-style pattern matching output
Fastq file names
- fastqs: FileCollector listing output Fastq
files
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.TrimFastqPair(*args, **kws)
Build command to run ‘cutadapt’ with ICell8 settings
- cmd()
Build the command
Must be implemented by the subclass and return a Command instance
- init(fastq_pair, trim_dir)
Create a new TrimFastqPair instance
- Parameters:
fastq_pair (list) – R1/R1 FASTQ file pair
trim_dir (str) – destination directory to write output files to
- class auto_process_ngs.icell8.pipeline.TrimReads(_name, *args, **kws)
Run ‘cutadapt’ with ICell8 settings
Given a set of Fastq R1/R2 file pairs, performs the following operations on the R2 reads:
Remove sequencing primers
Remove poly-A/T and poly-N sequences
Apply quality filter of Q <= 25
Remove short reads (<= 20 bases) post-trimming
If an R2 read fails any of the filters then the read pair is rejected.
Output Fastqs contain the filtered and trimmed reads only.
- finish()
Perform actions on task completion
Performs any actions that are required on completion of the task, such as moving or copying data, and setting the values of any output parameters.
Must be implemented by the subclass
- init(fastq_pairs, trim_dir)
Initialise the TrimReads task
- Parameters:
fastq_pairs (list) – input Fastq R1/R2 pairs
trim_dir (str) – destination directory to write output files to
- Outputs:
- pattern (str): glob-style pattern matching output
Fastq file names
fastqs (FileCollector): output Fastq files
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- class auto_process_ngs.icell8.pipeline.UpdateProjectData(_name, *args, **kws)
Updates data (e.g. primary Fastq set) in a project
- init(project_dir, primary_fastq_dir)
Initialise the SetPrimaryFastqDir task
- Parameters:
project_dir (str) – path to the project directory
primary_fastq_dir (str) – name of the Fastq subdirectory to make the primary Fastq set
- setup()
Set up commands to be performed by the task
Must be implemented by the subclass
- auto_process_ngs.icell8.pipeline.convert_to_xlsx(tsv_file, xlsx_file, title=None, freeze_header=False)
Convert a tab-delimited file to an XLSX file
- Parameters:
tsv_file (str) – path to the input TSV file
xlsx_file (str) – path to the output XLSX file
title (str) – optional, name to give the worksheet in the output XLSX file (defaults to the input file name)
freeze_header (bool) – optional, if True then ‘freezes’ the first line of the XLSX file (default is not to freeze the first line)
- auto_process_ngs.icell8.pipeline.tmp_dir(d)
Create a temp dir for directory ‘d’