auto_process_ngs.icell8.atac
Utility functions for handling single-cell ATAC-seq data from the ICELL8 platform.
Functions:
report: write a timestamped message
reverse_complement: get reverse complement of a sequence
update_fastq_read_index: rewrite index sequence in Fastq read header
split_fastq: split Fastq into batches
assign_reads: assign reads to samples from batched ICELL8 ATAC Fastqs
concat_fastqs: concatenate Fastqs for a sample across batches
- auto_process_ngs.icell8.atac.assign_reads(args)
Assign reads to samples from batched ICELL8 ATAC Fastqs
Intended to be invoked via ‘map’ or similar function
Arguments are supplied in a single list which should contain the following items:
R1 Fastq: path to R1 Fastq file
R2 Fastq: path to R2 Fastq file
I1 Fastq: path to I1 Fastq file
I2 Fastq: path to I2 Fastq file
well list: path to the well list file
mode: either ‘samples’ or ‘barcodes’
swap_i1_and_i2: boolean indicating whether I1 and I2 Fastqs should be swapped for matching
reverse_complement: either None, ‘i1’, ‘i2’ or both
rewrite_fastq_headers: boolean indicating whether to write the matching ICELL8 barcodes into the Fastq read headers on output
working_dir: working directory to write batches to
- unassigned: ‘sample name’ to associate with unassigned
read (used as a basename for output file)
In ‘samples’ mode assignment is done to samples only; in ‘barcodes’ mode assignment is done to samples and barcodes.
- Parameters:
args (list) – list containing the arguments supplied to the read assigner
- Returns:
- tuple consisting of (batch id,barcode_counts,
unassigned_barcodes_file).
- Return type:
Tuple
- auto_process_ngs.icell8.atac.concat_fastqs(args)
Concatenate Fastqs for a sample across batches
Intended to be invoked via ‘map’ or similar function
Arguments are supplied in a single list which should contain the following items:
sample: name of sample to concatenate Fastqs for
index: integer index to assign to the sample in output file name
barcode: (optional) barcode to concatenate Fastqs for (set to None when concatenating across samples)
lane: (optional) lane number for output Fastq (set to None to stop lane number appearing)
read: read identifier e.g. ‘R1’ or ‘I2’
batches: list of batch IDs to concatenate across
working_dir: working directory where batches are located
final_dir: directory to write concatenated Fastq to
- Parameters:
args (list) – list containing the arguments supplied to the read assigner
- Returns:
path of concatenated Fastq.
- Return type:
String
- auto_process_ngs.icell8.atac.report(msg, fp=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)
Write timestamped message
- Parameters:
msg (string) – text to be reported
fp (file) – stream to report to (defaults to stdout)
- auto_process_ngs.icell8.atac.reverse_complement(s)
Return reverse complement of a sequence
- Parameters:
s (str) – sequence to be reverse complemented
- Returns:
reverse complement of input sequence
- Return type:
String
- auto_process_ngs.icell8.atac.split_fastq(args)
Split Fastq into batches
Intended to be invoked via ‘map’ or similar function
Arguments are supplied in a single list which should contain the following items:
Fastq: path to Fastq file to split
batch_size: size of each batch
working_dir: working directory to write batches to
- Parameters:
args (list) – list containing the arguments supplied to the splitter
- Returns:
list of batched Fastqs
- Return type:
- auto_process_ngs.icell8.atac.update_fastq_read_index(read, index_sequence)
Update the index sequence (aka barcode) in a Fastq read
- Parameters:
read (list) – Fastq read to be updated, as a list of lines (with the first element/line being the sequence identifier line)
index_sequence (str) – the index sequence to put into the read header
- Returns:
the updated Fastq read, as a list of lines.
- Return type: