dinopy.fastq_writer module¶
- class dinopy.fastq_writer.FastqWriter(target, force_overwrite=False, append=False)¶
Create a new FastqWriter for writing reads to disk in fastq format.
Manages opening and closing of files. This works best when using a with environment (see Examples), but the open and clode methods of the writer can also be called directly. This can be useful, when the number of files to be opened is depending on the input data.
- Parameters:
target (str, bytes, file or sys.stdout) – Path where the file will be written to. If the path ends with the suffix .gz a gzipped file will be created.
force_overwrite (bool) – If set to True, an existing file will be overwritten. (Default: False)
append (bool) – If set to True, existing file will not be overwritten. Reads will be appended at the end of the file. (Default: False)
- Raises:
ValueError – If the filename is invalid.
ValueError – If contradicting parameters are passed (overwrite=True and append=True).
TypeError – If target is neither a file, nor a path nor stdout.
IOError – If target is a file opened in the wrong mode.
IOError – If target file already exists and neither overwrite nor append are specified.
Methods intended for public use are:
write()
: Write one read to the opened file.write_reads()
: Writes given reads to file, where reads must be an Iterable over either(sequence, sequence_id, quality_values)
or(sequence, sequence_id)
tuples.
Examples
Writing reads from a list:
reads = [("TTTTTTTTGGANNNNN", b"sequence_id", b"#+++3#+/-.1/1/.<")] with dinopy.FastqWriter("somefile.fastq") as fqw: fqw.write_reads(reads, dtype=str)
Results in:
@sequence_idTTTTTTTTGGANNNNN+#+++3#+/-.1/1/.<Writing a single read:
with dinopy.FastqWriter("somefile.fastq.gz") as fqw: fqw.write(b"TTTTTTTTGGANNNNN", b"sequence_id", b"#+++3#+/-.1/1/.<")
Results in:
@sequence_idTTTTTTTTGGANNNNN+#+++3#+/-.1/1/.<Using a FastqWriter without the with-environment. Make sure the file is closed after you finished writing.:
fqw = dinopy.FastqWriter("somefile.fastq") fqw.open() fqw.write(b"TTTTTTTTGGANNNNN", b"sequence_id", None, dtype=bytes) fqw.close()
Results in:
@sequence_idTTTTTTTTGGANNNNNUsing a variable number of writers.:
# create a dict of writers writers = {name: dinopy.FastqWriter(path) for name, path in zip(specimen, input_filepaths)} # open all writers for writer in writers: writer.open() for read in reads: # pick a writer / output file according to some properties of the read # and write the read using the picked writer. picked_writer = pick(read, writers) picked_writer.write(read) # close all writers for writer in writers: writer.close()
- write(self, seq, bytes name, bytes quality_values=None, type dtype=bytes)¶
Write a single read to file.
- Parameters:
seq (dtype) – Sequence of the read
name (bytes) – Name line for the read
quality_values (bytes) – Quality values of the read.
dtype (type) – Type of the sequence(s) (See dtype; Default: bytes)
- Raises:
IOError – If FastqWriter was not used in an environment. → No file has been opened.
InvalidDtypeError – If an invalid encoding for the sequence has been given.
Example
Write a single read to file:
with dinopy.FastqWriter("somefile.fastq") as fqw: fqw.write(b"TTTTTTTTGGANNNNN", b"sequence_id", b"#+++3#+/-.1/1/.<")
- write_reads(self, reads, bool quality_values=True, type dtype=bytes)¶
Write multiple reads to file.
- Parameters:
reads (Iterable) – Containing reads, i.e. tuples of sequence, name and (optionally) quality values
quality_values (bool) – If set to True (Default) quality values are written to file.
dtype (type) – Type of the sequence(s) (See dtype; Default: bytes)
- Raises:
IOError – If no file has been opened, i.e. the writer has neither been opened using a with environment nor the open method has been called explicitly.
Example
Write a list of reads to file:
reads = [("TTTTTTTTGGANNNNN", b"sequence_id", b"#+++3#+/-.1/1/.<")] with dinopy.FastqWriter("somefile.fastq") as fqw: fqw.write_reads(reads, dtype=str)