dextract - pull information needed for assembly from source HDF5
files made by PacBio RS II sequencer
dextract [-vq] [-o[<path>]]
[-l<int(500)>] [-s<int(750)>]
<input:bax_h5> ...
Dextract takes a series of .bax.h5 or .subreads.[bs]am files as
input, and depending on the option flags settings produces:
- 1
- (-f) a.fasta file containing subread sequences, each with a
"standard" Pacbio header consisting of the movie name, well
number, pulse range, and read quality value.
- 2
- (-a) a FASTA format .arrow file containing the pulse width stream for each
subread, with a header that contains the movie name and the 4 channel SNR
values.
- 3
- (-q) a FASTAQ-like .quiva file containing for each subread the same header
as the .fasta file above, save that it starts with an @-sign, followed by
the 5 quality value streams used by Quiver, one per line, where the order
of the streams is: deletion QVs, deletion Tags, insertion QVs, merge QVs,
and last substitution QVs.
If the -v option is set then the program reports the
processing of each PacBio input file, otherwise it runs silently. If
none of the -f, -a, or -q flags is set, then by default -f is assumed.
The destination of the extracted information is controlled by the -o
parameter as follows:
- 1
- If -o is absent, then for each input file X.bax.h5 or X.subreads.[bs]am,
dextract will produce X.fasta, X.arrow, and/or X.quiva as per the option
flags.
- 2
- If -o is present and followed by a path Y, then the concatenation of the
output for the input files is placed in Y.fasta, Y.arrow, and/or Y.quiva
as per the option flags.
- 3
- If -o is present but with no following path, then the output is sent to
the standard output (to enable a UNIX pipe if desired). In this case only
one of the flags -f, -a, or -q can be set.
The full documentation for dextract: is maintained as a
Texinfo manual. If the info and dextract: programs are
properly installed at your site, the command
- info dextract
should give you access to the complete manual.