-- coding: utf-8 --() | -- coding: utf-8 --() |
This document describes the VCFLIB API as it is used by the vcflib modules and the python ffi. vcflib follows the VCF standard (http://samtools.github.io/hts-specs/VCFv4.1.pdf).
VCFLIB contains a lot of functionality, but the basis of going through a VCF file and fetching record (by record) information is straightforward and visible in all modules. A recent example can be found in vcfwave.
VariantCallFile variantFile; if (optind < argc) { string filename = argv[optind]; variantFile.open(filename); } else { variantFile.open(std::cin); } if (!variantFile.is_open()) { return 1; }
The following will parse the records and you can print out the first two fields with
Variant var(variantFile); while (variantFile.getNextVariant(var)) { cout << var.sequenceName << " " << var.position << endl; }
In the file Variant.h (https://github.com/vcflib/vcflib/blob/master/src/Variant.h) the Variant class is defined with fields/accessors, such as
string sequenceName; long position; long zeroBasedPosition(void) const; string id; string ref; vector<string> alt; // a list of all the alternate alleles present at this locus vector<string> alleles; // a list all alleles (ref + alt) at this locus
See above read records example to parse a file. Some things to know are that info fields are split into fields with values in `info' and flags that are true in `infoFlags'. The order of info fields is not kept in the C++ map data structure so we have to keep track of order in an infoKeys vector.
The default string outputter of the Variant class outputs a VCF record using the field that are defined:
Variant var(variantFile); while (variantFile.getNextVariant(var)) { cout << var << endl; }
will output VCF.
The Python FFI follows this API though some accessors may be renamed. See pythonffi.cpp and pyvcflib.md.