vcf2genomicsdb - VCF data converter for GenomicsDB
vcf2genomicsdb [options]
<loader_json_config_file>
--help, -h
--version
--progress, -p
- Show import progress
- specify minimum amount of time between progress messages with
--progress=<interval> or -p<interval> where
<interval> is a floating point number. Default units are seconds,
explicitly specify seconds, minutes, or hours by appending s, m, or h to
the end of the number
--tmp-directory, -T
- Specify temporary directory (stores some temporary files during the import
process, default is /tmp)
--rank, -r
- Manually assign MPI rank of process, determines on which partition the
process will operate
--split-files
- Split the files specified by the callset mapping JSON file according to
the column partitions in the loader JSON
- resulting files will be placed in the same directory as the originals
default behavior is to generate split files only for the partition
corresponding to the rank
- Modifiers to --split-files:
- --split-all-partitions Overrides --split-files default
behavior and instead creates split files for all partitions
--split-files-results-directory Specify where to place split files,
overrides default behavior of placing them in the same directory as
originals --split-output-filename Create a split file for one
column partition and one VCF
- e.g. vcf2genomicsdb <loader.json> --rank=<rank>
--split-files --split-output-filename=<output_path>
<input.vcf.gz>
- --split-callset-mapping-file
- Create callset mapping files containing the paths to the generated split
files, one callset per partition