hyphy - Hypothesis testing using Phylogenies (pthreads
version)
usage: hyphy or HYPHYMPI [-h] [--help][-c] [-d] [-i] [-p]
[BASEPATH=directory path] [CPU=integer] [LIBPATH=library path]
[USEPATH=library path] [<standard analysis name> or <path to hyphy
batch file>] [--keyword value ...] [positional arguments ...]
Execute a HyPhy analysis, either interactively, or in batch mode
optional flags:
- -h --help
- show this help message and exit
- -c
- calculator mode; causes HyPhy to drop into an expression evaluation until
'exit' is typed
- -d
- debug mode; causes HyPhy to drop into an expression evaluation mode upon
script error
- -i
- interactive mode; causes HyPhy to always prompt the user for analysis
options, even when defaults are available
- -p
- postprocessor mode; drops HyPhy into an interactive mode where general
post-processing scripts can be selected upon analysis completion
- -m
- write diagnostic messages to messages.log
- BASEPATH=directory
path
- defines the base directory for all path operations (default is pwd)
- CPU=integer
- if compiled with OpenMP multithreading support, requests this many
threads; HyPhy could use fewer than this but never more; default is the
number of CPU cores (as computed by OpenMP) on the system
- LIBPATH=directory
path
- defines the directory where HyPhy library files are located (default
installed location is /usr/local/lib/hyphy or as configured during
CMake installation
- USEPATH=directory
path
- specifies the optional working and relative path directory (default is
BASEPATH)
- ENV=expression
- set HBL environment variables via explicit statements for example
ENV='DEBUG_MESSAGES=1;WRITE_LOGS=1'
- batch file to run
- if specified, execute this file, otherwise drop into an interactive
mode
- analysis
arguments
- if batch file is present, all remaining positional arguments are
interpreted as inputs to analysis prompts
optional keyword arguments (can appear anywhere); will be consumed
by the requested analysis
- --keyword
value
- will be passed to the analysis (which uses KeywordArgument directives)
multiple values for the same keywords are treated as an array of values
for multiple selectors
usage examples:
- hyphy -i
Run a standard analysis with default options and one required user
argument;
- hyphy busted --alignment path/to/file
Run a standard analysis with additional keyword arguments
- hyphy busted --alignment path/to/file --srv No
See which arguments are understood by a standard analysis
- hyphy busted --help
Run a custom analysis and pass it some arguments
- hyphy path/to/hyphy.script argument1 'argument 2'
Available standard keyword analyses (located in
/home/nilesh/packages/hyphy/hyphy/res/)
- meme
- [MEME] Test for episodic site-level selection using MEME (Mixed Effects
Model of Evolution).
- mh
- Merge two datafiles by combining sites (horizontal merge).
- mv
- Merge two datafiles by combining sequences (vertical merge).
- mcc
- Compare mean within-clade branch length or pairwise divergence between two
or more non-nested cladesd in a tree
- mclk
- Test for the presence of a global molecular clock on the tree using its
root (the resulting clock tree is unrooted, but one of the root branches
can be divided in such a way as to enforce the clock).
- mgvsgy
- Compare the fits of MG94 and GY94 models (crossed with an arbitrary
nucleotide bias) on codon data.
- mt
- Select an evolutionary model for nucleotide data, using the methods of
'ModelTest' - a program by David Posada and Keith Crandall.
- fel
- [FEL] Test for pervasive site-level selection using FEL (Fixed Effects
Likelihood).
- fubar
- [FUBAR] Test for pervasive site-level selection using FUBAR (Fast
Unconstrained Bayesian AppRoximation for inferring selection).
- fade
- [FADE] Test a protein alignment for directional selection towards specific
amino acids along a specified set of test branches using FADE (a FUBAR
Approach to Directional Evolution).
- faa
- Fit a multiple fitness class model to amino acid data.
- fmm
- "Fit a model that permits double (and triple) instantaneous
nucleotide substitutions"
- fst
- Compute various measures of F_ST and (optionally) perform permutation
tests.
- slac
- [SLAC] Test for pervasive site-level selection using SLAC (Single
Likelihood Ancestor Counting).
- sm
- Peform a classic and structured Slatkin-Maddison test for the number
migrations.
- sns
- Parse a codon alignment for ambiguous codons and output a complete
list/resolutions/syn and ns counts by sequence/position
- sw
- Perform a sliding window analysis of sequence data.
- sa
- Perform a phylogeny reconstuction for nucleotide, protein or codon data
with user-selectable models using the method of sequential addition.
- sbl
- Search an alignment for a single breakpoint.
- spl
- Plot genetic distances (similarity) of one sequence against all others in
an alignment, using a sliding window. Optionally, determine NJ-based
clustering and bootstrap support in every window. This is a HyPhy
adaptation of the excellent (but Windows only tool) SimPlot (and/or
VarPlot) written by Stuart Ray
(http://sray.med.som.jhmi.edu/SCRoftware/simplot/)
- busted
- [BUSTED] Test for episodic gene-wide selection using BUSTED (Branch-site
Unrestricted Statistical Test of Episodic Diversification).
- bgm
- [BGM] Apply Bayesian Graphical Model inference to substitution histories
at individual sites.
- bva
- Run a selection analysis using a general discrete bivariate (dN AND dS)
distribution; the appropriate number of rate classes is determined
automatically.
- brp
- Interpret bivariate codon rate analysis results.
- bsel
- Split a tree into two clades (compartments) and a separating branch and
test for equality of dN/dS between compartments and for selection along
the separating branch using a series of Likelihood Ratio Tests.
- bst
- Use the improved branch-site REL method of Yang and Nielsen (2005) to look
for episodic selection in sequences.
- bt
- Test whether a branch (or branches) in the tree evolves under different dN
and dS than the rest of the tree.
- absrel
- [aBSREL] Test for lineage-specific evolution using the branch-site method
aBS-REL (Adaptive Branch-Site Random Effects Likelihood).
- acd
- Analyse codon data with a variery of standard models using given
tree.
- ad
- Analyse nucleotide or aminoacid data with a variery of standard models
using given tree.
- adn
- Analyse di-nucleotide data with a variery of standard models using given
tree.
- afd
- Analyse nucleotide data with a variery of standard models using given
tree, estimating equilibrium frequencies as parameters
- ana
- Run a selection analysis.
- ai
- Peter Simmonds' Association Index (AI).
- relax
- [RELAX] Test for relaxation of selection pressure along a specified set of
test branches using RELAX (a random effects test of selection
relaxation).
- red
- Replace sufficiently close sequence with their MRCA
- rpc
- Interpret analysis results.
- rmv
- Remove sequences with stop codons from the data.
- rble
- Use a series of random effects branch-site models to perform robust
model-averaged branch length estimation under a codon model with episodic
selection.
- rclk
- Test for the presence of a global molecular clock on the tree. The tree is
rooted at every possible branch.
- rr
- Use relative rate test on three species and a variety of standard
models
- rrt
- Use relative ratio test on 2 datasets and a variety of standard
models
- prime
- [PRIME]
- protein
- Compare the fit of several amino-acid substitution models to an alignment
using AIC and c-AIC.
- prr
- Using the model and the outgroup provided by the user, perform relative
rate tests with all possible pair of species from the data file.
- prrti
- Given a list of files (and optionally genetic code tables), perform
relative ratio tests on all possible pair of the data files.
- pdf
- Read sequence data, select a contiguous subset of sites and save it to
another datafile.
- phb
- Run an example file from our book chapter in 'The Phylogentic Handbook'
(2nd edition).
- psm
- Test for positive selection using the approach of Nielsen and Yang, by
sampling global dN/dS from an array of distributions, and using Bayesian
posterior to identify the sites with dN/dS>1. Multiple subsets of one
data set with shared dN/dS.
- parris
- A PARtitioning approach for Robust Inference of Selection (written by K.
Scheffler)
- contrast-fel
- [Contrast-FEL] "Perform a site-level test for differences in
selective pressures between predetermined sets of branches."
- conv
- Translate an in-frame codon alignment to proteins.
- corr
- Assess the correlation between phylogenetic and compartment segregation
using generalized correlation coefficients and permutation tests.
- cod
- Compare all 203 reversible nucleotide models composed with MG94 to extend
them to codon data, and perform LRT and AIC model selection.
- cmp
- Use a series of LR tests to decide if dN and dS rate distributions are the
same or different between two codon alignments.
- caln
- Align coding sequences to reference (assuming star topology) using a
codon-based dynamic programming algorithm (good for fixing multiple
frameshifts). Designed with within-patient HIV sequences in mind.
- clg
- Remove 'gappy' sites from alignments based on a user-specified gap
threshold.
- cln
- Convert sequence names to HyPhy valid identifiers if needed and replace
stop codons with gaps in codon data if any are present.
- clsr
- Partition sequences into clusters based on a distance matrix.
- clst
- Apply clustering methods for phylogeny reconstruction
(UPGMA,WPGMA,complete or minimal linkage) to nucleotide, protein and codon
data, using MLE of pairwise distances with user-selectable models. These
methods produce trees with global molecular clock.
- leisr
- Infer relative evolutionary rates on a nucleotide or protein alignment, in
a spirit similar to Rate4Site (PMID: 12169533).
- lz
- Compute Lempel-Ziv complexity and entropy of (possibly unaligned)
sequences
- lclk
- Test for the presence of a local molecular clock. Every subtree of the
given tree is subjected to the clock constraint, while the remainder of
the tree is free of the clock constraint.
- lht
- A Likelihood Ratio Test to detect conflicting phylogenetic signal
Huelsenbeck and Bull, 1996. [Contributed by Olivier Fedrigo].
- tc
- Test whether a group of sequences in a sample cluster together
- ts
- Perform an exhaustive tree space search for nucleotide or protein data
with user-selectable models. Should only be used for data sets with less
than 10 taxa!
- dtr
- Read sequence data (#,PHYLIP,NEXUS) and convert to a different format
- dist
- Generate a pairwise sequence distance matrix in PHYLIP format.
- kh
- Perform a Kishino-Hasegawa test on two competing phylogenies
- ub
- Obtain an upper bound on the likelihood score of a dataset.
- nuc
- Compare all 203 reversible nucleotide models and perform LRT and AIC model
selection.
- nj
- Perform a phylogeny reconstuction for nucleotide, protein or codon data
with user-selectable models using the method of neighbor joining.
- ny
- Test for positive selection using the approach of Nielsen and Yabg, by
sampling global dN/dS from an array of distributions, and using Bayesian
posterior to identify the sites with dN/dS>1.
- gard
- [GARD] Screen an alignment using GARD (requires an MPI environment).
This manpage was written by Nilesh Patra for the Debian
distribution and
can be used for any other usage of the program.