From: Don Armstrong Date: Thu, 21 Jan 2016 17:31:15 +0000 (-0600) Subject: import mrbayes X-Git-Tag: mrbayes/3.2.6 X-Git-Url: https://git.donarmstrong.com/?p=mrbayes.git;a=commitdiff_plain;h=c79073ab0910f9a0e35e8a9c1f3cecf6c88a3606 import mrbayes --- c79073ab0910f9a0e35e8a9c1f3cecf6c88a3606 diff --git a/documentation/Manual_MrBayes_v3.2.pdf b/documentation/Manual_MrBayes_v3.2.pdf new file mode 100644 index 0000000..91a1e62 Binary files /dev/null and b/documentation/Manual_MrBayes_v3.2.pdf differ diff --git a/documentation/commref_mb3.2.txt b/documentation/commref_mb3.2.txt new file mode 100644 index 0000000..481dc3e --- /dev/null +++ b/documentation/commref_mb3.2.txt @@ -0,0 +1,3305 @@ + + + + + Command Reference for MrBayes ver. 3.2.6 + + (c) John P. Huelsenbeck, Fredrik Ronquist + and Maxim Teslenko + + + *************************************************************************** + * * + * 1. Command summary * + * * + *************************************************************************** + + --------------------------------------------------------------------------- + Commands that are available from the command + line or from a MrBayes block include: + + About -- Describes the program + Acknowledgments -- Shows program acknowledgments + Calibrate -- Assigns dates to terminals or interior nodes + Charset -- Assigns a group of sites to a set + Charstat -- Shows status of characters + Citations -- Citation of program, models, and methods + Comparetree -- Compares the trees from two tree files + Constraint -- Defines a constraint on tree topology + Ctype -- Assigns ordering for the characters + Databreaks -- Defines data breaks for autodiscrete gamma model + Delete -- Deletes taxa from the analysis + Disclaimer -- Describes program disclaimer + Exclude -- Excludes sites from the analysis + Execute -- Executes a file + Help -- Provides detailed description of commands + Include -- Includes sites + Link -- Links parameters across character partitions + Log -- Logs screen output to a file + Lset -- Sets the parameters of the likelihood model + Manual -- Prints a command reference to a text file + Mcmc -- Starts Markov chain Monte Carlo analysis + Mcmcp -- Sets parameters of a chain (without starting analysis) + Outgroup -- Changes outgroup taxon + Pairs -- Defines nucleotide pairs (doublets) for stem models + Partition -- Assigns a character partition + Plot -- Plots parameters from MCMC analysis + Prset -- Sets the priors for the parameters + Propset -- Sets proposal probabilities and tuning parameters + Quit -- Quits the program + Report -- Controls how model parameters are reported + Restore -- Restores taxa + Set -- Sets run conditions and defines active data partition + Showbeagle -- Show available BEAGLE resources + Showmatrix -- Shows current character matrix + Showmcmctrees -- Shows trees used in mcmc analysis + Showmodel -- Shows model settings + Showmoves -- Shows moves for current model + Showparams -- Shows parameters in current model + Showusertrees -- Shows user-defined trees + Speciespartition -- Defines a partition of tips into species + Ss -- Starts stepping-stone sampling + Ssp -- Sets parameters of stepping-stone analysis (without starting) + Startvals -- Sets starting values of parameters + Sump -- Summarizes parameters from MCMC analysis + Sumss -- Summarizes parameters from stepping-stone analysis + Sumt -- Summarizes trees from MCMC analysis + Taxastat -- Shows status of taxa + Taxset -- Assigns a group of taxa to a set + Unlink -- Unlinks parameters across character partitions + Version -- Shows program version + + Commands that should be in a NEXUS file (data + block, trees block or taxa block) include: + + Begin -- Denotes beginning of block in file + Dimensions -- Defines size of character matrix + End -- Denotes end of a block in file + Endblock -- Alternative way of denoting end of a block + Format -- Defines character format in data block + Matrix -- Defines matrix of characters in data block + Taxlabels -- Defines taxon labels + Translate -- Defines alternative names for taxa + Tree -- Defines a tree + + Note that this program supports the use of the shortest unambiguous + spelling of the above commands (e.g., "exe" instead of "execute"). + --------------------------------------------------------------------------- + + *************************************************************************** + * * + * 2. MrBayes commands * + * * + *************************************************************************** + + --------------------------------------------------------------------------- + About + + This command provides some general information about the program. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Acknowledgments + + This command shows the authors' acknowledgments. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Calibrate + + This command dates a terminal or interior node in the tree. The format is + + calibrate = + + where is the name of a defined interior constraint node or the + name of a terminal node (tip) and is a prior probability distribu- + tion on the age of the node. The latter can either be a fixed date or a date + drawn from one of the available prior probability distributions. In general, + the available prior probability distributions are parameterized in terms of + the expected mean age of the distribution to facilitate for users. Some dis- + tributions put a positive probability on all ages above 0.0, while others in- + clude a minimum-age constraint and sometimes a maximum-age constraint. The + available distributions and their parameters are: + + calibrate = fixed() + calibrate = uniform(,) + calibrate = offsetexponential(,) + calibrate = truncatednormal(,,) + calibrate = lognormal(,) + calibrate = offsetlognormal(,,) + calibrate = gamma(,) + calibrate = offsetgamma(,,) + + Note that mean_age is always the mean age and stdev the standard deviation of + the distribution measured in user-defined time units. This way of specifying + the distribution parameters is often different from the parameterization used + elsewhere in the program. For instance, the standard parameters of the gamma + distribution used by MrBayes are shape (alpha) and rate (beta). If you want + to use the standard parameterization, the conversions are as follows: + + exponential distributon: mean = 1 / rate + gamma distributon: mean = alpha / beta + st.dev. = square_root (alpha / beta^2) + lognormal distributon: mean = exp (mean_log + st.dev._log^2/2) + st.dev. = square_root ((exp (st.dev._log^2) - 1) + * (exp (2*mean_log + st.dev._log^2)) + + The truncated normal distribution is an exception in that the mean_age and + stdev parameters are the mean and standard deviation of the underlying non- + truncated normal distribution. The truncation will cause the modified distri- + bution to have a higher mean and lower standard deviation. The magnitude of + that effect depends on how much of the tail of the distribution is removed. + + Note that previous to version 3.2.2, MrBayes used the standard rate parameter- + ization of the offset exponential. This should not cause a problem in most + cases because the old parameterization will result in an error in more recent + versions of MrBayes, and the likely source of the error is given in the error + message. + + For a practical example, assume that we had three fossil terminals named + 'FossilA', 'FossilB', and 'FossilC'. Assume further that we want to fix the + age of FossilA to 100.0 million years, we think that FossilB is somewhere + between 100.0 and 200.0 million years old, and that FossilC is at least 300.0 + million years old, possibly older but relatively unlikely to be more than + 400.0 million years old. Then we might use the commands: + + calibrate FossilA = fixed(100) FossilB = uniform(100,200) + calibrate FossilC = offsetexponential(300,400) + + Note that it is possible to give more than one calibration for each + 'calibrate' statement. Thus, 'calibrate FossilA= FossilB=' + would be a valid statement. + + To actually use the calibrations to obtain dated trees, you also need to set + a clock model using relevant 'brlenspr' and 'nodeagepr' options of the 'prset' + command. You may also want to examine the 'clockvarpr' and 'clockratepr' op- + tions. Furthermore, you need to activate the relevant constraint(s) using + 'topologypr', if you use any dated interior nodes in the tree. + + You may wish to remove a calibration from an interior or terminal node, which + has previously been calibrated. You can do that using + + calibrate = unconstrained + + + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Charset + + This command defines a character set. The format for the charset command + is + + charset = + + For example, "charset first_pos = 1-720\3" defines a character set + called "first_pos" that includes every third site from 1 to 720. + The character set name cannot have any spaces in it. The slash (\) + is a nifty way of telling the program to assign every third (or + second, or fifth, or whatever) character to the character set. + This option is best used not from the command line, but rather as a + line in the mrbayes block of a file. Note that you can use "." to + stand in for the last character (e.g., charset 1-.\3). + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Charstat + + This command shows the status of all the characters. The correct usage + is + + charstat + + After typing "charstat", the character number, whether it is excluded + or included, and the partition identity are shown. The output is paused + every 100 characters. This pause can be turned off by setting autoclose + to "yes" (set autoclose=yes). + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Citations + + This command shows a thorough list of citations you may consider using + when publishing the results of a MrBayes analysis. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Comparetree + + This command compares the trees in two files, called "filename1" and + "filename2". It will output a bivariate plot of the split frequencies + as well as plots of the tree distance as a function of the generation. The + plots can be used to get a quick indication of whether two runs have con- + verged onto the same set of trees. The "Comparetree" command will also + produce a ".pairs" file and a ".dists" file (these file endings are added + to the end of the "Outputname"). The ".pairs" file contains the paired + split frequencies from the two tree samples; the ".dists" file contains the + tree distance values. + + Note that the "Sumt" command provides a different set of convergence diag- + nostics tools that you may also want to explore. Unlike "Comparetree", + "Sumt" can compare more than two tree samples and will calculate consensus + trees and split frequencies from the pooled samples. + + Options: + + Relburnin -- If this option is set to 'Yes', then a proportion of the + samples will be discarded as burnin when calculating summary + statistics. The proportion to be discarded is set with + Burninfrac (see below). When the Relburnin option is set to + 'No', then a specific number of samples is discarded instead. + This number is set by Burnin (see below). Note that the + burnin setting is shared with the 'mcmc', 'sumt', 'sump' and + 'plot' commands. + Burnin -- Determines the number of samples (not generations) that will + be discarded when summary statistics are calculated. The + value of this option is only relevant when Relburnin is set + to 'No'. + BurninFrac -- Determines the fraction of samples that will be discarded + when summary statistics are calculated. The value of this + option is only relevant when Relburnin is set to 'Yes'. + Example: A value for this option of 0.25 means that 25% of + the samples will be discarded. + Minpartfreq -- The minimum probability of partitions to include in summary + statistics. + Filename1 -- The name of the first tree file to compare. + Filename2 -- The name of the second tree file to compare. + Outputname -- Name of the file to which 'comparetree' results will be + printed. + + Current settings: + + Parameter Options Current Setting + -------------------------------------------------------- + Relburnin Yes/No Yes + Burnin 0 + Burninfrac 0.25 + Minpartfreq 0.00 + Filename1 temp.t + Filename2 temp.t + Outputname temp.comp + + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Constraint + + This command defines a tree constraint. The format for the constraint + command is + + constraint [hard|negative|partial] = [:] + + There are three types of constraint implemented in MrBayes. The type of the + constraint is specified by using one of the three keywords 'hard', 'negative', + or 'partial' right after the name of the constraint. If no type is specified, + then the constraint is assumed to be 'hard'. + + In a rooted tree, a 'hard' constraint forces the taxa in the list to form a + monophyletic group. In an unrooted tree, the taxon split that separates the + taxa in the list from other taxa is forced to be present. The interpretation + of this depends on whether the tree is rooted on a taxon outside the list or + a taxon in the list. If the outgroup is excluded , the taxa in the list are + assumed to form a monophyletic group, but if the outgroup is included, the + taxa that are not in the list are forced together. + + A 'negative' constraint bans all the trees that have the listed taxa in the + same subtree. In other words, it is the opposite of a hard constraint. + + A 'partial' or backbone constraint is defined in terms of two sets of taxa + separated by a colon character. The constraint forces all taxa in the first + list to form a monophyletic group that does not include any taxon in the + second list. Taxa that are not included in either list can be placed in any + position on the tree, either inside or outside the constrained group. In an + unrooted tree, the two taxon lists can be switched with each other with no + effect. For a rooted tree, it is the taxa in the first list that have to be + monophyletic, that is, these taxa must share a common ancestor not shared with + any taxon in the second list. The taxa in the second list may or may not fall + in a monophyletic group depending on the rooting of the tree. + + A list of taxa can be specified using a taxset, taxon names, taxon numbers, or + any combination of the above, sepatated by spaces. The constraint is treated + as an absolute requirement of trees, that is, trees that are not compatible + with the constraint have zero prior (and hence zero posterior) probabilty. + + If you are interested in inferring ancestral states for a particular node, + you need to 'hard' constrain that node first using the 'constraint' command. + The same applies if you wish to calibrate an interior node in a dated + analysis. For more information on how to infer ancestral states, see the help + for the 'report' command. For more on dating, see the 'calibrate' command. + + It is important to note that simply defining a constraint using this + command is not sufficient for the program to actually implement the + constraint in an analysis. You must also enforce the constraints using + 'prset topologypr = constraints ()'. For more infor- + mation on this, see the help on the 'prset' command. + + Examples: + + constraint myclade = Homo Pan Gorilla + + Defines a hard constraint forcing Homo, Pan, and Gorilla to form a mono- + phyletic group or a split that does not include any other taxa. + + constraint forbiddenclade negative = Homo Pan Gorilla + + Defines a negative constraint that associates all trees where Homon, Pan, and + Gorilla form a monophyletic group with zero posterior probability. In other + words, such trees will not be sampled during MCMC. + + constraint backbone partial = Homo Gorilla : Mus + + Defines a partial constraint that keeps Mus outside of the clade defined by + the most recent common ancestor of Homo and Gorilla. Other taxa are allowed to + sit anywhere in the tree. Note that this particular constraint is meaningless + in unrooted trees. MrBayes does not assume anything about the position of the + outgroup unless it is explicitly included in the partial constraint. Therefore + a partial constraint must have at least two taxa on each side of the ':' to be + useful in analyses of unrooted trees. The case is different for rooted trees, + where it is sufficient for a partial constraint to have more than one taxon + before the ':', as in the example given above, to constrain tree space. + + To define a more complex constraint tree, simply combine constraints into a + list when issuing the 'prset topologypr' command. + + + -------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Ctype + + This command sets the character ordering for standard-type data. The + correct usage is: + + ctype : + + The available options for the specifier are: + + unordered -- Movement directly from one state to another is + allowed in an instant of time. + ordered -- Movement is only allowed between adjacent characters. + For example, perhaps only between 0 <-> 1 and 1 <-> 2 + for a three state character ordered as 0 - 1 - 2. + irreversible -- Rates of change for losses are 0. + + The characters to which the ordering is applied is specified in manner + that is identical to commands such as "include" or "exclude". For + example, + + ctype ordered: 10 23 45 + + defines charactes 10, 23, and 45 to be of type ordered. Similarly, + + ctype irreversible: 54 - 67 71-92 + + defines characters 54 to 67 and characters 71 to 92 to be of type + irreversible. You can use the "." to denote the last character, and + "all" to denote all of the characters. Finally, you can use the + specifier "\" to apply the ordering to every n-th character or + you can use predefined charsets to specify the character. + + Only one ordering can be used on any specific application of ctype. + If you want to apply different orderings to different characters, then + you need to use ctype multiple times. For example, + + ctype ordered: 1-50 + ctype irreversible: 51-100 + + sets characters 1 to 50 to be ordered and characters 51 to 100 to be + irreversible. + + The ctype command is only sensible with morphological (here called + "standard") characters. The program ignores attempts to apply char- + acter orderings to other types of characters, such as DNA characters. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Databreaks + + This command is used to specify breaks in your input data matrix. Your + data may be a mixture of genes or a mixture of different types of data. + Some of the models implemented by MrBayes account for nonindependence at + adjacent characters. The autocorrelated gamma model, for example, allows + rates at adjacent sites to be correlated. However, there is no way for + such a model to tell whether two sites, adjacent in the matrix, are + actually separated by many kilobases or megabases in the genome. The + databreaks command allows you to specify such breaks. The correct + usage is: + + databreaks ... + + For example, say you have a data matrix of 3204 characters that include + nucleotide data from three genes. The first gene covers characters 1 to + 970, the second gene covers characters 971 to 2567, and the third gene + covers characters 2568 to 3204. Also, let's assume that the genes are + not directly adjacent to one another in the genome, as might be likely + if you have mitochondrial sequences. In this case, you can specify + breaks between the genes using: + + databreaks 970 2567; + + The first break, between genes one and two, is after character 970 and + the second break, between genes two and three, is after character 2567. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Delete + + This command deletes taxa from the analysis. The correct usage is: + + delete ... + + A list of the taxon names or taxon numbers (labelled 1 to ntax in the order + in the matrix) or taxset(s) can be used. For example, the following: + + delete 1 2 Homo_sapiens + + deletes taxa 1, 2, and the taxon labelled Homo_sapiens from the analysis. + You can also use "all" to delete all of the taxa. For example, + + delete all + + deletes all of the taxa from the analysis. Of course, a phylogenetic anal- + ysis that does not include any taxa is fairly uninteresting. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Disclaimer + + This command shows the disclaimer for the program. In short, the disclaimer + states that the authors are not responsible for any silly things you may do + to your computer or any unforseen but possibly nasty things the computer + program may inadvertently do to you. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Exclude + + This command excludes characters from the analysis. The correct usage is + + exclude + + or + + exclude - + + or + + exclude + + or some combination thereof. Moreover, you can use the specifier "\" to + exclude every nth character. For example, the following + + exclude 1-100\3 + + would exclude every third character. As a specific example, + + exclude 2 3 10-14 22 + + excludes sites 2, 3, 10, 11, 12, 13, 14, and 22 from the analysis. Also, + + exclude all + + excludes all of the characters from the analysis. Excluding all characters + does not leave you much information for inferring phylogeny. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Execute + + This command executes a file called . The correct usage is: + + execute + + For example, + + execute replicase.nex + + would execute the file named "replicase.nex". This file must be in the + same directory as the executable. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Help + + This command provides useful information on the use of this program. The + correct usage is + + help + + which gives a list of all available commands with a brief description of + each or + + help + + which gives detailed information on the use of . + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Include + + This command includes characters that were previously excluded from the + analysis. The correct usage is + + include + + or + + include - + + or + + include + + or some combination thereof. Moreover, you can use the specifier "\" to + include every nth character. For example, the following + + include 1-100\3 + + would include every third character. As a specific example, + + include 2 3 10-14 22 + + includes sites 2, 3, 10, 11, 12, 13, 14, and 22 from the analysis. Also, + + include all + + includes all of the characters in the analysis. Including all of the + characters (even if many of them are bad) is a very total-evidence-like + thing to do. Doing this will make a certain group of people very happy. + On the other hand, simply using this program would make those same people + unhappy. + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Link + + This command links model parameters across partitions of the data. The + correct usage is: + + link = ( or ) + + The list of parameters that can be linked includes: + + Tratio -- Transition/transversion rate ratio + Revmat -- Substitution rates of GTR model + Omega -- Nonsynonymous/synonymous rate ratio + Statefreq -- Character state frequencies + Shape -- Gamma/LNorm shape parameter + Pinvar -- Proportion of invariable sites + Correlation -- Correlation parameter of autodiscrete gamma + Ratemultiplier -- Rate multiplier for partitions + Switchrates -- Switching rates for covarion model + Topology -- Topology of tree + Brlens -- Branch lengths of tree + Speciationrate -- Speciation rates for birth-death process + Extinctionrate -- Extinction rates for birth-death process + Popsize -- Population size for coalescence process + Growthrate -- Growth rate of coalescence process + Aamodel -- Aminoacid rate matrix + Cpprate -- Rate of Compound Poisson Process (CPP) + Cppmultdev -- Standard dev. of CPP rate multipliers (log scale) + Cppevents -- CPP events + TK02var -- Variance increase in TK02 relaxed clock model + Igrvar -- Variance increase in IGR relaxed clock model + Mixedvar -- Variance increase in Mixed relaxed clock model + + For example, + + link shape=(all) + + links the gamma/lnorm shape parameter across all partitions of the data. + You can use "showmodel" to see the current linking status of the + characters. For more information on this command, see the help menu + for link's converse, unlink ("help unlink"); + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Log + + This command allows output to the screen to also be output to a file. + The useage is: + + log start/stop filename= append/replace + + The options are: + + Start/Stop -- Starts or stops logging of output to file. + Append/Replace -- Either append to or replace existing file. + Filename -- Name of log file (currently, the name of the log + file is "log.out"). + --------------------------------------------------------------------------- + --------------------------------------------------------------------------- + Lset + + This command sets the parameters of the likelihood model. The likelihood + function is the probability of observing the data conditional on the phylo- + genetic model. In order to calculate the likelihood, you must assume a + model of character change. This command lets you tailor the biological + assumptions made in the phylogenetic model. The correct usage is + + lset =