Command Reference for MrBayes ver. 3.2.6 (c) John P. Huelsenbeck, Fredrik Ronquist and Maxim Teslenko *************************************************************************** * * * 1. Command summary * * * *************************************************************************** --------------------------------------------------------------------------- Commands that are available from the command line or from a MrBayes block include: About -- Describes the program Acknowledgments -- Shows program acknowledgments Calibrate -- Assigns dates to terminals or interior nodes Charset -- Assigns a group of sites to a set Charstat -- Shows status of characters Citations -- Citation of program, models, and methods Comparetree -- Compares the trees from two tree files Constraint -- Defines a constraint on tree topology Ctype -- Assigns ordering for the characters Databreaks -- Defines data breaks for autodiscrete gamma model Delete -- Deletes taxa from the analysis Disclaimer -- Describes program disclaimer Exclude -- Excludes sites from the analysis Execute -- Executes a file Help -- Provides detailed description of commands Include -- Includes sites Link -- Links parameters across character partitions Log -- Logs screen output to a file Lset -- Sets the parameters of the likelihood model Manual -- Prints a command reference to a text file Mcmc -- Starts Markov chain Monte Carlo analysis Mcmcp -- Sets parameters of a chain (without starting analysis) Outgroup -- Changes outgroup taxon Pairs -- Defines nucleotide pairs (doublets) for stem models Partition -- Assigns a character partition Plot -- Plots parameters from MCMC analysis Prset -- Sets the priors for the parameters Propset -- Sets proposal probabilities and tuning parameters Quit -- Quits the program Report -- Controls how model parameters are reported Restore -- Restores taxa Set -- Sets run conditions and defines active data partition Showbeagle -- Show available BEAGLE resources Showmatrix -- Shows current character matrix Showmcmctrees -- Shows trees used in mcmc analysis Showmodel -- Shows model settings Showmoves -- Shows moves for current model Showparams -- Shows parameters in current model Showusertrees -- Shows user-defined trees Speciespartition -- Defines a partition of tips into species Ss -- Starts stepping-stone sampling Ssp -- Sets parameters of stepping-stone analysis (without starting) Startvals -- Sets starting values of parameters Sump -- Summarizes parameters from MCMC analysis Sumss -- Summarizes parameters from stepping-stone analysis Sumt -- Summarizes trees from MCMC analysis Taxastat -- Shows status of taxa Taxset -- Assigns a group of taxa to a set Unlink -- Unlinks parameters across character partitions Version -- Shows program version Commands that should be in a NEXUS file (data block, trees block or taxa block) include: Begin -- Denotes beginning of block in file Dimensions -- Defines size of character matrix End -- Denotes end of a block in file Endblock -- Alternative way of denoting end of a block Format -- Defines character format in data block Matrix -- Defines matrix of characters in data block Taxlabels -- Defines taxon labels Translate -- Defines alternative names for taxa Tree -- Defines a tree Note that this program supports the use of the shortest unambiguous spelling of the above commands (e.g., "exe" instead of "execute"). --------------------------------------------------------------------------- *************************************************************************** * * * 2. MrBayes commands * * * *************************************************************************** --------------------------------------------------------------------------- About This command provides some general information about the program. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Acknowledgments This command shows the authors' acknowledgments. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Calibrate This command dates a terminal or interior node in the tree. The format is calibrate = where is the name of a defined interior constraint node or the name of a terminal node (tip) and is a prior probability distribu- tion on the age of the node. The latter can either be a fixed date or a date drawn from one of the available prior probability distributions. In general, the available prior probability distributions are parameterized in terms of the expected mean age of the distribution to facilitate for users. Some dis- tributions put a positive probability on all ages above 0.0, while others in- clude a minimum-age constraint and sometimes a maximum-age constraint. The available distributions and their parameters are: calibrate = fixed() calibrate = uniform(,) calibrate = offsetexponential(,) calibrate = truncatednormal(,,) calibrate = lognormal(,) calibrate = offsetlognormal(,,) calibrate = gamma(,) calibrate = offsetgamma(,,) Note that mean_age is always the mean age and stdev the standard deviation of the distribution measured in user-defined time units. This way of specifying the distribution parameters is often different from the parameterization used elsewhere in the program. For instance, the standard parameters of the gamma distribution used by MrBayes are shape (alpha) and rate (beta). If you want to use the standard parameterization, the conversions are as follows: exponential distributon: mean = 1 / rate gamma distributon: mean = alpha / beta st.dev. = square_root (alpha / beta^2) lognormal distributon: mean = exp (mean_log + st.dev._log^2/2) st.dev. = square_root ((exp (st.dev._log^2) - 1) * (exp (2*mean_log + st.dev._log^2)) The truncated normal distribution is an exception in that the mean_age and stdev parameters are the mean and standard deviation of the underlying non- truncated normal distribution. The truncation will cause the modified distri- bution to have a higher mean and lower standard deviation. The magnitude of that effect depends on how much of the tail of the distribution is removed. Note that previous to version 3.2.2, MrBayes used the standard rate parameter- ization of the offset exponential. This should not cause a problem in most cases because the old parameterization will result in an error in more recent versions of MrBayes, and the likely source of the error is given in the error message. For a practical example, assume that we had three fossil terminals named 'FossilA', 'FossilB', and 'FossilC'. Assume further that we want to fix the age of FossilA to 100.0 million years, we think that FossilB is somewhere between 100.0 and 200.0 million years old, and that FossilC is at least 300.0 million years old, possibly older but relatively unlikely to be more than 400.0 million years old. Then we might use the commands: calibrate FossilA = fixed(100) FossilB = uniform(100,200) calibrate FossilC = offsetexponential(300,400) Note that it is possible to give more than one calibration for each 'calibrate' statement. Thus, 'calibrate FossilA= FossilB=' would be a valid statement. To actually use the calibrations to obtain dated trees, you also need to set a clock model using relevant 'brlenspr' and 'nodeagepr' options of the 'prset' command. You may also want to examine the 'clockvarpr' and 'clockratepr' op- tions. Furthermore, you need to activate the relevant constraint(s) using 'topologypr', if you use any dated interior nodes in the tree. You may wish to remove a calibration from an interior or terminal node, which has previously been calibrated. You can do that using calibrate = unconstrained --------------------------------------------------------------------------- --------------------------------------------------------------------------- Charset This command defines a character set. The format for the charset command is charset = For example, "charset first_pos = 1-720\3" defines a character set called "first_pos" that includes every third site from 1 to 720. The character set name cannot have any spaces in it. The slash (\) is a nifty way of telling the program to assign every third (or second, or fifth, or whatever) character to the character set. This option is best used not from the command line, but rather as a line in the mrbayes block of a file. Note that you can use "." to stand in for the last character (e.g., charset 1-.\3). --------------------------------------------------------------------------- --------------------------------------------------------------------------- Charstat This command shows the status of all the characters. The correct usage is charstat After typing "charstat", the character number, whether it is excluded or included, and the partition identity are shown. The output is paused every 100 characters. This pause can be turned off by setting autoclose to "yes" (set autoclose=yes). --------------------------------------------------------------------------- --------------------------------------------------------------------------- Citations This command shows a thorough list of citations you may consider using when publishing the results of a MrBayes analysis. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Comparetree This command compares the trees in two files, called "filename1" and "filename2". It will output a bivariate plot of the split frequencies as well as plots of the tree distance as a function of the generation. The plots can be used to get a quick indication of whether two runs have con- verged onto the same set of trees. The "Comparetree" command will also produce a ".pairs" file and a ".dists" file (these file endings are added to the end of the "Outputname"). The ".pairs" file contains the paired split frequencies from the two tree samples; the ".dists" file contains the tree distance values. Note that the "Sumt" command provides a different set of convergence diag- nostics tools that you may also want to explore. Unlike "Comparetree", "Sumt" can compare more than two tree samples and will calculate consensus trees and split frequencies from the pooled samples. Options: Relburnin -- If this option is set to 'Yes', then a proportion of the samples will be discarded as burnin when calculating summary statistics. The proportion to be discarded is set with Burninfrac (see below). When the Relburnin option is set to 'No', then a specific number of samples is discarded instead. This number is set by Burnin (see below). Note that the burnin setting is shared with the 'mcmc', 'sumt', 'sump' and 'plot' commands. Burnin -- Determines the number of samples (not generations) that will be discarded when summary statistics are calculated. The value of this option is only relevant when Relburnin is set to 'No'. BurninFrac -- Determines the fraction of samples that will be discarded when summary statistics are calculated. The value of this option is only relevant when Relburnin is set to 'Yes'. Example: A value for this option of 0.25 means that 25% of the samples will be discarded. Minpartfreq -- The minimum probability of partitions to include in summary statistics. Filename1 -- The name of the first tree file to compare. Filename2 -- The name of the second tree file to compare. Outputname -- Name of the file to which 'comparetree' results will be printed. Current settings: Parameter Options Current Setting -------------------------------------------------------- Relburnin Yes/No Yes Burnin 0 Burninfrac 0.25 Minpartfreq 0.00 Filename1 temp.t Filename2 temp.t Outputname temp.comp --------------------------------------------------------------------------- --------------------------------------------------------------------------- Constraint This command defines a tree constraint. The format for the constraint command is constraint [hard|negative|partial] = [:] There are three types of constraint implemented in MrBayes. The type of the constraint is specified by using one of the three keywords 'hard', 'negative', or 'partial' right after the name of the constraint. If no type is specified, then the constraint is assumed to be 'hard'. In a rooted tree, a 'hard' constraint forces the taxa in the list to form a monophyletic group. In an unrooted tree, the taxon split that separates the taxa in the list from other taxa is forced to be present. The interpretation of this depends on whether the tree is rooted on a taxon outside the list or a taxon in the list. If the outgroup is excluded , the taxa in the list are assumed to form a monophyletic group, but if the outgroup is included, the taxa that are not in the list are forced together. A 'negative' constraint bans all the trees that have the listed taxa in the same subtree. In other words, it is the opposite of a hard constraint. A 'partial' or backbone constraint is defined in terms of two sets of taxa separated by a colon character. The constraint forces all taxa in the first list to form a monophyletic group that does not include any taxon in the second list. Taxa that are not included in either list can be placed in any position on the tree, either inside or outside the constrained group. In an unrooted tree, the two taxon lists can be switched with each other with no effect. For a rooted tree, it is the taxa in the first list that have to be monophyletic, that is, these taxa must share a common ancestor not shared with any taxon in the second list. The taxa in the second list may or may not fall in a monophyletic group depending on the rooting of the tree. A list of taxa can be specified using a taxset, taxon names, taxon numbers, or any combination of the above, sepatated by spaces. The constraint is treated as an absolute requirement of trees, that is, trees that are not compatible with the constraint have zero prior (and hence zero posterior) probabilty. If you are interested in inferring ancestral states for a particular node, you need to 'hard' constrain that node first using the 'constraint' command. The same applies if you wish to calibrate an interior node in a dated analysis. For more information on how to infer ancestral states, see the help for the 'report' command. For more on dating, see the 'calibrate' command. It is important to note that simply defining a constraint using this command is not sufficient for the program to actually implement the constraint in an analysis. You must also enforce the constraints using 'prset topologypr = constraints ()'. For more infor- mation on this, see the help on the 'prset' command. Examples: constraint myclade = Homo Pan Gorilla Defines a hard constraint forcing Homo, Pan, and Gorilla to form a mono- phyletic group or a split that does not include any other taxa. constraint forbiddenclade negative = Homo Pan Gorilla Defines a negative constraint that associates all trees where Homon, Pan, and Gorilla form a monophyletic group with zero posterior probability. In other words, such trees will not be sampled during MCMC. constraint backbone partial = Homo Gorilla : Mus Defines a partial constraint that keeps Mus outside of the clade defined by the most recent common ancestor of Homo and Gorilla. Other taxa are allowed to sit anywhere in the tree. Note that this particular constraint is meaningless in unrooted trees. MrBayes does not assume anything about the position of the outgroup unless it is explicitly included in the partial constraint. Therefore a partial constraint must have at least two taxa on each side of the ':' to be useful in analyses of unrooted trees. The case is different for rooted trees, where it is sufficient for a partial constraint to have more than one taxon before the ':', as in the example given above, to constrain tree space. To define a more complex constraint tree, simply combine constraints into a list when issuing the 'prset topologypr' command. -------------------------------------------------------------------------- --------------------------------------------------------------------------- Ctype This command sets the character ordering for standard-type data. The correct usage is: ctype : The available options for the specifier are: unordered -- Movement directly from one state to another is allowed in an instant of time. ordered -- Movement is only allowed between adjacent characters. For example, perhaps only between 0 <-> 1 and 1 <-> 2 for a three state character ordered as 0 - 1 - 2. irreversible -- Rates of change for losses are 0. The characters to which the ordering is applied is specified in manner that is identical to commands such as "include" or "exclude". For example, ctype ordered: 10 23 45 defines charactes 10, 23, and 45 to be of type ordered. Similarly, ctype irreversible: 54 - 67 71-92 defines characters 54 to 67 and characters 71 to 92 to be of type irreversible. You can use the "." to denote the last character, and "all" to denote all of the characters. Finally, you can use the specifier "\" to apply the ordering to every n-th character or you can use predefined charsets to specify the character. Only one ordering can be used on any specific application of ctype. If you want to apply different orderings to different characters, then you need to use ctype multiple times. For example, ctype ordered: 1-50 ctype irreversible: 51-100 sets characters 1 to 50 to be ordered and characters 51 to 100 to be irreversible. The ctype command is only sensible with morphological (here called "standard") characters. The program ignores attempts to apply char- acter orderings to other types of characters, such as DNA characters. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Databreaks This command is used to specify breaks in your input data matrix. Your data may be a mixture of genes or a mixture of different types of data. Some of the models implemented by MrBayes account for nonindependence at adjacent characters. The autocorrelated gamma model, for example, allows rates at adjacent sites to be correlated. However, there is no way for such a model to tell whether two sites, adjacent in the matrix, are actually separated by many kilobases or megabases in the genome. The databreaks command allows you to specify such breaks. The correct usage is: databreaks ... For example, say you have a data matrix of 3204 characters that include nucleotide data from three genes. The first gene covers characters 1 to 970, the second gene covers characters 971 to 2567, and the third gene covers characters 2568 to 3204. Also, let's assume that the genes are not directly adjacent to one another in the genome, as might be likely if you have mitochondrial sequences. In this case, you can specify breaks between the genes using: databreaks 970 2567; The first break, between genes one and two, is after character 970 and the second break, between genes two and three, is after character 2567. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Delete This command deletes taxa from the analysis. The correct usage is: delete ... A list of the taxon names or taxon numbers (labelled 1 to ntax in the order in the matrix) or taxset(s) can be used. For example, the following: delete 1 2 Homo_sapiens deletes taxa 1, 2, and the taxon labelled Homo_sapiens from the analysis. You can also use "all" to delete all of the taxa. For example, delete all deletes all of the taxa from the analysis. Of course, a phylogenetic anal- ysis that does not include any taxa is fairly uninteresting. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Disclaimer This command shows the disclaimer for the program. In short, the disclaimer states that the authors are not responsible for any silly things you may do to your computer or any unforseen but possibly nasty things the computer program may inadvertently do to you. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Exclude This command excludes characters from the analysis. The correct usage is exclude or exclude - or exclude or some combination thereof. Moreover, you can use the specifier "\" to exclude every nth character. For example, the following exclude 1-100\3 would exclude every third character. As a specific example, exclude 2 3 10-14 22 excludes sites 2, 3, 10, 11, 12, 13, 14, and 22 from the analysis. Also, exclude all excludes all of the characters from the analysis. Excluding all characters does not leave you much information for inferring phylogeny. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Execute This command executes a file called . The correct usage is: execute For example, execute replicase.nex would execute the file named "replicase.nex". This file must be in the same directory as the executable. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Help This command provides useful information on the use of this program. The correct usage is help which gives a list of all available commands with a brief description of each or help which gives detailed information on the use of . --------------------------------------------------------------------------- --------------------------------------------------------------------------- Include This command includes characters that were previously excluded from the analysis. The correct usage is include or include - or include or some combination thereof. Moreover, you can use the specifier "\" to include every nth character. For example, the following include 1-100\3 would include every third character. As a specific example, include 2 3 10-14 22 includes sites 2, 3, 10, 11, 12, 13, 14, and 22 from the analysis. Also, include all includes all of the characters in the analysis. Including all of the characters (even if many of them are bad) is a very total-evidence-like thing to do. Doing this will make a certain group of people very happy. On the other hand, simply using this program would make those same people unhappy. --------------------------------------------------------------------------- --------------------------------------------------------------------------- Link This command links model parameters across partitions of the data. The correct usage is: link = ( or ) The list of parameters that can be linked includes: Tratio -- Transition/transversion rate ratio Revmat -- Substitution rates of GTR model Omega -- Nonsynonymous/synonymous rate ratio Statefreq -- Character state frequencies Shape -- Gamma/LNorm shape parameter Pinvar -- Proportion of invariable sites Correlation -- Correlation parameter of autodiscrete gamma Ratemultiplier -- Rate multiplier for partitions Switchrates -- Switching rates for covarion model Topology -- Topology of tree Brlens -- Branch lengths of tree Speciationrate -- Speciation rates for birth-death process Extinctionrate -- Extinction rates for birth-death process Popsize -- Population size for coalescence process Growthrate -- Growth rate of coalescence process Aamodel -- Aminoacid rate matrix Cpprate -- Rate of Compound Poisson Process (CPP) Cppmultdev -- Standard dev. of CPP rate multipliers (log scale) Cppevents -- CPP events TK02var -- Variance increase in TK02 relaxed clock model Igrvar -- Variance increase in IGR relaxed clock model Mixedvar -- Variance increase in Mixed relaxed clock model For example, link shape=(all) links the gamma/lnorm shape parameter across all partitions of the data. You can use "showmodel" to see the current linking status of the characters. For more information on this command, see the help menu for link's converse, unlink ("help unlink"); --------------------------------------------------------------------------- --------------------------------------------------------------------------- Log This command allows output to the screen to also be output to a file. The useage is: log start/stop filename= append/replace The options are: Start/Stop -- Starts or stops logging of output to file. Append/Replace -- Either append to or replace existing file. Filename -- Name of log file (currently, the name of the log file is "log.out"). --------------------------------------------------------------------------- --------------------------------------------------------------------------- Lset This command sets the parameters of the likelihood model. The likelihood function is the probability of observing the data conditional on the phylo- genetic model. In order to calculate the likelihood, you must assume a model of character change. This command lets you tailor the biological assumptions made in the phylogenetic model. The correct usage is lset =