man/phymltest.Rd

   1 \name{phymltest}
   2 \alias{phymltest}
   3 \alias{print.phymltest}
   4 \alias{summary.phymltest}
   5 \alias{plot.phymltest}
   6 \title{Fits a Bunch of Models with PHYML}
   7 \usage{
   8 phymltest(seqfile, format = "interleaved", itree = NULL,
   9           exclude = NULL, execname, path2exec = NULL)
  10 \method{print}{phymltest}(x, ...)
  11 \method{summary}{phymltest}(object, ...)
  12 \method{plot}{phymltest}(x, main = NULL, col = "blue", ...)
  13 }
  14 \arguments{
  15   \item{seqfile}{a character string giving the name of the file that
  16     contains the DNA sequences to be analysed by PHYML.}
  17   \item{format}{a character string specifying the format of the DNA
  18     sequences: either \code{"interleaved"} (the default), or
  19     \code{"sequential"}.}
  20   \item{itree}{a character string giving the name of a file with a tree
  21     in Newick format to be used as an initial tree by PHYML. If
  22     \code{NULL} (the default), PHYML uses a ``BIONJ'' tree.}
  23   \item{exclude}{a vector of mode character giving the models to be
  24     excluded from the analysis. These must be among those below, and
  25     follow the same syntax.}
  26   \item{execname}{a character string specifying the name of the PHYML
  27     binary file. This argument can be left missing under Windows: the
  28     default name \code{"phyml_w32"} will then be used.}
  29   \item{path2exec}{a character string giving the path to the PHYML
  30     binary file. If \code{NULL} the file must be accessible to R (either
  31     it is in the computer path, or it is in R's working directory).}
  32   \item{x}{an object of class \code{"phymltest"}.}
  33   \item{object}{an object of class \code{"phymltest"}.}
  34   \item{main}{a title for the plot; if left \code{NULL}, a title is made
  35     with the name of the object (use \code{main = ""} to have no
  36     title).}
  37   \item{col}{a colour used for the segments showing the AIC values (blue
  38     by default).}
  39   \item{...}{further arguments passed to or from other methods.}
  40 }
  41 \description{
  42   This function calls the software PHYML and fits successively 28 models
  43   of DNA evolution. The results are saved on disk, as PHYML usually
  44   does, and returned in R as a vector with the log-likelihood value of
  45   each model.
  46 }
  47 \details{
  48   The present function has been tested with version 2.4 of PHYML; it
  49   should also work with version 2.3, but it won't work with version 2.1.
  50
  51   Under unix-like systems, it seems necessary to run R from csh or a
  52   similar shell (sh might not work).
  53
  54   The user must take care to set correctly the three different paths
  55   involved here: the path to PHYML's binary, the path to the sequence
  56   file, and the path to R's working directory. The function should work
  57   if all three paths are different. Obviously, there should be no problem
  58   if they are all the same.
  59
  60   If the usual output files of PHYML already exist, they are not
  61   deleted and PHYML's results are appended.
  62
  63   The following syntax is used for the models:
  64
  65   "X[Y][Z]00[+I][+G]"
  66
  67   where "X" is the first letter of the author of the model, "Y" and "Z"
  68   are possibly other co-authors of the model, "00" is the year of the
  69   publication of the model, and "+I" and "+G" indicates whether the
  70   presence of invariant sites and/or a gamma distribution of
  71   substitution rates have been specified. Thus, Kimura's model is
  72   denoted "K80" and not "K2P". The exception to this rule is the general
  73   time-reversible model which is simple denoted "GTR" model.
  74
  75   The seven substitution models used are: "JC69", "K80", "F81", "F84",
  76   "HKY85", "TN93", and "GTR". These models are then altered by adding
  77   the "+I" and/or "+G", resulting thus in four variants for each of them
  78   (e.g., "JC69", "JC69+I", "JC69+G", "JC69+I+G"). Some of these models
  79   are described in the help page of \code{\link{dist.dna}}.
  80
  81   When a gamma distribution of substitution rates is specified, four
  82   categories are used (which is PHYML's default behaviour), and the
  83   ``alpha'' parameter is estimated from the data.
  84
  85   For the models with a different substition rate for transitions and
  86   transversions, these rates are left free and estimated from the data
  87   (and not constrained with a ratio of 4 as in PHYML's default).
  88 }
  89 \note{
  90   It is important to note that the models fitted by this function is
  91   only a small fraction of the models possible with PHYML. For instance,
  92   it is possible to vary the number of categories in the (discretized)
  93   gamma distribution of substitution rates, and many parameters can be
  94   fixed by the user. The results from the present function should rather
  95   be taken as indicative of a best model.
  96 }
  97 \value{
  98   \code{phymltest} returns an object of class \code{"phymltest"}: a
  99   numeric vector with the models as names.
 100
 101   The \code{print} method prints an object of class \code{"phymltest"}
 102   as matrix with the name of the models, the number of free parameters,
 103   the log-likelihood value, and the value of the Akaike information
 104   criterion (AIC = -2 * loglik + 2 * number of free parameters)
 105
 106   The \code{summary} method prints all the possible likelihood ratio
 107   tests for an object of class \code{"phymltest"}.
 108
 109   The \code{plot} method plots the values of AIC of an object of class
 110   \code{"phymltest"} on a vertical scale.
 111 }
 112 \references{
 113   Posada, D. and Crandall, K. A. (2001) Selecting the best-fit model of
 114   nucleotide substitution. \emph{Systematic Biology}, \bold{50},
 115   580--601.
 116
 117   Guindon, S. and Gascuel, O. (2003) A simple, fast, and accurate
 118   algorithm to estimate large phylogenies by maximum likelihood.
 119   \emph{Systematic Biology}, \bold{52}, 696--704.
 120   \url{http://atgc.lirmm.fr/phyml/}
 121 }
 122 \author{Emmanuel Paradis \email{Emmanuel.Paradis@mpl.ird.fr}}
 123 \seealso{
 124   \code{\link{read.tree}}, \code{\link{write.tree}},
 125   \code{\link{dist.dna}}
 126 }
 127 \examples{
 128 ### A `fake' example with random likelihood values: it does not
 129 ### make sense, but does not need PHYML and gives you a flavour
 130 ### of what the output looks like:
 131 x <- runif(28, -100, -50)
 132 names(x) <- .phymltest.model
 133 class(x) <- "phymltest"
 134 x
 135 summary(x)
 136 plot(x)
 137 plot(x, main = "", col = "red")
 138 ### This example needs PHYML, copy/paste or type the
 139 ### following commands if you want to try them, eventually
 140 ### changing setwd() and the options of phymltest()
 141 \dontrun{
 142 setwd("D:/phyml_v2.4/exe") # under Windows
 143 data(woodmouse)
 144 write.dna(woodmouse, "woodmouse.txt")
 145 X <- phymltest("woodmouse.txt")
 146 X
 147 summary(X)
 148 plot(X)
 149 }
 150 }
 151 \keyword{models}