man/chronopl.Rd

   1 \name{chronopl}
   2 \alias{chronopl}
   3 \title{Molecular Dating With Penalized Likelihood}
   4 \description{
   5   This function estimates the node ages of a tree using a
   6   semi-parametric method based on penalized likelihood (Sanderson
   7   2002). The branch lengths of the input tree are interpreted as mean
   8   numbers of substitutions (i.e., per site).
   9 }
  10 \usage{
  11 chronopl(phy, lambda, age.min = 1, age.max = NULL,
  12          node = "root", S = 1, tol = 1e-8,
  13          CV = FALSE, eval.max = 500, iter.max = 500, ...)
  14 }
  15 \arguments{
  16   \item{phy}{an object of class \code{"phylo"}.}
  17   \item{lambda}{value of the smoothing parameter.}
  18   \item{age.min}{numeric values specifying the fixed node ages (if
  19     \code{age.max = NULL}) or the youngest bound of the nodes known to
  20     be within an interval.}
  21   \item{age.max}{numeric values specifying the oldest bound of the nodes
  22     known to be within an interval.}
  23   \item{node}{the numbers of the nodes whose ages are given by
  24     \code{age.min}; \code{"root"} is a short-cut for the root.}
  25   \item{S}{the number of sites in the sequences; leave the default if
  26     branch lengths are in mean number of substitutions.}
  27   \item{tol}{the value below which branch lengths are considered
  28     effectively zero.}
  29   \item{CV}{whether to perform cross-validation.}
  30   \item{eval.max}{the maximal number of evaluations of the penalized
  31     likelihood function.}
  32   \item{iter.max}{the maximal number of iterations of the optimization
  33     algorithm.}
  34   \item{\dots}{further arguments passed to control \code{nlminb}.}
  35 }
  36 \details{
  37   The idea of this method is to use a trade-off between a parametric
  38   formulation where each branch has its own rate, and a nonparametric
  39   term where changes in rates are minimized between contiguous
  40   branches. A smoothing parameter (lambda) controls this trade-off. If
  41   lambda = 0, then the parametric component dominates and rates vary as
  42   much as possible among branches, whereas for increasing values of
  43   lambda, the variation are smoother to tend to a clock-like model (same
  44   rate for all branches).
  45
  46   \code{lambda} must be given. The known ages are given in
  47   \code{age.min}, and the correponding node numbers in \code{node}.
  48   These two arguments must obviously be of the same length. By default,
  49   an age of 1 is assumed for the root, and the ages of the other nodes
  50   are estimated.
  51
  52   If \code{age.max = NULL} (the default), it is assumed that
  53   \code{age.min} gives exactly known ages. Otherwise, \code{age.max} and
  54   \code{age.min} must be of the same length and give the intervals for
  55   each node. Some node may be known exactly while the others are
  56   known within some bounds: the values will be identical in both
  57   arguments for the former (e.g., \code{age.min = c(10, 5), age.max =
  58     c(10, 6), node = c(15, 18)} means that the age of node 15 is 10
  59   units of time, and the age of node 18 is between 5 and 6).
  60
  61   If two nodes are linked (i.e., one is the ancestor of the other) and
  62   have the same values of \code{age.min} and \code{age.max} (say, 10 and
  63   15) this will result in an error because the medians of these values
  64   are used as initial times (here 12.5) giving initial branch length(s)
  65   equal to zero. The easiest way to solve this is to change slightly the
  66   given values, for instance use \code{age.max = 14.9} for the youngest
  67   node, or \code{age.max = 15.1} for the oldest one (or similarly for
  68   \code{age.min}).
  69
  70   The input tree may have multichotomies. If some internal branches are
  71   of zero-length, they are collapsed (with a warning), and the returned
  72   tree will have less nodes than the input one. The presence of
  73   zero-lengthed terminal branches of results in an error since it makes
  74   little sense to have zero-rate branches.
  75
  76   The cross-validation used here is different from the one proposed by
  77   Sanderson (2002). Here, each tip is dropped successively and the
  78   analysis is repeated with the reduced tree: the estimated dates for
  79   the remaining nodes are compared with the estimates from the full
  80   data. For the \eqn{i}{i}th tip the following is calculated:
  81
  82   \deqn{\sum_{j=1}^{n-2}{\frac{(t_j - t_j^{-i})^2}{t_j}}}{SUM[j = 1, ..., n-2] (tj - tj[-i])^2/tj},
  83
  84   where \eqn{t_j}{tj} is the estimated date for the \eqn{j}{j}th node
  85   with the full phylogeny, \eqn{t_j^{-i}}{tj[-i]} is the estimated date
  86   for the \eqn{j}{j}th node after removing tip \eqn{i}{i} from the tree,
  87   and \eqn{n}{n} is the number of tips.
  88
  89   The present version uses the \code{\link[stats]{nlminb}} to optimise
  90   the penalized likelihood function: see its help page for details on
  91   parameters controlling the optimisation procedure.
  92 }
  93 \value{
  94   an object of class \code{"phylo"} with branch lengths as estimated by
  95   the function. There are three or four further attributes:
  96
  97   \item{ploglik}{the maximum penalized log-likelihood.}
  98   \item{rates}{the estimated rates for each branch.}
  99   \item{message}{the message returned by \code{nlminb} indicating
 100     whether the optimisation converged.}
 101   \item{D2}{the influence of each observation on overall date
 102     estimates (if \code{CV = TRUE}).}
 103 }
 104 \references{
 105   Sanderson, M. J. (2002) Estimating absolute rates of molecular
 106   evolution and divergence times: a penalized likelihood
 107   approach. \emph{Molecular Biology and Evolution}, \bold{19},
 108   101--109.
 109 }
 110 \author{Emmanuel Paradis}
 111 \seealso{
 112   \code{\link{chronoMPL}}
 113 }
 114 \keyword{models}