man/CADM.global.Rd

   1 \name{CADM.global}
   2 \alias{CADM}
   3 \alias{CADM.global}
   4 \alias{CADM.post}
   5 \title{ Congruence among distance matrices }
   6 \description{
   7 Function \code{\link{CADM.global}} compute and test the coefficient of concordance among several distance matrices through a permutation test.
   8
   9 Function \code{\link{CADM.post}} carries out a posteriori permutation tests of the contributions of individual distance matrices to the overall concordance of the group.
  10
  11 Use in phylogenetic analysis: to identify congruence among distance matrices (D) representing different genes or different types of data. Congruent D matrices correspond to data tables that can be used together in a combined phylogenetic or other type of multivariate analysis.
  12 }
  13 \usage{
  14 CADM.global(Dmat, nmat, n, nperm=99, make.sym=TRUE, weights=NULL,
  15             silent=FALSE)
  16 CADM.post  (Dmat, nmat, n, nperm=99, make.sym=TRUE, weights=NULL,
  17              mult="holm", mantel=FALSE, silent=FALSE)
  18 }
  19
  20 \arguments{
  21   \item{Dmat}{ A text file listing the distance matrices one after the other, with or without blank lines in-between. Each matrix is in the form of a square distance matrix with 0's on the diagonal. }
  22   \item{nmat}{ Number of distance matrices in file Dmat. }
  23   \item{n}{ Number of objects in each distance matrix. All matrices must have the same number of objects. }
  24   \item{nperm}{ Number of permutations for the tests of significance. }
  25   \item{make.sym}{ TRUE: turn asymmetric matrices into symmetric matrices by averaging the two triangular portions. FALSE: analyse asymmetric matrices as they are. }
  26   \item{weights}{ A vector of positive weights for the distance matrices. Example: weights = c(1,2,3). NULL (default): all matrices have same weight in the calculation of W. }
  27   \item{mult}{ Method for correcting P-values in multiple testing. The methods are "holm" (default), "sidak", and "bonferroni". The Bonferroni correction is overly conservative; it is not recommended. It is included to allow comparisons with the other methods. }
  28   \item{mantel}{ TRUE: Mantel statistics will be computed from ranked distances, as well as permutational P-values. FALSE (default): Mantel statistics and tests will not be computed. }
  29   \item{silent}{ TRUE: informative messages will not be printed, but stopping messages will. Option useful for simulation work. FALSE: informative messages will be printed. }
  30 }
  31 \details{
  32 \code{Dmat} must contain two or more distance matrices, listed one after the other, all of the same size, and corresponding to the same objects in the same order. Raw data tables can be transformed into distance matrices before comparison with other such distance matrices, or with data that have been obtained as distance matrices, e.g. serological or DNA hybridization data. The distances will be transformed to ranks before computation of the coefficient of concordance and other statistics.
  33
  34 \code{CADM.global} tests the global null hypothesis that all matrices are incongruent. If the global null is rejected, function \code{CADM.post} can be used to identify the concordant (H0 rejected) and discordant matrices (H0 not rejected) in the group. If a distance matrix has a negative value for the \code{Mantel.mean} statistic, that matrix clearly does not belong to the group. Remove that matrix (if there are more than one, remove first the matrix that has the most strongly negative value for \code{Mantel.mean}) and run the analysis again.
  35
  36 The corrections used for multiple testing are applied to the list of P-values (P) produced in the a posteriori tests; they take into account the number of tests (k) carried out simulatenously (number of matrices, parameter \code{nmat}).
  37
  38 The Holm correction is computed after ordering the P-values in a list with the smallest value to the left. Compute adjusted P-values as:
  39
  40 \deqn{P_{corr} = (k-i+1)*P}{P_corr = (k-i+1)*P}
  41
  42 where i is the position in the ordered list. Final step: from left to right, if an adjusted \eqn{P_{corr}}{P_corr} in the ordered list is smaller than the one occurring at its left, make the smallest one equal to the largest one.
  43
  44 The Sidak correction is:
  45
  46 \deqn{P_{corr} = 1 - (1 - P)^k}{P_corr = 1 - (1 - P)^k}
  47
  48 The Bonferonni correction is:
  49
  50 \deqn{P_{corr} = k*P}{P_corr = k*P}
  51 }
  52
  53 \value{
  54
  55 \code{CADM.global} produces a small table containing the W, Chi2, and Prob.perm statistics described in the following list.
  56 \code{CADM.post} produces a table stored in element \code{A_posteriori_tests}, containing Mantel.mean, Prob, and Corrected.prob statistics in rows; the columns correspond to the k distance matrices under study, labeled Dmat.1 to Dmat.k.
  57 If parameter \code{mantel} is TRUE, tables of Mantel statistics and P-values are computed among the matrices.
  58
  59   \item{W }{Kendall's coefficient of concordance, W (Kendall and Babington Smith 1939; see also Legendre 2010). }
  60   \item{Chi2 }{Friedman's chi-square statistic (Friedman 1937) used in the permutation test of W. }
  61   \item{Prob.perm }{Permutational probability. }
  62
  63   \item{Mantel.mean }{Mean of the Mantel correlations, computed on rank-transformed distances, between the distance matrix under test and all the other matrices in the study. }
  64   \item{Prob }{Permutational probabilities, uncorrected. }
  65   \item{Corrected prob }{Permutational probabilities corrected using the method selected in parameter \code{mult}. }
  66
  67   \item{Mantel.cor }{Matrix of Mantel correlations, computed on rank-transformed distances, among the distance matrices. }
  68   \item{Mantel.prob }{One-tailed P-values associated with the Mantel correlations of the previous table. The probabilities are computed in the right-hand tail. H0 is tested against the alternative one-tailed hypothesis that the Mantel correlation under test is positive. No correction is made for multiple testing. }
  69 }
  70
  71 \references{
  72 Campbell, V., P. Legendre and F.-J. Lapointe. 2009. Assessing congruence among ultrametric distance matrices. Journal of Classification 26: 103-117.
  73
  74 Campbell, V., P. Legendre and F.-J. Lapointe. 2011. The performance of the Congruence Among Distance Matrices (CADM) test in phylogenetic analysis. BMC Evolutionary Biology 11: 64. http://www.biomedcentral.com/1471-2148/11/64.
  75
  76 Friedman, M. 1937. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association 32: 675-701.
  77
  78 Kendall, M. G. and B. Babington Smith. 1939. The problem of m rankings. Annals of Mathematical Statistics 10: 275-287.
  79
  80 Lapointe, F.-J., J. A. W. Kirsch and J. M. Hutcheon. 1999. Total evidence, consensus, and bat phylogeny: a distance-based approach. Molecular Phylogenetics and Evolution 11: 55-66.
  81
  82 Legendre, P. 2010. Coefficient of concordance. Pp. 164-169 in: Encyclopedia of Research Design, Vol. 1. N. J. Salkind, ed. SAGE Publications, Inc., Los Angeles.
  83
  84 Legendre, P. and F.-J. Lapointe. 2004. Assessing congruence among distance matrices: single malt Scotch whiskies revisited. Australian and New Zealand Journal of Statistics 46: 615-629.
  85
  86 Legendre, P. et F.-J. Lapointe. 2005. Congruence entre matrices de distance. P. 178-181 in: Makarenkov, V., G. Cucumel et F.-J. Lapointe [eds] Comptes rendus des 12emes Rencontres de la Societe Francophone de Classification, Montreal, 30 mai - 1er juin 2005.
  87
  88 Siegel, S. and N. J. Castellan, Jr. 1988. Nonparametric statistics for the behavioral sciences. 2nd edition. McGraw-Hill, New York.
  89 }
  90
  91 \author{ Pierre Legendre, Universite de Montreal }
  92
  93 \examples{
  94
  95 # Examples 1 and 2: 5 genetic distance matrices computed from simulated DNA
  96 # sequences representing 50 taxa having evolved along additive trees with
  97 # identical evolutionary parameters (GTR+ Gamma + I). Distance matrices were
  98 # computed from the DNA sequence matrices using a p distance corrected with the
  99 # same parameters as those used to simulate the DNA sequences. See Campbell et
 100 # al. (2009) for details.
 101
 102 # Example 1: five independent additive trees. Data provided by V. Campbell.
 103
 104 data(mat5Mrand)
 105 res.global <- CADM.global(mat5Mrand, 5, 50)
 106
 107 # Example 2: three partly similar trees, two independent trees.
 108 # Data provided by V. Campbell.
 109
 110 data(mat5M3ID)
 111 res.global <- CADM.global(mat5M3ID, 5, 50)
 112 res.post   <- CADM.post(mat5M3ID, 5, 50, mantel=TRUE)
 113
 114 # Example 3: three matrices respectively representing Serological
 115 # (asymmetric), DNA hybridization (asymmetric) and Anatomical (symmetric)
 116 # distances among 9 families. Data from Lapointe et al. (1999).
 117
 118 data(mat3)
 119 res.global <- CADM.global(mat3, 3, 9, nperm=999)
 120 res.post   <- CADM.post(mat3, 3, 9, nperm=999, mantel=TRUE)
 121
 122 # Example 4, showing how to bind two D matrices (cophenetic matrices
 123 # in this example) into a file using rbind(), then run the global test.
 124
 125 a <- rtree(5)
 126 b <- rtree(5)
 127 A <- cophenetic(a)
 128 B <- cophenetic(b)
 129 x <- rownames(A)
 130 B <- B[x, x]
 131 M <- rbind(A, B)
 132 CADM.global(M, 2, 5)
 133 }
 134
 135 \keyword{ multivariate }
 136 \keyword{ nonparametric }