2 \alias{read.nexus.data}
4 Read Character Data In NEXUS Format
7 This function reads a file with sequences in the NEXUS format.
13 \item{file}{a file name specified by either a variable of mode
14 character, or a double-quoted string.}
17 This parser tries to read data from a file written in a
18 \emph{restricted} NEXUS format (see examples below).
20 Please see files \file{data.nex} and \file{taxacharacters.nex} for
21 examples of formats that will work.
23 Some noticeable exceptions from the NEXUS standard (non-exhaustive
27 \item{\bold{I}}{Comments must be either on separate lines or at the
28 end of lines. Examples:\cr
29 \code{[Comment]} \bold{--- OK}\cr
30 \code{Taxon ACGTACG [Comment]} \bold{--- OK}\cr
31 \code{[Comment line 1}
33 \code{Comment line 2]} \bold{--- NOT OK!}\cr
34 \code{Tax[Comment]on ACG[Comment]T} \bold{--- NOT OK!}}
35 \item{\bold{II}}{No spaces (or comments) are allowed in the
36 sequences. Examples:\cr
37 \code{name ACGT} \bold{--- OK}\cr
38 \code{name AC GT} \bold{--- NOT OK!}}
39 \item{\bold{III}}{No spaces are allowed in taxon names, not even if
40 names are in single quotes. That is, single-quoted names are not
41 treated as such by the parser. Examples:\cr
42 \code{Genus_species} \bold{--- OK}\cr
43 \code{'Genus_species'} \bold{--- OK}\cr
44 \code{'Genus species'} \bold{--- NOT OK!}}
45 \item{\bold{IV}}{The trailing \code{end} that closes the
46 \code{matrix} must be on a separate line. Examples:\cr
49 \code{end;} \bold{--- OK}\cr
52 \code{end;} \bold{--- OK}\cr
53 \code{taxon AACCCGT; end;} \bold{--- NOT OK!}}
54 \item{\bold{V}}{Multistate characters are not allowed. That is,
55 NEXUS allows you to specify multiple character states at a
56 character position either as an uncertainty, \code{(XY)}, or as an
57 actual appearance of multiple states, \code{\{XY\}}. This is
58 information is not handled by the parser. Examples:\cr
59 \code{taxon 0011?110} \bold{--- OK}\cr
60 \code{taxon 0011{01}110} \bold{--- NOT OK!}\cr
61 \code{taxon 0011(01)110} \bold{--- NOT OK!}}
62 \item{\bold{VI}}{The number of taxa must be on the same line as
63 \code{ntax}. The same applies to \code{nchar}. Examples:\cr
64 \code{ntax = 12} \bold{--- OK}\cr
67 \code{12} \bold{--- NOT OK!}}
68 \item{\bold{VII}}{The word \dQuote{matrix} can not occur anywhere in
69 the file before the actual \code{matrix} command, unless it is in
70 a comment. Examples:\cr
71 \code{BEGIN CHARACTERS;}
73 \code{TITLE 'Data in file "03a-cytochromeB.nex"';}
75 \code{DIMENSIONS NCHAR=382;}
77 \code{FORMAT DATATYPE=Protein GAP=- MISSING=?;}
79 \code{["This is The Matrix"]} \bold{--- OK}
83 \code{BEGIN CHARACTERS;}
85 \code{TITLE 'Matrix in file "03a-cytochromeB.nex"';} \bold{--- NOT OK!}
87 \code{DIMENSIONS NCHAR=382;}
89 \code{FORMAT DATATYPE=Protein GAP=- MISSING=?;}
95 A list of sequences each made of a single vector of mode character
96 where each element is a (phylogenetic) character state.
99 Maddison, D. R., Swofford, D. L. and Maddison, W. P. (1997) NEXUS: an
100 extensible file format for systematic information. \emph{Systematic
101 Biology}, \bold{46}, 590--621.
103 \author{Johan Nylander \email{nylander@scs.fsu.edu}}
105 \code{\link{read.nexus}}, \code{\link{write.nexus}},
106 \code{\link{write.nexus.data}}
109 ## Use read.nexus.data to read a file in NEXUS format into object x
110 \dontrun{x <- read.nexus.data("file.nex")}