2 * FastTree -- inferring approximately-maximum-likelihood trees for large
3 * multiple sequence alignments.
6 * http://www.microbesonline.org/fasttree/
8 * Thanks to Jim Hester of the Cleveland Clinic Foundation for
9 * providing the first parallel (OpenMP) code, Siavash Mirarab of
10 * UT Austin for implementing the WAG option, Samuel Shepard
11 * at the CDC for suggesting and helping with the -quote option, and
12 * Aaron Darling (University of Technology, Sydney) for numerical changes
13 * for wide alignments of closely-related sequences.
15 * Copyright (C) 2008-2015 The Regents of the University of California
16 * All rights reserved.
18 * This program is free software; you can redistribute it and/or modify
19 * it under the terms of the GNU General Public License as published by
20 * the Free Software Foundation; either version 2 of the License, or
21 * (at your option) any later version.
23 * This program is distributed in the hope that it will be useful,
24 * but WITHOUT ANY WARRANTY; without even the implied warranty of
25 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
26 * GNU General Public License for more details.
28 * You should have received a copy of the GNU General Public License along
29 * with this program; if not, write to the Free Software Foundation, Inc.,
30 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
31 * or visit http://www.gnu.org/copyleft/gpl.html
35 * NEITHER THE UNITED STATES NOR THE UNITED STATES DEPARTMENT OF ENERGY,
36 * NOR ANY OF THEIR EMPLOYEES, MAKES ANY WARRANTY, EXPRESS OR IMPLIED,
37 * OR ASSUMES ANY LEGAL LIABILITY OR RESPONSIBILITY FOR THE ACCURACY,
38 * COMPLETENESS, OR USEFULNESS OF ANY INFORMATION, APPARATUS, PRODUCT,
39 * OR PROCESS DISCLOSED, OR REPRESENTS THAT ITS USE WOULD NOT INFRINGE
40 * PRIVATELY OWNED RIGHTS.
44 * To compile FastTree, do:
45 * gcc -Wall -O3 -finline-functions -funroll-loops -o FastTree -lm FastTree.c
46 * Use -DNO_SSE to turn off use of SSE3 instructions
47 * (should not be necessary because compiler should not set __SSE__ if
48 * not available, and modern mallocs should return 16-byte-aligned values)
49 * Use -DOPENMP -fopenmp to use multiple threads (note, old versions of gcc
50 * may not support -fopenmp)
51 * Use -DTRACK_MEMORY if you want detailed reports of memory usage,
52 * but results are not correct above 4GB because mallinfo stores int values.
53 * It also makes FastTree run significantly slower.
55 * To get usage guidance, do:
58 * FastTree uses profiles instead of a distance matrix, and computes
59 * support values for each split from the profiles of the 4 nodes
60 * around the split. It stores a profile for each node and a average
61 * profile over all active nodes (the "out-profile" for computing the
62 * total sum of distance to other nodes). The neighbor joining phase
63 * requires O(N*L*a) space, where N is the number of sequences, L is
64 * the alignment width, and a is the alphabet size. The top-hits
65 * heuristic requires an additional O(N sqrt(N)) memory. After
66 * neighbor-joining, FastTree improves the topology with
67 * nearest-neighbor interchanges (NNIs) and subtree-prune-regraft
68 * moves (SPRs), which does not have a significant additional memory
69 * requirement. (We need only store "up-profiles" on the path from our
70 * current traversal point to the root.) These take O(NLa) time per
71 * round, and with default settings, O(N log(N) L a) time total.
72 * FastTree further improves the topology with maximum-likelihood
73 * NNIs, using similar data structures and complexity, but with a
74 * higher constant factor, and now the "profiles" are actually
75 * posterior distributions for that subtree. Finally, FastTree
76 * resamples the site likelihoods around each NNI and uses
77 * the Shimodaira Hasegawa test to estimate the reliability of each split.
79 * Overview of the neighbor-joining phase:
81 * Although FastTree uses a log correction on profile distances to
82 * account for multiple substitutions when doing NNIs and SPRs, the
83 * operations on the profiles themselves involve "additive" distances
84 * -- either %different (for nucleotide) or by using an amino acid
85 * similarity matrix (for proteins). If we are using %different as
86 * our distance matrix then
88 * Profile_distance(A,B) = 1 - sum over characters of freq(A)*freq(B)
90 * and we can average this value over positions. Positions with gaps
91 * are weighted by %ungapped(A) * %ungapped(B).
93 * If we are using an amino acid dissimilarity matrix D(i,j) then at
96 * Profile_distance(A,B) = sum(i,j) freq(A==i) * freq(B==j) * D(i,j)
97 * = sum(k) Ak * Bk * Lambda(k)
99 * where k iterates over 20 eigenvectors, Lambda(k) is the eigenvalue,
100 * and if A==i, then Ak is the kth column of the inverse of the
101 * eigenvector matrix.
103 * The exhaustive approach (-slow) takes O(N**3*L*a) time, but
104 * this can be reduced to as little as O(N**(3/2)*log(N)*L*a) time
105 * by using heuristics.
107 * It uses a combination of three heuristics: a visible set similar to
108 * that of FastTree (Elias & Lagergren 2005), a local hill-climbing
109 * search for a better join (as in relaxed neighbor-joining, Evans et
110 * al. 2006), and a top-hit list to reduce the search space (see
113 * The "visible" set stores, for each node, the best join for that
114 * node, as identified at some point in the past
116 * If top-hits are not being used, then the neighbor-joining phase can
119 * Compute the out-profile by averaging the leaves
120 * Compute the out-distance of each leaf quickly, using the out-profile
121 * Compute the visible set (or approximate it using top-hits, see below)
122 * Until we're down to 3 active nodes:
123 * Find the best join in the visible set
124 * (This involves recomputing the neighbor-joining criterion,
125 * as out-distances and #active nodes may have changed)
126 * Follow a chain of best hits (again recomputing the criterion)
127 * until we find a locally best join, as in relaxed neighbor joining
128 * Create a profile of the parent node, either using simple averages (default)
129 * or using weighted joining as in BIONJ (if -bionj was specified)
130 * Update the out-profile and the out-distances
131 * Update the visible set:
132 * find the best join for the new joined node
133 * replace hits to the joined children with hits to the parent
134 * if we stumble across a join for the new node that is better
135 * than the corresponding entry in the visible set, "reset"
138 * For each iteration, this method does
139 * O(N) work to find the best hit in the visible set
140 * O(L*N*a*log(N)) work to do the local search, where log(N)
141 * is a pessimistic estimate of the number of iterations. In
142 * practice, we average <1 iteration for 2,000 sequences.
143 * With -fastest, this step is omitted.
144 * O(N*a) work to compute the joined profile and update the out-profile
145 * O(L*N*a) work to update the out-distances
146 * O(L*N*a) work to compare the joined profile to the other nodes
147 * (to find the new entry in the visible set)
149 * and there are N-3 iterations, so it takes O(N**2 * L * log(N) * a) time.
151 * The profile distances give exactly the same result as matrix
152 * distances in neighbor-joining or BIONJ would if there are no gaps
153 * in the alignment. If there are gaps, then it is an
154 * approximation. To get the same result we also store a "diameter"
155 * for each node (diameter is 0 for leaves).
157 * In the simpler case (NJ rather than BIONJ), when we join A and B to
158 * give a new node AB,
160 * Profile(AB) = (A+B)/2
161 * Profile_distance(AB,C) = (Profile_distance(A,C)+Profile_distance(B,C))/2
162 * because the formulas above are linear
164 * And according to the neighor-joining rule,
165 * d(AB,C) = (d(A,C)+d(B,C)-d(A,B))/2
167 * and we can achieve the same value by writing
168 * diameter(AB) = pd(A,B)/2
170 * d(A,B) = pd(A,B) - diameter(A) - diameter(B)
173 * d(AB,C) = (d(A,C)+d(B,C)-d(A,B))/2
174 * = (pd(A,C)-diam(A)-diam(C)+pd(B,C)-diam(B)-diam(C)-d(A,B)+diam(A)+diam(B))/2
175 * = (pd(A,C)+pd(B,C))/2 - diam(C) - pd(A,B)
176 * = pd(AB,C) - diam(AB) - diam(C)
178 * If we are using BIONJ, with weight lambda for the join:
179 * Profile(AB) = lambda*A + (1-lambda)*B
180 * then a similar argument gives
181 * diam(AB) = lambda*diam(A) + (1-lambda)*diam(B) + lambda*d(A,AB) + (1-lambda)*d(B,AB),
183 * where, as in neighbor joining,
184 * d(A,AB) = d(A,B) + (total out_distance(A) - total out_distance(B))/(n-2)
186 * A similar recursion formula works for the "variance" matrix of BIONJ,
187 * var(AB,C) = lambda*var(A,C) + (1-lambda)*var(B,C) - lambda*(1-lambda)*var(A,B)
189 * var(A,B) = pv(A,B) - vd(A) - vd(B), where
191 * vd(A) = 0 for leaves
192 * vd(AB) = lambda*vd(A) + (1-lambda)*vd(B) + lambda*(1-lambda)*var(A,B)
194 * The top-hist heuristic to reduce the work below O(N**2*L) stores a top-hit
195 * list of size m=sqrt(N) for each active node.
197 * The list can be initialized for all the leaves in sub (N**2 * L) time as follows:
198 * Pick a "seed" sequence and compare it to all others
199 * Store the top m hits of the seed as its top-hit list
200 * Take "close" hits of the seed(within the top m, and see the "close" parameter),
201 * and assume that their top m hits lie within the top 2*m hits of the seed.
202 * So, compare them to the seed's neighors (if they do not already
203 * have a top hit list) and set their top hits.
205 * This method does O(N*L) work for each seed, or O(N**(3/2)*L) work total.
207 * To avoid doing O(N*L) work at each iteration, we need to avoid
208 * updating the visible set and the out-distances. So, we use "stale"
209 * out-distances, and when searching the visible set for the best hit,
210 * we only inspect the top m=sqrt(N) entries. We then update those
211 * out-distances (up to 2*m*L*a work) and then find the best hit.
213 * To avoid searching the entire visible set, FastTree keeps
214 * and updates a list of the top sqrt(N) entries in the visible set.
215 * This costs O(sqrt(N)) time per join to find the best entry and to
216 * update, or (N sqrt(N)) time overall.
218 * Similarly, when doing the local hill-climbing, we avoid O(N*L) work
219 * by only considering the top-hits for the current node. So this adds
220 * O(m*a*log(N)) work per iteration.
222 * When we join two nodes, we compute profiles and update the
223 * out-profile as before. We need to compute the best hits of the node
224 * -- we merge the lists for the children and select the best up-to-m
225 * hits. If the top hit list contains a stale node we replace it with
226 * its parent. If we still have <m/2 entries, we do a "refresh".
228 * In a "refresh", similar to the fast top-hit computation above, we
229 * compare the "seed", in this case the new joined node, to all other
230 * nodes. We compare its close neighbors (the top m hits) to all
231 * neighbors (the top 2*m hits) and update the top-hit lists of all
232 * neighbors (by merging to give a list of 3*m entries and then
233 * selecting the best m entries).
235 * Finally, during these processes we update the visible sets for
236 * other nodes with better hits if we find them, and we set the
237 * visible entry for the new joined node to the best entry in its
238 * top-hit list. (And whenever we update a visible entry, we
239 * do O(sqrt(N)) work to update the top-visible list.)
240 * These udpates are not common so they do not alter the
241 * O(N sqrt(N) log(N) L a) total running time for the joining phase.
243 * Second-level top hits
245 * With -fastest or with -2nd, FastTree uses an additional "2nd-level" top hits
246 * heuristic to reduce the running time for the top-hits phase to
247 * O(N**1.25 L) and for the neighbor-joining phase to O(N**1.25 L a).
248 * This also reduces the memory usage for the top-hits lists to
249 * O(N**1.25), which is important for alignments with a million
250 * sequences. The key idea is to store just q = sqrt(m) top hits for
253 * Given the neighbors of A -- either for a seed or for a neighbor
254 * from the top-hits heuristic, if B is within the top q hits of A, we
255 * set top-hits(B) from the top 3*q top-hits of A. And, we record that
256 * A is the "source" of the hits for B, so if we run low on hits for
257 * B, instead of doing a full refresh, we can do top-hits(B) :=
258 * top-hits(B) union top-hits(active_ancestor(A)).
259 * During a refresh, these "2nd-level" top hits are updated just as
260 * normal, but the source is maintained and only q entries are stored,
261 * until we near the end of the neighbor joining phase (until the
262 * root as 2*m children or less).
264 * Parallel execution with OpenMP
266 * If you compile FastTree with OpenMP support, it will take
267 * advantage of multiple CPUs on one machine. It will parallelize:
270 * Comparing one node to many others during the NJ phase (the simplest kind of join)
272 * Optimizing likelihoods for 3 alternate topologies during ML NNIs and ML supports
273 * (only 3 threads can be used)
275 * This accounts for most of the O(N L a) or slower steps except for
276 * minimum-evolution NNIs (which are fast anyway), minimum-evolution SPRs,
277 * selecting per-site rates, and optimizing branch lengths outside of ML NNIs.
279 * Parallelizing the top hits phase may lead to a slight change in the tree,
280 * as some top hits are computed from different (and potentially less optimal source).
281 * This means that results on repeated runs may not be 100% identical.
282 * However, this should not have any significant effect on tree quality
283 * after the NNIs and SPRs.
285 * The OpenMP code also turns off the star-topology test during ML
286 * NNIs, which may lead to slight improvements in likelihood.
295 #include <sys/time.h>
299 /* malloc.h apparently doesn't exist on MacOS */
303 /* Compile with -DOPENMP to turn on multithreading */
308 /* By default, tries to compile with SSE instructions for greater speed.
309 But if compiled with -DUSE_DOUBLE, uses double precision instead of single-precision
310 floating point (2x memory required), does not use SSE, and allows much shorter
314 #if !defined(NO_SSE) && !defined(USE_DOUBLE)
321 #define SSE_STRING "Double precision (No SSE3)"
322 typedef double numeric_t;
323 #define ScanNumericSpec "%lf"
325 typedef float numeric_t;
326 #define ScanNumericSpec "%f"
330 #define SSE_STRING "SSE3"
331 #define ALIGNED __attribute__((aligned(16)))
332 #define IS_ALIGNED(X) ((((unsigned long) new) & 15L) == 0L)
333 #include <xmmintrin.h>
338 #define IS_ALIGNED(X) 1
341 #define SSE_STRING "No SSE3"
344 #endif /* USE_SSE3 */
346 #define FT_VERSION "2.1.8"
349 " FastTree protein_alignment > tree\n"
350 " FastTree < protein_alignment > tree\n"
351 " FastTree -out tree protein_alignment\n"
352 " FastTree -nt nucleotide_alignment > tree\n"
353 " FastTree -nt -gtr < nucleotide_alignment > tree\n"
354 " FastTree < nucleotide_alignment > tree\n"
355 "FastTree accepts alignments in fasta or phylip interleaved formats\n"
357 "Common options (must be before the alignment file):\n"
358 " -quiet to suppress reporting information\n"
359 " -nopr to suppress progress indicator\n"
360 " -log logfile -- save intermediate trees, settings, and model details\n"
361 " -fastest -- speed up the neighbor joining phase & reduce memory usage\n"
362 " (recommended for >50,000 sequences)\n"
363 " -n <number> to analyze multiple alignments (phylip format only)\n"
364 " (use for global bootstrap, with seqboot and CompareToBootstrap.pl)\n"
365 " -nosupport to not compute support values\n"
366 " -intree newick_file to set the starting tree(s)\n"
367 " -intree1 newick_file to use this starting tree for all the alignments\n"
368 " (for faster global bootstrap on huge alignments)\n"
369 " -pseudo to use pseudocounts (recommended for highly gapped sequences)\n"
370 " -gtr -- generalized time-reversible model (nucleotide alignments only)\n"
371 " -wag -- Whelan-And-Goldman 2001 model (amino acid alignments only)\n"
372 " -quote -- allow spaces and other restricted characters (but not ' ) in\n"
373 " sequence names and quote names in the output tree (fasta input only;\n"
374 " FastTree will not be able to read these trees back in)\n"
375 " -noml to turn off maximum-likelihood\n"
376 " -nome to turn off minimum-evolution NNIs and SPRs\n"
377 " (recommended if running additional ML NNIs with -intree)\n"
378 " -nome -mllen with -intree to optimize branch lengths for a fixed topology\n"
379 " -cat # to specify the number of rate categories of sites (default 20)\n"
380 " or -nocat to use constant rates\n"
381 " -gamma -- after optimizing the tree under the CAT approximation,\n"
382 " rescale the lengths to optimize the Gamma20 likelihood\n"
383 " -constraints constraintAlignment to constrain the topology search\n"
384 " constraintAlignment should have 1s or 0s to indicates splits\n"
385 " -expert -- see more options\n"
386 "For more information, see http://www.microbesonline.org/fasttree/\n";
389 "FastTree [-nt] [-n 100] [-quote] [-pseudo | -pseudo 1.0]\n"
390 " [-boot 1000 | -nosupport]\n"
391 " [-intree starting_trees_file | -intree1 starting_tree_file]\n"
392 " [-quiet | -nopr]\n"
393 " [-nni 10] [-spr 2] [-noml | -mllen | -mlnni 10]\n"
394 " [-mlacc 2] [-cat 20 | -nocat] [-gamma]\n"
395 " [-slow | -fastest] [-2nd | -no2nd] [-slownni] [-seed 1253] \n"
396 " [-top | -notop] [-topm 1.0 [-close 0.75] [-refresh 0.8]]\n"
397 " [-matrix Matrix | -nomatrix] [-nj | -bionj]\n"
398 " [-wag] [-nt] [-gtr] [-gtrrates ac ag at cg ct gt] [-gtrfreq A C G T]\n"
399 " [ -constraints constraintAlignment [ -constraintWeight 100.0 ] ]\n"
401 " [ alignment_file ]\n"
402 " [ -out output_newick_file | > newick_tree]\n"
406 "FastTree [-nt] [-matrix Matrix | -nomatrix] [-rawdist] -makematrix [alignment]\n"
407 " [-n 100] > phylip_distance_matrix\n"
409 " FastTree supports fasta or phylip interleaved alignments\n"
410 " By default FastTree expects protein alignments, use -nt for nucleotides\n"
411 " FastTree reads standard input if no alignment file is given\n"
413 "Input/output options:\n"
414 " -n -- read in multiple alignments in. This only\n"
415 " works with phylip interleaved format. For example, you can\n"
416 " use it with the output from phylip's seqboot. If you use -n, FastTree\n"
417 " will write 1 tree per line to standard output.\n"
418 " -intree newickfile -- read the starting tree in from newickfile.\n"
419 " Any branch lengths in the starting trees are ignored.\n"
420 " -intree with -n will read a separate starting tree for each alignment.\n"
421 " -intree1 newickfile -- read the same starting tree for each alignment\n"
422 " -quiet -- do not write to standard error during normal operation (no progress\n"
423 " indicator, no options summary, no likelihood values, etc.)\n"
424 " -nopr -- do not write the progress indicator to stderr\n"
425 " -log logfile -- save intermediate trees so you can extract\n"
426 " the trees and restart long-running jobs if they crash\n"
427 " -log also reports the per-site rates (1 means slowest category)\n"
428 " -quote -- quote sequence names in the output and allow spaces, commas,\n"
429 " parentheses, and colons in them but not ' characters (fasta files only)\n"
432 " Default: For protein sequences, log-corrected distances and an\n"
433 " amino acid dissimilarity matrix derived from BLOSUM45\n"
434 " or for nucleotide sequences, Jukes-Cantor distances\n"
435 " To specify a different matrix, use -matrix FilePrefix or -nomatrix\n"
436 " Use -rawdist to turn the log-correction off\n"
437 " or to use %different instead of Jukes-Cantor\n"
439 " -pseudo [weight] -- Use pseudocounts to estimate distances between\n"
440 " sequences with little or no overlap. (Off by default.) Recommended\n"
441 " if analyzing the alignment has sequences with little or no overlap.\n"
442 " If the weight is not specified, it is 1.0\n"
444 "Topology refinement:\n"
445 " By default, FastTree tries to improve the tree with up to 4*log2(N)\n"
446 " rounds of minimum-evolution nearest-neighbor interchanges (NNI),\n"
447 " where N is the number of unique sequences, 2 rounds of\n"
448 " subtree-prune-regraft (SPR) moves (also min. evo.), and\n"
449 " up to 2*log(N) rounds of maximum-likelihood NNIs.\n"
450 " Use -nni to set the number of rounds of min. evo. NNIs,\n"
451 " and -spr to set the rounds of SPRs.\n"
452 " Use -noml to turn off both min-evo NNIs and SPRs (useful if refining\n"
453 " an approximately maximum-likelihood tree with further NNIs)\n"
454 " Use -sprlength set the maximum length of a SPR move (default 10)\n"
455 " Use -mlnni to set the number of rounds of maximum-likelihood NNIs\n"
456 " Use -mlacc 2 or -mlacc 3 to always optimize all 5 branches at each NNI,\n"
457 " and to optimize all 5 branches in 2 or 3 rounds\n"
458 " Use -mllen to optimize branch lengths without ML NNIs\n"
459 " Use -mllen -nome with -intree to optimize branch lengths on a fixed topology\n"
460 " Use -slownni to turn off heuristics to avoid constant subtrees (affects both\n"
463 "Maximum likelihood model options:\n"
464 " -wag -- Whelan-And-Goldman 2001 model instead of (default) Jones-Taylor-Thorton 1992 model (a.a. only)\n"
465 " -gtr -- generalized time-reversible instead of (default) Jukes-Cantor (nt only)\n"
466 " -cat # -- specify the number of rate categories of sites (default 20)\n"
467 " -nocat -- no CAT model (just 1 category)\n"
468 " -gamma -- after the final round of optimizing branch lengths with the CAT model,\n"
469 " report the likelihood under the discrete gamma model with the same\n"
470 " number of categories. FastTree uses the same branch lengths but\n"
471 " optimizes the gamma shape parameter and the scale of the lengths.\n"
472 " The final tree will have rescaled lengths. Used with -log, this\n"
473 " also generates per-site likelihoods for use with CONSEL, see\n"
474 " GammaLogToPaup.pl and documentation on the FastTree web site.\n"
476 "Support value options:\n"
477 " By default, FastTree computes local support values by resampling the site\n"
478 " likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome,\n"
479 " it will compute minimum-evolution bootstrap supports instead\n"
480 " In either case, the support values are proportions ranging from 0 to 1\n"
482 " Use -nosupport to turn off support values or -boot 100 to use just 100 resamples\n"
483 " Use -seed to initialize the random number generator\n"
485 "Searching for the best join:\n"
486 " By default, FastTree combines the 'visible set' of fast neighbor-joining with\n"
487 " local hill-climbing as in relaxed neighbor-joining\n"
488 " -slow -- exhaustive search (like NJ or BIONJ, but different gap handling)\n"
489 " -slow takes half an hour instead of 8 seconds for 1,250 proteins\n"
490 " -fastest -- search the visible set (the top hit for each node) only\n"
491 " Unlike the original fast neighbor-joining, -fastest updates visible(C)\n"
492 " after joining A and B if join(AB,C) is better than join(C,visible(C))\n"
493 " -fastest also updates out-distances in a very lazy way,\n"
494 " -fastest sets -2nd on as well, use -fastest -no2nd to avoid this\n"
496 "Top-hit heuristics:\n"
497 " By default, FastTree uses a top-hit list to speed up search\n"
498 " Use -notop (or -slow) to turn this feature off\n"
499 " and compare all leaves to each other,\n"
500 " and all new joined nodes to each other\n"
501 " -topm 1.0 -- set the top-hit list size to parameter*sqrt(N)\n"
502 " FastTree estimates the top m hits of a leaf from the\n"
503 " top 2*m hits of a 'close' neighbor, where close is\n"
504 " defined as d(seed,close) < 0.75 * d(seed, hit of rank 2*m),\n"
505 " and updates the top-hits as joins proceed\n"
506 " -close 0.75 -- modify the close heuristic, lower is more conservative\n"
507 " -refresh 0.8 -- compare a joined node to all other nodes if its\n"
508 " top-hit list is less than 80% of the desired length,\n"
509 " or if the age of the top-hit list is log2(m) or greater\n"
510 " -2nd or -no2nd to turn 2nd-level top hits heuristic on or off\n"
511 " This reduces memory usage and running time but may lead to\n"
512 " marginal reductions in tree quality.\n"
513 " (By default, -fastest turns on -2nd.)\n"
516 " -nj: regular (unweighted) neighbor-joining (default)\n"
517 " -bionj: weighted joins as in BIONJ\n"
518 " FastTree will also weight joins during NNIs\n"
520 "Constrained topology search options:\n"
521 " -constraints alignmentfile -- an alignment with values of 0, 1, and -\n"
522 " Not all sequences need be present. A column of 0s and 1s defines a\n"
523 " constrained split. Some constraints may be violated\n"
524 " (see 'violating constraints:' in standard error).\n"
525 " -constraintWeight -- how strongly to weight the constraints. A value of 1\n"
526 " means a penalty of 1 in tree length for violating a constraint\n"
529 "For more information, see http://www.microbesonline.org/fasttree/\n"
530 " or the comments in the source code\n";
536 /* Note -- sequence lines longer than BUFFER_SIZE are
537 allowed, but FASTA header lines must be within this limit */
538 #define BUFFER_SIZE 5000
539 #define MIN(X,Y) ((X) < (Y) ? (X) : (Y))
540 #define MAX(X,Y) ((X) > (Y) ? (X) : (Y))
547 int nSaved; /* actual allocated size of names and seqs */
550 /* For each position in a profile, we have a weight (% non-gapped) and a
551 frequency vector. (If using a matrix, the frequency vector is in eigenspace).
552 We also store codes for simple profile positions (all gaps or only 1 value)
553 If weight[pos] > 0 && codes[pos] == NOCODE then we store the vector
554 vectors itself is sets of nCodes long, so the vector for the ith nonconstant position
555 starts at &vectors[nCodes*i]
557 To speed up comparison of outprofile to a sequence or other simple profile, we also
558 (for outprofiles) store codeDist[iPos*nCodes+k] = dist(k,profile[iPos])
560 For constraints, we store a vector of nOn and nOff
561 If not using constraints, those will be NULL
564 /* alignment profile */
566 unsigned char *codes;
567 numeric_t *vectors; /* NULL if no non-constant positions, e.g. for leaves */
569 numeric_t *codeDist; /* Optional -- distance to each code at each position */
571 /* constraint profile */
576 /* A visible node is a pair of nodes i, j such that j is the best hit of i,
577 using the neighbor-joining criterion, at the time the comparison was made,
578 or approximately so since then.
580 Note that variance = dist because in BIONJ, constant factors of variance do not matter,
581 and because we weight ungapped sequences higher naturally when averaging profiles,
582 so we do not take this into account in the computation of "lambda" for BIONJ.
584 For the top-hit list heuristic, if the top hit list becomes "too short",
585 we store invalid entries with i=j=-1 and dist/criterion very high.
589 numeric_t weight; /* Total product of weights (maximum value is nPos)
590 This is needed for weighted joins and for pseudocounts,
591 but not in most other places.
592 For example, it is not maintained by the top hits code */
593 numeric_t dist; /* The uncorrected distance (includes diameter correction) */
594 numeric_t criterion; /* changes when we update the out-profile or change nActive */
603 /* Distances between amino acids */
604 numeric_t distances[MAXCODES][MAXCODES];
606 /* Inverse of the eigenvalue matrix, for rotating a frequency vector
607 into eigenspace so that profile similarity computations are
608 O(alphabet) not O(alphabet*alphabet) time.
610 numeric_t eigeninv[MAXCODES][MAXCODES];
611 numeric_t eigenval[MAXCODES]; /* eigenvalues */
614 /* eigentot=eigeninv times the all-1s frequency vector
615 useful for normalizing rotated frequency vectors
617 numeric_t eigentot[MAXCODES];
619 /* codeFreq is the transpose of the eigeninv matrix is
620 the rotated frequency vector for each code */
621 numeric_t codeFreq[MAXCODES][MAXCODES];
622 numeric_t gapFreq[MAXCODES];
626 /* A transition matrix gives the instantaneous rate of change of frequencies
630 and which is not a symmetric matrix because of
631 non-uniform stationary frequencies stat, so that
633 M(i,j) is instantaneous rate of j -> i, not of i -> j
635 S = diag(sqrt(stat)) is a correction so that
636 M' = S**-1 M S is symmetric
637 Let W L W**-1 = M' be an eigendecomposition of M'
638 Because M' is symmetric, W can be a rotation, and W**-1 = t(W)
640 M = V L V**-1 is an eigendecomposition of M
641 Note V**-1 = W**-1 S**-1 = t(W) S**-1
643 Evolution by time t is given by
645 exp(M*t) = V exp(L*t) V**-1
646 P(A & B | t) = B . exp(M*t) . (A * stat)
647 note this is *not* the same as P(A->B | t)
649 and we can reduce some of the computations from O(a**2) to O(a) time,
650 where a is the alphabet size, by storing frequency vectors as
651 t(V) . f = t(W) . t(S) . f
654 P(f0 & f1 | t) = f1 . exp(M*t) . f0 * (f0 . stat) = sum(r0j * r1j * exp(l_j*t))
655 where r0 and r1 are the transformed vectors
657 Posterior distribution of P given children f0 and f1 is given by
658 P(i | f0, f1, t0, t1) = stat * P(i->f0 | t0) * P(i->f1 | t1)
659 = P(i & f0 | t0) * P(i & f1 | t1) / stat
660 ~ (V . exp(t0*L) . r0) * (V . exp(t1*L) . r1) / stat
662 When normalize this posterior distribution (to sum to 1), divide by stat,
663 and transform by t(V) -- this is the "profile" of internal nodes
665 To eliminate the O(N**2) step of transforming by t(V), if the posterior
666 distribution of an amino acid is near 1 then we can approximate it by
667 P(i) ~= (i==A) * w + nearP(i) * (1-w), where
668 w is fit so that P(i==A) is correct
669 nearP = Posterior(i | i, i, 0.1, 0.1) [0.1 is an arbitrary choice]
670 and we confirm that the approximation works well before we use it.
672 Given this parameter w we can set
673 rotated_posterior = rotation(w * (i==A)/stat + (1-w) * nearP/stat)
674 = codeFreq(A) * w/stat(A) + nearFreq(A) * (1-w)
677 numeric_t stat[MAXCODES]; /* The stationary distribution */
678 numeric_t statinv[MAXCODES]; /* 1/stat */
679 /* the eigenmatrix, with the eigenvectors as columns and rotations of individual
680 characters as rows. Also includes a NOCODE entry for gaps */
681 numeric_t codeFreq[NOCODE+1][MAXCODES];
682 numeric_t eigeninv[MAXCODES][MAXCODES]; /* Inverse of eigenmatrix */
683 numeric_t eigeninvT[MAXCODES][MAXCODES]; /* transpose of eigeninv */
684 numeric_t eigenval[MAXCODES]; /* Eigenvalues */
685 /* These are for approximate posteriors (off by default) */
686 numeric_t nearP[MAXCODES][MAXCODES]; /* nearP[i][j] = P(parent=j | both children are i, both lengths are 0.1 */
687 numeric_t nearFreq[MAXCODES][MAXCODES]; /* rotation of nearP/stat */
688 } transition_matrix_t;
692 numeric_t *rates; /* 1 per rate category */
693 unsigned int *ratecat; /* 1 category per position */
700 char **seqs; /* the aligment sequences array (not reallocated) */
701 distance_matrix_t *distance_matrix; /* a pointer (not reallocated), or NULL if using %identity distance */
702 transition_matrix_t *transmat; /* a pointer (is allocated), or NULL for Jukes-Cantor */
703 /* Topological constraints are represented for each sequence as binary characters
704 with values of '0', '1', or '-' (for missing data)
705 Sequences that have no constraint may have a NULL string
708 char **constraintSeqs;
710 /* The profile data structures */
711 int maxnode; /* The next index to allocate */
712 int maxnodes; /* Space allocated in data structures below */
713 profile_t **profiles; /* Profiles of leaves and intermediate nodes */
714 numeric_t *diameter; /* To correct for distance "up" from children (if any) */
715 numeric_t *varDiameter; /* To correct variances for distance "up" */
716 numeric_t *selfdist; /* Saved for use in some formulas */
717 numeric_t *selfweight; /* Saved for use in some formulas */
719 /* Average profile of all active nodes, the "outprofile"
720 * If all inputs are ungapped, this has weight 1 (not nSequences) at each position
721 * The frequencies all sum to one (or that is implied by the eigen-representation)
723 profile_t *outprofile;
726 /* We sometimes use stale out-distances, so we remember what nActive was */
727 numeric_t *outDistances; /* Sum of distances to other active (parent==-1) nodes */
728 int *nOutDistActive; /* What nActive was when this outDistance was computed */
730 /* the inferred tree */
731 int root; /* index of the root. Unlike other internal nodes, it has 3 children */
732 int *parent; /* -1 or index of parent */
734 numeric_t *branchlength; /* Distance to parent */
735 numeric_t *support; /* 1 for high-confidence nodes */
737 /* auxilliary data for maximum likelihood (defaults to 1 category of rate=1.0) */
741 /* Uniquify sequences in an alignment -- map from indices
742 in the alignment to unique indicies in a NJ_t
747 int *uniqueFirst; /* iUnique -> iAln */
748 int *alnNext; /* iAln -> next, or -1 */
749 int *alnToUniq; /* iAln -> iUnique, or -1 if another was the exemplar */
750 char **uniqueSeq; /* indexed by iUniq -- points to strings allocated elsewhere */
753 /* Describes which switch to do */
754 typedef enum {ABvsCD,ACvsBD,ADvsBC} nni_t;
756 /* A list of these describes a chain of NNI moves in a rooted tree,
757 making up, in total, an SPR move
761 double deltaLength; /* change in tree length for this step (lower is better) */
764 /* Keep track of hits for the top-hits heuristic without wasting memory
766 If j is an inactive node, this may be replaced by that node's parent (and dist recomputed)
774 int nHits; /* the allocated and desired size; some of them may be empty */
776 int hitSource; /* where to refresh hits from if a 2nd-level top-hit list, or -1 */
777 int age; /* number of joins since a refresh */
781 int m; /* size of a full top hits list, usually sqrt(N) */
782 int q; /* size of a 2nd-level top hits, usually sqrt(m) */
784 top_hits_list_t *top_hits_lists; /* one per node */
785 hit_t *visible; /* the "visible" (very best) hit for each node */
787 /* The top-visible set is a subset, usually of size m, of the visible set --
788 it is the set of joins to select from
789 Each entry is either a node whose visible set entry has a good (low) criterion,
790 or -1 for empty, or is an obsolete node (which is effectively the same).
791 Whenever we update the visible set, should also call UpdateTopVisible()
792 which ensures that none of the topvisible set are stale (that is, they
793 all point to an active node).
795 int nTopVisible; /* nTopVisible = m * topvisibleMult */
798 int topvisibleAge; /* joins since the top-visible list was recomputed */
801 /* 1 lock to read or write any top hits list, no thread grabs more than one */
806 /* Global variables */
809 int showProgress = 1;
812 bool useTopHits2nd = false; /* use the second-level top hits heuristic? */
814 double tophitsMult = 1.0; /* 0 means compare nodes to all other nodes */
815 double tophitsClose = -1.0; /* Parameter for how close is close; also used as a coverage req. */
816 double topvisibleMult = 1.5; /* nTopVisible = m * topvisibleMult; 1 or 2 did not make much difference
817 in either running time or accuracy so I chose a compromise. */
819 double tophitsRefresh = 0.8; /* Refresh if fraction of top-hit-length drops to this */
820 double tophits2Mult = 1.0; /* Second-level top heuristic -- only with -fastest */
821 int tophits2Safety = 3; /* Safety factor for second level of top-hits heuristic */
822 double tophits2Refresh = 0.6; /* Refresh 2nd-level top hits if drops down to this fraction of length */
824 double staleOutLimit = 0.01; /* nActive changes by at most this amount before we recompute
825 an out-distance. (Only applies if using the top-hits heuristic) */
826 double fResetOutProfile = 0.02; /* Recompute out profile from scratch if nActive has changed
827 by more than this proportion, and */
828 int nResetOutProfile = 200; /* nActive has also changed more than this amount */
829 int nCodes=20; /* 20 if protein, 4 if nucleotide */
830 bool useMatrix=true; /* If false, use %different as the uncorrected distance */
831 bool logdist = true; /* If true, do a log-correction (scoredist-like or Jukes-Cantor)
832 but only during NNIs and support values, not during neighbor-joining */
833 double pseudoWeight = 0.0; /* The weight of pseudocounts to avoid artificial long branches when
834 nearby sequences in the tree have little or no overlap
835 (off by default). The prior distance is based on
836 all overlapping positions among the quartet or triplet under
837 consideration. The log correction takes place after the
838 pseudocount is used. */
839 double constraintWeight = 100.0;/* Cost of violation of a topological constraint in evolutionary distance
841 double MEMinDelta = 1.0e-4; /* Changes of less than this in tree-length are discounted for
842 purposes of identifying fixed subtrees */
844 bool gammaLogLk = false; /* compute gamma likelihood without reoptimizing branch lengths? */
846 /* Maximum likelihood options and constants */
847 /* These are used to rescale likelihood values and avoid taking a logarithm at each position */
848 const double LkUnderflow = 1.0e-4;
849 const double LkUnderflowInv = 1.0e4;
850 const double LogLkUnderflow = 9.21034037197618; /* -log(LkUnderflowInv) */
851 const double Log2 = 0.693147180559945;
852 /* These are used to limit the optimization of branch lengths.
853 Also very short branch lengths can create numerical problems.
854 In version 2.1.7, the minimum branch lengths (MLMinBranchLength and MLMinRelBranchLength)
855 were increased to prevent numerical problems in rare cases.
856 In version 2.1.8, to provide useful branch lengths for genome-wide alignments,
857 the minimum branch lengths were dramatically decreased if USE_DOUBLE is defined.
860 const double MLMinBranchLengthTolerance = 1.0e-4; /* absolute tolerance for optimizing branch lengths */
861 const double MLFTolBranchLength = 0.001; /* fractional tolerance for optimizing branch lengths */
862 const double MLMinBranchLength = 5.0e-4; /* minimum value for branch length */
863 const double MLMinRelBranchLength = 2.5e-4; /* minimum of rate * length */
864 const double fPostTotalTolerance = 1.0e-10; /* posterior vector must sum to at least this before rescaling */
866 const double MLMinBranchLengthTolerance = 1.0e-9;
867 const double MLFTolBranchLength = 0.001;
868 const double MLMinBranchLength = 5.0e-9;
869 const double MLMinRelBranchLength = 2.5e-9;
870 const double fPostTotalTolerance = 1.0e-20;
873 int mlAccuracy = 1; /* Rounds of optimization of branch lengths; 1 means do 2nd round only if close */
874 double closeLogLkLimit = 5.0; /* If partial optimization of an NNI looks like it would decrease the log likelihood
875 by this much or more then do not optimize it further */
876 double treeLogLkDelta = 0.1; /* Give up if tree log-lk changes by less than this; NNIs that change
877 likelihood by less than this also are considered unimportant
878 by some heuristics */
879 bool exactML = true; /* Exact or approximate posterior distributions for a.a.s */
880 double approxMLminf = 0.95; /* Only try to approximate posterior distributions if max. value is at least this high */
881 double approxMLminratio = 2/3.0;/* Ratio of approximated/true posterior values must be at least this high */
882 double approxMLnearT = 0.2; /* 2nd component of near-constant posterior distribution uses this time scale */
883 const int nDefaultRateCats = 20;
885 /* Performance and memory usage */
886 long profileOps = 0; /* Full profile-based distance operations */
887 long outprofileOps = 0; /* How many of profileOps are comparisons to outprofile */
888 long seqOps = 0; /* Faster leaf-based distance operations */
889 long profileAvgOps = 0; /* Number of profile-average steps */
890 long nHillBetter = 0; /* Number of hill-climbing steps */
891 long nCloseUsed = 0; /* Number of "close" neighbors we avoid full search for */
892 long nClose2Used = 0; /* Number of "close" neighbors we use 2nd-level top hits for */
893 long nRefreshTopHits = 0; /* Number of full-blown searches (interior nodes) */
894 long nVisibleUpdate = 0; /* Number of updates of the visible set */
895 long nNNI = 0; /* Number of NNI changes performed */
896 long nSPR = 0; /* Number of SPR changes performed */
897 long nML_NNI = 0; /* Number of max-lik. NNI changes performed */
898 long nSuboptimalSplits = 0; /* # of splits that are rejected given final tree (during bootstrap) */
899 long nSuboptimalConstrained = 0; /* Bad splits that are due to constraints */
900 long nConstraintViolations = 0; /* Number of constraint violations */
901 long nProfileFreqAlloc = 0;
902 long nProfileFreqAvoid = 0;
904 long mymallocUsed = 0; /* useful allocations by mymalloc */
905 long maxmallocHeap = 0; /* Maximum of mi.arena+mi.hblkhd from mallinfo (actual mem usage) */
906 long nLkCompute = 0; /* # of likelihood computations for pairs of probability vectors */
907 long nPosteriorCompute = 0; /* # of computations of posterior probabilities */
908 long nAAPosteriorExact = 0; /* # of times compute exact AA posterior */
909 long nAAPosteriorRough = 0; /* # of times use rough approximation */
910 long nStarTests = 0; /* # of times we use star test to avoid testing an NNI */
912 /* Protein character set */
913 unsigned char *codesStringAA = (unsigned char*) "ARNDCQEGHILKMFPSTWYV";
914 unsigned char *codesStringNT = (unsigned char*) "ACGT";
915 unsigned char *codesString = NULL;
917 distance_matrix_t *ReadDistanceMatrix(char *prefix);
918 void SetupDistanceMatrix(/*IN/OUT*/distance_matrix_t *); /* set eigentot, codeFreq, gapFreq */
919 void ReadMatrix(char *filename, /*OUT*/numeric_t codes[MAXCODES][MAXCODES], bool check_codes);
920 void ReadVector(char *filename, /*OUT*/numeric_t codes[MAXCODES]);
921 alignment_t *ReadAlignment(/*READ*/FILE *fp, bool bQuote); /* Returns a list of strings (exits on failure) */
922 alignment_t *FreeAlignment(alignment_t *); /* returns NULL */
923 void FreeAlignmentSeqs(/*IN/OUT*/alignment_t *);
925 /* Takes as input the transpose of the matrix V, with i -> j
926 This routine takes care of setting the diagonals
928 transition_matrix_t *CreateTransitionMatrix(/*IN*/double matrix[MAXCODES][MAXCODES],
929 /*IN*/double stat[MAXCODES]);
930 transition_matrix_t *CreateGTR(double *gtrrates/*ac,ag,at,cg,ct,gt*/, double *gtrfreq/*ACGT*/);
932 /* For converting profiles from 1 rotation to another, or converts NULL to NULL */
933 distance_matrix_t *TransMatToDistanceMat(transition_matrix_t *transmat);
935 /* Allocates memory, initializes leaf profiles */
936 NJ_t *InitNJ(char **sequences, int nSeqs, int nPos,
937 /*IN OPTIONAL*/char **constraintSeqs, int nConstraints,
938 /*IN OPTIONAL*/distance_matrix_t *,
939 /*IN OPTIONAL*/transition_matrix_t *);
941 NJ_t *FreeNJ(NJ_t *NJ); /* returns NULL */
942 void FastNJ(/*IN/OUT*/NJ_t *NJ); /* Does the joins */
943 void ReliabilityNJ(/*IN/OUT*/NJ_t *NJ, int nBootstrap); /* Estimates the reliability of the joins */
945 /* nni_stats_t is meaningless for leaves and root, so all of those entries
946 will just be high (for age) or 0 (for delta)
949 int age; /* number of rounds since this node was modified by an NNI */
950 int subtreeAge; /* number of rounds since self or descendent had a significant improvement */
951 double delta; /* improvement in score for this node (or 0 if no change) */
952 double support; /* improvement of score for self over better of alternatives */
955 /* One round of nearest-neighbor interchanges according to the
956 minimum-evolution or approximate maximum-likelihood criterion.
957 If doing maximum likelihood then this modifies the branch lengths.
958 age is the # of rounds since a node was NNId
959 Returns the # of topological changes performed
961 int NNI(/*IN/OUT*/NJ_t *NJ, int iRound, int nRounds, bool useML,
962 /*IN/OUT*/nni_stats_t *stats,
963 /*OUT*/double *maxDeltaCriterion);
964 nni_stats_t *InitNNIStats(NJ_t *NJ);
965 nni_stats_t *FreeNNIStats(nni_stats_t *, NJ_t *NJ); /* returns NULL */
967 /* One round of subtree-prune-regraft moves (minimum evolution) */
968 void SPR(/*IN/OUT*/NJ_t *NJ, int maxSPRLength, int iRound, int nRounds);
970 /* Recomputes all branch lengths by minimum evolution criterion*/
971 void UpdateBranchLengths(/*IN/OUT*/NJ_t *NJ);
973 /* Recomputes all branch lengths and, optionally, internal profiles */
974 double TreeLength(/*IN/OUT*/NJ_t *NJ, bool recomputeProfiles);
978 int nConstraintViolations;
981 /* How much length would be reduce or likelihood would be increased by the
982 best NNI we find (the worst "miss") */
983 double dWorstDeltaUnconstrained;
984 double dWorstDeltaConstrained;
987 void TestSplitsMinEvo(NJ_t *NJ, /*OUT*/SplitCount_t *splitcount);
989 /* Sets SH-like support values if nBootstrap>0 */
990 void TestSplitsML(/*IN/OUT*/NJ_t *NJ, /*OUT*/SplitCount_t *splitcount, int nBootstrap);
992 /* Pick columns for resampling, stored as returned_vector[iBoot*nPos + j] */
993 int *ResampleColumns(int nPos, int nBootstrap);
995 /* Use out-profile and NJ->totdiam to recompute out-distance for node iNode
996 Only does this computation if the out-distance is "stale" (nOutDistActive[iNode] != nActive)
997 Note "IN/UPDATE" for NJ always means that we may update out-distances but otherwise
1000 void SetOutDistance(/*IN/UPDATE*/NJ_t *NJ, int iNode, int nActive);
1002 /* Always sets join->criterion; may update NJ->outDistance and NJ->nOutDistActive,
1003 assumes join's weight and distance are already set,
1004 and that the constraint penalty (if any) is included in the distance
1006 void SetCriterion(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *join);
1008 /* Computes weight and distance (which includes the constraint penalty)
1009 and then sets the criterion (maybe update out-distances)
1011 void SetDistCriterion(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *join);
1013 /* If join->i or join->j are inactive nodes, replaces them with their active ancestors.
1014 After doing this, if i == j, or either is -1, sets weight to 0 and dist and criterion to 1e20
1015 and returns false (not a valid join)
1016 Otherwise, if i or j changed, recomputes the distance and criterion.
1017 Note that if i and j are unchanged then the criterion could be stale
1018 If bUpdateDist is false, and i or j change, then it just sets dist to a negative number
1020 bool UpdateBestHit(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *join,
1023 /* This recomputes the criterion, or returns false if the visible node
1024 is no longer active.
1026 bool GetVisible(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/top_hits_t *tophits,
1027 int iNode, /*OUT*/besthit_t *visible);
1029 int ActiveAncestor(/*IN*/NJ_t *NJ, int node);
1031 /* Compute the constraint penalty for a join. This is added to the "distance"
1033 int JoinConstraintPenalty(/*IN*/NJ_t *NJ, int node1, int node2);
1034 int JoinConstraintPenaltyPiece(NJ_t *NJ, int node1, int node2, int iConstraint);
1036 /* Helper function for computing the number of constraints violated by
1037 a split, represented as counts of on and off on each side */
1038 int SplitConstraintPenalty(int nOn1, int nOff1, int nOn2, int nOff2);
1040 /* Reports the (min. evo.) support for the (1,2) vs. (3,4) split
1041 col[iBoot*nPos+j] is column j for bootstrap iBoot
1043 double SplitSupport(profile_t *p1, profile_t *p2, profile_t *p3, profile_t *p4,
1044 /*OPTIONAL*/distance_matrix_t *dmat,
1049 /* Returns SH-like support given resampling spec. (in col) and site likelihods
1050 for the three quartets
1052 double SHSupport(int nPos, int nBoostrap, int *col, double loglk[3], double *site_likelihoods[3]);
1054 profile_t *SeqToProfile(/*IN/OUT*/NJ_t *NJ,
1055 char *seq, int nPos,
1056 /*OPTIONAL*/char *constraintSeqs, int nConstraints,
1058 unsigned long counts[256]);
1060 /* ProfileDist and SeqDist only set the dist and weight fields
1061 If using an outprofile, use the second argument of ProfileDist
1062 for better performance.
1064 These produce uncorrected distances.
1066 void ProfileDist(profile_t *profile1, profile_t *profile2, int nPos,
1067 /*OPTIONAL*/distance_matrix_t *distance_matrix,
1068 /*OUT*/besthit_t *hit);
1069 void SeqDist(unsigned char *codes1, unsigned char *codes2, int nPos,
1070 /*OPTIONAL*/distance_matrix_t *distance_matrix,
1071 /*OUT*/besthit_t *hit);
1073 /* Computes all pairs of profile distances, applies pseudocounts
1074 if pseudoWeight > 0, and applies log-correction if logdist is true.
1075 The lower index is compared to the higher index, e.g. for profiles
1076 A,B,C,D the comparison will be as in quartet_pair_t
1078 typedef enum {qAB,qAC,qAD,qBC,qBD,qCD} quartet_pair_t;
1079 void CorrectedPairDistances(profile_t **profiles, int nProfiles,
1080 /*OPTIONAL*/distance_matrix_t *distance_matrix,
1082 /*OUT*/double *distances);
1084 /* output is indexed by nni_t
1085 To ensure good behavior while evaluating a subtree-prune-regraft move as a series
1086 of nearest-neighbor interchanges, this uses a distance-ish model of constraints,
1087 as given by PairConstraintDistance(), rather than
1088 counting the number of violated splits (which is what FastTree does
1089 during neighbor-joining).
1090 Thus, penalty values may well be >0 even if no constraints are violated, but the
1091 relative scores for the three NNIs will be correct.
1093 void QuartetConstraintPenalties(profile_t *profiles[4], int nConstraints, /*OUT*/double d[3]);
1095 double PairConstraintDistance(int nOn1, int nOff1, int nOn2, int nOff2);
1097 /* the split is consistent with the constraint if any of the profiles have no data
1098 or if three of the profiles have the same uniform value (all on or all off)
1099 or if AB|CD = 00|11 or 11|00 (all uniform)
1101 bool SplitViolatesConstraint(profile_t *profiles[4], int iConstraint);
1103 /* If false, no values were set because this constraint was not relevant.
1104 output is for the 3 splits
1106 bool QuartetConstraintPenaltiesPiece(profile_t *profiles[4], int iConstraint, /*OUT*/double penalty[3]);
1108 /* Apply Jukes-Cantor or scoredist-like log(1-d) transform
1109 to correct the distance for multiple substitutions.
1111 double LogCorrect(double distance);
1113 /* AverageProfile is used to do a weighted combination of nodes
1114 when doing a join. If weight is negative, then the value is ignored and the profiles
1115 are averaged. The weight is *not* adjusted for the gap content of the nodes.
1116 Also, the weight does not affect the representation of the constraints
1118 profile_t *AverageProfile(profile_t *profile1, profile_t *profile2,
1119 int nPos, int nConstraints,
1120 distance_matrix_t *distance_matrix,
1123 /* PosteriorProfile() is like AverageProfile() but it computes posterior probabilities
1124 rather than an average
1126 profile_t *PosteriorProfile(profile_t *profile1, profile_t *profile2,
1127 double len1, double len2,
1128 /*OPTIONAL*/transition_matrix_t *transmat,
1130 int nPos, int nConstraints);
1132 /* Set a node's profile from its children.
1133 Deletes the previous profile if it exists
1134 Use -1.0 for a balanced join
1135 Fails unless the node has two children (e.g., no leaves or root)
1137 void SetProfile(/*IN/OUT*/NJ_t *NJ, int node, double weight1);
1139 /* OutProfile does an unweighted combination of nodes to create the
1140 out-profile. It always sets code to NOCODE so that UpdateOutProfile
1143 profile_t *OutProfile(profile_t **profiles, int nProfiles,
1144 int nPos, int nConstraints,
1145 distance_matrix_t *distance_matrix);
1147 void UpdateOutProfile(/*UPDATE*/profile_t *out, profile_t *old1, profile_t *old2,
1148 profile_t *new, int nActiveOld,
1149 int nPos, int nConstraints,
1150 distance_matrix_t *distance_matrix);
1152 profile_t *NewProfile(int nPos, int nConstraints); /* returned has no vectors */
1153 profile_t *FreeProfile(profile_t *profile, int nPos, int nConstraints); /* returns NULL */
1155 void AllocRateCategories(/*IN/OUT*/rates_t *rates, int nRateCategories, int nPos);
1157 /* f1 can be NULL if code1 != NOCODE, and similarly for f2
1158 Or, if (say) weight1 was 0, then can have code1==NOCODE *and* f1==NULL
1159 In that case, returns an arbitrary large number.
1161 double ProfileDistPiece(unsigned int code1, unsigned int code2,
1162 numeric_t *f1, numeric_t *f2,
1163 /*OPTIONAL*/distance_matrix_t *dmat,
1164 /*OPTIONAL*/numeric_t *codeDist2);
1166 /* Adds (or subtracts, if weight is negative) fIn/codeIn from fOut
1167 fOut is assumed to exist (as from an outprofile)
1168 do not call unless weight of input profile > 0
1170 void AddToFreq(/*IN/OUT*/numeric_t *fOut, double weight,
1171 unsigned int codeIn, /*OPTIONAL*/numeric_t *fIn,
1172 /*OPTIONAL*/distance_matrix_t *dmat);
1174 /* Divide the vector (of length nCodes) by a constant
1175 so that the total (unrotated) frequency is 1.0 */
1176 void NormalizeFreq(/*IN/OUT*/numeric_t *freq, distance_matrix_t *distance_matrix);
1178 /* Allocate, if necessary, and recompute the codeDist*/
1179 void SetCodeDist(/*IN/OUT*/profile_t *profile, int nPos, distance_matrix_t *dmat);
1181 /* The allhits list contains the distances of the node to all other active nodes
1182 This is useful for the "reset" improvement to the visible set
1183 Note that the following routines do not handle the tophits heuristic
1184 and assume that out-distances are up to date.
1186 void SetBestHit(int node, NJ_t *NJ, int nActive,
1187 /*OUT*/besthit_t *bestjoin,
1188 /*OUT OPTIONAL*/besthit_t *allhits);
1189 void ExhaustiveNJSearch(NJ_t *NJ, int nActive, /*OUT*/besthit_t *bestjoin);
1191 /* Searches the visible set */
1192 void FastNJSearch(NJ_t *NJ, int nActive, /*UPDATE*/besthit_t *visible, /*OUT*/besthit_t *bestjoin);
1194 /* Subroutines for handling the tophits heuristic */
1196 top_hits_t *InitTopHits(NJ_t *NJ, int m);
1197 top_hits_t *FreeTopHits(top_hits_t *tophits); /* returns NULL */
1199 /* Before we do any joins -- sets tophits and visible
1200 NJ may be modified by setting out-distances
1202 void SetAllLeafTopHits(/*IN/UPDATE*/NJ_t *NJ, /*IN/OUT*/top_hits_t *tophits);
1204 /* Find the best join to do. */
1205 void TopHitNJSearch(/*IN/UPDATE*/NJ_t *NJ,
1207 /*IN/OUT*/top_hits_t *tophits,
1208 /*OUT*/besthit_t *bestjoin);
1210 /* Returns the best hit within top hits
1211 NJ may be modified because it updates out-distances if they are too stale
1212 Does *not* update visible set
1214 void GetBestFromTopHits(int iNode, /*IN/UPDATE*/NJ_t *NJ, int nActive,
1215 /*IN*/top_hits_t *tophits,
1216 /*OUT*/besthit_t *bestjoin);
1218 /* visible set is modifiable so that we can reset it more globally when we do
1219 a "refresh", but we also set the visible set for newnode and do any
1220 "reset" updates too. And, we update many outdistances.
1222 void TopHitJoin(int newnode,
1223 /*IN/UPDATE*/NJ_t *NJ, int nActive,
1224 /*IN/OUT*/top_hits_t *tophits);
1226 /* Sort the input besthits by criterion
1227 and save the best nOut hits as a new array in top_hits_lists
1228 Does not update criterion or out-distances
1229 Ignores (silently removes) hit to self
1230 Saved list may be shorter than requested if there are insufficient entries
1232 void SortSaveBestHits(int iNode, /*IN/SORT*/besthit_t *besthits,
1234 /*IN/OUT*/top_hits_t *tophits);
1236 /* Given candidate hits from one node, "transfer" them to another node:
1237 Stores them in a new place in the same order
1238 searches up to active nodes if hits involve non-active nodes
1239 If update flag is set, it also recomputes distance and criterion
1240 (and ensures that out-distances are updated); otherwise
1241 it sets dist to -1e20 and criterion to 1e20
1244 void TransferBestHits(/*IN/UPDATE*/NJ_t *NJ, int nActive,
1246 /*IN*/besthit_t *oldhits,
1248 /*OUT*/besthit_t *newhits,
1249 bool updateDistance);
1251 /* Create best hit objects from 1 or more hits. Do not update out-distances or set criteria */
1252 void HitsToBestHits(/*IN*/hit_t *hits, int nHits, int iNode, /*OUT*/besthit_t *newhits);
1253 besthit_t HitToBestHit(int i, hit_t hit);
1255 /* Given a set of besthit entries,
1256 look for improvements to the visible set of the j entries.
1257 Updates out-distances as it goes.
1258 Also replaces stale nodes with this node, because a join is usually
1259 how this happens (i.e. it does not need to walk up to ancestors).
1260 Note this calls UpdateTopVisible() on any change
1262 void UpdateVisible(/*IN/UPDATE*/NJ_t *NJ, int nActive,
1263 /*IN*/besthit_t *tophitsNode,
1265 /*IN/OUT*/top_hits_t *tophits);
1267 /* Update the top-visible list to perhaps include this hit (O(sqrt(N)) time) */
1268 void UpdateTopVisible(/*IN*/NJ_t * NJ, int nActive,
1269 int iNode, /*IN*/hit_t *hit,
1270 /*IN/OUT*/top_hits_t *tophits);
1272 /* Recompute the top-visible subset of the visible set */
1273 void ResetTopVisible(/*IN/UPDATE*/NJ_t *NJ,
1275 /*IN/OUT*/top_hits_t *tophits);
1277 /* Make a shorter list with only unique entries.
1278 Replaces any "dead" hits to nodes that have parents with their active ancestors
1279 and ignores any that become dead.
1280 Updates all criteria.
1281 Combined gets sorted by i & j
1282 The returned list is allocated to nCombined even though only *nUniqueOut entries are filled
1284 besthit_t *UniqueBestHits(/*IN/UPDATE*/NJ_t *NJ, int nActive,
1285 /*IN/SORT*/besthit_t *combined, int nCombined,
1286 /*OUT*/int *nUniqueOut);
1288 nni_t ChooseNNI(profile_t *profiles[4],
1289 /*OPTIONAL*/distance_matrix_t *dmat,
1290 int nPos, int nConstraints,
1291 /*OUT*/double criteria[3]); /* The three internal branch lengths or log likelihoods*/
1293 /* length[] is ordered as described by quartet_length_t, but after we do the swap
1294 of B with C (to give AC|BD) or B with D (to get AD|BC), if that is the returned choice
1295 bFast means do not consider NNIs if AB|CD is noticeably better than the star topology
1296 (as implemented by MLQuartetOptimize).
1297 If there are constraints, then the constraint penalty is included in criteria[]
1299 nni_t MLQuartetNNI(profile_t *profiles[4],
1300 /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1301 int nPos, int nConstraints,
1302 /*OUT*/double criteria[3], /* The three potential quartet log-likelihoods */
1303 /*IN/OUT*/numeric_t length[5],
1306 void OptimizeAllBranchLengths(/*IN/OUT*/NJ_t *NJ);
1307 double TreeLogLk(/*IN*/NJ_t *NJ, /*OPTIONAL OUT*/double *site_loglk);
1308 double MLQuartetLogLk(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
1309 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1310 /*IN*/double branch_lengths[5],
1311 /*OPTIONAL OUT*/double *site_likelihoods);
1313 /* Given a topology and branch lengths, estimate rates & recompute profiles */
1314 void SetMLRates(/*IN/OUT*/NJ_t *NJ, int nRateCategories);
1316 /* Returns a set of nRateCategories potential rates; the caller must free it */
1317 numeric_t *MLSiteRates(int nRateCategories);
1319 /* returns site_loglk so that
1320 site_loglk[nPos*iRate + j] is the log likelihood of site j with rate iRate
1321 The caller must free it.
1323 double *MLSiteLikelihoodsByRate(/*IN*/NJ_t *NJ, /*IN*/numeric_t *rates, int nRateCategories);
1326 double mult; /* multiplier for the rates / divisor for the tree-length */
1334 double GammaLogLk(/*IN*/siteratelk_t *s, /*OPTIONAL OUT*/double *gamma_loglk_sites);
1336 /* Input site_loglk must be for each rate. Note that FastTree does not reoptimize
1337 the branch lengths under the Gamma model -- it optimizes the overall scale.
1338 Reports the gamma log likelihhod (and logs site likelihoods if fpLog is set),
1339 and reports the rescaling value.
1341 double RescaleGammaLogLk(int nPos, int nRateCats,
1342 /*IN*/numeric_t *rates, /*IN*/double *site_loglk,
1343 /*OPTIONAL*/FILE *fpLog);
1345 /* P(value<=x) for the gamma distribution with shape parameter alpha and scale 1/alpha */
1346 double PGamma(double x, double alpha);
1348 /* Given a topology and branch lengths, optimize GTR rates and quickly reoptimize branch lengths
1349 If gtrfreq is NULL, then empirical frequencies are used
1351 void SetMLGtr(/*IN/OUT*/NJ_t *NJ, /*OPTIONAL IN*/double *gtrfreq, /*OPTIONAL WRITE*/FILE *fpLog);
1353 /* P(A & B | len) = P(B | A, len) * P(A)
1354 If site_likelihoods is present, multiplies those values by the site likelihood at each point
1355 (Note it does not handle underflow)
1357 double PairLogLk(/*IN*/profile_t *p1, /*IN*/profile_t *p2, double length,
1358 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1359 /*OPTIONAL IN/OUT*/double *site_likelihoods);
1361 /* Branch lengths for 4-taxon tree ((A,B),C,D); I means internal */
1362 typedef enum {LEN_A,LEN_B,LEN_C,LEN_D,LEN_I} quartet_length_t;
1366 transition_matrix_t *transmat;
1368 int nEval; /* number of likelihood evaluations */
1369 /* The pair to optimize */
1374 double PairNegLogLk(double x, void *data); /* data must be a quartet_opt_t */
1380 int iRate; /* which rate to set x from */
1383 /* Returns -log_likelihood for the tree with the given rates
1384 data must be a gtr_opt_t and x is used to set rate iRate
1385 Does not recompute profiles -- assumes that the caller will
1387 double GTRNegLogLk(double x, void *data);
1389 /* Returns the resulting log likelihood. Optionally returns whether other
1390 topologies should be abandoned, based on the difference between AB|CD and
1391 the "star topology" (AB|CD with a branch length of MLMinBranchLength) exceeding
1393 If bStarTest is passed in, it only optimized the internal branch if
1394 the star test is true. Otherwise, it optimized all 5 branch lengths
1397 double MLQuartetOptimize(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
1398 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1399 /*IN/OUT*/double branch_lengths[5],
1400 /*OPTIONAL OUT*/bool *pStarTest,
1401 /*OPTIONAL OUT*/double *site_likelihoods);
1403 /* Returns the resulting log likelihood */
1404 double MLPairOptimize(profile_t *pA, profile_t *pB,
1405 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1406 /*IN/OUT*/double *branch_length);
1408 /* Returns the number of steps considered, with the actual steps in steps[]
1409 Modifies the tree by this chain of NNIs
1411 int FindSPRSteps(/*IN/OUT*/NJ_t *NJ,
1413 int parent, /* sibling or parent of node to NNI to start the chain */
1414 /*IN/OUT*/profile_t **upProfiles,
1415 /*OUT*/spr_step_t *steps,
1419 /* Undo a single NNI */
1420 void UnwindSPRStep(/*IN/OUT*/NJ_t *NJ,
1421 /*IN*/spr_step_t *step,
1422 /*IN/OUT*/profile_t **upProfiles);
1425 /* Update the profile of node and its ancestor, and delete nearby out-profiles */
1426 void UpdateForNNI(/*IN/OUT*/NJ_t *NJ, int node, /*IN/OUT*/profile_t **upProfiles, bool useML);
1428 /* Sets NJ->parent[newchild] and replaces oldchild with newchild
1429 in the list of children of parent
1431 void ReplaceChild(/*IN/OUT*/NJ_t *NJ, int parent, int oldchild, int newchild);
1433 int CompareHitsByCriterion(const void *c1, const void *c2);
1434 int CompareHitsByIJ(const void *c1, const void *c2);
1436 int NGaps(NJ_t *NJ, int node); /* only handles leaf sequences */
1438 /* node is the parent of AB, sibling of C
1439 node cannot be root or a leaf
1440 If node is the child of root, then D is the other sibling of node,
1441 and the 4th profile is D's profile.
1442 Otherwise, D is the parent of node, and we use its upprofile
1443 Call this with profiles=NULL to get the nodes, without fetching or
1446 void SetupABCD(NJ_t *NJ, int node,
1447 /* the 4 profiles for ABCD; the last one is an upprofile */
1448 /*OPTIONAL OUT*/profile_t *profiles[4],
1449 /*OPTIONAL IN/OUT*/profile_t **upProfiles,
1450 /*OUT*/int nodeABCD[4],
1453 int Sibling(NJ_t *NJ, int node); /* At root, no unique sibling so returns -1 */
1454 void RootSiblings(NJ_t *NJ, int node, /*OUT*/int sibs[2]);
1456 /* JC probability of nucleotide not changing, for each rate category */
1457 double *PSameVector(double length, rates_t *rates);
1459 /* JC probability of nucleotide not changing, for each rate category */
1460 double *PDiffVector(double *pSame, rates_t *rates);
1462 /* expeigen[iRate*nCodes + j] = exp(length * rate iRate * eigenvalue j) */
1463 numeric_t *ExpEigenRates(double length, transition_matrix_t *transmat, rates_t *rates);
1465 /* Print a progress report if more than 0.1 second has gone by since the progress report */
1466 /* Format should include 0-4 %d references and no newlines */
1467 void ProgressReport(char *format, int iArg1, int iArg2, int iArg3, int iArg4);
1468 void LogTree(char *format, int round, /*OPTIONAL WRITE*/FILE *fp, NJ_t *NJ, char **names, uniquify_t *unique, bool bQuote);
1469 void LogMLRates(/*OPTIONAL WRITE*/FILE *fpLog, NJ_t *NJ);
1471 void *mymalloc(size_t sz); /* Prints "Out of memory" and exits on failure */
1472 void *myfree(void *, size_t sz); /* Always returns NULL */
1474 /* One-dimensional minimization using brent's function, with
1475 a fractional and an absolute tolerance */
1476 double onedimenmin(double xmin, double xguess, double xmax, double (*f)(double,void*), void *data,
1477 double ftol, double atol,
1478 /*OUT*/double *fx, /*OUT*/double *f2x);
1480 double brent(double ax, double bx, double cx, double (*f)(double, void *), void *data,
1481 double ftol, double atol,
1482 double *foptx, double *f2optx, double fax, double fbx, double fcx);
1484 /* Vector operations, either using SSE3 or not
1485 Code assumes that vectors are a multiple of 4 in size
1487 void vector_multiply(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, int n, /*OUT*/numeric_t *fOut);
1488 numeric_t vector_multiply_sum(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, int n);
1489 void vector_add_mult(/*IN/OUT*/numeric_t *f, /*IN*/numeric_t *add, numeric_t weight, int n);
1491 /* multiply the transpose of a matrix by a vector */
1492 void matrixt_by_vector4(/*IN*/numeric_t mat[4][MAXCODES], /*IN*/numeric_t vec[4], /*OUT*/numeric_t out[4]);
1494 /* sum(f1*fBy)*sum(f2*fBy) */
1495 numeric_t vector_dot_product_rot(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, /*IN*/numeric_t* fBy, int n);
1498 numeric_t vector_multiply3_sum(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, /*IN*/numeric_t* f3, int n);
1500 numeric_t vector_sum(/*IN*/numeric_t *f1, int n);
1501 void vector_multiply_by(/*IN/OUT*/numeric_t *f, /*IN*/numeric_t fBy, int n);
1503 double clockDiff(/*IN*/struct timeval *clock_start);
1504 int timeval_subtract (/*OUT*/struct timeval *result, /*IN*/struct timeval *x, /*IN*/struct timeval *y);
1506 char *OpenMPString(void);
1508 void ran_start(long seed);
1509 double knuth_rand(); /* Random number between 0 and 1 */
1510 void tred2 (double *a, const int n, const int np, double *d, double *e);
1511 double pythag(double a, double b);
1512 void tqli(double *d, double *e, int n, int np, double *z);
1514 /* Like mymalloc; duplicates the input (returns NULL if given NULL) */
1515 void *mymemdup(void *data, size_t sz);
1516 void *myrealloc(void *data, size_t szOld, size_t szNew, bool bCopy);
1518 double pnorm(double z); /* Probability(value <=z) */
1520 /* Hashtable functions */
1524 int nCount; /* number of times this entry was seen */
1525 int first; /* index of first entry with this value */
1530 /* hashvalue -> bucket. Or look in bucket + 1, +2, etc., till you hit a NULL string */
1531 hashbucket_t *buckets;
1533 typedef int hashiterator_t;
1535 hashstrings_t *MakeHashtable(char **strings, int nStrings);
1536 hashstrings_t *FreeHashtable(hashstrings_t* hash); /*returns NULL*/
1537 hashiterator_t FindMatch(hashstrings_t *hash, char *string);
1539 /* Return NULL if we have run out of values */
1540 char *GetHashString(hashstrings_t *hash, hashiterator_t hi);
1541 int HashCount(hashstrings_t *hash, hashiterator_t hi);
1542 int HashFirst(hashstrings_t *hash, hashiterator_t hi);
1544 void PrintNJ(/*WRITE*/FILE *, NJ_t *NJ, char **names, uniquify_t *unique, bool bShowSupport, bool bQuoteNames);
1546 /* Print topology using node indices as node names */
1547 void PrintNJInternal(/*WRITE*/FILE *, NJ_t *NJ, bool useLen);
1549 uniquify_t *UniquifyAln(/*IN*/alignment_t *aln);
1550 uniquify_t *FreeUniquify(uniquify_t *); /* returns NULL */
1552 /* Convert a constraint alignment to a list of sequences. The returned array is indexed
1553 by iUnique and points to values in the input alignment
1555 char **AlnToConstraints(alignment_t *constraints, uniquify_t *unique, hashstrings_t *hashnames);
1557 /* ReadTree ignores non-unique leaves after the first instance.
1558 At the end, it prunes the tree to ignore empty children and it
1559 unroots the tree if necessary.
1561 void ReadTree(/*IN/OUT*/NJ_t *NJ,
1562 /*IN*/uniquify_t *unique,
1563 /*IN*/hashstrings_t *hashnames,
1564 /*READ*/FILE *fpInTree);
1565 char *ReadTreeToken(/*READ*/FILE *fp); /* returns a static array, or NULL on EOF */
1566 void ReadTreeAddChild(int parent, int child, /*IN/OUT*/int *parents, /*IN/OUT*/children_t *children);
1567 /* Do not add the leaf if we already set this unique-set to another parent */
1568 void ReadTreeMaybeAddLeaf(int parent, char *name,
1569 hashstrings_t *hashnames, uniquify_t *unique,
1570 /*IN/OUT*/int *parents, /*IN/OUT*/children_t *children);
1571 void ReadTreeRemove(/*IN/OUT*/int *parents, /*IN/OUT*/children_t *children, int node);
1573 /* Routines to support tree traversal and prevent visiting a node >1 time
1574 (esp. if topology changes).
1576 typedef bool *traversal_t;
1577 traversal_t InitTraversal(NJ_t*);
1578 void SkipTraversalInto(int node, /*IN/OUT*/traversal_t traversal);
1579 traversal_t FreeTraversal(traversal_t, NJ_t*); /*returns NULL*/
1581 /* returns new node, or -1 if nothing left to do. Use root for the first call.
1582 Will return every node and then root.
1583 Uses postorder tree traversal (depth-first search going down to leaves first)
1584 Keeps track of which nodes are visited, so even after an NNI that swaps a
1585 visited child with an unvisited uncle, the next call will visit the
1586 was-uncle-now-child. (However, after SPR moves, there is no such guarantee.)
1588 If pUp is not NULL, then, if going "back up" through a previously visited node
1589 (presumably due to an NNI), then it will return the node another time,
1592 int TraversePostorder(int lastnode, NJ_t *NJ, /*IN/OUT*/traversal_t,
1593 /*OUT OPTIONAL*/bool *pUp);
1595 /* Routines to support storing up-profiles during tree traversal
1596 Eventually these should be smart enough to do weighted joins and
1597 to minimize memory usage
1599 profile_t **UpProfiles(NJ_t *NJ);
1600 profile_t *GetUpProfile(/*IN/OUT*/profile_t **upProfiles, NJ_t *NJ, int node, bool useML);
1601 profile_t *DeleteUpProfile(/*IN/OUT*/profile_t **upProfiles, NJ_t *NJ, int node); /* returns NULL */
1602 profile_t **FreeUpProfiles(profile_t **upProfiles, NJ_t *NJ); /* returns NULL */
1604 /* Recomputes the profile for a node, presumably to reflect topology changes
1605 If bionj is set, does a weighted join -- which requires using upProfiles
1606 If useML is set, computes the posterior probability instead of averaging
1608 void RecomputeProfile(/*IN/OUT*/NJ_t *NJ, /*IN/OUT*/profile_t **upProfiles, int node, bool useML);
1610 /* Recompute profiles going up from the leaves, using the provided distance matrix
1611 and unweighted joins
1613 void RecomputeProfiles(/*IN/OUT*/NJ_t *NJ, /*OPTIONAL*/distance_matrix_t *dmat);
1615 void RecomputeMLProfiles(/*IN/OUT*/NJ_t *NJ);
1617 /* If bionj is set, computes the weight to be given to A when computing the
1618 profile for the ancestor of A and B. C and D are the other profiles in the quartet
1619 If bionj is not set, returns -1 (which means unweighted in AverageProfile).
1620 (A and B are the first two profiles in the array)
1622 double QuartetWeight(profile_t *profiles[4], distance_matrix_t *dmat, int nPos);
1624 /* Returns a list of nodes, starting with node and ending with root */
1625 int *PathToRoot(NJ_t *NJ, int node, /*OUT*/int *depth);
1626 int *FreePath(int *path, NJ_t *NJ); /* returns NULL */
1628 /* The default amino acid distance matrix, derived from the BLOSUM45 similarity matrix */
1629 distance_matrix_t matrixBLOSUM45;
1631 /* The default amino acid transition matrix (Jones Taylor Thorton 1992) */
1632 double matrixJTT92[MAXCODES][MAXCODES];
1633 double statJTT92[MAXCODES];
1635 /* The WAG amino acid transition matrix (Whelan-And-Goldman 2001) */
1636 double matrixWAG01[MAXCODES][MAXCODES];
1637 double statWAG01[MAXCODES];
1640 int main(int argc, char **argv) {
1641 int nAlign = 1; /* number of alignments to read */
1643 char *matrixPrefix = NULL;
1644 distance_matrix_t *distance_matrix = NULL;
1645 bool make_matrix = false;
1646 char *constraintsFile = NULL;
1647 char *intreeFile = NULL;
1648 bool intree1 = false; /* the same starting tree each round */
1649 int nni = -1; /* number of rounds of NNI, defaults to 4*log2(n) */
1650 int spr = 2; /* number of rounds of SPR */
1651 int maxSPRLength = 10; /* maximum distance to move a node */
1652 int MLnni = -1; /* number of rounds of ML NNI, defaults to 2*log2(n) */
1653 bool MLlen = false; /* optimize branch lengths; no topology changes */
1654 int nBootstrap = 1000; /* If set, number of replicates of local bootstrap to do */
1655 int nRateCats = nDefaultRateCats;
1656 char *logfile = NULL;
1657 bool bUseGtr = false;
1658 bool bUseWag = false;
1659 bool bUseGtrRates = false;
1660 double gtrrates[6] = {1,1,1,1,1,1};
1661 bool bUseGtrFreq = false;
1662 double gtrfreq[4] = {0.25,0.25,0.25,0.25};
1663 bool bQuote = false;
1664 FILE *fpOut = stdout;
1666 if (isatty(STDIN_FILENO) && argc == 1) {
1667 fprintf(stderr,"Usage for FastTree version %s %s%s:\n%s",
1668 FT_VERSION, SSE_STRING, OpenMPString(), usage);
1669 #if (defined _WIN32 || defined WIN32 || defined WIN64 || defined _WIN64)
1670 fprintf(stderr, "Windows users: Please remember to run this inside a command shell\n");
1671 fprintf(stderr,"Hit return to continue\n");
1676 for (iArg = 1; iArg < argc; iArg++) {
1677 if (strcmp(argv[iArg],"-makematrix") == 0) {
1679 } else if (strcmp(argv[iArg],"-logdist") == 0) {
1680 fprintf(stderr, "Warning: logdist is now on by default and obsolete\n");
1681 } else if (strcmp(argv[iArg],"-rawdist") == 0) {
1683 } else if (strcmp(argv[iArg],"-verbose") == 0 && iArg < argc-1) {
1684 verbose = atoi(argv[++iArg]);
1685 } else if (strcmp(argv[iArg],"-quiet") == 0) {
1688 } else if (strcmp(argv[iArg],"-nopr") == 0) {
1690 } else if (strcmp(argv[iArg],"-slow") == 0) {
1692 } else if (strcmp(argv[iArg],"-fastest") == 0) {
1694 tophitsRefresh = 0.5;
1695 useTopHits2nd = true;
1696 } else if (strcmp(argv[iArg],"-2nd") == 0) {
1697 useTopHits2nd = true;
1698 } else if (strcmp(argv[iArg],"-no2nd") == 0) {
1699 useTopHits2nd = false;
1700 } else if (strcmp(argv[iArg],"-slownni") == 0) {
1702 } else if (strcmp(argv[iArg], "-matrix") == 0 && iArg < argc-1) {
1704 matrixPrefix = argv[iArg];
1705 } else if (strcmp(argv[iArg], "-nomatrix") == 0) {
1707 } else if (strcmp(argv[iArg], "-n") == 0 && iArg < argc-1) {
1709 nAlign = atoi(argv[iArg]);
1711 fprintf(stderr, "-n argument for #input alignments must be > 0 not %s\n", argv[iArg]);
1714 } else if (strcmp(argv[iArg], "-quote") == 0) {
1716 } else if (strcmp(argv[iArg], "-nt") == 0) {
1718 } else if (strcmp(argv[iArg], "-intree") == 0 && iArg < argc-1) {
1720 intreeFile = argv[iArg];
1721 } else if (strcmp(argv[iArg], "-intree1") == 0 && iArg < argc-1) {
1723 intreeFile = argv[iArg];
1725 } else if (strcmp(argv[iArg], "-nj") == 0) {
1727 } else if (strcmp(argv[iArg], "-bionj") == 0) {
1729 } else if (strcmp(argv[iArg], "-boot") == 0 && iArg < argc-1) {
1731 nBootstrap = atoi(argv[iArg]);
1732 } else if (strcmp(argv[iArg], "-noboot") == 0 || strcmp(argv[iArg], "-nosupport") == 0) {
1734 } else if (strcmp(argv[iArg], "-seed") == 0 && iArg < argc-1) {
1736 long seed = atol(argv[iArg]);
1738 } else if (strcmp(argv[iArg],"-top") == 0) {
1739 if(tophitsMult < 0.01)
1741 } else if (strcmp(argv[iArg],"-notop") == 0) {
1743 } else if (strcmp(argv[iArg], "-topm") == 0 && iArg < argc-1) {
1745 tophitsMult = atof(argv[iArg]);
1746 } else if (strcmp(argv[iArg], "-close") == 0 && iArg < argc-1) {
1748 tophitsClose = atof(argv[iArg]);
1749 if (tophitsMult <= 0) {
1750 fprintf(stderr, "Cannot use -close unless -top is set above 0\n");
1753 if (tophitsClose <= 0 || tophitsClose >= 1) {
1754 fprintf(stderr, "-close argument must be between 0 and 1\n");
1757 } else if (strcmp(argv[iArg], "-refresh") == 0 && iArg < argc-1) {
1759 tophitsRefresh = atof(argv[iArg]);
1760 if (tophitsMult <= 0) {
1761 fprintf(stderr, "Cannot use -refresh unless -top is set above 0\n");
1764 if (tophitsRefresh <= 0 || tophitsRefresh >= 1) {
1765 fprintf(stderr, "-refresh argument must be between 0 and 1\n");
1768 } else if (strcmp(argv[iArg],"-nni") == 0 && iArg < argc-1) {
1770 nni = atoi(argv[iArg]);
1773 } else if (strcmp(argv[iArg],"-spr") == 0 && iArg < argc-1) {
1775 spr = atoi(argv[iArg]);
1776 } else if (strcmp(argv[iArg],"-sprlength") == 0 && iArg < argc-1) {
1778 maxSPRLength = atoi(argv[iArg]);
1779 } else if (strcmp(argv[iArg],"-mlnni") == 0 && iArg < argc-1) {
1781 MLnni = atoi(argv[iArg]);
1782 } else if (strcmp(argv[iArg],"-noml") == 0) {
1784 } else if (strcmp(argv[iArg],"-mllen") == 0) {
1787 } else if (strcmp(argv[iArg],"-nome") == 0) {
1790 } else if (strcmp(argv[iArg],"-help") == 0) {
1791 fprintf(stderr,"FastTree %s %s%s:\n%s", FT_VERSION, SSE_STRING, OpenMPString(), usage);
1793 } else if (strcmp(argv[iArg],"-expert") == 0) {
1794 fprintf(stderr, "Detailed usage for FastTree %s %s%s:\n%s",
1795 FT_VERSION, SSE_STRING, OpenMPString(), expertUsage);
1797 } else if (strcmp(argv[iArg],"-pseudo") == 0) {
1798 if (iArg < argc-1 && isdigit(argv[iArg+1][0])) {
1800 pseudoWeight = atof(argv[iArg]);
1801 if (pseudoWeight < 0.0) {
1802 fprintf(stderr,"Illegal argument to -pseudo: %s\n", argv[iArg]);
1808 } else if (strcmp(argv[iArg],"-constraints") == 0 && iArg < argc-1) {
1810 constraintsFile = argv[iArg];
1811 } else if (strcmp(argv[iArg],"-constraintWeight") == 0 && iArg < argc-1) {
1813 constraintWeight = atof(argv[iArg]);
1814 if (constraintWeight <= 0.0) {
1815 fprintf(stderr, "Illegal argument to -constraintWeight (must be greater than zero): %s\n", argv[iArg]);
1818 } else if (strcmp(argv[iArg],"-mlacc") == 0 && iArg < argc-1) {
1820 mlAccuracy = atoi(argv[iArg]);
1821 if (mlAccuracy < 1) {
1822 fprintf(stderr, "Illlegal -mlacc argument: %s\n", argv[iArg]);
1825 } else if (strcmp(argv[iArg],"-exactml") == 0 || strcmp(argv[iArg],"-mlexact") == 0) {
1826 fprintf(stderr,"-exactml is not required -- exact posteriors is the default now\n");
1827 } else if (strcmp(argv[iArg],"-approxml") == 0 || strcmp(argv[iArg],"-mlapprox") == 0) {
1829 } else if (strcmp(argv[iArg],"-cat") == 0 && iArg < argc-1) {
1831 nRateCats = atoi(argv[iArg]);
1832 if (nRateCats < 1) {
1833 fprintf(stderr, "Illlegal argument to -ncat (must be greater than zero): %s\n", argv[iArg]);
1836 } else if (strcmp(argv[iArg],"-nocat") == 0) {
1838 } else if (strcmp(argv[iArg], "-wag") == 0) {
1840 } else if (strcmp(argv[iArg], "-gtr") == 0) {
1842 } else if (strcmp(argv[iArg], "-gtrrates") == 0 && iArg < argc-6) {
1844 bUseGtrRates = true;
1846 for (i = 0; i < 6; i++) {
1847 gtrrates[i] = atof(argv[++iArg]);
1848 if (gtrrates[i] < 1e-5) {
1849 fprintf(stderr, "Illegal or too small value of GTR rate: %s\n", argv[iArg]);
1853 } else if (strcmp(argv[iArg],"-gtrfreq") == 0 && iArg < argc-4) {
1858 for (i = 0; i < 4; i++) {
1859 gtrfreq[i] = atof(argv[++iArg]);
1861 if (gtrfreq[i] < 1e-5) {
1862 fprintf(stderr, "Illegal or too small value of GTR frequency: %s\n", argv[iArg]);
1866 if (fabs(1.0-sum) > 0.01) {
1867 fprintf(stderr, "-gtrfreq values do not sum to 1\n");
1870 for (i = 0; i < 4; i++)
1872 } else if (strcmp(argv[iArg],"-log") == 0 && iArg < argc-1) {
1874 logfile = argv[iArg];
1875 } else if (strcmp(argv[iArg],"-gamma") == 0) {
1877 } else if (strcmp(argv[iArg],"-out") == 0 && iArg < argc-1) {
1879 fpOut = fopen(argv[iArg],"w");
1881 fprintf(stderr,"Cannot write to %s\n",argv[iArg]);
1884 } else if (argv[iArg][0] == '-') {
1885 fprintf(stderr, "Unknown or incorrect use of option %s\n%s", argv[iArg], usage);
1891 fprintf(stderr, "%s", usage);
1895 codesString = nCodes == 20 ? codesStringAA : codesStringNT;
1896 if (nCodes == 4 && matrixPrefix == NULL)
1897 useMatrix = false; /* no default nucleotide matrix */
1899 char *fileName = iArg == (argc-1) ? argv[argc-1] : NULL;
1901 if (slow && fastest) {
1902 fprintf(stderr,"Cannot be both slow and fastest\n");
1905 if (slow && tophitsMult > 0) {
1910 if (logfile != NULL) {
1911 fpLog = fopen(logfile, "w");
1912 if (fpLog == NULL) {
1913 fprintf(stderr, "Cannot write to: %s\n", logfile);
1916 fprintf(fpLog, "Command:");
1918 for (i=0; i < argc; i++)
1919 fprintf(fpLog, " %s", argv[i]);
1920 fprintf(fpLog,"\n");
1925 FILE *fps[2] = {NULL,NULL};
1928 fps[nFPs++] = stderr;
1930 fps[nFPs++] = fpLog;
1932 if (!make_matrix) { /* Report settings */
1933 char tophitString[100] = "no";
1934 char tophitsCloseStr[100] = "default";
1935 if(tophitsClose > 0) sprintf(tophitsCloseStr,"%.2f",tophitsClose);
1936 if(tophitsMult>0) sprintf(tophitString,"%.2f*sqrtN close=%s refresh=%.2f",
1937 tophitsMult, tophitsCloseStr, tophitsRefresh);
1938 char supportString[100] = "none";
1940 if (MLnni != 0 || MLlen)
1941 sprintf(supportString, "SH-like %d", nBootstrap);
1943 sprintf(supportString,"Local boot %d",nBootstrap);
1945 char nniString[100] = "(no NNI)";
1947 sprintf(nniString, "+NNI (%d rounds)", nni);
1949 strcpy(nniString, "+NNI");
1950 char sprString[100] = "(no SPR)";
1952 sprintf(sprString, "+SPR (%d rounds range %d)", spr, maxSPRLength);
1953 char mlnniString[100] = "(no ML-NNI)";
1955 sprintf(mlnniString, "+ML-NNI (%d rounds)", MLnni);
1956 else if (MLnni == -1)
1957 sprintf(mlnniString, "+ML-NNI");
1959 sprintf(mlnniString, "+ML branch lengths");
1960 if ((MLlen || MLnni != 0) && !exactML)
1961 strcat(mlnniString, " approx");
1963 sprintf(mlnniString+strlen(mlnniString), " opt-each=%d",mlAccuracy);
1965 for (i = 0; i < nFPs; i++) {
1967 fprintf(fp,"FastTree Version %s %s%s\nAlignment: %s",
1968 FT_VERSION, SSE_STRING, OpenMPString(), fileName != NULL ? fileName : "standard input");
1970 fprintf(fp, " (%d alignments)", nAlign);
1971 fprintf(fp,"\n%s distances: %s Joins: %s Support: %s\n",
1972 nCodes == 20 ? "Amino acid" : "Nucleotide",
1973 matrixPrefix ? matrixPrefix : (useMatrix? "BLOSUM45"
1974 : (nCodes==4 && logdist ? "Jukes-Cantor" : "%different")),
1975 bionj ? "weighted" : "balanced" ,
1977 if (intreeFile == NULL)
1978 fprintf(fp, "Search: %s%s %s %s %s\nTopHits: %s\n",
1979 slow?"Exhaustive (slow)" : (fastest ? "Fastest" : "Normal"),
1980 useTopHits2nd ? "+2nd" : "",
1981 nniString, sprString, mlnniString,
1984 fprintf(fp, "Start at tree from %s %s %s\n", intreeFile, nniString, sprString);
1986 if (MLnni != 0 || MLlen) {
1987 fprintf(fp, "ML Model: %s,",
1988 (nCodes == 4) ? (bUseGtr ? "Generalized Time-Reversible" : "Jukes-Cantor") : (bUseWag ? "Whelan-And-Goldman" : "Jones-Taylor-Thorton"));
1990 fprintf(fp, " No rate variation across sites");
1992 fprintf(fp, " CAT approximation with %d rate categories", nRateCats);
1994 if (nCodes == 4 && bUseGtrRates)
1995 fprintf(fp, "GTR rates(ac ag at cg ct gt) %.4f %.4f %.4f %.4f %.4f %.4f\n",
1996 gtrrates[0],gtrrates[1],gtrrates[2],gtrrates[3],gtrrates[4],gtrrates[5]);
1997 if (nCodes == 4 && bUseGtrFreq)
1998 fprintf(fp, "GTR frequencies(A C G T) %.4f %.4f %.4f %.4f\n",
1999 gtrfreq[0],gtrfreq[1],gtrfreq[2],gtrfreq[3]);
2001 if (constraintsFile != NULL)
2002 fprintf(fp, "Constraints: %s Weight: %.3f\n", constraintsFile, constraintWeight);
2003 if (pseudoWeight > 0)
2004 fprintf(fp, "Pseudocount weight for comparing sequences with little overlap: %.3lf\n",pseudoWeight);
2008 if (matrixPrefix != NULL) {
2010 fprintf(stderr,"Cannot use both -matrix and -nomatrix arguments!");
2013 distance_matrix = ReadDistanceMatrix(matrixPrefix);
2014 } else if (useMatrix) { /* use default matrix */
2016 distance_matrix = &matrixBLOSUM45;
2017 SetupDistanceMatrix(distance_matrix);
2019 distance_matrix = NULL;
2023 FILE *fpIn = fileName != NULL ? fopen(fileName, "r") : stdin;
2025 fprintf(stderr, "Cannot read %s\n", fileName);
2028 FILE *fpConstraints = NULL;
2029 if (constraintsFile != NULL) {
2030 fpConstraints = fopen(constraintsFile, "r");
2031 if (fpConstraints == NULL) {
2032 fprintf(stderr, "Cannot read %s\n", constraintsFile);
2037 FILE *fpInTree = NULL;
2038 if (intreeFile != NULL) {
2039 fpInTree = fopen(intreeFile,"r");
2040 if (fpInTree == NULL) {
2041 fprintf(stderr, "Cannot read %s\n", intreeFile);
2046 for(iAln = 0; iAln < nAlign; iAln++) {
2047 alignment_t *aln = ReadAlignment(fpIn, bQuote);
2048 if (aln->nSeq < 1) {
2049 fprintf(stderr, "No alignment sequences\n");
2053 fprintf(fpLog, "Read %d sequences, %d positions\n", aln->nSeq, aln->nPos);
2057 struct timeval clock_start;
2058 gettimeofday(&clock_start,NULL);
2059 ProgressReport("Read alignment",0,0,0,0);
2061 /* Check that all names in alignment are unique */
2062 hashstrings_t *hashnames = MakeHashtable(aln->names, aln->nSeq);
2064 for (i=0; i<aln->nSeq; i++) {
2065 hashiterator_t hi = FindMatch(hashnames,aln->names[i]);
2066 if (HashCount(hashnames,hi) != 1) {
2067 fprintf(stderr,"Non-unique name '%s' in the alignment\n",aln->names[i]);
2072 /* Make a list of unique sequences -- note some lists are bigger than required */
2073 ProgressReport("Hashed the names",0,0,0,0);
2075 NJ_t *NJ = InitNJ(aln->seqs, aln->nSeq, aln->nPos,
2076 /*constraintSeqs*/NULL, /*nConstraints*/0,
2077 distance_matrix, /*transmat*/NULL);
2078 printf(" %d\n",aln->nSeq);
2080 for(i = 0; i < NJ->nSeq; i++) {
2081 printf("%s",aln->names[i]);
2082 for (j = 0; j < NJ->nSeq; j++) {
2084 SeqDist(NJ->profiles[i]->codes,NJ->profiles[j]->codes,NJ->nPos,NJ->distance_matrix,/*OUT*/&hit);
2086 hit.dist = LogCorrect(hit.dist);
2087 /* Make sure -0 prints as 0 */
2088 printf(" %f", hit.dist <= 0.0 ? 0.0 : hit.dist);
2101 nRefreshTopHits = 0;
2105 nProfileFreqAlloc = 0;
2106 nProfileFreqAvoid = 0;
2111 nPosteriorCompute = 0;
2112 nAAPosteriorExact = 0;
2113 nAAPosteriorRough = 0;
2116 uniquify_t *unique = UniquifyAln(aln);
2117 ProgressReport("Identified unique sequences",0,0,0,0);
2119 /* read constraints */
2120 alignment_t *constraints = NULL;
2121 char **uniqConstraints = NULL;
2122 if (constraintsFile != NULL) {
2123 constraints = ReadAlignment(fpConstraints, bQuote);
2124 if (constraints->nSeq < 4) {
2125 fprintf(stderr, "Warning: constraints file with less than 4 sequences ignored:\nalignment #%d in %s\n",
2126 iAln+1, constraintsFile);
2127 constraints = FreeAlignment(constraints);
2129 uniqConstraints = AlnToConstraints(constraints, unique, hashnames);
2130 ProgressReport("Read the constraints",0,0,0,0);
2132 } /* end load constraints */
2134 transition_matrix_t *transmat = NULL;
2136 transmat = bUseWag? CreateTransitionMatrix(matrixWAG01,statWAG01) : CreateTransitionMatrix(matrixJTT92,statJTT92);
2137 } else if (nCodes == 4 && bUseGtr && (bUseGtrRates || bUseGtrFreq)) {
2138 transmat = CreateGTR(gtrrates,gtrfreq);
2140 NJ_t *NJ = InitNJ(unique->uniqueSeq, unique->nUnique, aln->nPos,
2142 uniqConstraints != NULL ? constraints->nPos : 0, /* nConstraints */
2145 if (verbose>2) fprintf(stderr, "read %s seqs %d (%d unique) positions %d nameLast %s seqLast %s\n",
2146 fileName ? fileName : "standard input",
2147 aln->nSeq, unique->nUnique, aln->nPos, aln->names[aln->nSeq-1], aln->seqs[aln->nSeq-1]);
2148 FreeAlignmentSeqs(/*IN/OUT*/aln); /*no longer needed*/
2149 if (fpInTree != NULL) {
2151 fseek(fpInTree, 0L, SEEK_SET);
2152 ReadTree(/*IN/OUT*/NJ, /*IN*/unique, /*IN*/hashnames, /*READ*/fpInTree);
2154 fprintf(stderr, "Read tree from %s\n", intreeFile);
2156 PrintNJ(stderr, NJ, aln->names, unique, /*support*/false, bQuote);
2160 LogTree("NJ", 0, fpLog, NJ, aln->names, unique, bQuote);
2162 /* profile-frequencies for the "up-profiles" in ReliabilityNJ take only diameter(Tree)*L*a
2163 space not N*L*a space, because we can free them as we go.
2164 And up-profile by their nature tend to be complicated.
2165 So save the profile-frequency memory allocation counters now to exclude later results.
2168 long svProfileFreqAlloc = nProfileFreqAlloc;
2169 long svProfileFreqAvoid = nProfileFreqAvoid;
2171 int nniToDo = nni == -1 ? (int)(0.5 + 4.0 * log(NJ->nSeq)/log(2)) : nni;
2172 int sprRemaining = spr;
2173 int MLnniToDo = (MLnni != -1) ? MLnni : (int)(0.5 + 2.0*log(NJ->nSeq)/log(2));
2175 if (fpInTree == NULL)
2176 fprintf(stderr, "Initial topology in %.2f seconds\n", clockDiff(&clock_start));
2177 if (spr > 0 || nniToDo > 0 || MLnniToDo > 0)
2178 fprintf(stderr,"Refining topology: %d rounds ME-NNIs, %d rounds ME-SPRs, %d rounds ML-NNIs\n", nniToDo, spr, MLnniToDo);
2183 bool bConverged = false;
2184 nni_stats_t *nni_stats = InitNNIStats(NJ);
2185 for (i=0; i < nniToDo; i++) {
2188 int nChange = NNI(/*IN/OUT*/NJ, i, nniToDo, /*use ml*/false, /*IN/OUT*/nni_stats, /*OUT*/&maxDelta);
2189 LogTree("ME_NNI%d",i+1, fpLog, NJ, aln->names, unique, bQuote);
2193 fprintf(stderr, "Min_evolution NNIs converged at round %d -- skipping some rounds\n", i+1);
2195 fprintf(fpLog, "Min_evolution NNIs converged at round %d -- skipping some rounds\n", i+1);
2199 /* Interleave SPRs with NNIs (typically 1/3rd NNI, SPR, 1/3rd NNI, SPR, 1/3rd NNI */
2200 if (sprRemaining > 0 && (nniToDo/(spr+1) > 0 && ((i+1) % (nniToDo/(spr+1))) == 0)) {
2201 SPR(/*IN/OUT*/NJ, maxSPRLength, spr-sprRemaining, spr);
2202 LogTree("ME_SPR%d",spr-sprRemaining+1, fpLog, NJ, aln->names, unique, bQuote);
2204 /* Restart the NNIs -- set all ages to 0, etc. */
2206 nni_stats = FreeNNIStats(nni_stats, NJ);
2207 nni_stats = InitNNIStats(NJ);
2210 nni_stats = FreeNNIStats(nni_stats, NJ);
2212 while(sprRemaining > 0) { /* do any remaining SPR rounds */
2213 SPR(/*IN/OUT*/NJ, maxSPRLength, spr-sprRemaining, spr);
2214 LogTree("ME_SPR%d",spr-sprRemaining+1, fpLog, NJ, aln->names, unique, bQuote);
2218 /* In minimum-evolution mode, update branch lengths, even if no NNIs or SPRs,
2219 so that they are log-corrected, do not include penalties from constraints,
2220 and avoid errors due to approximation of out-distances.
2221 If doing maximum-likelihood NNIs, then we'll also use these
2222 to get estimates of starting distances for quartets, etc.
2224 UpdateBranchLengths(/*IN/OUT*/NJ);
2225 LogTree("ME_Lengths",0, fpLog, NJ, aln->names, unique, bQuote);
2227 double total_len = 0;
2229 for (iNode = 0; iNode < NJ->maxnode; iNode++)
2230 total_len += fabs(NJ->branchlength[iNode]);
2233 fprintf(stderr, "Total branch-length %.3f after %.2f sec\n",
2234 total_len, clockDiff(&clock_start));
2238 fprintf(fpLog, "Total branch-length %.3f after %.2f sec\n",
2239 total_len, clockDiff(&clock_start));
2245 struct mallinfo mi = mallinfo();
2246 fprintf(stderr, "Memory @ end of ME phase: %.2f MB (%.1f byte/pos) useful %.2f expected %.2f\n",
2247 (mi.arena+mi.hblkhd)/1.0e6, (mi.arena+mi.hblkhd)/(double)(NJ->nSeq*(double)NJ->nPos),
2248 mi.uordblks/1.0e6, mymallocUsed/1e6);
2252 SplitCount_t splitcount = {0,0,0,0,0.0,0.0};
2254 if (MLnniToDo > 0 || MLlen) {
2255 bool warn_len = total_len/NJ->maxnode < 0.001 && MLMinBranchLengthTolerance > 1.0/aln->nPos;
2256 bool warn = warn_len || (total_len/NJ->maxnode < 0.001 && aln->nPos >= 10000);
2258 fprintf(stderr, "\nWARNING! This alignment consists of closely-related and very-long sequences.\n");
2261 "This version of FastTree may not report reasonable branch lengths!\n"
2263 "Consider changing MLMinBranchLengthTolerance.\n"
2265 "Consider recompiling FastTree with -DUSE_DOUBLE.\n"
2267 "For more information, visit\n"
2268 "http://www.microbesonline.org/fasttree/#BranchLen\n\n");
2270 fprintf(stderr, "WARNING! FastTree (or other standard maximum-likelihood tools)\n"
2271 "may not be appropriate for aligments of very closely-related sequences\n"
2272 "like this one, as FastTree does not account for recombination or gene conversion\n\n");
2274 /* Do maximum-likelihood computations */
2275 /* Convert profiles to use the transition matrix */
2276 distance_matrix_t *tmatAsDist = TransMatToDistanceMat(/*OPTIONAL*/NJ->transmat);
2277 RecomputeProfiles(NJ, /*OPTIONAL*/tmatAsDist);
2278 tmatAsDist = myfree(tmatAsDist, sizeof(distance_matrix_t));
2279 double lastloglk = -1e20;
2280 nni_stats_t *nni_stats = InitNNIStats(NJ);
2281 bool resetGtr = nCodes == 4 && bUseGtr && !bUseGtrRates;
2285 int maxRound = (int)(0.5 + log(NJ->nSeq)/log(2));
2286 double dLastLogLk = -1e20;
2287 for (iRound = 1; iRound <= maxRound; iRound++) {
2289 numeric_t *oldlength = (numeric_t*)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2290 for (node = 0; node < NJ->maxnode; node++)
2291 oldlength[node] = NJ->branchlength[node];
2292 OptimizeAllBranchLengths(/*IN/OUT*/NJ);
2293 LogTree("ML_Lengths",iRound, fpLog, NJ, aln->names, unique, bQuote);
2294 double dMaxChange = 0; /* biggest change in branch length */
2295 for (node = 0; node < NJ->maxnode; node++) {
2296 double d = fabs(oldlength[node] - NJ->branchlength[node]);
2300 oldlength = myfree(oldlength, sizeof(numeric_t)*NJ->maxnodes);
2301 double loglk = TreeLogLk(NJ, /*site_likelihoods*/NULL);
2302 bool bConverged = iRound > 1 && (dMaxChange < 0.001 || loglk < (dLastLogLk+treeLogLkDelta));
2304 fprintf(stderr, "%d rounds ML lengths: LogLk %s= %.3lf Max-change %.4lf%s Time %.2f\n",
2306 exactML || nCodes != 20 ? "" : "~",
2309 bConverged ? " (converged)" : "",
2310 clockDiff(&clock_start));
2312 fprintf(fpLog, "TreeLogLk\tLength%d\t%.4lf\tMaxChange\t%.4lf\n",
2313 iRound, loglk, dMaxChange);
2316 SetMLGtr(/*IN/OUT*/NJ, bUseGtrFreq ? gtrfreq : NULL, fpLog);
2317 SetMLRates(/*IN/OUT*/NJ, nRateCats);
2318 LogMLRates(fpLog, NJ);
2325 if (MLnniToDo > 0) {
2326 /* This may help us converge faster, and is fast */
2327 OptimizeAllBranchLengths(/*IN/OUT*/NJ);
2328 LogTree("ML_Lengths%d",1, fpLog, NJ, aln->names, unique, bQuote);
2333 bool bConverged = false;
2334 for (iMLnni = 0; iMLnni < MLnniToDo; iMLnni++) {
2335 int changes = NNI(/*IN/OUT*/NJ, iMLnni, MLnniToDo, /*use ml*/true, /*IN/OUT*/nni_stats, /*OUT*/&maxDelta);
2336 LogTree("ML_NNI%d",iMLnni+1, fpLog, NJ, aln->names, unique, bQuote);
2337 double loglk = TreeLogLk(NJ, /*site_likelihoods*/NULL);
2338 bool bConvergedHere = (iMLnni > 0) && ((loglk < lastloglk + treeLogLkDelta) || maxDelta < treeLogLkDelta);
2340 fprintf(stderr, "ML-NNI round %d: LogLk %s= %.3f NNIs %d max delta %.2f Time %.2f%s\n",
2342 exactML || nCodes != 20 ? "" : "~",
2343 loglk, changes, maxDelta, clockDiff(&clock_start),
2344 bConverged ? " (final)" : "");
2346 fprintf(fpLog, "TreeLogLk\tML_NNI%d\t%.4lf\tMaxChange\t%.4lf\n", iMLnni+1, loglk, maxDelta);
2348 break; /* we did our extra round */
2351 if (bConverged || iMLnni == MLnniToDo-2) {
2352 /* last round uses high-accuracy seettings -- reset NNI stats to tone down heuristics */
2353 nni_stats = FreeNNIStats(nni_stats, NJ);
2354 nni_stats = InitNNIStats(NJ);
2356 fprintf(stderr, "Turning off heuristics for final round of ML NNIs%s\n",
2357 bConvergedHere? " (converged)" : "");
2359 fprintf(fpLog, "Turning off heuristics for final round of ML NNIs%s\n",
2360 bConvergedHere? " (converged)" : "");
2363 if (iMLnni == 0 && NJ->rates.nRateCategories == 1) {
2365 SetMLGtr(/*IN/OUT*/NJ, bUseGtrFreq ? gtrfreq : NULL, fpLog);
2366 SetMLRates(/*IN/OUT*/NJ, nRateCats);
2367 LogMLRates(fpLog, NJ);
2370 nni_stats = FreeNNIStats(nni_stats, NJ);
2372 /* This does not take long and improves the results */
2373 if (MLnniToDo > 0) {
2374 OptimizeAllBranchLengths(/*IN/OUT*/NJ);
2375 LogTree("ML_Lengths%d",2, fpLog, NJ, aln->names, unique, bQuote);
2376 if (verbose || fpLog) {
2377 double loglk = TreeLogLk(NJ, /*site_likelihoods*/NULL);
2379 fprintf(stderr, "Optimize all lengths: LogLk %s= %.3f Time %.2f\n",
2380 exactML || nCodes != 20 ? "" : "~",
2382 clockDiff(&clock_start));
2384 fprintf(fpLog, "TreeLogLk\tML_Lengths%d\t%.4f\n", 2, loglk);
2390 /* Count bad splits and compute SH-like supports if desired */
2391 if ((MLnniToDo > 0 && !fastest) || nBootstrap > 0)
2392 TestSplitsML(NJ, /*OUT*/&splitcount, nBootstrap);
2394 /* Compute gamma-based likelihood? */
2395 if (gammaLogLk && nRateCats > 1) {
2396 numeric_t *rates = MLSiteRates(nRateCats);
2397 double *site_loglk = MLSiteLikelihoodsByRate(NJ, rates, nRateCats);
2398 double scale = RescaleGammaLogLk(NJ->nPos, nRateCats, rates, /*IN*/site_loglk, /*OPTIONAL*/fpLog);
2399 rates = myfree(rates, sizeof(numeric_t) * nRateCats);
2400 site_loglk = myfree(site_loglk, sizeof(double) * nRateCats * NJ->nPos);
2402 for (i = 0; i < NJ->maxnodes; i++)
2403 NJ->branchlength[i] *= scale;
2406 /* Minimum evolution supports */
2407 TestSplitsMinEvo(NJ, /*OUT*/&splitcount);
2409 ReliabilityNJ(NJ, nBootstrap);
2412 for (i = 0; i < nFPs; i++) {
2414 fprintf(fp, "Total time: %.2f seconds Unique: %d/%d Bad splits: %d/%d",
2415 clockDiff(&clock_start),
2416 NJ->nSeq, aln->nSeq,
2417 splitcount.nBadSplits, splitcount.nSplits);
2418 if (splitcount.dWorstDeltaUnconstrained > 0)
2419 fprintf(fp, " Worst %sdelta-%s %.3f",
2420 uniqConstraints != NULL ? "unconstrained " : "",
2421 (MLnniToDo > 0 || MLlen) ? "LogLk" : "Len",
2422 splitcount.dWorstDeltaUnconstrained);
2424 if (NJ->nSeq > 3 && NJ->nConstraints > 0) {
2425 fprintf(fp, "Violating constraints: %d both bad: %d",
2426 splitcount.nConstraintViolations, splitcount.nBadBoth);
2427 if (splitcount.dWorstDeltaConstrained > 0)
2428 fprintf(fp, " Worst delta-%s due to constraints: %.3f",
2429 (MLnniToDo > 0 || MLlen) ? "LogLk" : "Len",
2430 splitcount.dWorstDeltaConstrained);
2433 if (verbose > 1 || fp == fpLog) {
2434 double dN2 = NJ->nSeq*(double)NJ->nSeq;
2435 fprintf(fp, "Dist/N**2: by-profile %.3f (out %.3f) by-leaf %.3f avg-prof %.3f\n",
2436 profileOps/dN2, outprofileOps/dN2, seqOps/dN2, profileAvgOps/dN2);
2437 if (nCloseUsed>0 || nClose2Used > 0 || nRefreshTopHits>0)
2438 fprintf(fp, "Top hits: close neighbors %ld/%d 2nd-level %ld refreshes %ld",
2439 nCloseUsed, NJ->nSeq, nClose2Used, nRefreshTopHits);
2440 if(!slow) fprintf(fp, " Hill-climb: %ld Update-best: %ld\n", nHillBetter, nVisibleUpdate);
2441 if (nniToDo > 0 || spr > 0 || MLnniToDo > 0)
2442 fprintf(fp, "NNI: %ld SPR: %ld ML-NNI: %ld\n", nNNI, nSPR, nML_NNI);
2443 if (MLnniToDo > 0) {
2444 fprintf(fp, "Max-lk operations: lk %ld posterior %ld", nLkCompute, nPosteriorCompute);
2445 if (nAAPosteriorExact > 0 || nAAPosteriorRough > 0)
2446 fprintf(fp, " approximate-posteriors %.2f%%",
2447 (100.0*nAAPosteriorRough)/(double)(nAAPosteriorExact+nAAPosteriorRough));
2449 fprintf(fp, " star-only %ld", nStarTests);
2454 fprintf(fp, "Memory: %.2f MB (%.1f byte/pos) ",
2455 maxmallocHeap/1.0e6, maxmallocHeap/(double)(aln->nSeq*(double)aln->nPos));
2456 /* Only report numbers from before we do reliability estimates */
2457 fprintf(fp, "profile-freq-alloc %ld avoided %.2f%%\n",
2459 svProfileFreqAvoid > 0 ?
2460 100.0*svProfileFreqAvoid/(double)(svProfileFreqAlloc+svProfileFreqAvoid)
2465 PrintNJ(fpOut, NJ, aln->names, unique, /*support*/nBootstrap > 0, bQuote);
2468 fprintf(fpLog,"TreeCompleted\n");
2472 if (uniqConstraints != NULL)
2473 uniqConstraints = myfree(uniqConstraints, sizeof(char*) * unique->nUnique);
2474 constraints = FreeAlignment(constraints);
2475 unique = FreeUniquify(unique);
2476 } /* end build tree */
2477 hashnames = FreeHashtable(hashnames);
2478 aln = FreeAlignment(aln);
2479 } /* end loop over alignments */
2482 if (fpOut != stdout) fclose(fpOut);
2486 void ProgressReport(char *format, int i1, int i2, int i3, int i4) {
2487 static bool time_set = false;
2488 static struct timeval time_last;
2489 static struct timeval time_begin;
2494 static struct timeval time_now;
2495 gettimeofday(&time_now,NULL);
2497 time_begin = time_last = time_now;
2500 static struct timeval elapsed;
2501 timeval_subtract(&elapsed,&time_now,&time_last);
2503 if (elapsed.tv_sec > 1 || elapsed.tv_usec > 100*1000 || verbose > 1) {
2504 timeval_subtract(&elapsed,&time_now,&time_begin);
2505 fprintf(stderr, "%7i.%2.2i seconds: ", (int)elapsed.tv_sec, (int)(elapsed.tv_usec/10000));
2506 fprintf(stderr, format, i1, i2, i3, i4);
2507 if (verbose > 1 || !isatty(STDERR_FILENO)) {
2508 fprintf(stderr, "\n");
2510 fprintf(stderr, " \r");
2513 time_last = time_now;
2517 void LogMLRates(/*OPTIONAL WRITE*/FILE *fpLog, NJ_t *NJ) {
2518 if (fpLog != NULL) {
2519 rates_t *rates = &NJ->rates;
2520 fprintf(fpLog, "NCategories\t%d\nRates",rates->nRateCategories);
2521 assert(rates->nRateCategories > 0);
2523 for (iRate = 0; iRate < rates->nRateCategories; iRate++)
2524 fprintf(fpLog, " %f", rates->rates[iRate]);
2525 fprintf(fpLog,"\nSiteCategories");
2527 for (iPos = 0; iPos < NJ->nPos; iPos++) {
2528 iRate = rates->ratecat[iPos];
2529 fprintf(fpLog," %d",iRate+1);
2531 fprintf(fpLog,"\n");
2536 void LogTree(char *format, int i, /*OPTIONAL WRITE*/FILE *fpLog, NJ_t *NJ, char **names, uniquify_t *unique, bool bQuote) {
2538 fprintf(fpLog, format, i);
2539 fprintf(fpLog, "\t");
2540 PrintNJ(fpLog, NJ, names, unique, /*support*/false, bQuote);
2545 NJ_t *InitNJ(char **sequences, int nSeq, int nPos,
2546 /*OPTIONAL*/char **constraintSeqs, int nConstraints,
2547 /*OPTIONAL*/distance_matrix_t *distance_matrix,
2548 /*OPTIONAL*/transition_matrix_t *transmat) {
2551 NJ_t *NJ = (NJ_t*)mymalloc(sizeof(NJ_t));
2552 NJ->root = -1; /* set at end of FastNJ() */
2553 NJ->maxnode = NJ->nSeq = nSeq;
2555 NJ->maxnodes = 2*nSeq;
2556 NJ->seqs = sequences;
2557 NJ->distance_matrix = distance_matrix;
2558 NJ->transmat = transmat;
2559 NJ->nConstraints = nConstraints;
2560 NJ->constraintSeqs = constraintSeqs;
2562 NJ->profiles = (profile_t **)mymalloc(sizeof(profile_t*) * NJ->maxnodes);
2564 unsigned long counts[256];
2566 for (i = 0; i < 256; i++)
2568 for (iNode = 0; iNode < NJ->nSeq; iNode++) {
2569 NJ->profiles[iNode] = SeqToProfile(NJ, NJ->seqs[iNode], nPos,
2570 constraintSeqs != NULL ? constraintSeqs[iNode] : NULL,
2575 unsigned long totCount = 0;
2576 for (i = 0; i < 256; i++)
2577 totCount += counts[i];
2579 /* warnings about unknown characters */
2580 for (i = 0; i < 256; i++) {
2581 if (counts[i] == 0 || i == '.' || i == '-')
2583 unsigned char *codesP;
2584 bool bMatched = false;
2585 for (codesP = codesString; *codesP != '\0'; codesP++) {
2586 if (*codesP == i || tolower(*codesP) == i) {
2592 fprintf(stderr, "Ignored unknown character %c (seen %lu times)\n", i, counts[i]);
2596 /* warnings about the counts */
2597 double fACGTUN = (counts['A'] + counts['C'] + counts['G'] + counts['T'] + counts['U'] + counts['N']
2598 + counts['a'] + counts['c'] + counts['g'] + counts['t'] + counts['u'] + counts['n'])
2599 / (double)(totCount - counts['-'] - counts['.']);
2600 if (nCodes == 4 && fACGTUN < 0.9)
2601 fprintf(stderr, "WARNING! ONLY %.1f%% NUCLEOTIDE CHARACTERS -- IS THIS REALLY A NUCLEOTIDE ALIGNMENT?\n",
2603 else if (nCodes == 20 && fACGTUN >= 0.9)
2604 fprintf(stderr, "WARNING! %.1f%% NUCLEOTIDE CHARACTERS -- IS THIS REALLY A PROTEIN ALIGNMENT?\n",
2607 if(verbose>10) fprintf(stderr,"Made sequence profiles\n");
2608 for (iNode = NJ->nSeq; iNode < NJ->maxnodes; iNode++)
2609 NJ->profiles[iNode] = NULL; /* not yet exists */
2611 NJ->outprofile = OutProfile(NJ->profiles, NJ->nSeq,
2612 NJ->nPos, NJ->nConstraints,
2613 NJ->distance_matrix);
2614 if(verbose>10) fprintf(stderr,"Made out-profile\n");
2618 NJ->diameter = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2619 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->diameter[iNode] = 0;
2621 NJ->varDiameter = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2622 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->varDiameter[iNode] = 0;
2624 NJ->selfdist = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2625 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->selfdist[iNode] = 0;
2627 NJ->selfweight = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2628 for (iNode = 0; iNode < NJ->nSeq; iNode++)
2629 NJ->selfweight[iNode] = NJ->nPos - NGaps(NJ,iNode);
2631 NJ->outDistances = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2632 NJ->nOutDistActive = (int *)mymalloc(sizeof(int)*NJ->maxnodes);
2633 for (iNode = 0; iNode < NJ->maxnodes; iNode++)
2634 NJ->nOutDistActive[iNode] = NJ->nSeq * 10; /* unreasonably high value */
2635 NJ->parent = NULL; /* so SetOutDistance ignores it */
2636 for (iNode = 0; iNode < NJ->nSeq; iNode++)
2637 SetOutDistance(/*IN/UPDATE*/NJ, iNode, /*nActive*/NJ->nSeq);
2640 for (iNode = 0; iNode < 4 && iNode < NJ->nSeq; iNode++)
2641 fprintf(stderr, "Node %d outdist %f\n", iNode, NJ->outDistances[iNode]);
2644 NJ->parent = (int *)mymalloc(sizeof(int)*NJ->maxnodes);
2645 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->parent[iNode] = -1;
2647 NJ->branchlength = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes); /* distance to parent */
2648 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->branchlength[iNode] = 0;
2650 NJ->support = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2651 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->support[iNode] = -1.0;
2653 NJ->child = (children_t*)mymalloc(sizeof(children_t)*NJ->maxnodes);
2654 for (iNode= 0; iNode < NJ->maxnode; iNode++) NJ->child[iNode].nChild = 0;
2656 NJ->rates.nRateCategories = 0;
2657 NJ->rates.rates = NULL;
2658 NJ->rates.ratecat = NULL;
2659 AllocRateCategories(&NJ->rates, 1, NJ->nPos);
2663 NJ_t *FreeNJ(NJ_t *NJ) {
2668 for (i=0; i < NJ->maxnode; i++)
2669 NJ->profiles[i] = FreeProfile(NJ->profiles[i], NJ->nPos, NJ->nConstraints);
2670 NJ->profiles = myfree(NJ->profiles, sizeof(profile_t*) * NJ->maxnodes);
2671 NJ->outprofile = FreeProfile(NJ->outprofile, NJ->nPos, NJ->nConstraints);
2672 NJ->diameter = myfree(NJ->diameter, sizeof(numeric_t)*NJ->maxnodes);
2673 NJ->varDiameter = myfree(NJ->varDiameter, sizeof(numeric_t)*NJ->maxnodes);
2674 NJ->selfdist = myfree(NJ->selfdist, sizeof(numeric_t)*NJ->maxnodes);
2675 NJ->selfweight = myfree(NJ->selfweight, sizeof(numeric_t)*NJ->maxnodes);
2676 NJ->outDistances = myfree(NJ->outDistances, sizeof(numeric_t)*NJ->maxnodes);
2677 NJ->nOutDistActive = myfree(NJ->nOutDistActive, sizeof(int)*NJ->maxnodes);
2678 NJ->parent = myfree(NJ->parent, sizeof(int)*NJ->maxnodes);
2679 NJ->branchlength = myfree(NJ->branchlength, sizeof(numeric_t)*NJ->maxnodes);
2680 NJ->support = myfree(NJ->support, sizeof(numeric_t)*NJ->maxnodes);
2681 NJ->child = myfree(NJ->child, sizeof(children_t)*NJ->maxnodes);
2682 NJ->transmat = myfree(NJ->transmat, sizeof(transition_matrix_t));
2683 AllocRateCategories(&NJ->rates, 0, NJ->nPos);
2684 return(myfree(NJ, sizeof(NJ_t)));
2687 /* Allocate or reallocate the rate categories, and set every position
2688 to category 0 and every category's rate to 1.0
2689 If nRateCategories=0, just deallocate
2691 void AllocRateCategories(/*IN/OUT*/rates_t *rates, int nRateCategories, int nPos) {
2692 assert(nRateCategories >= 0);
2693 rates->rates = myfree(rates->rates, sizeof(numeric_t)*rates->nRateCategories);
2694 rates->ratecat = myfree(rates->ratecat, sizeof(unsigned int)*nPos);
2695 rates->nRateCategories = nRateCategories;
2696 if (rates->nRateCategories > 0) {
2697 rates->rates = (numeric_t*)mymalloc(sizeof(numeric_t)*rates->nRateCategories);
2699 for (i = 0; i < nRateCategories; i++)
2700 rates->rates[i] = 1.0;
2701 rates->ratecat = (unsigned int *)mymalloc(sizeof(unsigned int)*nPos);
2702 for (i = 0; i < nPos; i++)
2703 rates->ratecat[i] = 0;
2707 void FastNJ(NJ_t *NJ) {
2710 assert(NJ->nSeq >= 1);
2712 NJ->root = NJ->maxnode++;
2713 NJ->child[NJ->root].nChild = NJ->nSeq;
2714 for (iNode = 0; iNode < NJ->nSeq; iNode++) {
2715 NJ->parent[iNode] = NJ->root;
2716 NJ->child[NJ->root].child[iNode] = iNode;
2718 if (NJ->nSeq == 1) {
2719 NJ->branchlength[0] = 0;
2721 assert (NJ->nSeq == 2);
2723 SeqDist(NJ->profiles[0]->codes,NJ->profiles[1]->codes,NJ->nPos,NJ->distance_matrix,/*OUT*/&hit);
2724 NJ->branchlength[0] = hit.dist/2.0;
2725 NJ->branchlength[1] = hit.dist/2.0;
2730 /* else 3 or more sequences */
2732 /* The visible set stores the best hit of each node (unless using top hits, in which case
2733 it is handled by the top hits routines) */
2734 besthit_t *visible = NULL; /* Not used if doing top hits */
2735 besthit_t *besthitNew = NULL; /* All hits of new node -- not used if doing top-hits */
2737 /* The top-hits lists, with the key parameter m = length of each top-hit list */
2738 top_hits_t *tophits = NULL;
2739 int m = 0; /* maximum length of a top-hits list */
2740 if (tophitsMult > 0) {
2741 m = (int)(0.5 + tophitsMult*sqrt(NJ->nSeq));
2742 if(m<4 || 2*m >= NJ->nSeq) {
2744 if(verbose>1) fprintf(stderr,"Too few leaves, turning off top-hits\n");
2746 if(verbose>2) fprintf(stderr,"Top-hit-list size = %d of %d\n", m, NJ->nSeq);
2749 assert(!(slow && m>0));
2751 /* Initialize top-hits or visible set */
2753 tophits = InitTopHits(NJ, m);
2754 SetAllLeafTopHits(/*IN/UPDATE*/NJ, /*OUT*/tophits);
2755 ResetTopVisible(/*IN/UPDATE*/NJ, /*nActive*/NJ->nSeq, /*IN/OUT*/tophits);
2757 visible = (besthit_t*)mymalloc(sizeof(besthit_t)*NJ->maxnodes);
2758 besthitNew = (besthit_t*)mymalloc(sizeof(besthit_t)*NJ->maxnodes);
2759 for (iNode = 0; iNode < NJ->nSeq; iNode++)
2760 SetBestHit(iNode, NJ, /*nActive*/NJ->nSeq, /*OUT*/&visible[iNode], /*OUT IGNORED*/NULL);
2763 /* Iterate over joins */
2764 int nActiveOutProfileReset = NJ->nSeq;
2766 for (nActive = NJ->nSeq; nActive > 3; nActive--) {
2767 int nJoinsDone = NJ->nSeq - nActive;
2768 if (nJoinsDone > 0 && (nJoinsDone % 100) == 0)
2769 ProgressReport("Joined %6d of %6d", nJoinsDone, NJ->nSeq-3, 0, 0);
2771 besthit_t join; /* the join to do */
2773 ExhaustiveNJSearch(NJ,nActive,/*OUT*/&join);
2775 TopHitNJSearch(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, /*OUT*/&join);
2777 FastNJSearch(NJ, nActive, /*IN/OUT*/visible, /*OUT*/&join);
2781 double penalty = constraintWeight
2782 * (double)JoinConstraintPenalty(NJ, join.i, join.j);
2783 if (penalty > 0.001) {
2784 fprintf(stderr, "Constraint violation during neighbor-joining %d %d into %d penalty %.3f\n",
2785 join.i, join.j, NJ->maxnode, penalty);
2787 for (iC = 0; iC < NJ->nConstraints; iC++) {
2788 int local = JoinConstraintPenaltyPiece(NJ, join.i, join.j, iC);
2790 fprintf(stderr, "Constraint %d piece %d %d/%d %d/%d %d/%d\n", iC, local,
2791 NJ->profiles[join.i]->nOn[iC],
2792 NJ->profiles[join.i]->nOff[iC],
2793 NJ->profiles[join.j]->nOn[iC],
2794 NJ->profiles[join.j]->nOff[iC],
2795 NJ->outprofile->nOn[iC] - NJ->profiles[join.i]->nOn[iC] - NJ->profiles[join.j]->nOn[iC],
2796 NJ->outprofile->nOff[iC] - NJ->profiles[join.i]->nOff[iC] - NJ->profiles[join.j]->nOff[iC]);
2801 /* because of the stale out-distance heuristic, make sure that these are up-to-date */
2802 SetOutDistance(NJ, join.i, nActive);
2803 SetOutDistance(NJ, join.j, nActive);
2804 /* Make sure weight is set and criterion is up to date */
2805 SetDistCriterion(NJ, nActive, /*IN/OUT*/&join);
2806 assert(NJ->nOutDistActive[join.i] == nActive);
2807 assert(NJ->nOutDistActive[join.j] == nActive);
2809 int newnode = NJ->maxnode++;
2810 NJ->parent[join.i] = newnode;
2811 NJ->parent[join.j] = newnode;
2812 NJ->child[newnode].nChild = 2;
2813 NJ->child[newnode].child[0] = join.i < join.j ? join.i : join.j;
2814 NJ->child[newnode].child[1] = join.i > join.j ? join.i : join.j;
2816 double rawIJ = join.dist + NJ->diameter[join.i] + NJ->diameter[join.j];
2817 double distIJ = join.dist;
2819 double deltaDist = (NJ->outDistances[join.i]-NJ->outDistances[join.j])/(double)(nActive-2);
2820 NJ->branchlength[join.i] = (distIJ + deltaDist)/2;
2821 NJ->branchlength[join.j] = (distIJ - deltaDist)/2;
2823 double bionjWeight = 0.5; /* IJ = bionjWeight*I + (1-bionjWeight)*J */
2824 double varIJ = rawIJ - NJ->varDiameter[join.i] - NJ->varDiameter[join.j];
2826 if (bionj && join.weight > 0.01 && varIJ > 0.001) {
2827 /* Set bionjWeight according to the BIONJ formula, where
2828 the variance matrix is approximated by
2830 Vij = ProfileVar(i,j) - varDiameter(i) - varDiameter(j)
2831 ProfileVar(i,j) = distance(i,j) = top(i,j)/weight(i,j)
2833 (The node's distance diameter does not affect the variances.)
2835 The BIONJ formula is equation 9 from Gascuel 1997:
2837 bionjWeight = 1/2 + sum(k!=i,j) (Vjk - Vik) / ((nActive-2)*Vij)
2838 sum(k!=i,j) (Vjk - Vik) = sum(k!=i,j) Vik - varDiameter(j) + varDiameter(i)
2839 = sum(k!=i,j) ProfileVar(j,k) - sum(k!=i,j) ProfileVar(i,k) + (nActive-2)*(varDiameter(i)-varDiameter(j))
2841 sum(k!=i,j) ProfileVar(i,k)
2842 ~= (sum(k!=i,j) distance(i,k) * weight(i,k))/(mean(k!=i,j) weight(i,k))
2843 ~= (N-2) * top(i, Out-i-j) / weight(i, Out-i-j)
2845 weight(i, Out-i-j) = N*weight(i,Out) - weight(i,i) - weight(i,j)
2846 top(i, Out-i-j) = N*top(i,Out) - top(i,i) - top(i,j)
2850 ProfileDist(NJ->profiles[join.i],NJ->outprofile,NJ->nPos,NJ->distance_matrix,/*OUT*/&outI);
2851 ProfileDist(NJ->profiles[join.j],NJ->outprofile,NJ->nPos,NJ->distance_matrix,/*OUT*/&outJ);
2854 double varIWeight = (nActive * outI.weight - NJ->selfweight[join.i] - join.weight);
2855 double varJWeight = (nActive * outJ.weight - NJ->selfweight[join.j] - join.weight);
2857 double varITop = outI.dist * outI.weight * nActive
2858 - NJ->selfdist[join.i] * NJ->selfweight[join.i] - rawIJ * join.weight;
2859 double varJTop = outJ.dist * outJ.weight * nActive
2860 - NJ->selfdist[join.j] * NJ->selfweight[join.j] - rawIJ * join.weight;
2862 double deltaProfileVarOut = (nActive-2) * (varJTop/varJWeight - varITop/varIWeight);
2863 double deltaVarDiam = (nActive-2)*(NJ->varDiameter[join.i] - NJ->varDiameter[join.j]);
2864 if (varJWeight > 0.01 && varIWeight > 0.01)
2865 bionjWeight = 0.5 + (deltaProfileVarOut+deltaVarDiam)/(2*(nActive-2)*varIJ);
2866 if(bionjWeight<0) bionjWeight=0;
2867 if(bionjWeight>1) bionjWeight=1;
2868 if (verbose>2) fprintf(stderr,"dVarO %f dVarDiam %f varIJ %f from dist %f weight %f (pos %d) bionjWeight %f %f\n",
2869 deltaProfileVarOut, deltaVarDiam,
2870 varIJ, join.dist, join.weight, NJ->nPos,
2871 bionjWeight, 1-bionjWeight);
2872 if (verbose>3 && (newnode%5) == 0) {
2873 /* Compare weight estimated from outprofiles from weight made by summing over other nodes */
2874 double deltaProfileVarTot = 0;
2875 for (iNode = 0; iNode < newnode; iNode++) {
2876 if (NJ->parent[iNode] < 0) { /* excludes join.i, join.j */
2878 ProfileDist(NJ->profiles[join.i],NJ->profiles[iNode],NJ->nPos,NJ->distance_matrix,/*OUT*/&di);
2879 ProfileDist(NJ->profiles[join.j],NJ->profiles[iNode],NJ->nPos,NJ->distance_matrix,/*OUT*/&dj);
2880 deltaProfileVarTot += dj.dist - di.dist;
2883 double lambdaTot = 0.5 + (deltaProfileVarTot+deltaVarDiam)/(2*(nActive-2)*varIJ);
2884 if (lambdaTot < 0) lambdaTot = 0;
2885 if (lambdaTot > 1) lambdaTot = 1;
2886 if (fabs(bionjWeight-lambdaTot) > 0.01 || verbose > 4)
2887 fprintf(stderr, "deltaProfileVar actual %.6f estimated %.6f lambda actual %.3f estimated %.3f\n",
2888 deltaProfileVarTot,deltaProfileVarOut,lambdaTot,bionjWeight);
2891 if (verbose > 2) fprintf(stderr, "Join\t%d\t%d\t%.6f\tlambda\t%.6f\tselfw\t%.3f\t%.3f\tnew\t%d\n",
2892 join.i < join.j ? join.i : join.j,
2893 join.i < join.j ? join.j : join.i,
2894 join.criterion, bionjWeight,
2895 NJ->selfweight[join.i < join.j ? join.i : join.j],
2896 NJ->selfweight[join.i < join.j ? join.j : join.i],
2899 NJ->diameter[newnode] = bionjWeight * (NJ->branchlength[join.i] + NJ->diameter[join.i])
2900 + (1-bionjWeight) * (NJ->branchlength[join.j] + NJ->diameter[join.j]);
2901 NJ->varDiameter[newnode] = bionjWeight * NJ->varDiameter[join.i]
2902 + (1-bionjWeight) * NJ->varDiameter[join.j]
2903 + bionjWeight * (1-bionjWeight) * varIJ;
2905 NJ->profiles[newnode] = AverageProfile(NJ->profiles[join.i],NJ->profiles[join.j],
2906 NJ->nPos, NJ->nConstraints,
2907 NJ->distance_matrix,
2908 bionj ? bionjWeight : /*noweight*/-1.0);
2910 /* Update out-distances and total diameters */
2911 int changedActiveOutProfile = nActiveOutProfileReset - (nActive-1);
2912 if (changedActiveOutProfile >= nResetOutProfile
2913 && changedActiveOutProfile >= fResetOutProfile * nActiveOutProfileReset) {
2914 /* Recompute the outprofile from scratch to avoid roundoff error */
2915 profile_t **activeProfiles = (profile_t**)mymalloc(sizeof(profile_t*)*(nActive-1));
2918 for (iNode=0;iNode<NJ->maxnode;iNode++) {
2919 if (NJ->parent[iNode]<0) {
2920 assert(nSaved < nActive-1);
2921 activeProfiles[nSaved++] = NJ->profiles[iNode];
2922 NJ->totdiam += NJ->diameter[iNode];
2925 assert(nSaved==nActive-1);
2926 FreeProfile(NJ->outprofile, NJ->nPos, NJ->nConstraints);
2927 if(verbose>2) fprintf(stderr,"Recomputing outprofile %d %d\n",nActiveOutProfileReset,nActive-1);
2928 NJ->outprofile = OutProfile(activeProfiles, nSaved,
2929 NJ->nPos, NJ->nConstraints,
2930 NJ->distance_matrix);
2931 activeProfiles = myfree(activeProfiles, sizeof(profile_t*)*(nActive-1));
2932 nActiveOutProfileReset = nActive-1;
2934 UpdateOutProfile(/*OUT*/NJ->outprofile,
2935 NJ->profiles[join.i], NJ->profiles[join.j], NJ->profiles[newnode],
2937 NJ->nPos, NJ->nConstraints,
2938 NJ->distance_matrix);
2939 NJ->totdiam += NJ->diameter[newnode] - NJ->diameter[join.i] - NJ->diameter[join.j];
2942 /* Store self-dist for use in other computations */
2944 ProfileDist(NJ->profiles[newnode],NJ->profiles[newnode],NJ->nPos,NJ->distance_matrix,/*OUT*/&selfdist);
2945 NJ->selfdist[newnode] = selfdist.dist;
2946 NJ->selfweight[newnode] = selfdist.weight;
2948 /* Find the best hit of the joined node IJ */
2950 TopHitJoin(newnode, /*IN/UPDATE*/NJ, nActive-1, /*IN/OUT*/tophits);
2952 /* Not using top-hits, so we update all out-distances */
2953 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
2954 if (NJ->parent[iNode] < 0) {
2955 /* True nActive is now nActive-1 */
2956 SetOutDistance(/*IN/UPDATE*/NJ, iNode, nActive-1);
2960 if(visible != NULL) {
2961 SetBestHit(newnode, NJ, nActive-1, /*OUT*/&visible[newnode], /*OUT OPTIONAL*/besthitNew);
2963 fprintf(stderr,"Visible %d %d %f %f\n",
2964 visible[newnode].i, visible[newnode].j,
2965 visible[newnode].dist, visible[newnode].criterion);
2966 if (besthitNew != NULL) {
2967 /* Use distances to new node to update visible set entries that are non-optimal */
2968 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
2969 if (NJ->parent[iNode] >= 0 || iNode == newnode)
2971 int iOldVisible = visible[iNode].j;
2972 assert(iOldVisible>=0);
2973 assert(visible[iNode].i == iNode);
2975 /* Update the criterion; use nActive-1 because haven't decremented nActive yet */
2976 if (NJ->parent[iOldVisible] < 0)
2977 SetCriterion(/*IN/OUT*/NJ, nActive-1, &visible[iNode]);
2979 if (NJ->parent[iOldVisible] >= 0
2980 || besthitNew[iNode].criterion < visible[iNode].criterion) {
2981 if(verbose>3) fprintf(stderr,"Visible %d reset from %d to %d (%f vs. %f)\n",
2983 newnode, visible[iNode].criterion, besthitNew[iNode].criterion);
2984 if(NJ->parent[iOldVisible] < 0) nVisibleUpdate++;
2985 visible[iNode].j = newnode;
2986 visible[iNode].dist = besthitNew[iNode].dist;
2987 visible[iNode].criterion = besthitNew[iNode].criterion;
2989 } /* end loop over all nodes */
2990 } /* end if recording all hits of new node */
2991 } /* end if keeping a visible set */
2992 } /* end else (m==0) */
2993 } /* end loop over nActive */
2997 struct mallinfo mi = mallinfo();
2998 fprintf(stderr, "Memory @ end of FastNJ(): %.2f MB (%.1f byte/pos) useful %.2f expected %.2f\n",
2999 (mi.arena+mi.hblkhd)/1.0e6, (mi.arena+mi.hblkhd)/(double)(NJ->nSeq*(double)NJ->nPos),
3000 mi.uordblks/1.0e6, mymallocUsed/1e6);
3004 /* We no longer need the tophits, visible set, etc. */
3005 if (visible != NULL) visible = myfree(visible,sizeof(besthit_t)*NJ->maxnodes);
3006 if (besthitNew != NULL) besthitNew = myfree(besthitNew,sizeof(besthit_t)*NJ->maxnodes);
3007 tophits = FreeTopHits(tophits);
3009 /* Add a root for the 3 remaining nodes */
3012 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
3013 if (NJ->parent[iNode] < 0) {
3015 top[nTop++] = iNode;
3020 NJ->root = NJ->maxnode++;
3021 NJ->child[NJ->root].nChild = 3;
3022 for (nTop = 0; nTop < 3; nTop++) {
3023 NJ->parent[top[nTop]] = NJ->root;
3024 NJ->child[NJ->root].child[nTop] = top[nTop];
3027 besthit_t dist01, dist02, dist12;
3028 ProfileDist(NJ->profiles[top[0]], NJ->profiles[top[1]], NJ->nPos, NJ->distance_matrix, /*OUT*/&dist01);
3029 ProfileDist(NJ->profiles[top[0]], NJ->profiles[top[2]], NJ->nPos, NJ->distance_matrix, /*OUT*/&dist02);
3030 ProfileDist(NJ->profiles[top[1]], NJ->profiles[top[2]], NJ->nPos, NJ->distance_matrix, /*OUT*/&dist12);
3032 double d01 = dist01.dist - NJ->diameter[top[0]] - NJ->diameter[top[1]];
3033 double d02 = dist02.dist - NJ->diameter[top[0]] - NJ->diameter[top[2]];
3034 double d12 = dist12.dist - NJ->diameter[top[1]] - NJ->diameter[top[2]];
3035 NJ->branchlength[top[0]] = (d01 + d02 - d12)/2;
3036 NJ->branchlength[top[1]] = (d01 + d12 - d02)/2;
3037 NJ->branchlength[top[2]] = (d02 + d12 - d01)/2;
3039 /* Check how accurate the outprofile is */
3041 profile_t *p[3] = {NJ->profiles[top[0]], NJ->profiles[top[1]], NJ->profiles[top[2]]};
3042 profile_t *out = OutProfile(p, 3, NJ->nPos, NJ->nConstraints, NJ->distance_matrix);
3044 double freqerror = 0;
3045 double weighterror = 0;
3046 for (i=0;i<NJ->nPos;i++) {
3047 weighterror += fabs(out->weights[i] - NJ->outprofile->weights[i]);
3049 for(k=0;k<nCodes;k++)
3050 freqerror += fabs(out->vectors[nCodes*i+k] - NJ->outprofile->vectors[nCodes*i+k]);
3052 fprintf(stderr,"Roundoff error in outprofile@end: WeightError %f FreqError %f\n", weighterror, freqerror);
3053 FreeProfile(out, NJ->nPos, NJ->nConstraints);
3058 void ExhaustiveNJSearch(NJ_t *NJ, int nActive, /*OUT*/besthit_t *join) {
3063 join->criterion = 1e20;
3064 double bestCriterion = 1e20;
3067 for (i = 0; i < NJ->maxnode-1; i++) {
3068 if (NJ->parent[i] < 0) {
3069 for (j = i+1; j < NJ->maxnode; j++) {
3070 if (NJ->parent[j] < 0) {
3074 SetDistCriterion(NJ, nActive, /*IN/OUT*/&hit);
3075 if (hit.criterion < bestCriterion) {
3077 bestCriterion = hit.criterion;
3083 assert (join->i >= 0 && join->j >= 0);
3086 void FastNJSearch(NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *besthits, /*OUT*/besthit_t *join) {
3091 join->criterion = 1e20;
3093 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
3094 int jNode = besthits[iNode].j;
3095 if (NJ->parent[iNode] < 0 && NJ->parent[jNode] < 0) { /* both i and j still active */
3096 /* recompute criterion to reflect the current out-distances */
3097 SetCriterion(NJ, nActive, /*IN/OUT*/&besthits[iNode]);
3098 if (besthits[iNode].criterion < join->criterion)
3099 *join = besthits[iNode];
3107 assert(join->i >= 0 && join->j >= 0);
3108 SetBestHit(join->i, NJ, nActive, /*OUT*/&besthits[join->i], /*OUT IGNORED*/NULL);
3109 if (besthits[join->i].j != join->j) {
3112 fprintf(stderr,"BetterI\t%d\t%d\t%d\t%d\t%f\t%f\n",
3113 join->i,join->j,besthits[join->i].i,besthits[join->i].j,
3114 join->criterion,besthits[join->i].criterion);
3117 /* Save the best hit either way, because the out-distance has probably changed
3118 since we started the computation. */
3119 join->j = besthits[join->i].j;
3120 join->weight = besthits[join->i].weight;
3121 join->dist = besthits[join->i].dist;
3122 join->criterion = besthits[join->i].criterion;
3124 SetBestHit(join->j, NJ, nActive, /*OUT*/&besthits[join->j], /*OUT IGNORE*/NULL);
3125 if (besthits[join->j].j != join->i) {
3128 fprintf(stderr,"BetterJ\t%d\t%d\t%d\t%d\t%f\t%f\n",
3129 join->i,join->j,besthits[join->j].i,besthits[join->j].j,
3130 join->criterion,besthits[join->j].criterion);
3131 join->i = besthits[join->j].j;
3132 join->weight = besthits[join->j].weight;
3133 join->dist = besthits[join->j].dist;
3134 join->criterion = besthits[join->j].criterion;
3136 if(changed) nHillBetter++;
3141 /* A token is one of ():;, or an alphanumeric string without whitespace
3142 Any whitespace between tokens is ignored */
3143 char *ReadTreeToken(FILE *fp) {
3144 static char buf[BUFFER_SIZE];
3147 for (c = fgetc(fp); c != EOF; c = fgetc(fp)) {
3148 if (c == '(' || c == ')' || c == ':' || c == ';' || c == ',') {
3149 /* standalone token */
3159 } else if (isspace(c)) {
3164 /* else ignore whitespace at beginning of token */
3166 /* not whitespace or standalone token */
3168 if (len >= BUFFER_SIZE) {
3169 buf[BUFFER_SIZE-1] = '\0';
3170 fprintf(stderr, "Token too long in tree file, token begins with\n%s\n", buf);
3176 /* return the token we have so far */
3184 void ReadTreeError(char *err, char *token) {
3185 fprintf(stderr, "Tree parse error: unexpected token '%s' -- %s\n",
3186 token == NULL ? "(End of file)" : token,
3191 void ReadTreeAddChild(int parent, int child, /*IN/OUT*/int *parents, /*IN/OUT*/children_t *children) {
3192 assert(parent >= 0);
3194 assert(parents[child] < 0);
3195 assert(children[parent].nChild < 3);
3196 parents[child] = parent;
3197 children[parent].child[children[parent].nChild++] = child;
3200 void ReadTreeMaybeAddLeaf(int parent, char *name,
3201 hashstrings_t *hashnames, uniquify_t *unique,
3202 /*IN/OUT*/int *parents, /*IN/OUT*/children_t *children) {
3203 hashiterator_t hi = FindMatch(hashnames,name);
3204 if (HashCount(hashnames,hi) != 1)
3205 ReadTreeError("not recognized as a sequence name", name);
3207 int iSeqNonunique = HashFirst(hashnames,hi);
3208 assert(iSeqNonunique >= 0 && iSeqNonunique < unique->nSeq);
3209 int iSeqUnique = unique->alnToUniq[iSeqNonunique];
3210 assert(iSeqUnique >= 0 && iSeqUnique < unique->nUnique);
3211 /* Either record this leaves' parent (if it is -1) or ignore this leaf (if already seen) */
3212 if (parents[iSeqUnique] < 0) {
3213 ReadTreeAddChild(parent, iSeqUnique, /*IN/OUT*/parents, /*IN/OUT*/children);
3215 fprintf(stderr, "Found leaf uniq%d name %s child of %d\n", iSeqUnique, name, parent);
3218 fprintf(stderr, "Skipped redundant leaf uniq%d name %s\n", iSeqUnique, name);
3222 void ReadTreeRemove(/*IN/OUT*/int *parents, /*IN/OUT*/children_t *children, int node) {
3224 fprintf(stderr,"Removing node %d parent %d\n", node, parents[node]);
3225 assert(parents[node] >= 0);
3226 int parent = parents[node];
3228 children_t *pc = &children[parent];
3230 for (oldn = 0; oldn < pc->nChild; oldn++) {
3231 if (pc->child[oldn] == node)
3234 assert(oldn < pc->nChild);
3236 /* move successor nodes back in child list and shorten list */
3238 for (i = oldn; i < pc->nChild-1; i++)
3239 pc->child[i] = pc->child[i+1];
3242 /* add its children to parent's child list */
3243 children_t *nc = &children[node];
3244 if (nc->nChild > 0) {
3245 assert(nc->nChild<=2);
3246 assert(pc->nChild < 3);
3247 assert(pc->nChild + nc->nChild <= 3);
3249 for (j = 0; j < nc->nChild; j++) {
3251 fprintf(stderr,"Repointing parent %d to child %d\n", parent, nc->child[j]);
3252 pc->child[pc->nChild++] = nc->child[j];
3253 parents[nc->child[j]] = parent;
3259 void ReadTree(/*IN/OUT*/NJ_t *NJ,
3260 /*IN*/uniquify_t *unique,
3261 /*IN*/hashstrings_t *hashnames,
3262 /*READ*/FILE *fpInTree) {
3263 assert(NJ->nSeq == unique->nUnique);
3264 /* First, do a preliminary parse of the tree to with non-unique leaves ignored
3265 We need to store this separately from NJ because it may have too many internal nodes
3266 (matching sequences show up once in the NJ but could be in multiple places in the tree)
3267 Will use iUnique as the index of nodes, as in the NJ structure
3269 int maxnodes = unique->nSeq*2;
3270 int maxnode = unique->nSeq;
3271 int *parent = (int*)mymalloc(sizeof(int)*maxnodes);
3272 children_t *children = (children_t *)mymalloc(sizeof(children_t)*maxnodes);
3273 int root = maxnode++;
3275 for (i = 0; i < maxnodes; i++) {
3277 children[i].nChild = 0;
3280 /* The stack is the current path to the root, with the root at the first (top) position */
3282 int *stack = (int*)mymalloc(sizeof(int)*maxnodes);
3288 token = ReadTreeToken(fpInTree);
3289 if (token == NULL || *token != '(')
3290 ReadTreeError("No '(' at start", token);
3291 /* nDown is still 0 because we have created the root */
3293 while ((token = ReadTreeToken(fpInTree)) != NULL) {
3294 if (nDown > 0) { /* In a stream of parentheses */
3297 else if (*token == ',' || *token == ';' || *token == ':' || *token == ')')
3298 ReadTreeError("while reading parentheses", token);
3300 /* Add intermediate nodes if nDown was > 1 (for nDown=1, the only new node is the leaf) */
3301 while (nDown-- > 0) {
3302 int new = maxnode++;
3303 assert(new < maxnodes);
3304 ReadTreeAddChild(stack[stack_size-1], new, /*IN/OUT*/parent, /*IN/OUT*/children);
3306 fprintf(stderr, "Added internal child %d of %d, stack size increase to %d\n",
3307 new, stack[stack_size-1],stack_size+1);
3308 stack[stack_size++] = new;
3309 assert(stack_size < maxnodes);
3311 ReadTreeMaybeAddLeaf(stack[stack_size-1], token,
3313 /*IN/OUT*/parent, /*IN/OUT*/children);
3315 } else if (nUp > 0) {
3316 if (*token == ';') { /* end the tree? */
3317 if (nUp != stack_size)
3318 ReadTreeError("unbalanced parentheses", token);
3321 } else if (*token == ')')
3323 else if (*token == '(')
3324 ReadTreeError("unexpected '(' after ')'", token);
3325 else if (*token == ':') {
3326 token = ReadTreeToken(fpInTree);
3327 /* Read the branch length and ignore it */
3328 if (token == NULL || (*token != '-' && !isdigit(*token)))
3329 ReadTreeError("not recognized as a branch length", token);
3330 } else if (*token == ',') {
3331 /* Go back up the stack the correct #times */
3335 fprintf(stderr, "Up to nUp=%d stack size %d at %d\n",
3336 nUp, stack_size, stack[stack_size-1]);
3337 if (stack_size <= 0)
3338 ReadTreeError("too many ')'", token);
3341 } else if (*token == '-' || isdigit(*token))
3342 ; /* ignore bootstrap value */
3344 fprintf(stderr, "Warning while parsing tree: non-numeric label %s for internal node\n",
3346 } else if (*token == '(') {
3348 } else if (*token == ')') {
3350 } else if (*token == ':') {
3351 token = ReadTreeToken(fpInTree);
3352 if (token == NULL || (*token != '-' && !isdigit(*token)))
3353 ReadTreeError("not recognized as a branch length", token);
3354 } else if (*token == ',') {
3356 } else if (*token == ';')
3357 ReadTreeError("unexpected token", token);
3359 ReadTreeMaybeAddLeaf(stack[stack_size-1], token,
3361 /*IN/OUT*/parent, /*IN/OUT*/children);
3364 /* Verify that all sequences were seen */
3365 for (i = 0; i < unique->nUnique; i++) {
3366 if (parent[i] < 0) {
3367 fprintf(stderr, "Alignment sequence %d (unique %d) absent from input tree\n"
3368 "The starting tree (the argument to -intree) must include all sequences in the alignment!\n",
3369 unique->uniqueFirst[i], i);
3374 /* Simplify the tree -- remove all internal nodes with < 2 children
3375 Keep trying until no nodes get removed
3380 /* Here stack is the list of nodes we haven't visited yet while doing
3384 while (stack_size > 0) {
3385 int node = stack[--stack_size];
3386 if (node >= unique->nUnique) { /* internal node */
3387 if (children[node].nChild <= 1) {
3389 ReadTreeRemove(/*IN/OUT*/parent,/*IN/OUT*/children,node);
3391 } else if (node == root && children[node].nChild == 1) {
3392 int newroot = children[node].child[0];
3393 parent[newroot] = -1;
3394 children[root].nChild = 0;
3397 fprintf(stderr,"Changed root from %d to %d\n",root,newroot);
3399 stack[stack_size++] = newroot;
3403 for (j = 0; j < children[node].nChild; j++) {
3404 assert(stack_size < maxnodes);
3405 stack[stack_size++] = children[node].child[j];
3407 fprintf(stderr,"Added %d to stack\n", stack[stack_size-1]);
3412 } while (nRemoved > 0);
3414 /* Simplify the root node to 3 children if it has 2 */
3415 if (children[root].nChild == 2) {
3416 for (i = 0; i < 2; i++) {
3417 int child = children[root].child[i];
3418 assert(child >= 0 && child < maxnodes);
3419 if (children[child].nChild == 2) {
3420 ReadTreeRemove(parent,children,child); /* replace root -> child -> A,B with root->A,B */
3426 for (i = 0; i < maxnodes; i++)
3428 fprintf(stderr,"Simplfied node %d has parent %d nchild %d\n",
3429 i, parent[i], children[i].nChild);
3431 /* Map the remaining internal nodes to NJ nodes */
3432 int *map = (int*)mymalloc(sizeof(int)*maxnodes);
3433 for (i = 0; i < unique->nUnique; i++)
3435 for (i = unique->nUnique; i < maxnodes; i++)
3439 while (stack_size > 0) {
3440 int node = stack[--stack_size];
3441 if (node >= unique->nUnique) { /* internal node */
3442 assert(node == root || children[node].nChild > 1);
3443 map[node] = NJ->maxnode++;
3444 for (i = 0; i < children[node].nChild; i++) {
3445 assert(stack_size < maxnodes);
3446 stack[stack_size++] = children[node].child[i];
3450 for (i = 0; i < maxnodes; i++)
3452 fprintf(stderr,"Map %d to %d (parent %d nchild %d)\n",
3453 i, map[i], parent[i], children[i].nChild);
3455 /* Set NJ->parent, NJ->children, NJ->root */
3456 NJ->root = map[root];
3458 for (node = 0; node < maxnodes; node++) {
3459 int njnode = map[node];
3461 NJ->child[njnode].nChild = children[node].nChild;
3462 for (i = 0; i < children[node].nChild; i++) {
3463 assert(children[node].child[i] >= 0 && children[node].child[i] < maxnodes);
3464 NJ->child[njnode].child[i] = map[children[node].child[i]];
3466 if (parent[node] >= 0)
3467 NJ->parent[njnode] = map[parent[node]];
3471 /* Make sure that parent/child relationships match */
3472 for (i = 0; i < NJ->maxnode; i++) {
3473 children_t *c = &NJ->child[i];
3475 for (j = 0; j < c->nChild;j++)
3476 assert(c->child[j] >= 0 && c->child[j] < NJ->maxnode && NJ->parent[c->child[j]] == i);
3478 assert(NJ->parent[NJ->root] < 0);
3480 map = myfree(map,sizeof(int)*maxnodes);
3481 stack = myfree(stack,sizeof(int)*maxnodes);
3482 children = myfree(children,sizeof(children_t)*maxnodes);
3483 parent = myfree(parent,sizeof(int)*maxnodes);
3485 /* Compute profiles as balanced -- the NNI stage will recompute these
3488 traversal_t traversal = InitTraversal(NJ);
3490 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
3491 if (node >= NJ->nSeq && node != NJ->root)
3492 SetProfile(/*IN/OUT*/NJ, node, /*noweight*/-1.0);
3494 traversal = FreeTraversal(traversal,NJ);
3497 /* Print topology using node indices as node names */
3498 void PrintNJInternal(FILE *fp, NJ_t *NJ, bool useLen) {
3502 typedef struct { int node; int end; } stack_t;
3503 stack_t *stack = (stack_t *)mymalloc(sizeof(stack_t)*NJ->maxnodes);
3505 stack[0].node = NJ->root;
3508 while(stackSize>0) {
3509 stack_t *last = &stack[stackSize-1];
3511 /* Save last, as we are about to overwrite it */
3512 int node = last->node;
3513 int end = last->end;
3515 if (node < NJ->nSeq) {
3516 if (NJ->child[NJ->parent[node]].child[0] != node) fputs(",",fp);
3517 fprintf(fp, "%d", node);
3519 fprintf(fp, ":%.4f", NJ->branchlength[node]);
3521 fprintf(fp, ")%d", node);
3523 fprintf(fp, ":%.4f", NJ->branchlength[node]);
3525 if (node != NJ->root && NJ->child[NJ->parent[node]].child[0] != node) fprintf(fp, ",");
3528 stack[stackSize-1].node = node;
3529 stack[stackSize-1].end = 1;
3530 children_t *c = &NJ->child[node];
3531 /* put children on in reverse order because we use the last one first */
3533 for (i = c->nChild-1; i >=0; i--) {
3535 stack[stackSize-1].node = c->child[i];
3536 stack[stackSize-1].end = 0;
3541 stack = myfree(stack, sizeof(stack_t)*NJ->maxnodes);
3544 void PrintNJ(FILE *fp, NJ_t *NJ, char **names, uniquify_t *unique, bool bShowSupport, bool bQuote) {
3545 /* And print the tree: depth first search
3546 * The stack contains
3547 * list of remaining children with their depth
3548 * parent node, with a flag of -1 so I know to print right-paren
3550 if (NJ->nSeq==1 && unique->alnNext[unique->uniqueFirst[0]] >= 0) {
3551 /* Special case -- otherwise we end up with double parens */
3552 int first = unique->uniqueFirst[0];
3553 assert(first >= 0 && first < unique->nSeq);
3554 fprintf(fp, bQuote ? "('%s':0.0" : "(%s:0.0", names[first]);
3555 int iName = unique->alnNext[first];
3556 while (iName >= 0) {
3557 assert(iName < unique->nSeq);
3558 fprintf(fp, bQuote ? ",'%s':0.0" : ",%s:0.0", names[iName]);
3559 iName = unique->alnNext[iName];
3565 typedef struct { int node; int end; } stack_t;
3566 stack_t *stack = (stack_t *)mymalloc(sizeof(stack_t)*NJ->maxnodes);
3568 stack[0].node = NJ->root;
3571 while(stackSize>0) {
3572 stack_t *last = &stack[stackSize-1];
3574 /* Save last, as we are about to overwrite it */
3575 int node = last->node;
3576 int end = last->end;
3578 if (node < NJ->nSeq) {
3579 if (NJ->child[NJ->parent[node]].child[0] != node) fputs(",",fp);
3580 int first = unique->uniqueFirst[node];
3581 assert(first >= 0 && first < unique->nSeq);
3582 /* Print the name, or the subtree of duplicate names */
3583 if (unique->alnNext[first] == -1) {
3584 fprintf(fp, bQuote ? "'%s'" : "%s", names[first]);
3586 fprintf(fp, bQuote ? "('%s':0.0" : "(%s:0.0", names[first]);
3587 int iName = unique->alnNext[first];
3588 while (iName >= 0) {
3589 assert(iName < unique->nSeq);
3590 fprintf(fp, bQuote ? ",'%s':0.0" : ",%s:0.0", names[iName]);
3591 iName = unique->alnNext[iName];
3595 /* Print the branch length */
3597 #define FP_FORMAT "%.9f"
3599 #define FP_FORMAT "%.5f"
3601 fprintf(fp, ":" FP_FORMAT, NJ->branchlength[node]);
3603 if (node == NJ->root)
3605 else if (bShowSupport)
3606 fprintf(fp, ")%.3f:" FP_FORMAT, NJ->support[node], NJ->branchlength[node]);
3608 fprintf(fp, "):" FP_FORMAT, NJ->branchlength[node]);
3610 if (node != NJ->root && NJ->child[NJ->parent[node]].child[0] != node) fprintf(fp, ",");
3613 stack[stackSize-1].node = node;
3614 stack[stackSize-1].end = 1;
3615 children_t *c = &NJ->child[node];
3616 /* put children on in reverse order because we use the last one first */
3618 for (i = c->nChild-1; i >=0; i--) {
3620 stack[stackSize-1].node = c->child[i];
3621 stack[stackSize-1].end = 0;
3626 stack = myfree(stack, sizeof(stack_t)*NJ->maxnodes);
3629 alignment_t *ReadAlignment(/*IN*/FILE *fp, bool bQuote) {
3630 /* bQuote supports the -quote option */
3633 char **names = NULL;
3635 char buf[BUFFER_SIZE] = "";
3636 if (fgets(buf,sizeof(buf),fp) == NULL) {
3637 fprintf(stderr, "Error reading header line\n");
3641 if (buf[0] == '>') {
3642 /* FASTA, truncate names at any of these */
3643 char *nameStop = bQuote ? "'\t\r\n" : "(),: \t\r\n";
3644 char *seqSkip = " \t\r\n"; /* skip these characters in the sequence */
3645 seqs = (char**)mymalloc(sizeof(char*) * nSaved);
3646 names = (char**)mymalloc(sizeof(char*) * nSaved);
3649 /* loop over lines */
3650 if (buf[0] == '>') {
3651 /* truncate the name */
3653 for (p = buf+1; *p != '\0'; p++) {
3654 for (q = nameStop; *q != '\0'; q++) {
3660 if (*p == '\0') break;
3663 /* allocate space for another sequence */
3665 if (nSeq > nSaved) {
3666 int nNewSaved = nSaved*2;
3667 seqs = myrealloc(seqs,sizeof(char*)*nSaved,sizeof(char*)*nNewSaved, /*copy*/false);
3668 names = myrealloc(names,sizeof(char*)*nSaved,sizeof(char*)*nNewSaved, /*copy*/false);
3671 names[nSeq-1] = (char*)mymemdup(buf+1,strlen(buf));
3672 seqs[nSeq-1] = NULL;
3674 /* count non-space characters and append to sequence */
3677 for (p=buf; *p != '\0'; p++) {
3678 for (q=seqSkip; *q != '\0'; q++) {
3685 int nOld = (seqs[nSeq-1] == NULL) ? 0 : strlen(seqs[nSeq-1]);
3686 seqs[nSeq-1] = (char*)myrealloc(seqs[nSeq-1], nOld, nOld+nKeep+1, /*copy*/false);
3687 if (nOld+nKeep > nPos)
3688 nPos = nOld + nKeep;
3689 char *out = seqs[nSeq-1] + nOld;
3690 for (p=buf; *p != '\0'; p++) {
3691 for (q=seqSkip; *q != '\0'; q++) {
3700 assert(out-seqs[nSeq-1] == nKeep + nOld);
3703 } while(fgets(buf,sizeof(buf),fp) != NULL);
3705 if (seqs[nSeq-1] == NULL) {
3706 fprintf(stderr, "No sequence data for last entry %s\n",names[nSeq-1]);
3709 names = myrealloc(names,sizeof(char*)*nSaved,sizeof(char*)*nSeq, /*copy*/false);
3710 seqs = myrealloc(seqs,sizeof(char*)*nSaved,sizeof(char*)*nSeq, /*copy*/false);
3712 /* PHYLIP interleaved-like format
3713 Allow arbitrary length names, require spaces between names and sequences
3714 Allow multiple alignments, either separated by a single empty line (e.g. seqboot output)
3717 if (buf[0] == '\n' || buf[0] == '\r') {
3718 if (fgets(buf,sizeof(buf),fp) == NULL) {
3719 fprintf(stderr, "Empty header line followed by EOF\n");
3723 if (sscanf(buf, "%d%d", &nSeq, &nPos) != 2
3724 || nSeq < 1 || nPos < 1) {
3725 fprintf(stderr, "Error parsing header line:%s\n", buf);
3728 names = (char **)mymalloc(sizeof(char*) * nSeq);
3729 seqs = (char **)mymalloc(sizeof(char*) * nSeq);
3733 for (i = 0; i < nSeq; i++) {
3735 seqs[i] = (char *)mymalloc(nPos+1); /* null-terminate */
3740 while(fgets(buf,sizeof(buf),fp)) {
3741 if ((buf[0] == '\n' || buf[0] == '\r') && (iSeq == nSeq || iSeq == 0)) {
3744 int j = 0; /* character just past end of name */
3745 if (buf[0] == ' ') {
3746 if (names[iSeq] == NULL) {
3747 fprintf(stderr, "No name in phylip line %s", buf);
3751 while (buf[j] != '\n' && buf[j] != '\0' && buf[j] != ' ')
3753 if (buf[j] != ' ' || j == 0) {
3754 fprintf(stderr, "No sequence in phylip line %s", buf);
3758 fprintf(stderr, "No empty line between sequence blocks (is the sequence count wrong?)\n");
3761 if (names[iSeq] == NULL) {
3763 names[iSeq] = (char *)mymalloc(j+1);
3765 for (k = 0; k < j; k++) names[iSeq][k] = buf[k];
3766 names[iSeq][j] = '\0';
3768 /* check the name */
3771 for (k = 0; k < j; k++) {
3772 if (names[iSeq][k] != buf[k]) {
3777 if (!match || names[iSeq][j] != '\0') {
3778 fprintf(stderr, "Wrong name in phylip line %s\nExpected %s\n", buf, names[iSeq]);
3783 int seqlen = strlen(seqs[iSeq]);
3784 for (; buf[j] != '\n' && buf[j] != '\0'; j++) {
3785 if (buf[j] != ' ') {
3786 if (seqlen >= nPos) {
3787 fprintf(stderr, "Too many characters (expected %d) for sequence named %s\nSo far have:\n%s\n",
3788 nPos, names[iSeq], seqs[iSeq]);
3791 seqs[iSeq][seqlen++] = toupper(buf[j]);
3794 seqs[iSeq][seqlen] = '\0'; /* null-terminate */
3795 if(verbose>10) fprintf(stderr,"Read iSeq %d name %s seqsofar %s\n", iSeq, names[iSeq], seqs[iSeq]);
3797 if (iSeq == nSeq && strlen(seqs[0]) == nPos)
3798 break; /* finished alignment */
3799 } /* end else non-empty phylip line */
3801 if (iSeq != nSeq && iSeq != 0) {
3802 fprintf(stderr, "Wrong number of sequences: expected %d\n", nSeq);
3806 /* Check lengths of sequences */
3808 for (i = 0; i < nSeq; i++) {
3809 int seqlen = strlen(seqs[i]);
3810 if (seqlen != nPos) {
3811 fprintf(stderr, "Wrong number of characters for %s: expected %d but have %d instead.\n"
3812 "This sequence may be truncated, or another sequence may be too long.\n",
3813 names[i], nPos, seqlen);
3817 /* Replace "." with "-" and warn if we find any */
3818 /* If nucleotide sequences, replace U with T and N with X */
3819 bool findDot = false;
3820 for (i = 0; i < nSeq; i++) {
3822 for (p = seqs[i]; *p != '\0'; p++) {
3827 if (nCodes == 4 && *p == 'U')
3829 if (nCodes == 4 && *p == 'N')
3834 fprintf(stderr, "Warning! Found \".\" character(s). These are treated as gaps\n");
3837 fprintf(stderr, "Error reading input file\n");
3841 alignment_t *align = (alignment_t*)mymalloc(sizeof(alignment_t));
3844 align->names = names;
3846 align->nSaved = nSaved;
3850 void FreeAlignmentSeqs(/*IN/OUT*/alignment_t *aln) {
3851 assert(aln != NULL);
3853 for (i = 0; i < aln->nSeq; i++)
3854 aln->seqs[i] = myfree(aln->seqs[i], aln->nPos+1);
3857 alignment_t *FreeAlignment(alignment_t *aln) {
3861 for (i = 0; i < aln->nSeq; i++) {
3862 aln->names[i] = myfree(aln->names[i],strlen(aln->names[i])+1);
3863 aln->seqs[i] = myfree(aln->seqs[i], aln->nPos+1);
3865 aln->names = myfree(aln->names, sizeof(char*)*aln->nSaved);
3866 aln->seqs = myfree(aln->seqs, sizeof(char*)*aln->nSaved);
3867 myfree(aln, sizeof(alignment_t));
3871 char **AlnToConstraints(alignment_t *constraints, uniquify_t *unique, hashstrings_t *hashnames) {
3872 /* look up constraints as names and map to unique-space */
3873 char ** uniqConstraints = (char**)mymalloc(sizeof(char*) * unique->nUnique);
3875 for (i = 0; i < unique->nUnique; i++)
3876 uniqConstraints[i] = NULL;
3877 for (i = 0; i < constraints->nSeq; i++) {
3878 char *name = constraints->names[i];
3879 char *constraintSeq = constraints->seqs[i];
3880 hashiterator_t hi = FindMatch(hashnames,name);
3881 if (HashCount(hashnames,hi) != 1) {
3882 fprintf(stderr, "Sequence %s from constraints file is not in the alignment\n", name);
3885 int iSeqNonunique = HashFirst(hashnames,hi);
3886 assert(iSeqNonunique >= 0 && iSeqNonunique < unique->nSeq);
3887 int iSeqUnique = unique->alnToUniq[iSeqNonunique];
3888 assert(iSeqUnique >= 0 && iSeqUnique < unique->nUnique);
3889 if (uniqConstraints[iSeqUnique] != NULL) {
3890 /* Already set a constraint for this group of sequences!
3891 Warn that we are ignoring this one unless the constraints match */
3892 if (strcmp(uniqConstraints[iSeqUnique],constraintSeq) != 0) {
3894 "Warning: ignoring constraints for %s:\n%s\n"
3895 "Another sequence has the same sequence but different constraints\n",
3896 name, constraintSeq);
3899 uniqConstraints[iSeqUnique] = constraintSeq;
3902 return(uniqConstraints);
3906 profile_t *SeqToProfile(/*IN/OUT*/NJ_t *NJ,
3907 char *seq, int nPos,
3908 /*OPTIONAL*/char *constraintSeq, int nConstraints,
3910 unsigned long counts[256]) {
3911 static unsigned char charToCode[256];
3912 static int codeSet = 0;
3916 for (c = 0; c < 256; c++) {
3917 charToCode[c] = nCodes;
3919 for (i = 0; codesString[i]; i++) {
3920 charToCode[codesString[i]] = i;
3921 charToCode[tolower(codesString[i])] = i;
3923 charToCode['-'] = NOCODE;
3927 assert(strlen(seq) == nPos);
3928 profile_t *profile = NewProfile(nPos,nConstraints);
3930 for (i = 0; i < nPos; i++) {
3931 unsigned int character = (unsigned int) seq[i];
3932 counts[character]++;
3933 c = charToCode[character];
3934 if(verbose>10 && i < 2) fprintf(stderr,"pos %d char %c code %d\n", i, seq[i], c);
3935 /* treat unknowns as gaps */
3936 if (c == nCodes || c == NOCODE) {
3937 profile->codes[i] = NOCODE;
3938 profile->weights[i] = 0.0;
3940 profile->codes[i] = c;
3941 profile->weights[i] = 1.0;
3944 if (nConstraints > 0) {
3945 for (i = 0; i < nConstraints; i++) {
3946 profile->nOn[i] = 0;
3947 profile->nOff[i] = 0;
3950 if (constraintSeq != NULL) {
3951 assert(strlen(constraintSeq) == nConstraints);
3952 for (i = 0; i < nConstraints; i++) {
3953 if (constraintSeq[i] == '1') {
3954 profile->nOn[i] = 1;
3955 } else if (constraintSeq[i] == '0') {
3956 profile->nOff[i] = 1;
3957 } else if (constraintSeq[i] != '-') {
3959 fprintf(stderr, "Constraint characters in unique sequence %d replaced with gap:", iNode+1);
3962 fprintf(stderr, " %c%d", constraintSeq[i], i+1);
3963 /* For the benefit of ConstraintSequencePenalty -- this is a bit of a hack, as
3964 this modifies the value read from the alignment
3966 constraintSeq[i] = '-';
3970 fprintf(stderr, "\n");
3976 void SeqDist(unsigned char *codes1, unsigned char *codes2, int nPos,
3977 distance_matrix_t *dmat,
3978 /*OUT*/besthit_t *hit) {
3979 double top = 0; /* summed over positions */
3984 for (i = 0; i < nPos; i++) {
3985 if (codes1[i] != NOCODE && codes2[i] != NOCODE) {
3987 if (codes1[i] != codes2[i]) nDiff++;
3990 top = (double)nDiff;
3992 for (i = 0; i < nPos; i++) {
3993 if (codes1[i] != NOCODE && codes2[i] != NOCODE) {
3995 top += dmat->distances[(unsigned int)codes1[i]][(unsigned int)codes2[i]];
3999 hit->weight = (double)nUse;
4000 hit->dist = nUse > 0 ? top/(double)nUse : 1.0;
4004 void CorrectedPairDistances(profile_t **profiles, int nProfiles,
4005 /*OPTIONAL*/distance_matrix_t *distance_matrix,
4007 /*OUT*/double *distances) {
4008 assert(distances != NULL);
4009 assert(profiles != NULL);
4010 assert(nProfiles>1 && nProfiles <= 4);
4014 for (iHit=0, i=0; i < nProfiles; i++) {
4015 for (j=i+1; j < nProfiles; j++, iHit++) {
4016 ProfileDist(profiles[i],profiles[j],nPos,distance_matrix,/*OUT*/&hit[iHit]);
4017 distances[iHit] = hit[iHit].dist;
4020 if (pseudoWeight > 0) {
4021 /* Estimate the prior distance */
4024 for (iHit=0; iHit < (nProfiles*(nProfiles-1))/2; iHit++) {
4025 dTop += hit[iHit].dist * hit[iHit].weight;
4026 dBottom += hit[iHit].weight;
4028 double prior = (dBottom > 0.01) ? dTop/dBottom : 3.0;
4029 for (iHit=0; iHit < (nProfiles*(nProfiles-1))/2; iHit++)
4030 distances[iHit] = (distances[iHit] * hit[iHit].weight + prior * pseudoWeight)
4031 / (hit[iHit].weight + pseudoWeight);
4034 for (iHit=0; iHit < (nProfiles*(nProfiles-1))/2; iHit++)
4035 distances[iHit] = LogCorrect(distances[iHit]);
4039 /* During the neighbor-joining phase, a join only violates our constraints if
4040 node1, node2, and other are all represented in the constraint
4041 and if one of the 3 is split and the other two do not agree
4043 int JoinConstraintPenalty(/*IN*/NJ_t *NJ, int node1, int node2) {
4044 if (NJ->nConstraints == 0)
4048 for (iC = 0; iC < NJ->nConstraints; iC++)
4049 penalty += JoinConstraintPenaltyPiece(NJ, node1, node2, iC);
4053 int JoinConstraintPenaltyPiece(NJ_t *NJ, int node1, int node2, int iC) {
4054 profile_t *pOut = NJ->outprofile;
4055 profile_t *p1 = NJ->profiles[node1];
4056 profile_t *p2 = NJ->profiles[node2];
4057 int nOn1 = p1->nOn[iC];
4058 int nOff1 = p1->nOff[iC];
4059 int nOn2 = p2->nOn[iC];
4060 int nOff2 = p2->nOff[iC];
4061 int nOnOut = pOut->nOn[iC] - nOn1 - nOn2;
4062 int nOffOut = pOut->nOff[iC] - nOff1 - nOff2;
4064 if ((nOn1+nOff1) > 0 && (nOn2+nOff2) > 0 && (nOnOut+nOffOut) > 0) {
4065 /* code is -1 for split, 0 for off, 1 for on */
4066 int code1 = (nOn1 > 0 && nOff1 > 0) ? -1 : (nOn1 > 0 ? 1 : 0);
4067 int code2 = (nOn2 > 0 && nOff2 > 0) ? -1 : (nOn2 > 0 ? 1 : 0);
4068 int code3 = (nOnOut > 0 && nOffOut) > 0 ? -1 : (nOnOut > 0 ? 1 : 0);
4069 int nSplit = (code1 == -1 ? 1 : 0) + (code2 == -1 ? 1 : 0) + (code3 == -1 ? 1 : 0);
4070 int nOn = (code1 == 1 ? 1 : 0) + (code2 == 1 ? 1 : 0) + (code3 == 1 ? 1 : 0);
4071 if (nSplit == 1 && nOn == 1)
4072 return(SplitConstraintPenalty(nOn1+nOn2, nOff1+nOff2, nOnOut, nOffOut));
4078 void QuartetConstraintPenalties(profile_t *profiles[4], int nConstraints, /*OUT*/double penalty[3]) {
4080 for (i=0; i < 3; i++)
4082 if(nConstraints == 0)
4085 for (iC = 0; iC < nConstraints; iC++) {
4087 if (QuartetConstraintPenaltiesPiece(profiles, iC, /*OUT*/part)) {
4089 penalty[i] += part[i];
4092 && (fabs(part[ABvsCD]-part[ACvsBD]) > 0.001 || fabs(part[ABvsCD]-part[ADvsBC]) > 0.001))
4093 fprintf(stderr, "Constraint Penalties at %d: ABvsCD %.3f ACvsBD %.3f ADvsBC %.3f %d/%d %d/%d %d/%d %d/%d\n",
4094 iC, part[ABvsCD], part[ACvsBD], part[ADvsBC],
4095 profiles[0]->nOn[iC], profiles[0]->nOff[iC],
4096 profiles[1]->nOn[iC], profiles[1]->nOff[iC],
4097 profiles[2]->nOn[iC], profiles[2]->nOff[iC],
4098 profiles[3]->nOn[iC], profiles[3]->nOff[iC]);
4102 fprintf(stderr, "Total Constraint Penalties: ABvsCD %.3f ACvsBD %.3f ADvsBC %.3f\n",
4103 penalty[ABvsCD], penalty[ACvsBD], penalty[ADvsBC]);
4106 double PairConstraintDistance(int nOn1, int nOff1, int nOn2, int nOff2) {
4107 double f1 = nOn1/(double)(nOn1+nOff1);
4108 double f2 = nOn2/(double)(nOn2+nOff2);
4109 /* 1 - f1 * f2 - (1-f1)*(1-f2) = 1 - f1 * f2 - 1 + f1 + f2 - f1 * f2 */
4110 return(f1 + f2 - 2.0 * f1 * f2);
4113 bool QuartetConstraintPenaltiesPiece(profile_t *profiles[4], int iC, /*OUT*/double piece[3]) {
4121 for (i=0; i < 4; i++) {
4122 nOn[i] = profiles[i]->nOn[iC];
4123 nOff[i] = profiles[i]->nOff[iC];
4124 if (nOn[i] + nOff[i] == 0)
4125 return(false); /* ignore */
4126 else if (nOn[i] > 0 && nOff[i] > 0)
4128 else if (nOn[i] > 0)
4133 /* If just one of them is split or on the other side and the others all agree, also ignore */
4134 if (nPlus >= 3 || nMinus >= 3)
4136 piece[ABvsCD] = constraintWeight
4137 * (PairConstraintDistance(nOn[0],nOff[0],nOn[1],nOff[1])
4138 + PairConstraintDistance(nOn[2],nOff[2],nOn[3],nOff[3]));
4139 piece[ACvsBD] = constraintWeight
4140 * (PairConstraintDistance(nOn[0],nOff[0],nOn[2],nOff[2])
4141 + PairConstraintDistance(nOn[1],nOff[1],nOn[3],nOff[3]));
4142 piece[ADvsBC] = constraintWeight
4143 * (PairConstraintDistance(nOn[0],nOff[0],nOn[3],nOff[3])
4144 + PairConstraintDistance(nOn[2],nOff[2],nOn[1],nOff[1]));
4148 /* Minimum number of constrained leaves that need to be moved
4149 to satisfy the constraint (or 0 if constraint is satisfied)
4150 Defining it this way should ensure that SPR moves that break
4151 constraints get a penalty
4153 int SplitConstraintPenalty(int nOn1, int nOff1, int nOn2, int nOff2) {
4154 return(nOn1 + nOff2 < nOn2 + nOff1 ?
4155 (nOn1 < nOff2 ? nOn1 : nOff2)
4156 : (nOn2 < nOff1 ? nOn2 : nOff1));
4159 bool SplitViolatesConstraint(profile_t *profiles[4], int iConstraint) {
4161 int codes[4]; /* 0 for off, 1 for on, -1 for split (quit if not constrained at all) */
4162 for (i = 0; i < 4; i++) {
4163 if (profiles[i]->nOn[iConstraint] + profiles[i]->nOff[iConstraint] == 0)
4165 else if (profiles[i]->nOn[iConstraint] > 0 && profiles[i]->nOff[iConstraint] == 0)
4167 else if (profiles[i]->nOn[iConstraint] == 0 && profiles[i]->nOff[iConstraint] > 0)
4174 for (i = 0; i < 4; i++) {
4177 else if (codes[i] == 1)
4180 /* 3 on one side means no violation, even if other is code -1
4181 otherwise must have code != -1 and agreement on the split
4183 if (n0 >= 3 || n1 >= 3)
4185 if (n0==2 && n1==2 && codes[0] == codes[1] && codes[2] == codes[3])
4190 double LogCorrect(double dist) {
4191 const double maxscore = 3.0;
4192 if (nCodes == 4 && !useMatrix) { /* Jukes-Cantor */
4193 dist = dist < 0.74 ? -0.75*log(1.0 - dist * 4.0/3.0) : maxscore;
4194 } else { /* scoredist-like */
4195 dist = dist < 0.99 ? -1.3*log(1.0 - dist) : maxscore;
4197 return (dist < maxscore ? dist : maxscore);
4200 /* A helper function -- f1 and f2 can be NULL if the corresponding code != NOCODE
4202 double ProfileDistPiece(unsigned int code1, unsigned int code2,
4203 numeric_t *f1, numeric_t *f2,
4204 /*OPTIONAL*/distance_matrix_t *dmat,
4205 /*OPTIONAL*/numeric_t *codeDist2) {
4207 if (code1 != NOCODE && code2 != NOCODE) { /* code1 vs code2 */
4208 return(dmat->distances[code1][code2]);
4209 } else if (codeDist2 != NULL && code1 != NOCODE) { /* code1 vs. codeDist2 */
4210 return(codeDist2[code1]);
4211 } else { /* f1 vs f2 */
4213 if(code1 == NOCODE) return(10.0);
4214 f1 = &dmat->codeFreq[code1][0];
4217 if(code2 == NOCODE) return(10.0);
4218 f2 = &dmat->codeFreq[code2][0];
4220 return(vector_multiply3_sum(f1,f2,dmat->eigenval,nCodes));
4224 if (code1 != NOCODE) {
4225 if (code2 != NOCODE) {
4226 return(code1 == code2 ? 0.0 : 1.0); /* code1 vs code2 */
4228 if(f2 == NULL) return(10.0);
4229 return(1.0 - f2[code1]); /* code1 vs. f2 */
4232 if (code2 != NOCODE) {
4233 if(f1 == NULL) return(10.0);
4234 return(1.0 - f1[code2]); /* f1 vs code2 */
4235 } else { /* f1 vs. f2 */
4236 if (f1 == NULL || f2 == NULL) return(10.0);
4239 for (k = 0; k < nCodes; k++) {
4240 piece -= f1[k] * f2[k];
4249 /* E.g. GET_FREQ(profile,iPos,iVector)
4250 Gets the next element of the vectors (and updates iVector), or
4251 returns NULL if we didn't store a vector
4253 #define GET_FREQ(P,I,IVECTOR) \
4254 (P->weights[I] > 0 && P->codes[I] == NOCODE ? &P->vectors[nCodes*(IVECTOR++)] : NULL)
4256 void ProfileDist(profile_t *profile1, profile_t *profile2, int nPos,
4257 /*OPTIONAL*/distance_matrix_t *dmat,
4258 /*OUT*/besthit_t *hit) {
4264 for (i = 0; i < nPos; i++) {
4265 numeric_t *f1 = GET_FREQ(profile1,i,/*IN/OUT*/iFreq1);
4266 numeric_t *f2 = GET_FREQ(profile2,i,/*IN/OUT*/iFreq2);
4267 if (profile1->weights[i] > 0 && profile2->weights[i] > 0) {
4268 double weight = profile1->weights[i] * profile2->weights[i];
4270 double piece = ProfileDistPiece(profile1->codes[i],profile2->codes[i],f1,f2,dmat,
4271 profile2->codeDist ? &profile2->codeDist[i*nCodes] : NULL);
4272 top += weight * piece;
4275 assert(iFreq1 == profile1->nVectors);
4276 assert(iFreq2 == profile2->nVectors);
4277 hit->weight = denom > 0 ? denom : 0.01; /* 0.01 is an arbitrarily low value of weight (normally >>1) */
4278 hit->dist = denom > 0 ? top/denom : 1;
4282 /* This should not be called if the update weight is 0, as
4283 in that case code==NOCODE and in=NULL is possible, and then
4286 void AddToFreq(/*IN/OUT*/numeric_t *fOut,
4288 unsigned int codeIn, /*OPTIONAL*/numeric_t *fIn,
4289 /*OPTIONAL*/distance_matrix_t *dmat) {
4290 assert(fOut != NULL);
4292 vector_add_mult(fOut, fIn, weight, nCodes);
4294 assert(codeIn != NOCODE);
4295 vector_add_mult(fOut, dmat->codeFreq[codeIn], weight, nCodes);
4297 assert(codeIn != NOCODE);
4298 fOut[codeIn] += weight;
4302 void SetProfile(/*IN/OUT*/NJ_t *NJ, int node, double weight1) {
4303 children_t *c = &NJ->child[node];
4304 assert(c->nChild == 2);
4305 assert(NJ->profiles[c->child[0]] != NULL);
4306 assert(NJ->profiles[c->child[1]] != NULL);
4307 if (NJ->profiles[node] != NULL)
4308 FreeProfile(NJ->profiles[node], NJ->nPos, NJ->nConstraints);
4309 NJ->profiles[node] = AverageProfile(NJ->profiles[c->child[0]],
4310 NJ->profiles[c->child[1]],
4311 NJ->nPos, NJ->nConstraints,
4312 NJ->distance_matrix,
4316 /* bionjWeight is the weight of the first sequence (between 0 and 1),
4317 or -1 to do the average.
4319 profile_t *AverageProfile(profile_t *profile1, profile_t *profile2,
4320 int nPos, int nConstraints,
4321 distance_matrix_t *dmat,
4322 double bionjWeight) {
4324 if (bionjWeight < 0) {
4328 /* First, set codes and weights and see how big vectors will be */
4329 profile_t *out = NewProfile(nPos, nConstraints);
4331 for (i = 0; i < nPos; i++) {
4332 out->weights[i] = bionjWeight * profile1->weights[i]
4333 + (1-bionjWeight) * profile2->weights[i];
4334 out->codes[i] = NOCODE;
4335 if (out->weights[i] > 0) {
4336 if (profile1->weights[i] > 0 && profile1->codes[i] != NOCODE
4337 && (profile2->weights[i] <= 0 || profile1->codes[i] == profile2->codes[i])) {
4338 out->codes[i] = profile1->codes[i];
4339 } else if (profile1->weights[i] <= 0
4340 && profile2->weights[i] > 0
4341 && profile2->codes[i] != NOCODE) {
4342 out->codes[i] = profile2->codes[i];
4344 if (out->codes[i] == NOCODE) out->nVectors++;
4348 /* Allocate and set the vectors */
4349 out->vectors = (numeric_t*)mymalloc(sizeof(numeric_t)*nCodes*out->nVectors);
4350 for (i = 0; i < nCodes * out->nVectors; i++) out->vectors[i] = 0;
4351 nProfileFreqAlloc += out->nVectors;
4352 nProfileFreqAvoid += nPos - out->nVectors;
4356 for (i=0; i < nPos; i++) {
4357 numeric_t *f = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4358 numeric_t *f1 = GET_FREQ(profile1,i,/*IN/OUT*/iFreq1);
4359 numeric_t *f2 = GET_FREQ(profile2,i,/*IN/OUT*/iFreq2);
4361 if (profile1->weights[i] > 0)
4362 AddToFreq(/*IN/OUT*/f, profile1->weights[i] * bionjWeight,
4363 profile1->codes[i], f1, dmat);
4364 if (profile2->weights[i] > 0)
4365 AddToFreq(/*IN/OUT*/f, profile2->weights[i] * (1.0-bionjWeight),
4366 profile2->codes[i], f2, dmat);
4367 NormalizeFreq(/*IN/OUT*/f, dmat);
4368 } /* end if computing f */
4369 if (verbose > 10 && i < 5) {
4370 fprintf(stderr,"Average profiles: pos %d in-w1 %f in-w2 %f bionjWeight %f to weight %f code %d\n",
4371 i, profile1->weights[i], profile2->weights[i], bionjWeight,
4372 out->weights[i], out->codes[i]);
4375 for (k = 0; k < nCodes; k++)
4376 fprintf(stderr, "\t%c:%f", codesString[k], f ? f[k] : -1.0);
4377 fprintf(stderr,"\n");
4380 } /* end loop over positions */
4381 assert(iFreq1 == profile1->nVectors);
4382 assert(iFreq2 == profile2->nVectors);
4383 assert(iFreqOut == out->nVectors);
4385 /* compute total constraints */
4386 for (i = 0; i < nConstraints; i++) {
4387 out->nOn[i] = profile1->nOn[i] + profile2->nOn[i];
4388 out->nOff[i] = profile1->nOff[i] + profile2->nOff[i];
4394 /* Make the (unrotated) frequencies sum to 1
4395 Simply dividing by total_weight is not ideal because of roundoff error
4396 So compute total_freq instead
4398 void NormalizeFreq(/*IN/OUT*/numeric_t *freq, distance_matrix_t *dmat) {
4399 double total_freq = 0;
4402 /* The total frequency is dot_product(true_frequencies, 1)
4403 So we rotate the 1 vector by eigeninv (stored in eigentot)
4405 total_freq = vector_multiply_sum(freq, dmat->eigentot, nCodes);
4407 for (k = 0; k < nCodes; k++)
4408 total_freq += freq[k];
4410 if (total_freq > fPostTotalTolerance) {
4411 numeric_t inverse_weight = 1.0/total_freq;
4412 vector_multiply_by(/*IN/OUT*/freq, inverse_weight, nCodes);
4414 /* This can happen if we are in a very low-weight region, e.g. if a mostly-gap position gets weighted down
4415 repeatedly; just set them all to arbitrary but legal values */
4417 for (k = 0; k < nCodes; k++)
4418 freq[k] = 1.0/nCodes;
4420 for (k = 0; k < nCodes; k++)
4421 freq[k] = dmat->codeFreq[0][k];/*XXX gapFreq[k];*/
4426 /* OutProfile() computes the out-profile */
4427 profile_t *OutProfile(profile_t **profiles, int nProfiles,
4428 int nPos, int nConstraints,
4429 distance_matrix_t *dmat) {
4430 int i; /* position */
4431 int in; /* profile */
4432 profile_t *out = NewProfile(nPos, nConstraints);
4434 double inweight = 1.0/(double)nProfiles; /* The maximal output weight is 1.0 */
4436 /* First, set weights -- code is always NOCODE, prevent weight=0 */
4437 for (i = 0; i < nPos; i++) {
4438 out->weights[i] = 0;
4439 for (in = 0; in < nProfiles; in++)
4440 out->weights[i] += profiles[in]->weights[i] * inweight;
4441 if (out->weights[i] <= 0) out->weights[i] = 1e-20; /* always store a vector */
4443 out->codes[i] = NOCODE; /* outprofile is normally complicated */
4446 /* Initialize the frequencies to 0 */
4447 out->vectors = (numeric_t*)mymalloc(sizeof(numeric_t)*nCodes*out->nVectors);
4448 for (i = 0; i < nCodes*out->nVectors; i++)
4449 out->vectors[i] = 0;
4451 /* Add up the weights, going through each sequence in turn */
4452 for (in = 0; in < nProfiles; in++) {
4455 for (i = 0; i < nPos; i++) {
4456 numeric_t *fIn = GET_FREQ(profiles[in],i,/*IN/OUT*/iFreqIn);
4457 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4458 if (profiles[in]->weights[i] > 0)
4459 AddToFreq(/*IN/OUT*/fOut, profiles[in]->weights[i],
4460 profiles[in]->codes[i], fIn, dmat);
4462 assert(iFreqOut == out->nVectors);
4463 assert(iFreqIn == profiles[in]->nVectors);
4466 /* And normalize the frequencies to sum to 1 */
4468 for (i = 0; i < nPos; i++) {
4469 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4471 NormalizeFreq(/*IN/OUT*/fOut, dmat);
4473 assert(iFreqOut == out->nVectors);
4474 if (verbose > 10) fprintf(stderr,"Average %d profiles\n", nProfiles);
4476 SetCodeDist(/*IN/OUT*/out, nPos, dmat);
4478 /* Compute constraints */
4479 for (i = 0; i < nConstraints; i++) {
4482 for (in = 0; in < nProfiles; in++) {
4483 out->nOn[i] += profiles[in]->nOn[i];
4484 out->nOff[i] += profiles[in]->nOff[i];
4490 void UpdateOutProfile(/*IN/OUT*/profile_t *out, profile_t *old1, profile_t *old2,
4491 profile_t *new, int nActiveOld,
4492 int nPos, int nConstraints,
4493 distance_matrix_t *dmat) {
4499 assert(nActiveOld > 0);
4501 for (i = 0; i < nPos; i++) {
4502 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4503 numeric_t *fOld1 = GET_FREQ(old1,i,/*IN/OUT*/iFreq1);
4504 numeric_t *fOld2 = GET_FREQ(old2,i,/*IN/OUT*/iFreq2);
4505 numeric_t *fNew = GET_FREQ(new,i,/*IN/OUT*/iFreqNew);
4507 assert(out->codes[i] == NOCODE && fOut != NULL); /* No no-vector optimization for outprofiles */
4508 if (verbose > 3 && i < 3) {
4509 fprintf(stderr,"Updating out-profile position %d weight %f (mult %f)\n",
4510 i, out->weights[i], out->weights[i]*nActiveOld);
4512 double originalMult = out->weights[i]*nActiveOld;
4513 double newMult = originalMult + new->weights[i] - old1->weights[i] - old2->weights[i];
4514 out->weights[i] = newMult/(nActiveOld-1);
4515 if (out->weights[i] <= 0) out->weights[i] = 1e-20; /* always use the vector */
4517 for (k = 0; k < nCodes; k++) fOut[k] *= originalMult;
4519 if (old1->weights[i] > 0)
4520 AddToFreq(/*IN/OUT*/fOut, -old1->weights[i], old1->codes[i], fOld1, dmat);
4521 if (old2->weights[i] > 0)
4522 AddToFreq(/*IN/OUT*/fOut, -old2->weights[i], old2->codes[i], fOld2, dmat);
4523 if (new->weights[i] > 0)
4524 AddToFreq(/*IN/OUT*/fOut, new->weights[i], new->codes[i], fNew, dmat);
4526 /* And renormalize */
4527 NormalizeFreq(/*IN/OUT*/fOut, dmat);
4529 if (verbose > 2 && i < 3) {
4530 fprintf(stderr,"Updated out-profile position %d weight %f (mult %f)",
4531 i, out->weights[i], out->weights[i]*nActiveOld);
4532 if(out->weights[i] > 0)
4533 for (k=0;k<nCodes;k++)
4534 fprintf(stderr, " %c:%f", dmat?'?':codesString[k], fOut[k]);
4535 fprintf(stderr,"\n");
4538 assert(iFreqOut == out->nVectors);
4539 assert(iFreq1 == old1->nVectors);
4540 assert(iFreq2 == old2->nVectors);
4541 assert(iFreqNew == new->nVectors);
4543 SetCodeDist(/*IN/OUT*/out,nPos,dmat);
4545 /* update constraints -- note in practice this should be a no-op */
4546 for (i = 0; i < nConstraints; i++) {
4547 out->nOn[i] += new->nOn[i] - old1->nOn[i] - old2->nOn[i];
4548 out->nOff[i] += new->nOff[i] - old1->nOff[i] - old2->nOff[i];
4552 void SetCodeDist(/*IN/OUT*/profile_t *profile, int nPos,
4553 distance_matrix_t *dmat) {
4554 if (profile->codeDist == NULL)
4555 profile->codeDist = (numeric_t*)mymalloc(sizeof(numeric_t)*nPos*nCodes);
4558 for (i = 0; i < nPos; i++) {
4559 numeric_t *f = GET_FREQ(profile,i,/*IN/OUT*/iFreq);
4562 for (k = 0; k < nCodes; k++)
4563 profile->codeDist[i*nCodes+k] = ProfileDistPiece(/*code1*/profile->codes[i], /*code2*/k,
4564 /*f1*/f, /*f2*/NULL,
4567 assert(iFreq==profile->nVectors);
4571 void SetBestHit(int node, NJ_t *NJ, int nActive,
4572 /*OUT*/besthit_t *bestjoin, /*OUT OPTIONAL*/besthit_t *allhits) {
4573 assert(NJ->parent[node] < 0);
4577 bestjoin->dist = 1e20;
4578 bestjoin->criterion = 1e20;
4584 /* Note -- if we are already in a parallel region, this will be ignored */
4585 #pragma omp parallel for schedule(dynamic, 50)
4587 for (j = 0; j < NJ->maxnode; j++) {
4588 besthit_t *sv = allhits != NULL ? &allhits[j] : &tmp;
4591 if (NJ->parent[j] >= 0) {
4592 sv->i = -1; /* illegal/empty join */
4594 sv->criterion = sv->dist = 1e20;
4597 /* Note that we compute self-distances (allow j==node) because the top-hit heuristic
4598 expects self to be within its top hits, but we exclude those from the bestjoin
4601 SetDistCriterion(NJ, nActive, /*IN/OUT*/sv);
4602 if (sv->criterion < bestjoin->criterion && node != j)
4606 fprintf(stderr, "SetBestHit %d %d %f %f\n", bestjoin->i, bestjoin->j, bestjoin->dist, bestjoin->criterion);
4610 void ReadMatrix(char *filename, /*OUT*/numeric_t codes[MAXCODES][MAXCODES], bool checkCodes) {
4611 char buf[BUFFER_SIZE] = "";
4612 FILE *fp = fopen(filename, "r");
4614 fprintf(stderr, "Cannot read %s\n",filename);
4617 if (fgets(buf,sizeof(buf),fp) == NULL) {
4618 fprintf(stderr, "Error reading header line for %s:\n%s\n", filename, buf);
4624 for (iBufPos=0,i=0;i<nCodes;i++,iBufPos++) {
4625 if(buf[iBufPos] != codesString[i]) {
4626 fprintf(stderr,"Header line\n%s\nin file %s does not have expected code %c # %d in %s\n",
4627 buf, filename, codesString[i], i, codesString);
4631 if(buf[iBufPos] != '\n' && buf[iBufPos] != '\r' && buf[iBufPos] != '\0' && buf[iBufPos] != '\t') {
4632 fprintf(stderr, "Header line in %s should be tab-delimited\n", filename);
4635 if (buf[iBufPos] == '\0' && i < nCodes-1) {
4636 fprintf(stderr, "Header line in %s ends prematurely\n",filename);
4639 } /* end loop over codes */
4640 /* Should be at end, but allow \n because of potential DOS \r\n */
4641 if(buf[iBufPos] != '\0' && buf[iBufPos] != '\n' && buf[iBufPos] != '\r') {
4642 fprintf(stderr, "Header line in %s has too many entries\n", filename);
4647 for (iLine = 0; iLine < nCodes; iLine++) {
4649 if (fgets(buf,sizeof(buf),fp) == NULL) {
4650 fprintf(stderr, "Cannot read line %d from file %s\n", iLine+2, filename);
4653 char *field = strtok(buf,"\t\r\n");
4654 field = strtok(NULL, "\t"); /* ignore first column */
4656 for (iColumn = 0; iColumn < nCodes && field != NULL; iColumn++, field = strtok(NULL,"\t")) {
4657 if(sscanf(field,ScanNumericSpec,&codes[iLine][iColumn]) != 1) {
4658 fprintf(stderr,"Cannot parse field %s in file %s\n", field, filename);
4665 void ReadVector(char *filename, /*OUT*/numeric_t codes[MAXCODES]) {
4666 FILE *fp = fopen(filename,"r");
4668 fprintf(stderr, "Cannot read %s\n",filename);
4672 for (i = 0; i < nCodes; i++) {
4673 if (fscanf(fp,ScanNumericSpec,&codes[i]) != 1) {
4674 fprintf(stderr,"Cannot read %d entry of %s\n",i+1,filename);
4678 if (fclose(fp) != 0) {
4679 fprintf(stderr, "Error reading %s\n",filename);
4684 distance_matrix_t *ReadDistanceMatrix(char *prefix) {
4685 char buffer[BUFFER_SIZE];
4686 distance_matrix_t *dmat = (distance_matrix_t*)mymalloc(sizeof(distance_matrix_t));
4688 if(strlen(prefix) > BUFFER_SIZE-20) {
4689 fprintf(stderr,"Filename %s too long\n", prefix);
4693 strcpy(buffer, prefix);
4694 strcat(buffer, ".distances");
4695 ReadMatrix(buffer, /*OUT*/dmat->distances, /*checkCodes*/true);
4697 strcpy(buffer, prefix);
4698 strcat(buffer, ".inverses");
4699 ReadMatrix(buffer, /*OUT*/dmat->eigeninv, /*checkCodes*/false);
4701 strcpy(buffer, prefix);
4702 strcat(buffer, ".eigenvalues");
4703 ReadVector(buffer, /*OUT*/dmat->eigenval);
4705 if(verbose>1) fprintf(stderr, "Read distance matrix from %s\n",prefix);
4706 SetupDistanceMatrix(/*IN/OUT*/dmat);
4710 void SetupDistanceMatrix(/*IN/OUT*/distance_matrix_t *dmat) {
4711 /* Check that the eigenvalues and eigen-inverse are consistent with the
4712 distance matrix and that the matrix is symmetric */
4714 for (i = 0; i < nCodes; i++) {
4715 for (j = 0; j < nCodes; j++) {
4716 if(fabs(dmat->distances[i][j]-dmat->distances[j][i]) > 1e-6) {
4717 fprintf(stderr,"Distance matrix not symmetric for %d,%d: %f vs %f\n",
4719 dmat->distances[i][j],
4720 dmat->distances[j][i]);
4724 for (k = 0; k < nCodes; k++)
4725 total += dmat->eigenval[k] * dmat->eigeninv[k][i] * dmat->eigeninv[k][j];
4726 if(fabs(total - dmat->distances[i][j]) > 1e-6) {
4727 fprintf(stderr,"Distance matrix entry %d,%d should be %f but eigen-representation gives %f\n",
4728 i+1,j+1,dmat->distances[i][j],total);
4734 /* And compute eigentot */
4735 for (k = 0; k < nCodes; k++) {
4736 dmat->eigentot[k] = 0.;
4738 for (j = 0; j < nCodes; j++)
4739 dmat->eigentot[k] += dmat->eigeninv[k][j];
4742 /* And compute codeFreq */
4744 for(code = 0; code < nCodes; code++) {
4745 for (k = 0; k < nCodes; k++) {
4746 dmat->codeFreq[code][k] = dmat->eigeninv[k][code];
4750 for(code = 0; code < nCodes; code++) {
4751 double gapFreq = 0.0;
4752 for (k = 0; k < nCodes; k++)
4753 gapFreq += dmat->codeFreq[k][code];
4754 dmat->gapFreq[code] = gapFreq / nCodes;
4757 if(verbose>10) fprintf(stderr, "Made codeFreq\n");
4760 nni_t ChooseNNI(profile_t *profiles[4],
4761 /*OPTIONAL*/distance_matrix_t *dmat,
4762 int nPos, int nConstraints,
4763 /*OUT*/double criteria[3]) {
4765 CorrectedPairDistances(profiles, 4, dmat, nPos, /*OUT*/d);
4766 double penalty[3]; /* indexed as nni_t */
4767 QuartetConstraintPenalties(profiles, nConstraints, /*OUT*/penalty);
4768 criteria[ABvsCD] = d[qAB] + d[qCD] + penalty[ABvsCD];
4769 criteria[ACvsBD] = d[qAC] + d[qBD] + penalty[ACvsBD];
4770 criteria[ADvsBC] = d[qAD] + d[qBC] + penalty[ADvsBC];
4772 nni_t choice = ABvsCD;
4773 if (criteria[ACvsBD] < criteria[ABvsCD] && criteria[ACvsBD] <= criteria[ADvsBC]) {
4775 } else if (criteria[ADvsBC] < criteria[ABvsCD] && criteria[ADvsBC] <= criteria[ACvsBD]) {
4778 if (verbose > 1 && penalty[choice] > penalty[ABvsCD] + 1e-6) {
4779 fprintf(stderr, "Worsen constraint: from %.3f to %.3f distance %.3f to %.3f: ",
4780 penalty[ABvsCD], penalty[choice],
4781 criteria[ABvsCD], choice == ACvsBD ? criteria[ACvsBD] : criteria[ADvsBC]);
4783 for (iC = 0; iC < nConstraints; iC++) {
4785 if (QuartetConstraintPenaltiesPiece(profiles, iC, /*OUT*/ppart)) {
4786 double old_penalty = ppart[ABvsCD];
4787 double new_penalty = ppart[choice];
4788 if (new_penalty > old_penalty + 1e-6)
4789 fprintf(stderr, " %d (%d/%d %d/%d %d/%d %d/%d)", iC,
4790 profiles[0]->nOn[iC], profiles[0]->nOff[iC],
4791 profiles[1]->nOn[iC], profiles[1]->nOff[iC],
4792 profiles[2]->nOn[iC], profiles[2]->nOff[iC],
4793 profiles[3]->nOn[iC], profiles[3]->nOff[iC]);
4796 fprintf(stderr,"\n");
4799 fprintf(stderr, "NNI scores ABvsCD %.5f ACvsBD %.5f ADvsBC %.5f choice %s\n",
4800 criteria[ABvsCD], criteria[ACvsBD], criteria[ADvsBC],
4801 choice == ABvsCD ? "AB|CD" : (choice == ACvsBD ? "AC|BD" : "AD|BC"));
4805 profile_t *PosteriorProfile(profile_t *p1, profile_t *p2,
4806 double len1, double len2,
4807 /*OPTIONAL*/transition_matrix_t *transmat,
4809 int nPos, int nConstraints) {
4810 if (len1 < MLMinBranchLength)
4811 len1 = MLMinBranchLength;
4812 if (len2 < MLMinBranchLength)
4813 len2 = MLMinBranchLength;
4816 profile_t *out = NewProfile(nPos, nConstraints);
4817 for (i = 0; i < nPos; i++) {
4818 out->codes[i] = NOCODE;
4819 out->weights[i] = 1.0;
4821 out->nVectors = nPos;
4822 out->vectors = (numeric_t*)mymalloc(sizeof(numeric_t)*nCodes*out->nVectors);
4823 for (i = 0; i < nCodes * out->nVectors; i++) out->vectors[i] = 0;
4827 numeric_t *expeigenRates1 = NULL, *expeigenRates2 = NULL;
4829 if (transmat != NULL) {
4830 expeigenRates1 = ExpEigenRates(len1, transmat, rates);
4831 expeigenRates2 = ExpEigenRates(len2, transmat, rates);
4834 if (transmat == NULL) { /* Jukes-Cantor */
4835 assert(nCodes == 4);
4837 double *PSame1 = PSameVector(len1, rates);
4838 double *PDiff1 = PDiffVector(PSame1, rates);
4839 double *PSame2 = PSameVector(len2, rates);
4840 double *PDiff2 = PDiffVector(PSame2, rates);
4842 numeric_t mix1[4], mix2[4];
4844 for (i=0; i < nPos; i++) {
4845 int iRate = rates->ratecat[i];
4846 double w1 = p1->weights[i];
4847 double w2 = p2->weights[i];
4848 int code1 = p1->codes[i];
4849 int code2 = p2->codes[i];
4850 numeric_t *f1 = GET_FREQ(p1,i,/*IN/OUT*/iFreq1);
4851 numeric_t *f2 = GET_FREQ(p2,i,/*IN/OUT*/iFreq2);
4853 /* First try to store a simple profile */
4854 if (f1 == NULL && f2 == NULL) {
4855 if (code1 == NOCODE && code2 == NOCODE) {
4856 out->codes[i] = NOCODE;
4857 out->weights[i] = 0.0;
4859 } else if (code1 == NOCODE) {
4860 /* Posterior(parent | character & gap, len1, len2) = Posterior(parent | character, len1)
4861 = PSame() for matching characters and 1-PSame() for the rest
4862 = (pSame - pDiff) * character + (1-(pSame-pDiff)) * gap
4864 out->codes[i] = code2;
4865 out->weights[i] = w2 * (PSame2[iRate] - PDiff2[iRate]);
4867 } else if (code2 == NOCODE) {
4868 out->codes[i] = code1;
4869 out->weights[i] = w1 * (PSame1[iRate] - PDiff1[iRate]);
4871 } else if (code1 == code2) {
4872 out->codes[i] = code1;
4873 double f12code = (w1*PSame1[iRate] + (1-w1)*0.25) * (w2*PSame2[iRate] + (1-w2)*0.25);
4874 double f12other = (w1*PDiff1[iRate] + (1-w1)*0.25) * (w2*PDiff2[iRate] + (1-w2)*0.25);
4875 /* posterior probability of code1/code2 after scaling */
4876 double pcode = f12code/(f12code+3*f12other);
4877 /* Now f = w * (code ? 1 : 0) + (1-w) * 0.25, so to get pcode we need
4878 fcode = 1/4 + w1*3/4 or w = (f-1/4)*4/3
4880 out->weights[i] = (pcode - 0.25) * 4.0/3.0;
4881 /* This can be zero because of numerical problems, I think */
4882 if (out->weights[i] < 1e-6) {
4884 fprintf(stderr, "Replaced weight %f with %f from w1 %f w2 %f PSame %f %f f12code %f f12other %f\n",
4885 out->weights[i], 1e-6,
4887 PSame1[iRate], PSame2[iRate],
4889 out->weights[i] = 1e-6;
4894 /* if we did not compute a simple profile, then do the full computation and
4895 store the full vector
4898 for (j = 0; j < 4; j++)
4899 mix1[j] = (1-w1)*0.25;
4905 for (j = 0; j < 4; j++)
4906 mix2[j] = (1-w2)*0.25;
4911 out->codes[i] = NOCODE;
4912 out->weights[i] = 1.0;
4913 numeric_t *f = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4915 for (j = 0; j < 4; j++) {
4916 f[j] = (f1[j] * PSame1[iRate] + (1.0-f1[j]) * PDiff1[iRate])
4917 * (f2[j] * PSame2[iRate] + (1.0-f2[j]) * PDiff2[iRate]);
4920 double lkABInv = 1.0/lkAB;
4921 for (j = 0; j < 4; j++)
4924 PSame1 = myfree(PSame1, sizeof(double) * rates->nRateCategories);
4925 PSame2 = myfree(PSame2, sizeof(double) * rates->nRateCategories);
4926 PDiff1 = myfree(PDiff1, sizeof(double) * rates->nRateCategories);
4927 PDiff2 = myfree(PDiff2, sizeof(double) * rates->nRateCategories);
4928 } else if (nCodes == 4) { /* matrix model on nucleotides */
4929 numeric_t *fGap = &transmat->codeFreq[NOCODE][0];
4930 numeric_t f1mix[4], f2mix[4];
4932 for (i=0; i < nPos; i++) {
4933 if (p1->codes[i] == NOCODE && p2->codes[i] == NOCODE
4934 && p1->weights[i] == 0 && p2->weights[i] == 0) {
4935 /* aligning gap with gap -- just output a gap
4936 out->codes[i] is already set to NOCODE so need not set that */
4937 out->weights[i] = 0;
4940 int iRate = rates->ratecat[i];
4941 numeric_t *expeigen1 = &expeigenRates1[iRate*4];
4942 numeric_t *expeigen2 = &expeigenRates2[iRate*4];
4943 numeric_t *f1 = GET_FREQ(p1,i,/*IN/OUT*/iFreq1);
4944 numeric_t *f2 = GET_FREQ(p2,i,/*IN/OUT*/iFreq2);
4945 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4946 assert(fOut != NULL);
4949 f1 = &transmat->codeFreq[p1->codes[i]][0]; /* codeFreq includes an entry for NOCODE */
4950 double w = p1->weights[i];
4951 if (w > 0.0 && w < 1.0) {
4952 for (j = 0; j < 4; j++)
4953 f1mix[j] = w * f1[j] + (1.0-w) * fGap[j];
4958 f2 = &transmat->codeFreq[p2->codes[i]][0];
4959 double w = p2->weights[i];
4960 if (w > 0.0 && w < 1.0) {
4961 for (j = 0; j < 4; j++)
4962 f2mix[j] = w * f2[j] + (1.0-w) * fGap[j];
4966 numeric_t fMult1[4] ALIGNED; /* rotated1 * expeigen1 */
4967 numeric_t fMult2[4] ALIGNED; /* rotated2 * expeigen2 */
4968 #if 0 /* SSE3 is slower */
4969 vector_multiply(f1, expeigen1, 4, /*OUT*/fMult1);
4970 vector_multiply(f2, expeigen2, 4, /*OUT*/fMult2);
4972 for (j = 0; j < 4; j++) {
4973 fMult1[j] = f1[j]*expeigen1[j];
4974 fMult2[j] = f2[j]*expeigen2[j];
4977 numeric_t fPost[4] ALIGNED; /* in unrotated space */
4978 for (j = 0; j < 4; j++) {
4979 #if 0 /* SSE3 is slower */
4980 fPost[j] = vector_dot_product_rot(fMult1, fMult2, &transmat->codeFreq[j][0], 4)
4981 * transmat->statinv[j]; */
4985 for (k = 0; k < 4; k++) {
4986 out1 += fMult1[k] * transmat->codeFreq[j][k];
4987 out2 += fMult2[k] * transmat->codeFreq[j][k];
4989 fPost[j] = out1*out2*transmat->statinv[j];
4992 double fPostTot = 0;
4993 for (j = 0; j < 4; j++)
4994 fPostTot += fPost[j];
4995 assert(fPostTot > fPostTotalTolerance);
4996 double fPostInv = 1.0/fPostTot;
4997 #if 0 /* SSE3 is slower */
4998 vector_multiply_by(fPost, fPostInv, 4);
5000 for (j = 0; j < 4; j++)
5001 fPost[j] *= fPostInv;
5004 /* and finally, divide by stat again & rotate to give the new frequencies */
5005 matrixt_by_vector4(transmat->eigeninvT, fPost, /*OUT*/fOut);
5006 } /* end loop over position i */
5007 } else if (nCodes == 20) { /* matrix model on amino acids */
5008 numeric_t *fGap = &transmat->codeFreq[NOCODE][0];
5009 numeric_t f1mix[20] ALIGNED;
5010 numeric_t f2mix[20] ALIGNED;
5012 for (i=0; i < nPos; i++) {
5013 if (p1->codes[i] == NOCODE && p2->codes[i] == NOCODE
5014 && p1->weights[i] == 0 && p2->weights[i] == 0) {
5015 /* aligning gap with gap -- just output a gap
5016 out->codes[i] is already set to NOCODE so need not set that */
5017 out->weights[i] = 0;
5020 int iRate = rates->ratecat[i];
5021 numeric_t *expeigen1 = &expeigenRates1[iRate*20];
5022 numeric_t *expeigen2 = &expeigenRates2[iRate*20];
5023 numeric_t *f1 = GET_FREQ(p1,i,/*IN/OUT*/iFreq1);
5024 numeric_t *f2 = GET_FREQ(p2,i,/*IN/OUT*/iFreq2);
5025 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
5026 assert(fOut != NULL);
5029 f1 = &transmat->codeFreq[p1->codes[i]][0]; /* codeFreq includes an entry for NOCODE */
5030 double w = p1->weights[i];
5031 if (w > 0.0 && w < 1.0) {
5032 for (j = 0; j < 20; j++)
5033 f1mix[j] = w * f1[j] + (1.0-w) * fGap[j];
5038 f2 = &transmat->codeFreq[p2->codes[i]][0];
5039 double w = p2->weights[i];
5040 if (w > 0.0 && w < 1.0) {
5041 for (j = 0; j < 20; j++)
5042 f2mix[j] = w * f2[j] + (1.0-w) * fGap[j];
5046 numeric_t fMult1[20] ALIGNED; /* rotated1 * expeigen1 */
5047 numeric_t fMult2[20] ALIGNED; /* rotated2 * expeigen2 */
5048 vector_multiply(f1, expeigen1, 20, /*OUT*/fMult1);
5049 vector_multiply(f2, expeigen2, 20, /*OUT*/fMult2);
5050 numeric_t fPost[20] ALIGNED; /* in unrotated space */
5051 for (j = 0; j < 20; j++) {
5052 numeric_t value = vector_dot_product_rot(fMult1, fMult2, &transmat->codeFreq[j][0], 20)
5053 * transmat->statinv[j];
5054 /* Added this logic try to avoid rare numerical problems */
5055 fPost[j] = value >= 0 ? value : 0;
5057 double fPostTot = vector_sum(fPost, 20);
5058 assert(fPostTot > fPostTotalTolerance);
5059 double fPostInv = 1.0/fPostTot;
5060 vector_multiply_by(/*IN/OUT*/fPost, fPostInv, 20);
5061 int ch = -1; /* the dominant character, if any */
5063 for (j = 0; j < 20; j++) {
5064 if (fPost[j] >= approxMLminf) {
5071 /* now, see if we can use the approximation
5072 fPost ~= (1 or 0) * w + nearP * (1-w)
5073 to avoid rotating */
5076 w = (fPost[ch] - transmat->nearP[ch][ch]) / (1.0 - transmat->nearP[ch][ch]);
5077 for (j = 0; j < 20; j++) {
5079 double fRough = (1.0-w) * transmat->nearP[ch][j];
5080 if (fRough < fPost[j] * approxMLminratio) {
5081 ch = -1; /* give up on the approximation */
5088 nAAPosteriorRough++;
5089 double wInvStat = w * transmat->statinv[ch];
5090 for (j = 0; j < 20; j++)
5091 fOut[j] = wInvStat * transmat->codeFreq[ch][j] + (1.0-w) * transmat->nearFreq[ch][j];
5093 /* and finally, divide by stat again & rotate to give the new frequencies */
5094 nAAPosteriorExact++;
5095 for (j = 0; j < 20; j++)
5096 fOut[j] = vector_multiply_sum(fPost, &transmat->eigeninv[j][0], 20);
5098 } /* end loop over position i */
5100 assert(0); /* illegal nCodes */
5103 if (transmat != NULL) {
5104 expeigenRates1 = myfree(expeigenRates1, sizeof(numeric_t) * rates->nRateCategories * nCodes);
5105 expeigenRates2 = myfree(expeigenRates2, sizeof(numeric_t) * rates->nRateCategories * nCodes);
5108 /* Reallocate out->vectors to be the right size */
5109 out->nVectors = iFreqOut;
5110 if (out->nVectors == 0)
5111 out->vectors = (numeric_t*)myfree(out->vectors, sizeof(numeric_t)*nCodes*nPos);
5113 out->vectors = (numeric_t*)myrealloc(out->vectors,
5114 /*OLDSIZE*/sizeof(numeric_t)*nCodes*nPos,
5115 /*NEWSIZE*/sizeof(numeric_t)*nCodes*out->nVectors,
5116 /*copy*/true); /* try to save space */
5117 nProfileFreqAlloc += out->nVectors;
5118 nProfileFreqAvoid += nPos - out->nVectors;
5120 /* compute total constraints */
5121 for (i = 0; i < nConstraints; i++) {
5122 out->nOn[i] = p1->nOn[i] + p2->nOn[i];
5123 out->nOff[i] = p1->nOff[i] + p2->nOff[i];
5125 nPosteriorCompute++;
5129 double *PSameVector(double length, rates_t *rates) {
5130 double *pSame = mymalloc(sizeof(double) * rates->nRateCategories);
5132 for (iRate = 0; iRate < rates->nRateCategories; iRate++)
5133 pSame[iRate] = 0.25 + 0.75 * exp((-4.0/3.0) * fabs(length*rates->rates[iRate]));
5137 double *PDiffVector(double *pSame, rates_t *rates) {
5138 double *pDiff = mymalloc(sizeof(double) * rates->nRateCategories);
5140 for (iRate = 0; iRate < rates->nRateCategories; iRate++)
5141 pDiff[iRate] = (1.0 - pSame[iRate])/3.0;
5145 numeric_t *ExpEigenRates(double length, transition_matrix_t *transmat, rates_t *rates) {
5146 numeric_t *expeigen = mymalloc(sizeof(numeric_t) * nCodes * rates->nRateCategories);
5148 for (iRate = 0; iRate < rates->nRateCategories; iRate++) {
5149 for (j = 0; j < nCodes; j++) {
5150 double relLen = length * rates->rates[iRate];
5151 /* very short branch lengths lead to numerical problems so prevent them */
5152 if (relLen < MLMinRelBranchLength)
5153 relLen = MLMinRelBranchLength;
5154 expeigen[iRate*nCodes + j] = exp(relLen * transmat->eigenval[j]);
5160 double PairLogLk(profile_t *pA, profile_t *pB, double length, int nPos,
5161 /*OPTIONAL*/transition_matrix_t *transmat,
5163 /*OPTIONAL IN/OUT*/double *site_likelihoods) {
5165 double loglk = 0.0; /* stores underflow of lk during the loop over positions */
5167 assert(rates != NULL && rates->nRateCategories > 0);
5168 numeric_t *expeigenRates = NULL;
5169 if (transmat != NULL)
5170 expeigenRates = ExpEigenRates(length, transmat, rates);
5172 if (transmat == NULL) { /* Jukes-Cantor */
5173 assert (nCodes == 4);
5174 double *pSame = PSameVector(length, rates);
5175 double *pDiff = PDiffVector(pSame, rates);
5179 for (i = 0; i < nPos; i++) {
5180 int iRate = rates->ratecat[i];
5181 double wA = pA->weights[i];
5182 double wB = pB->weights[i];
5183 int codeA = pA->codes[i];
5184 int codeB = pB->codes[i];
5185 numeric_t *fA = GET_FREQ(pA,i,/*IN/OUT*/iFreqA);
5186 numeric_t *fB = GET_FREQ(pB,i,/*IN/OUT*/iFreqB);
5189 if (fA == NULL && fB == NULL) {
5190 if (codeA == NOCODE) { /* A is all gaps */
5191 /* gap to gap is sum(j) 0.25 * (0.25 * pSame + 0.75 * pDiff) = sum(i) 0.25*0.25 = 0.25
5192 gap to any character gives the same result
5195 } else if (codeB == NOCODE) { /* B is all gaps */
5197 } else if (codeA == codeB) { /* A and B match */
5198 lkAB = pSame[iRate] * wA*wB + 0.25 * (1-wA*wB);
5199 } else { /* codeA != codeB */
5200 lkAB = pDiff[iRate] * wA*wB + 0.25 * (1-wA*wB);
5202 } else if (fA == NULL) {
5203 /* Compare codeA to profile of B */
5204 if (codeA == NOCODE)
5207 lkAB = wA * (pDiff[iRate] + fB[codeA] * (pSame[iRate]-pDiff[iRate])) + (1.0-wA) * 0.25;
5208 /* because lkAB = wA * P(codeA->B) + (1-wA) * 0.25
5209 P(codeA -> B) = sum(j) P(B==j) * (j==codeA ? pSame : pDiff)
5210 = sum(j) P(B==j) * pDiff +
5211 = pDiff + P(B==codeA) * (pSame-pDiff)
5213 } else if (fB == NULL) { /* Compare codeB to profile of A */
5214 if (codeB == NOCODE)
5217 lkAB = wB * (pDiff[iRate] + fA[codeB] * (pSame[iRate]-pDiff[iRate])) + (1.0-wB) * 0.25;
5218 } else { /* both are full profiles */
5219 for (j = 0; j < 4; j++)
5220 lkAB += fB[j] * (fA[j] * pSame[iRate] + (1-fA[j])* pDiff[iRate]); /* P(A|B) */
5224 while (lk < LkUnderflow) {
5225 lk *= LkUnderflowInv;
5226 loglk -= LogLkUnderflow;
5228 if (site_likelihoods != NULL)
5229 site_likelihoods[i] *= lkAB;
5231 pSame = myfree(pSame, sizeof(double) * rates->nRateCategories);
5232 pDiff = myfree(pDiff, sizeof(double) * rates->nRateCategories);
5233 } else if (nCodes == 4) { /* matrix model on nucleotides */
5236 numeric_t fAmix[4], fBmix[4];
5237 numeric_t *fGap = &transmat->codeFreq[NOCODE][0];
5239 for (i = 0; i < nPos; i++) {
5240 int iRate = rates->ratecat[i];
5241 numeric_t *expeigen = &expeigenRates[iRate*4];
5242 double wA = pA->weights[i];
5243 double wB = pB->weights[i];
5244 if (wA == 0 && wB == 0 && pA->codes[i] == NOCODE && pB->codes[i] == NOCODE) {
5245 /* Likelihood of A vs B is 1, so nothing changes
5246 Do not need to advance iFreqA or iFreqB */
5249 numeric_t *fA = GET_FREQ(pA,i,/*IN/OUT*/iFreqA);
5250 numeric_t *fB = GET_FREQ(pB,i,/*IN/OUT*/iFreqB);
5252 fA = &transmat->codeFreq[pA->codes[i]][0];
5253 if (wA > 0.0 && wA < 1.0) {
5254 for (j = 0; j < 4; j++)
5255 fAmix[j] = wA*fA[j] + (1.0-wA)*fGap[j];
5259 fB = &transmat->codeFreq[pB->codes[i]][0];
5260 if (wB > 0.0 && wB < 1.0) {
5261 for (j = 0; j < 4; j++)
5262 fBmix[j] = wB*fB[j] + (1.0-wB)*fGap[j];
5265 /* SSE3 instructions do not speed this step up:
5266 numeric_t lkAB = vector_multiply3_sum(expeigen, fA, fB); */
5268 for (j = 0; j < 4; j++)
5269 lkAB += expeigen[j]*fA[j]*fB[j];
5271 if (site_likelihoods != NULL)
5272 site_likelihoods[i] *= lkAB;
5274 while (lk < LkUnderflow) {
5275 lk *= LkUnderflowInv;
5276 loglk -= LogLkUnderflow;
5278 while (lk > LkUnderflowInv) {
5280 loglk += LogLkUnderflow;
5283 } else if (nCodes == 20) { /* matrix model on amino acids */
5286 numeric_t fAmix[20], fBmix[20];
5287 numeric_t *fGap = &transmat->codeFreq[NOCODE][0];
5289 for (i = 0; i < nPos; i++) {
5290 int iRate = rates->ratecat[i];
5291 numeric_t *expeigen = &expeigenRates[iRate*20];
5292 double wA = pA->weights[i];
5293 double wB = pB->weights[i];
5294 if (wA == 0 && wB == 0 && pA->codes[i] == NOCODE && pB->codes[i] == NOCODE) {
5295 /* Likelihood of A vs B is 1, so nothing changes
5296 Do not need to advance iFreqA or iFreqB */
5299 numeric_t *fA = GET_FREQ(pA,i,/*IN/OUT*/iFreqA);
5300 numeric_t *fB = GET_FREQ(pB,i,/*IN/OUT*/iFreqB);
5302 fA = &transmat->codeFreq[pA->codes[i]][0];
5303 if (wA > 0.0 && wA < 1.0) {
5304 for (j = 0; j < 20; j++)
5305 fAmix[j] = wA*fA[j] + (1.0-wA)*fGap[j];
5309 fB = &transmat->codeFreq[pB->codes[i]][0];
5310 if (wB > 0.0 && wB < 1.0) {
5311 for (j = 0; j < 20; j++)
5312 fBmix[j] = wB*fB[j] + (1.0-wB)*fGap[j];
5315 numeric_t lkAB = vector_multiply3_sum(expeigen, fA, fB, 20);
5317 /* If this happens, it indicates a numerical problem that needs to be addressed elsewhere,
5318 so report all the details */
5319 fprintf(stderr, "# FastTree.c::PairLogLk -- numerical problem!\n");
5320 fprintf(stderr, "# This block is intended for loading into R\n");
5322 fprintf(stderr, "lkAB = %.8g\n", lkAB);
5323 fprintf(stderr, "Branch_length= %.8g\nalignment_position=%d\nnCodes=%d\nrate_category=%d\nrate=%.8g\n",
5324 length, i, nCodes, iRate, rates->rates[iRate]);
5325 fprintf(stderr, "wA=%.8g\nwB=%.8g\n", wA, wB);
5326 fprintf(stderr, "codeA = %d\ncodeB = %d\n", pA->codes[i], pB->codes[i]);
5328 fprintf(stderr, "fA = c(");
5329 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", fA[j]);
5330 fprintf(stderr,")\n");
5332 fprintf(stderr, "fB = c(");
5333 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", fB[j]);
5334 fprintf(stderr,")\n");
5336 fprintf(stderr, "stat = c(");
5337 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", transmat->stat[j]);
5338 fprintf(stderr,")\n");
5340 fprintf(stderr, "eigenval = c(");
5341 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", transmat->eigenval[j]);
5342 fprintf(stderr,")\n");
5344 fprintf(stderr, "expeigen = c(");
5345 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", expeigen[j]);
5346 fprintf(stderr,")\n");
5349 fprintf(stderr, "codeFreq = c(");
5350 for (j = 0; j < nCodes; j++) for(k = 0; k < nCodes; k++) fprintf(stderr, "%s %.8g", j==0 && k==0?"":",",
5351 transmat->codeFreq[j][k]);
5352 fprintf(stderr,")\n");
5354 fprintf(stderr, "eigeninv = c(");
5355 for (j = 0; j < nCodes; j++) for(k = 0; k < nCodes; k++) fprintf(stderr, "%s %.8g", j==0 && k==0?"":",",
5356 transmat->eigeninv[j][k]);
5357 fprintf(stderr,")\n");
5359 fprintf(stderr, "# Transform into matrices and compute un-rotated vectors for profiles A and B\n");
5360 fprintf(stderr, "codeFreq = matrix(codeFreq,nrow=20);\n");
5361 fprintf(stderr, "eigeninv = matrix(eigeninv,nrow=20);\n");
5362 fputs("unrotA = stat * (eigeninv %*% fA)\n", stderr);
5363 fputs("unrotB = stat * (eigeninv %*% fB)\n", stderr);
5364 fprintf(stderr,"# End of R block\n");
5367 if (site_likelihoods != NULL)
5368 site_likelihoods[i] *= lkAB;
5370 while (lk < LkUnderflow) {
5371 lk *= LkUnderflowInv;
5372 loglk -= LogLkUnderflow;
5374 while (lk > LkUnderflowInv) {
5376 loglk += LogLkUnderflow;
5380 assert(0); /* illegal nCodes */
5382 if (transmat != NULL)
5383 expeigenRates = myfree(expeigenRates, sizeof(numeric_t) * rates->nRateCategories * 20);
5389 double MLQuartetLogLk(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
5390 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
5391 /*IN*/double branch_lengths[5],
5392 /*OPTIONAL OUT*/double *site_likelihoods) {
5393 profile_t *pAB = PosteriorProfile(pA, pB,
5394 branch_lengths[0], branch_lengths[1],
5397 nPos, /*nConstraints*/0);
5398 profile_t *pCD = PosteriorProfile(pC, pD,
5399 branch_lengths[2], branch_lengths[3],
5402 nPos, /*nConstraints*/0);
5403 if (site_likelihoods != NULL) {
5405 for (i = 0; i < nPos; i++)
5406 site_likelihoods[i] = 1.0;
5408 /* Roughly, P(A,B,C,D) = P(A) P(B|A) P(D|C) P(AB | CD) */
5409 double loglk = PairLogLk(pA, pB, branch_lengths[0]+branch_lengths[1],
5410 nPos, transmat, rates, /*OPTIONAL IN/OUT*/site_likelihoods)
5411 + PairLogLk(pC, pD, branch_lengths[2]+branch_lengths[3],
5412 nPos, transmat, rates, /*OPTIONAL IN/OUT*/site_likelihoods)
5413 + PairLogLk(pAB, pCD, branch_lengths[4],
5414 nPos, transmat, rates, /*OPTIONAL IN/OUT*/site_likelihoods);
5415 pAB = FreeProfile(pAB, nPos, /*nConstraints*/0);
5416 pCD = FreeProfile(pCD, nPos, /*nConstraints*/0);
5420 double PairNegLogLk(double x, void *data) {
5421 quartet_opt_t *qo = (quartet_opt_t *)data;
5423 assert(qo->pair1 != NULL && qo->pair2 != NULL);
5425 double loglk = PairLogLk(qo->pair1, qo->pair2, x, qo->nPos, qo->transmat, qo->rates, /*site_lk*/NULL);
5426 assert(loglk < 1e100);
5428 fprintf(stderr, "PairLogLk(%.4f) = %.4f\n", x, loglk);
5432 double MLQuartetOptimize(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
5433 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
5434 /*IN/OUT*/double branch_lengths[5],
5435 /*OPTIONAL OUT*/bool *pStarTest,
5436 /*OPTIONAL OUT*/double *site_likelihoods) {
5438 double start_length[5];
5439 for (j = 0; j < 5; j++) {
5440 start_length[j] = branch_lengths[j];
5441 if (branch_lengths[j] < MLMinBranchLength)
5442 branch_lengths[j] = MLMinBranchLength;
5444 quartet_opt_t qopt = { nPos, transmat, rates, /*nEval*/0,
5445 /*pair1*/NULL, /*pair2*/NULL };
5446 double f2x, negloglk;
5448 if (pStarTest != NULL)
5451 /* First optimize internal branch, then branch to A, B, C, D, in turn
5452 May use star test to quit after internal branch
5454 profile_t *pAB = PosteriorProfile(pA, pB,
5455 branch_lengths[LEN_A], branch_lengths[LEN_B],
5456 transmat, rates, nPos, /*nConstraints*/0);
5457 profile_t *pCD = PosteriorProfile(pC, pD,
5458 branch_lengths[LEN_C], branch_lengths[LEN_D],
5459 transmat, rates, nPos, /*nConstraints*/0);
5462 branch_lengths[LEN_I] = onedimenmin(/*xmin*/MLMinBranchLength,
5463 /*xguess*/branch_lengths[LEN_I],
5467 /*ftol*/MLFTolBranchLength,
5468 /*atol*/MLMinBranchLengthTolerance,
5472 if (pStarTest != NULL) {
5473 assert(site_likelihoods == NULL);
5474 double loglkStar = -PairNegLogLk(MLMinBranchLength, &qopt);
5475 if (loglkStar < -negloglk - closeLogLkLimit) {
5477 double off = PairLogLk(pA, pB,
5478 branch_lengths[LEN_A] + branch_lengths[LEN_B],
5479 qopt.nPos, qopt.transmat, qopt.rates, /*site_lk*/NULL)
5481 branch_lengths[LEN_C] + branch_lengths[LEN_D],
5482 qopt.nPos, qopt.transmat, qopt.rates, /*site_lk*/NULL);
5483 pAB = FreeProfile(pAB, nPos, /*nConstraints*/0);
5484 pCD = FreeProfile(pCD, nPos, /*nConstraints*/0);
5485 return (-negloglk + off);
5488 pAB = FreeProfile(pAB, nPos, /*nConstraints*/0);
5489 profile_t *pBCD = PosteriorProfile(pB, pCD,
5490 branch_lengths[LEN_B], branch_lengths[LEN_I],
5491 transmat, rates, nPos, /*nConstraints*/0);
5494 branch_lengths[LEN_A] = onedimenmin(/*xmin*/MLMinBranchLength,
5495 /*xguess*/branch_lengths[LEN_A],
5499 /*ftol*/MLFTolBranchLength,
5500 /*atol*/MLMinBranchLengthTolerance,
5503 pBCD = FreeProfile(pBCD, nPos, /*nConstraints*/0);
5504 profile_t *pACD = PosteriorProfile(pA, pCD,
5505 branch_lengths[LEN_A], branch_lengths[LEN_I],
5506 transmat, rates, nPos, /*nConstraints*/0);
5509 branch_lengths[LEN_B] = onedimenmin(/*xmin*/MLMinBranchLength,
5510 /*xguess*/branch_lengths[LEN_B],
5514 /*ftol*/MLFTolBranchLength,
5515 /*atol*/MLMinBranchLengthTolerance,
5518 pACD = FreeProfile(pACD, nPos, /*nConstraints*/0);
5519 pCD = FreeProfile(pCD, nPos, /*nConstraints*/0);
5520 pAB = PosteriorProfile(pA, pB,
5521 branch_lengths[LEN_A], branch_lengths[LEN_B],
5522 transmat, rates, nPos, /*nConstraints*/0);
5523 profile_t *pABD = PosteriorProfile(pAB, pD,
5524 branch_lengths[LEN_I], branch_lengths[LEN_D],
5525 transmat, rates, nPos, /*nConstraints*/0);
5528 branch_lengths[LEN_C] = onedimenmin(/*xmin*/MLMinBranchLength,
5529 /*xguess*/branch_lengths[LEN_C],
5533 /*ftol*/MLFTolBranchLength,
5534 /*atol*/MLMinBranchLengthTolerance,
5537 pABD = FreeProfile(pABD, nPos, /*nConstraints*/0);
5538 profile_t *pABC = PosteriorProfile(pAB, pC,
5539 branch_lengths[LEN_I], branch_lengths[LEN_C],
5540 transmat, rates, nPos, /*nConstraints*/0);
5543 branch_lengths[LEN_D] = onedimenmin(/*xmin*/MLMinBranchLength,
5544 /*xguess*/branch_lengths[LEN_D],
5548 /*ftol*/MLFTolBranchLength,
5549 /*atol*/MLMinBranchLengthTolerance,
5553 /* Compute the total quartet likelihood
5554 PairLogLk(ABC,D) + PairLogLk(AB,C) + PairLogLk(A,B)
5556 double loglkABCvsD = -negloglk;
5557 if (site_likelihoods) {
5558 for (j = 0; j < nPos; j++)
5559 site_likelihoods[j] = 1.0;
5560 PairLogLk(pABC, pD, branch_lengths[LEN_D],
5561 qopt.nPos, qopt.transmat, qopt.rates, /*IN/OUT*/site_likelihoods);
5563 double quartetloglk = loglkABCvsD
5564 + PairLogLk(pAB, pC, branch_lengths[LEN_I] + branch_lengths[LEN_C],
5565 qopt.nPos, qopt.transmat, qopt.rates,
5566 /*IN/OUT*/site_likelihoods)
5567 + PairLogLk(pA, pB, branch_lengths[LEN_A] + branch_lengths[LEN_B],
5568 qopt.nPos, qopt.transmat, qopt.rates,
5569 /*IN/OUT*/site_likelihoods);
5571 pABC = FreeProfile(pABC, nPos, /*nConstraints*/0);
5572 pAB = FreeProfile(pAB, nPos, /*nConstraints*/0);
5575 double loglkStart = MLQuartetLogLk(pA, pB, pC, pD, nPos, transmat, rates, start_length, /*site_lk*/NULL);
5576 fprintf(stderr, "Optimize loglk from %.5f to %.5f eval %d lengths from\n"
5577 " %.5f %.5f %.5f %.5f %.5f to\n"
5578 " %.5f %.5f %.5f %.5f %.5f\n",
5579 loglkStart, quartetloglk, qopt.nEval,
5580 start_length[0], start_length[1], start_length[2], start_length[3], start_length[4],
5581 branch_lengths[0], branch_lengths[1], branch_lengths[2], branch_lengths[3], branch_lengths[4]);
5583 return(quartetloglk);
5586 nni_t MLQuartetNNI(profile_t *profiles[4],
5587 /*OPTIONAL*/transition_matrix_t *transmat,
5589 int nPos, int nConstraints,
5590 /*OUT*/double criteria[3], /* The three potential quartet log-likelihoods */
5591 /*IN/OUT*/numeric_t len[5],
5595 double lenABvsCD[5] = {len[LEN_A], len[LEN_B], len[LEN_C], len[LEN_D], len[LEN_I]};
5596 double lenACvsBD[5] = {len[LEN_A], len[LEN_C], len[LEN_B], len[LEN_D], len[LEN_I]}; /* Swap B & C */
5597 double lenADvsBC[5] = {len[LEN_A], len[LEN_D], len[LEN_C], len[LEN_B], len[LEN_I]}; /* Swap B & D */
5598 bool bConsiderAC = true;
5599 bool bConsiderAD = true;
5601 int nRounds = mlAccuracy < 2 ? 2 : mlAccuracy;
5603 QuartetConstraintPenalties(profiles, nConstraints, /*OUT*/penalty);
5604 if (penalty[ABvsCD] > penalty[ACvsBD] || penalty[ABvsCD] > penalty[ADvsBC])
5607 bFast = false; /* turn off star topology test */
5610 for (iRound = 0; iRound < nRounds; iRound++) {
5611 bool bStarTest = false;
5614 #pragma omp parallel
5615 #pragma omp sections
5622 criteria[ABvsCD] = MLQuartetOptimize(profiles[0], profiles[1], profiles[2], profiles[3],
5623 nPos, transmat, rates,
5624 /*IN/OUT*/lenABvsCD,
5625 bFast ? &bStarTest : NULL,
5626 /*site_likelihoods*/NULL)
5627 - penalty[ABvsCD]; /* subtract penalty b/c we are trying to maximize log lk */
5635 criteria[ACvsBD] = -1e20;
5636 criteria[ADvsBC] = -1e20;
5637 len[LEN_I] = lenABvsCD[LEN_I];
5643 criteria[ACvsBD] = MLQuartetOptimize(profiles[0], profiles[2], profiles[1], profiles[3],
5644 nPos, transmat, rates,
5645 /*IN/OUT*/lenACvsBD, NULL, /*site_likelihoods*/NULL)
5654 criteria[ADvsBC] = MLQuartetOptimize(profiles[0], profiles[3], profiles[2], profiles[1],
5655 nPos, transmat, rates,
5656 /*IN/OUT*/lenADvsBC, NULL, /*site_likelihoods*/NULL)
5660 } /* end parallel sections */
5661 if (mlAccuracy < 2) {
5662 /* If clearly worse then ABvsCD, or have short internal branch length and worse, then
5664 if (criteria[ACvsBD] < criteria[ABvsCD] - closeLogLkLimit
5665 || (lenACvsBD[LEN_I] <= 2.0*MLMinBranchLength && criteria[ACvsBD] < criteria[ABvsCD]))
5666 bConsiderAC = false;
5667 if (criteria[ADvsBC] < criteria[ABvsCD] - closeLogLkLimit
5668 || (lenADvsBC[LEN_I] <= 2.0*MLMinBranchLength && criteria[ADvsBC] < criteria[ABvsCD]))
5669 bConsiderAD = false;
5670 if (!bConsiderAC && !bConsiderAD)
5672 /* If clearly better than either alternative, then give up
5673 (Comparison is probably biased in favor of ABvsCD anyway) */
5674 if (criteria[ACvsBD] > criteria[ABvsCD] + closeLogLkLimit
5675 && criteria[ACvsBD] > criteria[ADvsBC] + closeLogLkLimit)
5677 if (criteria[ADvsBC] > criteria[ABvsCD] + closeLogLkLimit
5678 && criteria[ADvsBC] > criteria[ACvsBD] + closeLogLkLimit)
5681 } /* end loop over rounds */
5684 fprintf(stderr, "Optimized quartet for %d rounds: ABvsCD %.5f ACvsBD %.5f ADvsBC %.5f\n",
5685 iRound, criteria[ABvsCD], criteria[ACvsBD], criteria[ADvsBC]);
5687 if (criteria[ACvsBD] > criteria[ABvsCD] && criteria[ACvsBD] > criteria[ADvsBC]) {
5688 for (i = 0; i < 5; i++) len[i] = lenACvsBD[i];
5690 } else if (criteria[ADvsBC] > criteria[ABvsCD] && criteria[ADvsBC] > criteria[ACvsBD]) {
5691 for (i = 0; i < 5; i++) len[i] = lenADvsBC[i];
5694 for (i = 0; i < 5; i++) len[i] = lenABvsCD[i];
5699 double TreeLength(/*IN/OUT*/NJ_t *NJ, bool recomputeProfiles) {
5700 if (recomputeProfiles) {
5701 traversal_t traversal2 = InitTraversal(NJ);
5703 while((j = TraversePostorder(j, NJ, /*IN/OUT*/traversal2, /*pUp*/NULL)) >= 0) {
5704 /* nothing to do for leaves or root */
5705 if (j >= NJ->nSeq && j != NJ->root)
5706 SetProfile(/*IN/OUT*/NJ, j, /*noweight*/-1.0);
5708 traversal2 = FreeTraversal(traversal2,NJ);
5710 UpdateBranchLengths(/*IN/OUT*/NJ);
5711 double total_len = 0;
5713 for (iNode = 0; iNode < NJ->maxnode; iNode++)
5714 total_len += NJ->branchlength[iNode];
5718 double TreeLogLk(/*IN*/NJ_t *NJ, /*OPTIONAL OUT*/double *site_loglk) {
5723 double *site_likelihood = NULL;
5724 if (site_loglk != NULL) {
5725 site_likelihood = mymalloc(sizeof(double)*NJ->nPos);
5726 for (i = 0; i < NJ->nPos; i++) {
5727 site_likelihood[i] = 1.0;
5728 site_loglk[i] = 0.0;
5731 traversal_t traversal = InitTraversal(NJ);
5732 int node = NJ->root;
5733 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
5734 int nChild = NJ->child[node].nChild;
5737 assert(nChild >= 2);
5738 int *children = NJ->child[node].child;
5739 double loglkchild = PairLogLk(NJ->profiles[children[0]], NJ->profiles[children[1]],
5740 NJ->branchlength[children[0]]+NJ->branchlength[children[1]],
5741 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/site_likelihood);
5742 loglk += loglkchild;
5743 if (site_likelihood != NULL) {
5744 /* prevent underflows */
5745 for (i = 0; i < NJ->nPos; i++) {
5746 while(site_likelihood[i] < LkUnderflow) {
5747 site_likelihood[i] *= LkUnderflowInv;
5748 site_loglk[i] -= LogLkUnderflow;
5753 fprintf(stderr, "At %d: LogLk(%d:%.4f,%d:%.4f) = %.3f\n",
5755 children[0], NJ->branchlength[children[0]],
5756 children[1], NJ->branchlength[children[1]],
5758 if (NJ->child[node].nChild == 3) {
5759 assert(node == NJ->root);
5760 /* Infer the common parent of the 1st two to define the third... */
5761 profile_t *pAB = PosteriorProfile(NJ->profiles[children[0]],
5762 NJ->profiles[children[1]],
5763 NJ->branchlength[children[0]],
5764 NJ->branchlength[children[1]],
5765 NJ->transmat, &NJ->rates,
5766 NJ->nPos, /*nConstraints*/0);
5767 double loglkup = PairLogLk(pAB, NJ->profiles[children[2]],
5768 NJ->branchlength[children[2]],
5769 NJ->nPos, NJ->transmat, &NJ->rates,
5770 /*IN/OUT*/site_likelihood);
5773 fprintf(stderr, "At root %d: LogLk((%d/%d),%d:%.3f) = %.3f\n",
5774 node, children[0], children[1], children[2],
5775 NJ->branchlength[children[2]],
5777 pAB = FreeProfile(pAB, NJ->nPos, NJ->nConstraints);
5780 traversal = FreeTraversal(traversal,NJ);
5781 if (site_likelihood != NULL) {
5782 for (i = 0; i < NJ->nPos; i++) {
5783 site_loglk[i] += log(site_likelihood[i]);
5785 site_likelihood = myfree(site_likelihood, sizeof(double)*NJ->nPos);
5788 /* For Jukes-Cantor, with a tree of size 4, if the children of the root are
5789 (A,B), C, and D, then
5790 P(ABCD) = P(A) P(B|A) P(C|AB) P(D|ABC)
5792 Above we compute P(B|A) P(C|AB) P(D|ABC) -- note P(B|A) is at the child of root
5793 and P(C|AB) P(D|ABC) is at root.
5795 Similarly if the children of the root are C, D, and (A,B), then
5796 P(ABCD) = P(C|D) P(A|B) P(AB|CD) P(D), and above we compute that except for P(D)
5798 So we need to multiply by P(A) = 0.25, so we pay log(4) at each position
5799 (if ungapped). Each gapped position in any sequence reduces the payment by log(4)
5801 For JTT or GTR, we are computing P(A & B) and the posterior profiles are scaled to take
5802 the prior into account, so we do not need any correction.
5803 codeFreq[NOCODE] is scaled x higher so that P(-) = 1 not P(-)=1/nCodes, so gaps
5804 do not need to be corrected either.
5807 if (nCodes == 4 && NJ->transmat == NULL) {
5809 double logNCodes = log((double)nCodes);
5810 for (i = 0; i < NJ->nPos; i++) {
5811 int nGapsThisPos = 0;
5812 for (node = 0; node < NJ->nSeq; node++) {
5813 unsigned char *codes = NJ->profiles[node]->codes;
5814 if (codes[i] == NOCODE)
5817 nGaps += nGapsThisPos;
5818 if (site_loglk != NULL) {
5819 site_loglk[i] += nGapsThisPos * logNCodes;
5820 if (nCodes == 4 && NJ->transmat == NULL)
5821 site_loglk[i] -= logNCodes;
5824 loglk -= NJ->nPos * logNCodes;
5825 loglk += nGaps * logNCodes; /* do not pay for gaps -- only Jukes-Cantor */
5830 void SetMLGtr(/*IN/OUT*/NJ_t *NJ, /*OPTIONAL IN*/double *freq_in, /*OPTIONAL WRITE*/FILE *fpLog) {
5835 if (freq_in != NULL) {
5837 gtr.freq[i]=freq_in[i];
5839 int n[4] = {1,1,1,1}; /* pseudocounts */
5840 for (i=0; i<NJ->nSeq; i++) {
5841 unsigned char *codes = NJ->profiles[i]->codes;
5843 for (iPos=0; iPos<NJ->nPos; iPos++)
5844 if (codes[iPos] < 4)
5847 int sum = n[0]+n[1]+n[2]+n[3];
5849 gtr.freq[i] = n[i]/(double)sum;
5853 int nRounds = mlAccuracy < 2 ? 2 : mlAccuracy;
5854 for (i = 0; i < nRounds; i++) {
5855 for (gtr.iRate = 0; gtr.iRate < 6; gtr.iRate++) {
5856 ProgressReport("Optimizing GTR model, step %d of %d", i*6+gtr.iRate+1, 12, 0, 0);
5857 double negloglk, f2x;
5858 gtr.rates[gtr.iRate] = onedimenmin(/*xmin*/0.05,
5859 /*xguess*/gtr.rates[gtr.iRate],
5869 /* normalize gtr so last rate is 1 -- specifying that rate separately is useful for optimization only */
5870 for (i = 0; i < 5; i++)
5871 gtr.rates[i] /= gtr.rates[5];
5874 fprintf(stderr, "GTR Frequencies: %.4f %.4f %.4f %.4f\n", gtr.freq[0], gtr.freq[1], gtr.freq[2], gtr.freq[3]);
5875 fprintf(stderr, "GTR rates(ac ag at cg ct gt) %.4f %.4f %.4f %.4f %.4f %.4f\n",
5876 gtr.rates[0],gtr.rates[1],gtr.rates[2],gtr.rates[3],gtr.rates[4],gtr.rates[5]);
5878 if (fpLog != NULL) {
5879 fprintf(fpLog, "GTRFreq\t%.4f\t%.4f\t%.4f\t%.4f\n", gtr.freq[0], gtr.freq[1], gtr.freq[2], gtr.freq[3]);
5880 fprintf(fpLog, "GTRRates\t%.4f\t%.4f\t%.4f\t%.4f\t%.4f\t%.4f\n",
5881 gtr.rates[0],gtr.rates[1],gtr.rates[2],gtr.rates[3],gtr.rates[4],gtr.rates[5]);
5883 myfree(NJ->transmat, sizeof(transition_matrix_t));
5884 NJ->transmat = CreateGTR(gtr.rates, gtr.freq);
5885 RecomputeMLProfiles(/*IN/OUT*/NJ);
5886 OptimizeAllBranchLengths(/*IN/OUT*/NJ);
5889 double GTRNegLogLk(double x, void *data) {
5890 gtr_opt_t *gtr = (gtr_opt_t*)data;
5891 assert(nCodes == 4);
5892 assert(gtr->NJ != NULL);
5893 assert(gtr->iRate >= 0 && gtr->iRate < 6);
5895 transition_matrix_t *old = gtr->NJ->transmat;
5898 for (i = 0; i < 6; i++)
5899 rates[i] = gtr->rates[i];
5900 rates[gtr->iRate] = x;
5902 gtr->NJ->transmat = CreateGTR(rates, gtr->freq);
5903 RecomputeMLProfiles(/*IN/OUT*/gtr->NJ);
5904 double loglk = TreeLogLk(gtr->NJ, /*site_loglk*/NULL);
5905 myfree(gtr->NJ->transmat, sizeof(transition_matrix_t));
5906 gtr->NJ->transmat = old;
5907 /* Do not recompute profiles -- assume the caller will do that */
5909 fprintf(stderr, "GTR LogLk(%.5f %.5f %.5f %.5f %.5f %.5f) = %f\n",
5910 rates[0], rates[1], rates[2], rates[3], rates[4], rates[5], loglk);
5914 /* Caller must free the resulting vector of n rates */
5915 numeric_t *MLSiteRates(int nRateCategories) {
5916 /* Even spacing from 1/nRate to nRate */
5917 double logNCat = log((double)nRateCategories);
5918 double logMinRate = -logNCat;
5919 double logMaxRate = logNCat;
5920 double logd = (logMaxRate-logMinRate)/(double)(nRateCategories-1);
5922 numeric_t *rates = mymalloc(sizeof(numeric_t)*nRateCategories);
5924 for (i = 0; i < nRateCategories; i++)
5925 rates[i] = exp(logMinRate + logd*(double)i);
5929 double *MLSiteLikelihoodsByRate(/*IN*/NJ_t *NJ, /*IN*/numeric_t *rates, int nRateCategories) {
5930 double *site_loglk = mymalloc(sizeof(double)*NJ->nPos*nRateCategories);
5932 /* save the original rates */
5933 assert(NJ->rates.nRateCategories > 0);
5934 numeric_t *oldRates = NJ->rates.rates;
5935 NJ->rates.rates = mymalloc(sizeof(numeric_t) * NJ->rates.nRateCategories);
5937 /* Compute site likelihood for each rate */
5940 for (iRate = 0; iRate < nRateCategories; iRate++) {
5942 for (i = 0; i < NJ->rates.nRateCategories; i++)
5943 NJ->rates.rates[i] = rates[iRate];
5944 RecomputeMLProfiles(/*IN/OUT*/NJ);
5945 double loglk = TreeLogLk(NJ, /*OUT*/&site_loglk[NJ->nPos*iRate]);
5946 ProgressReport("Site likelihoods with rate category %d of %d", iRate+1, nRateCategories, 0, 0);
5948 fprintf(stderr, "Rate %.3f Loglk %.3f SiteLogLk", rates[iRate], loglk);
5949 for (iPos = 0; iPos < NJ->nPos; iPos++)
5950 fprintf(stderr,"\t%.3f", site_loglk[NJ->nPos*iRate + iPos]);
5951 fprintf(stderr,"\n");
5955 /* restore original rates and profiles */
5956 myfree(NJ->rates.rates, sizeof(numeric_t) * NJ->rates.nRateCategories);
5957 NJ->rates.rates = oldRates;
5958 RecomputeMLProfiles(/*IN/OUT*/NJ);
5963 void SetMLRates(/*IN/OUT*/NJ_t *NJ, int nRateCategories) {
5964 assert(nRateCategories > 0);
5965 AllocRateCategories(/*IN/OUT*/&NJ->rates, 1, NJ->nPos); /* set to 1 category of rate 1 */
5966 if (nRateCategories == 1) {
5967 RecomputeMLProfiles(/*IN/OUT*/NJ);
5970 numeric_t *rates = MLSiteRates(nRateCategories);
5971 double *site_loglk = MLSiteLikelihoodsByRate(/*IN*/NJ, /*IN*/rates, nRateCategories);
5973 /* Select best rate for each site, correcting for the prior
5974 For a prior, use a gamma distribution with shape parameter 3, scale 1/3, so
5975 Prior(rate) ~ rate**2 * exp(-3*rate)
5976 log Prior(rate) = C + 2 * log(rate) - 3 * rate
5978 double sumRates = 0;
5981 for (iPos = 0; iPos < NJ->nPos; iPos++) {
5983 double dBest = -1e20;
5984 for (iRate = 0; iRate < nRateCategories; iRate++) {
5985 double site_loglk_with_prior = site_loglk[NJ->nPos*iRate + iPos]
5986 + 2.0 * log(rates[iRate]) - 3.0 * rates[iRate];
5987 if (site_loglk_with_prior > dBest) {
5989 dBest = site_loglk_with_prior;
5993 fprintf(stderr, "Selected rate category %d rate %.3f for position %d\n",
5994 iBest, rates[iBest], iPos+1);
5995 NJ->rates.ratecat[iPos] = iBest;
5996 sumRates += rates[iBest];
5998 site_loglk = myfree(site_loglk, sizeof(double)*NJ->nPos*nRateCategories);
6000 /* Force the rates to average to 1 */
6001 double avgRate = sumRates/NJ->nPos;
6002 for (iRate = 0; iRate < nRateCategories; iRate++)
6003 rates[iRate] /= avgRate;
6005 /* Save the rates */
6006 NJ->rates.rates = myfree(NJ->rates.rates, sizeof(numeric_t) * NJ->rates.nRateCategories);
6007 NJ->rates.rates = rates;
6008 NJ->rates.nRateCategories = nRateCategories;
6010 /* Update profiles based on rates */
6011 RecomputeMLProfiles(/*IN/OUT*/NJ);
6014 fprintf(stderr, "Switched to using %d rate categories (CAT approximation)\n", nRateCategories);
6015 fprintf(stderr, "Rate categories were divided by %.3f so that average rate = 1.0\n", avgRate);
6016 fprintf(stderr, "CAT-based log-likelihoods may not be comparable across runs\n");
6018 fprintf(stderr, "Use -gamma for approximate but comparable Gamma(20) log-likelihoods\n");
6022 double GammaLogLk(/*IN*/siteratelk_t *s, /*OPTIONAL OUT*/double *gamma_loglk_sites) {
6024 double *dRate = mymalloc(sizeof(double) * s->nRateCats);
6025 for (iRate = 0; iRate < s->nRateCats; iRate++) {
6026 /* The probability density for each rate is approximated by the total
6027 density between the midpoints */
6028 double pMin = iRate == 0 ? 0.0 :
6029 PGamma(s->mult * (s->rates[iRate-1] + s->rates[iRate])/2.0, s->alpha);
6030 double pMax = iRate == s->nRateCats-1 ? 1.0 :
6031 PGamma(s->mult * (s->rates[iRate]+s->rates[iRate+1])/2.0, s->alpha);
6032 dRate[iRate] = pMax-pMin;
6036 for (iPos = 0; iPos < s->nPos; iPos++) {
6037 /* Prevent underflow on large trees by comparing to maximum loglk */
6038 double maxloglk = -1e20;
6039 for (iRate = 0; iRate < s->nRateCats; iRate++) {
6040 double site_loglk = s->site_loglk[s->nPos*iRate + iPos];
6041 if (site_loglk > maxloglk)
6042 maxloglk = site_loglk;
6044 double rellk = 0; /* likelihood scaled by exp(maxloglk) */
6045 for (iRate = 0; iRate < s->nRateCats; iRate++) {
6046 double lk = exp(s->site_loglk[s->nPos*iRate + iPos] - maxloglk);
6047 rellk += lk * dRate[iRate];
6049 double loglk_site = maxloglk + log(rellk);
6050 loglk += loglk_site;
6051 if (gamma_loglk_sites != NULL)
6052 gamma_loglk_sites[iPos] = loglk_site;
6054 dRate = myfree(dRate, sizeof(double)*s->nRateCats);
6058 double OptAlpha(double alpha, void *data) {
6059 siteratelk_t *s = (siteratelk_t *)data;
6061 return(-GammaLogLk(s, NULL));
6064 double OptMult(double mult, void *data) {
6065 siteratelk_t *s = (siteratelk_t *)data;
6067 return(-GammaLogLk(s, NULL));
6070 /* Input site_loglk must be for each rate */
6071 double RescaleGammaLogLk(int nPos, int nRateCats, /*IN*/numeric_t *rates, /*IN*/double *site_loglk,
6072 /*OPTIONAL*/FILE *fpLog) {
6073 siteratelk_t s = { /*mult*/1.0, /*alpha*/1.0, nPos, nRateCats, rates, site_loglk };
6076 fx = -GammaLogLk(&s, NULL);
6078 fprintf(stderr, "Optimizing alpha, starting at loglk %.3f\n", -fx);
6079 for (i = 0; i < 10; i++) {
6080 ProgressReport("Optimizing alpha round %d", i+1, 0, 0, 0);
6082 s.alpha = onedimenmin(0.01, s.alpha, 10.0, OptAlpha, &s, 0.001, 0.001, &fx, &f2x);
6084 fprintf(stderr, "Optimize alpha round %d to %.3f lk %.3f\n", i+1, s.alpha, -fx);
6085 s.mult = onedimenmin(0.01, s.mult, 10.0, OptMult, &s, 0.001, 0.001, &fx, &f2x);
6087 fprintf(stderr, "Optimize mult round %d to %.3f lk %.3f\n", i+1, s.mult, -fx);
6088 if (fx > start - 0.001) {
6090 fprintf(stderr, "Optimizing alpha & mult converged\n");
6095 double *gamma_loglk_sites = mymalloc(sizeof(double) * nPos);
6096 double gammaLogLk = GammaLogLk(&s, /*OUT*/gamma_loglk_sites);
6098 fprintf(stderr, "Gamma(%d) LogLk = %.3f alpha = %.3f rescaling lengths by %.3f\n",
6099 nRateCats, gammaLogLk, s.alpha, 1/s.mult);
6103 fprintf(fpLog, "Gamma%dLogLk\t%.3f\tApproximate\tAlpha\t%.3f\tRescale\t%.3f\n",
6104 nRateCats, gammaLogLk, s.alpha, 1/s.mult);
6105 fprintf(fpLog, "Gamma%d\tSite\tLogLk", nRateCats);
6106 for (iRate = 0; iRate < nRateCats; iRate++)
6107 fprintf(fpLog, "\tr=%.3f", rates[iRate]/s.mult);
6108 fprintf(fpLog,"\n");
6109 for (iPos = 0; iPos < nPos; iPos++) {
6110 fprintf(fpLog, "Gamma%d\t%d\t%.3f", nRateCats, iPos, gamma_loglk_sites[iPos]);
6111 for (iRate = 0; iRate < nRateCats; iRate++)
6112 fprintf(fpLog, "\t%.3f", site_loglk[nPos*iRate + iPos]);
6113 fprintf(fpLog,"\n");
6116 gamma_loglk_sites = myfree(gamma_loglk_sites, sizeof(double) * nPos);
6120 double MLPairOptimize(profile_t *pA, profile_t *pB,
6121 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
6122 /*IN/OUT*/double *branch_length) {
6123 quartet_opt_t qopt = { nPos, transmat, rates,
6124 /*nEval*/0, /*pair1*/pA, /*pair2*/pB };
6125 double f2x,negloglk;
6126 *branch_length = onedimenmin(/*xmin*/MLMinBranchLength,
6127 /*xguess*/*branch_length,
6131 /*ftol*/MLFTolBranchLength,
6132 /*atol*/MLMinBranchLengthTolerance,
6135 return(-negloglk); /* the log likelihood */
6138 void OptimizeAllBranchLengths(/*IN/OUT*/NJ_t *NJ) {
6141 if (NJ->nSeq == 2) {
6142 int parent = NJ->root;
6143 assert(NJ->child[parent].nChild==2);
6144 int nodes[2] = { NJ->child[parent].child[0], NJ->child[parent].child[1] };
6145 double length = 1.0;
6146 (void)MLPairOptimize(NJ->profiles[nodes[0]], NJ->profiles[nodes[1]],
6147 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/&length);
6148 NJ->branchlength[nodes[0]] = length/2.0;
6149 NJ->branchlength[nodes[1]] = length/2.0;
6153 traversal_t traversal = InitTraversal(NJ);
6154 profile_t **upProfiles = UpProfiles(NJ);
6155 int node = NJ->root;
6157 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6158 int nChild = NJ->child[node].nChild;
6160 if ((iDone % 100) == 0)
6161 ProgressReport("ML Lengths %d of %d splits", iDone+1, NJ->maxnode - NJ->nSeq, 0, 0);
6164 /* optimize the branch lengths between self, parent, and children,
6167 assert(nChild == 2 || nChild == 3);
6168 int nodes[3] = { NJ->child[node].child[0],
6169 NJ->child[node].child[1],
6170 nChild == 3 ? NJ->child[node].child[2] : node };
6171 profile_t *profiles[3] = { NJ->profiles[nodes[0]],
6172 NJ->profiles[nodes[1]],
6173 nChild == 3 ? NJ->profiles[nodes[2]]
6174 : GetUpProfile(/*IN/OUT*/upProfiles, NJ, node, /*useML*/true) };
6176 for (iter = 0; iter < 2; iter++) {
6178 for (i = 0; i < 3; i++) {
6179 profile_t *pA = profiles[i];
6182 profile_t *pB = PosteriorProfile(profiles[b1], profiles[b2],
6183 NJ->branchlength[nodes[b1]],
6184 NJ->branchlength[nodes[b2]],
6185 NJ->transmat, &NJ->rates, NJ->nPos, /*nConstraints*/0);
6186 double len = NJ->branchlength[nodes[i]];
6187 if (len < MLMinBranchLength)
6188 len = MLMinBranchLength;
6189 (void)MLPairOptimize(pA, pB, NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/&len);
6190 NJ->branchlength[nodes[i]] = len;
6191 pB = FreeProfile(pB, NJ->nPos, /*nConstraints*/0);
6193 fprintf(stderr, "Optimize length for %d to %.3f\n",
6194 nodes[i], NJ->branchlength[nodes[i]]);
6197 if (node != NJ->root) {
6198 RecomputeProfile(/*IN/OUT*/NJ, /*IN/OUT*/upProfiles, node, /*useML*/true);
6199 DeleteUpProfile(upProfiles, NJ, node);
6203 traversal = FreeTraversal(traversal,NJ);
6204 upProfiles = FreeUpProfiles(upProfiles,NJ);
6207 void RecomputeMLProfiles(/*IN/OUT*/NJ_t *NJ) {
6208 traversal_t traversal = InitTraversal(NJ);
6209 int node = NJ->root;
6210 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6211 if (NJ->child[node].nChild == 2) {
6212 NJ->profiles[node] = FreeProfile(NJ->profiles[node], NJ->nPos, NJ->nConstraints);
6213 int *children = NJ->child[node].child;
6214 NJ->profiles[node] = PosteriorProfile(NJ->profiles[children[0]], NJ->profiles[children[1]],
6215 NJ->branchlength[children[0]], NJ->branchlength[children[1]],
6216 NJ->transmat, &NJ->rates, NJ->nPos, NJ->nConstraints);
6219 traversal = FreeTraversal(traversal, NJ);
6222 void RecomputeProfiles(/*IN/OUT*/NJ_t *NJ, /*OPTIONAL*/distance_matrix_t *dmat) {
6223 traversal_t traversal = InitTraversal(NJ);
6224 int node = NJ->root;
6225 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6226 if (NJ->child[node].nChild == 2) {
6227 int *child = NJ->child[node].child;
6228 NJ->profiles[node] = FreeProfile(NJ->profiles[node], NJ->nPos, NJ->nConstraints);
6229 NJ->profiles[node] = AverageProfile(NJ->profiles[child[0]], NJ->profiles[child[1]],
6230 NJ->nPos, NJ->nConstraints,
6231 dmat, /*unweighted*/-1.0);
6234 traversal = FreeTraversal(traversal,NJ);
6237 int NNI(/*IN/OUT*/NJ_t *NJ, int iRound, int nRounds, bool useML,
6238 /*IN/OUT*/nni_stats_t *stats,
6239 /*OUT*/double *dMaxDelta) {
6240 /* For each non-root node N, with children A,B, sibling C, and uncle D,
6241 we compare the current topology AB|CD to the alternate topologies
6242 AC|BD and AD|BC, by using the 4 relevant profiles.
6244 If useML is true, it uses quartet maximum likelihood, and it
6245 updates branch lengths as it goes.
6247 If useML is false, it uses the minimum-evolution criterion with
6248 log-corrected distances on profiles. (If logdist is false, then
6249 the log correction is not done.) If useML is false, then NNI()
6250 does NOT modify the branch lengths.
6252 Regardless of whether it changes the topology, it recomputes the
6253 profile for the node, using the pairwise distances and BIONJ-like
6254 weightings (if bionj is set). The parent's profile has changed,
6255 but recomputing it is not necessary because we will visit it
6256 before we need it (we use postorder, so we may visit the sibling
6257 and its children before we visit the parent, but we never
6258 consider an ancestor's profile, so that is OK). When we change
6259 the parent's profile, this alters the uncle's up-profile, so we
6260 remove that. Finally, if the topology has changed, we remove the
6261 up-profiles of the nodes.
6263 If we do an NNI during post-order traversal, the result is a bit
6264 tricky. E.g. if we are at node N, and have visited its children A
6265 and B but not its uncle C, and we do an NNI that swaps B & C,
6266 then the post-order traversal will visit C, and its children, but
6267 then on the way back up, it will skip N, as it has already
6268 visited it. So, the profile of N will not be recomputed: any
6269 changes beneath C will not be reflected in the profile of N, and
6270 the profile of N will be slightly stale. This will be corrected
6271 on the next round of NNIs.
6273 double supportThreshold = useML ? treeLogLkDelta : MEMinDelta;
6276 int nNNIThisRound = 0;
6279 return(0); /* nothing to do */
6281 fprintf(stderr, "Beginning round %d of NNIs with ml? %d\n", iRound, useML?1:0);
6282 PrintNJInternal(/*WRITE*/stderr, NJ, /*useLen*/useML && iRound > 0 ? 1 : 0);
6284 /* For each node the upProfile or NULL */
6285 profile_t **upProfiles = UpProfiles(NJ);
6287 traversal_t traversal = InitTraversal(NJ);
6289 /* Identify nodes we can skip traversing into */
6292 for (node = 0; node < NJ->maxnode; node++) {
6293 if (node != NJ->root
6295 && stats[node].age >= 2
6296 && stats[node].subtreeAge >= 2
6297 && stats[node].support > supportThreshold) {
6299 SetupABCD(NJ, node, NULL, NULL, /*OUT*/nodeABCD, useML);
6300 for (i = 0; i < 4; i++)
6301 if (stats[nodeABCD[i]].age == 0 && stats[nodeABCD[i]].support > supportThreshold)
6304 SkipTraversalInto(node, /*IN/OUT*/traversal);
6306 fprintf(stderr, "Skipping subtree at %d: child %d %d parent %d age %d subtreeAge %d support %.3f\n",
6307 node, nodeABCD[0], nodeABCD[1], NJ->parent[node],
6308 stats[node].age, stats[node].subtreeAge, stats[node].support);
6317 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, &bUp)) >= 0) {
6318 if (node < NJ->nSeq || node == NJ->root)
6319 continue; /* nothing to do for leaves or root */
6322 fprintf(stderr, "Going up back to node %d\n", node);
6323 /* No longer needed */
6324 for (i = 0; i < NJ->child[node].nChild; i++)
6325 DeleteUpProfile(upProfiles, NJ, NJ->child[node].child[i]);
6326 DeleteUpProfile(upProfiles, NJ, node);
6327 RecomputeProfile(/*IN/OUT*/NJ, /*IN/OUT*/upProfiles, node, useML);
6330 if ((iDone % 100) == 0) {
6332 sprintf(buf, "%s NNI round %%d of %%d, %%d of %%d splits", useML ? "ML" : "ME");
6334 sprintf(buf+strlen(buf), ", %d changes", nNNIThisRound);
6335 if (nNNIThisRound > 0)
6336 sprintf(buf+strlen(buf), " (max delta %.3f)", *dMaxDelta);
6337 ProgressReport(buf, iRound+1, nRounds, iDone+1, NJ->maxnode - NJ->nSeq);
6341 profile_t *profiles[4];
6343 /* Note -- during the first round of ML NNIs, we use the min-evo-based branch lengths,
6344 which may be suboptimal */
6345 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, useML);
6347 /* Given our 4 profiles, consider doing a swap */
6348 int nodeA = nodeABCD[0];
6349 int nodeB = nodeABCD[1];
6350 int nodeC = nodeABCD[2];
6351 int nodeD = nodeABCD[3];
6353 nni_t choice = ABvsCD;
6356 fprintf(stderr,"Considering NNI around %d: Swap A=%d B=%d C=%d D=up(%d) or parent %d\n",
6357 node, nodeA, nodeB, nodeC, nodeD, NJ->parent[node]);
6358 if (verbose > 3 && useML) {
6359 double len[5] = { NJ->branchlength[nodeA], NJ->branchlength[nodeB], NJ->branchlength[nodeC], NJ->branchlength[nodeD],
6360 NJ->branchlength[node] };
6361 for (i=0; i < 5; i++)
6362 if (len[i] < MLMinBranchLength)
6363 len[i] = MLMinBranchLength;
6364 fprintf(stderr, "Starting quartet likelihood %.3f len %.3f %.3f %.3f %.3f %.3f\n",
6365 MLQuartetLogLk(profiles[0],profiles[1],profiles[2],profiles[3],NJ->nPos,NJ->transmat,&NJ->rates,len, /*site_lk*/NULL),
6366 len[0], len[1], len[2], len[3], len[4]);
6369 numeric_t newlength[5];
6372 for (i = 0; i < 4; i++)
6373 newlength[i] = NJ->branchlength[nodeABCD[i]];
6374 newlength[4] = NJ->branchlength[node];
6375 bool bFast = mlAccuracy < 2 && stats[node].age > 0;
6376 choice = MLQuartetNNI(profiles, NJ->transmat, &NJ->rates, NJ->nPos, NJ->nConstraints,
6377 /*OUT*/criteria, /*IN/OUT*/newlength, bFast);
6379 choice = ChooseNNI(profiles, NJ->distance_matrix, NJ->nPos, NJ->nConstraints,
6381 /* invert criteria so that higher is better, as in ML case, to simplify code below */
6382 for (i = 0; i < 3; i++)
6383 criteria[i] = -criteria[i];
6386 if (choice == ACvsBD) {
6388 ReplaceChild(/*IN/OUT*/NJ, node, nodeB, nodeC);
6389 ReplaceChild(/*IN/OUT*/NJ, NJ->parent[node], nodeC, nodeB);
6390 } else if (choice == ADvsBC) {
6392 ReplaceChild(/*IN/OUT*/NJ, node, nodeA, nodeC);
6393 ReplaceChild(/*IN/OUT*/NJ, NJ->parent[node], nodeC, nodeA);
6397 /* update branch length for the internal branch, and of any
6398 branches that lead to leaves, b/c those will not are not
6399 the internal branch for NNI and would not otherwise be set.
6401 if (choice == ADvsBC) {
6402 /* For ADvsBC, MLQuartetNNI swaps B with D, but we swap A with C */
6403 double length2[5] = { newlength[LEN_C], newlength[LEN_D],
6404 newlength[LEN_A], newlength[LEN_B],
6407 for (i = 0; i < 5; i++) newlength[i] = length2[i];
6408 /* and swap A and C */
6409 double tmp = newlength[LEN_A];
6410 newlength[LEN_A] = newlength[LEN_C];
6411 newlength[LEN_C] = tmp;
6412 } else if (choice == ACvsBD) {
6414 double tmp = newlength[LEN_B];
6415 newlength[LEN_B] = newlength[LEN_C];
6416 newlength[LEN_C] = tmp;
6419 NJ->branchlength[node] = newlength[LEN_I];
6420 NJ->branchlength[nodeA] = newlength[LEN_A];
6421 NJ->branchlength[nodeB] = newlength[LEN_B];
6422 NJ->branchlength[nodeC] = newlength[LEN_C];
6423 NJ->branchlength[nodeD] = newlength[LEN_D];
6426 if (verbose>2 && (choice != ABvsCD || verbose > 2))
6427 fprintf(stderr,"NNI around %d: Swap A=%d B=%d C=%d D=out(C) -- choose %s %s %.4f\n",
6428 node, nodeA, nodeB, nodeC,
6429 choice == ACvsBD ? "AC|BD" : (choice == ABvsCD ? "AB|CD" : "AD|BC"),
6430 useML ? "delta-loglk" : "-deltaLen",
6431 criteria[choice] - criteria[ABvsCD]);
6432 if(verbose >= 3 && slow && useML)
6433 fprintf(stderr, "Old tree lk -- %.4f\n", TreeLogLk(NJ, /*site_likelihoods*/NULL));
6435 /* update stats, *dMaxDelta, etc. */
6436 if (choice == ABvsCD) {
6444 stats[node].age = 0;
6445 stats[nodeA].age = 0;
6446 stats[nodeB].age = 0;
6447 stats[nodeC].age = 0;
6448 stats[nodeD].age = 0;
6450 stats[node].delta = criteria[choice] - criteria[ABvsCD]; /* 0 if ABvsCD */
6451 if (stats[node].delta > *dMaxDelta)
6452 *dMaxDelta = stats[node].delta;
6454 /* support is improvement of score for self over better of alternatives */
6455 stats[node].support = 1e20;
6456 for (i = 0; i < 3; i++)
6457 if (choice != i && criteria[choice]-criteria[i] < stats[node].support)
6458 stats[node].support = criteria[choice]-criteria[i];
6460 /* subtreeAge is the number of rounds since self or descendent had a significant improvement */
6461 if (stats[node].delta > supportThreshold)
6462 stats[node].subtreeAge = 0;
6464 stats[node].subtreeAge++;
6465 for (i = 0; i < 2; i++) {
6466 int child = NJ->child[node].child[i];
6467 if (stats[node].subtreeAge > stats[child].subtreeAge)
6468 stats[node].subtreeAge = stats[child].subtreeAge;
6472 /* update profiles and free up unneeded up-profiles */
6473 if (choice == ABvsCD) {
6474 /* No longer needed */
6475 DeleteUpProfile(upProfiles, NJ, nodeA);
6476 DeleteUpProfile(upProfiles, NJ, nodeB);
6477 DeleteUpProfile(upProfiles, NJ, nodeC);
6478 RecomputeProfile(/*IN/OUT*/NJ, /*IN/OUT*/upProfiles, node, useML);
6480 UpdateForNNI(NJ, node, upProfiles, useML);
6482 UpdateForNNI(NJ, node, upProfiles, useML);
6484 if(verbose > 2 && slow && useML) {
6485 /* Note we recomputed profiles back up to root already if slow */
6486 PrintNJInternal(/*WRITE*/stderr, NJ, /*useLen*/true);
6487 fprintf(stderr, "New tree lk -- %.4f\n", TreeLogLk(NJ, /*site_likelihoods*/NULL));
6489 } /* end postorder traversal */
6490 traversal = FreeTraversal(traversal,NJ);
6493 for (i = 0; i < NJ->maxnodes; i++)
6494 if (upProfiles[i] != NULL)
6496 fprintf(stderr, "N up profiles at end of NNI: %d\n", nUp);
6498 upProfiles = FreeUpProfiles(upProfiles,NJ);
6499 return(nNNIThisRound);
6502 nni_stats_t *InitNNIStats(NJ_t *NJ) {
6503 nni_stats_t *stats = mymalloc(sizeof(nni_stats_t)*NJ->maxnode);
6504 const int LargeAge = 1000000;
6506 for (i = 0; i < NJ->maxnode; i++) {
6508 stats[i].support = 0;
6509 if (i == NJ->root || i < NJ->nSeq) {
6510 stats[i].age = LargeAge;
6511 stats[i].subtreeAge = LargeAge;
6514 stats[i].subtreeAge = 0;
6520 nni_stats_t *FreeNNIStats(nni_stats_t *stats, NJ_t *NJ) {
6521 return(myfree(stats, sizeof(nni_stats_t)*NJ->maxnode));
6524 int FindSPRSteps(/*IN/OUT*/NJ_t *NJ,
6525 int nodeMove, /* the node to move multiple times */
6526 int nodeAround, /* sibling or parent of node to NNI to start the chain */
6527 /*IN/OUT*/profile_t **upProfiles,
6528 /*OUT*/spr_step_t *steps,
6532 for (iStep = 0; iStep < maxSteps; iStep++) {
6533 if (NJ->child[nodeAround].nChild != 2)
6534 break; /* no further to go */
6536 /* Consider the NNIs around nodeAround */
6537 profile_t *profiles[4];
6539 SetupABCD(NJ, nodeAround, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/false);
6541 (void) ChooseNNI(profiles, NJ->distance_matrix, NJ->nPos, NJ->nConstraints,
6544 /* Do & save the swap */
6545 spr_step_t *step = &steps[iStep];
6546 if (iStep == 0 ? bFirstAC : criteria[ACvsBD] < criteria[ADvsBC]) {
6547 /* swap B & C to put AC together */
6548 step->deltaLength = criteria[ACvsBD] - criteria[ABvsCD];
6549 step->nodes[0] = nodeABCD[1];
6550 step->nodes[1] = nodeABCD[2];
6552 /* swap AC to put AD together */
6553 step->deltaLength = criteria[ADvsBC] - criteria[ABvsCD];
6554 step->nodes[0] = nodeABCD[0];
6555 step->nodes[1] = nodeABCD[2];
6559 fprintf(stderr, "SPR chain step %d for %d around %d swap %d %d deltaLen %.5f\n",
6560 iStep+1, nodeAround, nodeMove, step->nodes[0], step->nodes[1], step->deltaLength);
6562 PrintNJInternal(stderr, NJ, /*useLen*/false);
6564 ReplaceChild(/*IN/OUT*/NJ, nodeAround, step->nodes[0], step->nodes[1]);
6565 ReplaceChild(/*IN/OUT*/NJ, NJ->parent[nodeAround], step->nodes[1], step->nodes[0]);
6566 UpdateForNNI(/*IN/OUT*/NJ, nodeAround, /*IN/OUT*/upProfiles, /*useML*/false);
6568 /* set the new nodeAround -- either parent(nodeMove) or sibling(nodeMove) --
6569 so that it different from current nodeAround
6571 int newAround[2] = { NJ->parent[nodeMove], Sibling(NJ, nodeMove) };
6572 if (NJ->parent[nodeMove] == NJ->root)
6573 RootSiblings(NJ, nodeMove, /*OUT*/newAround);
6574 assert(newAround[0] == nodeAround || newAround[1] == nodeAround);
6575 assert(newAround[0] != newAround[1]);
6576 nodeAround = newAround[newAround[0] == nodeAround ? 1 : 0];
6581 void UnwindSPRStep(/*IN/OUT*/NJ_t *NJ,
6582 /*IN*/spr_step_t *step,
6583 /*IN/OUT*/profile_t **upProfiles) {
6586 for (i = 0; i < 2; i++) {
6587 assert(step->nodes[i] >= 0 && step->nodes[i] < NJ->maxnodes);
6588 parents[i] = NJ->parent[step->nodes[i]];
6589 assert(parents[i] >= 0);
6591 assert(parents[0] != parents[1]);
6592 ReplaceChild(/*IN/OUT*/NJ, parents[0], step->nodes[0], step->nodes[1]);
6593 ReplaceChild(/*IN/OUT*/NJ, parents[1], step->nodes[1], step->nodes[0]);
6595 if (NJ->parent[parents[0]] == parents[1]) {
6598 assert(NJ->parent[parents[1]] == parents[0]);
6601 UpdateForNNI(/*IN/OUT*/NJ, parents[iYounger], /*IN/OUT*/upProfiles, /*useML*/false);
6604 /* Update the profile of node and its ancestor, and delete nearby out-profiles */
6605 void UpdateForNNI(/*IN/OUT*/NJ_t *NJ, int node, /*IN/OUT*/profile_t **upProfiles,
6609 /* exhaustive update */
6610 for (i = 0; i < NJ->maxnodes; i++)
6611 DeleteUpProfile(upProfiles, NJ, i);
6613 /* update profiles back to root */
6615 for (ancestor = node; ancestor >= 0; ancestor = NJ->parent[ancestor])
6616 RecomputeProfile(/*IN/OUT*/NJ, upProfiles, ancestor, useML);
6618 /* remove any up-profiles made while doing that*/
6619 for (i = 0; i < NJ->maxnodes; i++)
6620 DeleteUpProfile(upProfiles, NJ, i);
6622 /* if fast, only update around self
6623 note that upProfile(parent) is still OK after an NNI, but
6624 up-profiles of uncles may not be
6626 DeleteUpProfile(upProfiles, NJ, node);
6627 for (i = 0; i < NJ->child[node].nChild; i++)
6628 DeleteUpProfile(upProfiles, NJ, NJ->child[node].child[i]);
6629 assert(node != NJ->root);
6630 int parent = NJ->parent[node];
6631 int neighbors[2] = { parent, Sibling(NJ, node) };
6632 if (parent == NJ->root)
6633 RootSiblings(NJ, node, /*OUT*/neighbors);
6634 DeleteUpProfile(upProfiles, NJ, neighbors[0]);
6635 DeleteUpProfile(upProfiles, NJ, neighbors[1]);
6636 int uncle = Sibling(NJ, parent);
6638 DeleteUpProfile(upProfiles, NJ, uncle);
6639 RecomputeProfile(/*IN/OUT*/NJ, upProfiles, node, useML);
6640 RecomputeProfile(/*IN/OUT*/NJ, upProfiles, parent, useML);
6644 void SPR(/*IN/OUT*/NJ_t *NJ, int maxSPRLength, int iRound, int nRounds) {
6645 /* Given a non-root node N with children A,B, sibling C, and uncle D,
6646 we can try to move A by doing three types of moves (4 choices):
6647 "down" -- swap A with a child of B (if B is not a leaf) [2 choices]
6648 "over" -- swap B with C
6649 "up" -- swap A with D
6650 We follow down moves with down moves, over moves with down moves, and
6651 up moves with either up or over moves. (Other choices are just backing
6652 up and hence useless.)
6654 As with NNIs, we keep track of up-profiles as we go. However, some of the regular
6655 profiles may also become "stale" so it is a bit trickier.
6657 We store the traversal before we do SPRs to avoid any possible infinite loop
6659 double last_tot_len = 0.0;
6660 if (NJ->nSeq <= 3 || maxSPRLength < 1)
6663 last_tot_len = TreeLength(NJ, /*recomputeLengths*/true);
6664 int *nodeList = mymalloc(sizeof(int) * NJ->maxnodes);
6665 int nodeListLen = 0;
6666 traversal_t traversal = InitTraversal(NJ);
6667 int node = NJ->root;
6668 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6669 nodeList[nodeListLen++] = node;
6671 assert(nodeListLen == NJ->maxnode);
6672 traversal = FreeTraversal(traversal,NJ);
6674 profile_t **upProfiles = UpProfiles(NJ);
6675 spr_step_t *steps = mymalloc(sizeof(spr_step_t) * maxSPRLength); /* current chain of SPRs */
6678 for (i = 0; i < nodeListLen; i++) {
6681 ProgressReport("SPR round %3d of %3d, %d of %d nodes",
6682 iRound+1, nRounds, i+1, nodeListLen);
6683 if (node == NJ->root)
6684 continue; /* nothing to do for root */
6685 /* The nodes to NNI around */
6686 int nodeAround[2] = { NJ->parent[node], Sibling(NJ, node) };
6687 if (NJ->parent[node] == NJ->root) {
6688 /* NNI around both siblings instead */
6689 RootSiblings(NJ, node, /*OUT*/nodeAround);
6691 bool bChanged = false;
6693 for (iAround = 0; iAround < 2 && bChanged == false; iAround++) {
6695 for (ACFirst = 0; ACFirst < 2 && bChanged == false; ACFirst++) {
6697 PrintNJInternal(stderr, NJ, /*useLen*/false);
6698 int chainLength = FindSPRSteps(/*IN/OUT*/NJ, node, nodeAround[iAround],
6699 upProfiles, /*OUT*/steps, maxSPRLength, (bool)ACFirst);
6700 double dMinDelta = 0.0;
6702 double dTotDelta = 0.0;
6704 for (iC = 0; iC < chainLength; iC++) {
6705 dTotDelta += steps[iC].deltaLength;
6706 if (dTotDelta < dMinDelta) {
6707 dMinDelta = dTotDelta;
6713 fprintf(stderr, "SPR %s %d around %d chainLength %d of %d deltaLength %.5f swaps:",
6714 iCBest >= 0 ? "move" : "abandoned",
6715 node,nodeAround[iAround],iCBest+1,chainLength,dMinDelta);
6716 for (iC = 0; iC < chainLength; iC++)
6717 fprintf(stderr, " (%d,%d)%.4f", steps[iC].nodes[0], steps[iC].nodes[1], steps[iC].deltaLength);
6718 fprintf(stderr,"\n");
6720 for (iC = chainLength - 1; iC > iCBest; iC--)
6721 UnwindSPRStep(/*IN/OUT*/NJ, /*IN*/&steps[iC], /*IN/OUT*/upProfiles);
6723 PrintNJInternal(stderr, NJ, /*useLen*/false);
6724 while (slow && iCBest >= 0) {
6725 double expected_tot_len = last_tot_len + dMinDelta;
6726 double new_tot_len = TreeLength(NJ, /*recompute*/true);
6728 fprintf(stderr, "Total branch-length is now %.4f was %.4f expected %.4f\n",
6729 new_tot_len, last_tot_len, expected_tot_len);
6730 if (new_tot_len < last_tot_len) {
6731 last_tot_len = new_tot_len;
6732 break; /* no rewinding necessary */
6735 fprintf(stderr, "Rewinding SPR to %d\n",iCBest);
6736 UnwindSPRStep(/*IN/OUT*/NJ, /*IN*/&steps[iCBest], /*IN/OUT*/upProfiles);
6737 dMinDelta -= steps[iCBest].deltaLength;
6742 } /* loop over which step to take at 1st NNI */
6743 } /* loop over which node to pivot around */
6746 nSPR++; /* the SPR move is OK */
6747 /* make sure all the profiles are OK */
6749 for (j = 0; j < NJ->maxnodes; j++)
6750 DeleteUpProfile(upProfiles, NJ, j);
6752 for (ancestor = NJ->parent[node]; ancestor >= 0; ancestor = NJ->parent[ancestor])
6753 RecomputeProfile(/*IN/OUT*/NJ, upProfiles, ancestor, /*useML*/false);
6755 } /* end loop over subtrees to prune & regraft */
6756 steps = myfree(steps, sizeof(spr_step_t) * maxSPRLength);
6757 upProfiles = FreeUpProfiles(upProfiles,NJ);
6758 nodeList = myfree(nodeList, sizeof(int) * NJ->maxnodes);
6761 void RecomputeProfile(/*IN/OUT*/NJ_t *NJ, /*IN/OUT*/profile_t **upProfiles, int node,
6763 if (node < NJ->nSeq || node == NJ->root)
6764 return; /* no profile to compute */
6765 assert(NJ->child[node].nChild==2);
6767 profile_t *profiles[4];
6768 double weight = 0.5;
6769 if (useML || !bionj) {
6770 profiles[0] = NJ->profiles[NJ->child[node].child[0]];
6771 profiles[1] = NJ->profiles[NJ->child[node].child[1]];
6774 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, useML);
6775 weight = QuartetWeight(profiles, NJ->distance_matrix, NJ->nPos);
6779 fprintf(stderr, "Recompute %d from %d %d lengths %.4f %.4f\n",
6781 NJ->child[node].child[0],
6782 NJ->child[node].child[1],
6783 NJ->branchlength[NJ->child[node].child[0]],
6784 NJ->branchlength[NJ->child[node].child[1]]);
6786 fprintf(stderr, "Recompute %d from %d %d weight %.3f\n",
6787 node, NJ->child[node].child[0], NJ->child[node].child[1], weight);
6790 NJ->profiles[node] = FreeProfile(NJ->profiles[node], NJ->nPos, NJ->nConstraints);
6792 NJ->profiles[node] = PosteriorProfile(profiles[0], profiles[1],
6793 NJ->branchlength[NJ->child[node].child[0]],
6794 NJ->branchlength[NJ->child[node].child[1]],
6795 NJ->transmat, &NJ->rates, NJ->nPos, NJ->nConstraints);
6797 NJ->profiles[node] = AverageProfile(profiles[0], profiles[1],
6798 NJ->nPos, NJ->nConstraints,
6799 NJ->distance_matrix, weight);
6803 /* The BIONJ-like formula for the weight of A when building a profile for AB is
6804 1/2 + (avgD(B,CD) - avgD(A,CD))/(2*d(A,B))
6806 double QuartetWeight(profile_t *profiles[4], distance_matrix_t *dmat, int nPos) {
6808 return(-1.0); /* even weighting */
6810 CorrectedPairDistances(profiles, 4, dmat, nPos, /*OUT*/d);
6813 double weight = 0.5 + ((d[qBC]+d[qBD])-(d[qAC]+d[qAD]))/(4*d[qAB]);
6821 /* Resets the children entry of parent and also the parent entry of newchild */
6822 void ReplaceChild(/*IN/OUT*/NJ_t *NJ, int parent, int oldchild, int newchild) {
6823 NJ->parent[newchild] = parent;
6826 for (iChild = 0; iChild < NJ->child[parent].nChild; iChild++) {
6827 if (NJ->child[parent].child[iChild] == oldchild) {
6828 NJ->child[parent].child[iChild] = newchild;
6835 /* Recomputes all branch lengths
6837 For internal branches such as (A,B) vs. (C,D), uses the formula
6839 length(AB|CD) = (d(A,C)+d(A,D)+d(B,C)+d(B,D))/4 - d(A,B)/2 - d(C,D)/2
6841 (where all distances are profile distances - diameters).
6843 For external branches (e.g. to leaves) A vs. (B,C), use the formula
6845 length(A|BC) = (d(A,B)+d(A,C)-d(B,C))/2
6847 void UpdateBranchLengths(/*IN/OUT*/NJ_t *NJ) {
6850 else if (NJ->nSeq == 2) {
6851 int root = NJ->root;
6852 int nodeA = NJ->child[root].child[0];
6853 int nodeB = NJ->child[root].child[1];
6855 ProfileDist(NJ->profiles[nodeA],NJ->profiles[nodeB],
6856 NJ->nPos, NJ->distance_matrix, /*OUT*/&h);
6858 h.dist = LogCorrect(h.dist);
6859 NJ->branchlength[nodeA] = h.dist/2.0;
6860 NJ->branchlength[nodeB] = h.dist/2.0;
6864 profile_t **upProfiles = UpProfiles(NJ);
6865 traversal_t traversal = InitTraversal(NJ);
6866 int node = NJ->root;
6868 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6869 /* reset branch length of node (distance to its parent) */
6870 if (node == NJ->root)
6871 continue; /* no branch length to set */
6872 if (node < NJ->nSeq) { /* a leaf */
6873 profile_t *profileA = NJ->profiles[node];
6874 profile_t *profileB = NULL;
6875 profile_t *profileC = NULL;
6877 int sib = Sibling(NJ,node);
6878 if (sib == -1) { /* at root, have 2 siblings */
6880 RootSiblings(NJ, node, /*OUT*/sibs);
6881 profileB = NJ->profiles[sibs[0]];
6882 profileC = NJ->profiles[sibs[1]];
6884 profileB = NJ->profiles[sib];
6885 profileC = GetUpProfile(/*IN/OUT*/upProfiles, NJ, NJ->parent[node], /*useML*/false);
6887 profile_t *profiles[3] = {profileA,profileB,profileC};
6888 double d[3]; /*AB,AC,BC*/
6889 CorrectedPairDistances(profiles, 3, NJ->distance_matrix, NJ->nPos, /*OUT*/d);
6890 /* d(A,BC) = (dAB+dAC-dBC)/2 */
6891 NJ->branchlength[node] = (d[0]+d[1]-d[2])/2.0;
6893 profile_t *profiles[4];
6895 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/false);
6897 CorrectedPairDistances(profiles, 4, NJ->distance_matrix, NJ->nPos, /*OUT*/d);
6898 NJ->branchlength[node] = (d[qAC]+d[qAD]+d[qBC]+d[qBD])/4.0 - (d[qAB]+d[qCD])/2.0;
6900 /* no longer needed */
6901 DeleteUpProfile(upProfiles, NJ, nodeABCD[0]);
6902 DeleteUpProfile(upProfiles, NJ, nodeABCD[1]);
6905 traversal = FreeTraversal(traversal,NJ);
6906 upProfiles = FreeUpProfiles(upProfiles,NJ);
6909 /* Pick columns for resampling, stored as returned_vector[iBoot*nPos + j] */
6910 int *ResampleColumns(int nPos, int nBootstrap) {
6911 long lPos = nPos; /* to prevent overflow on very long alignments when multiplying nPos * nBootstrap */
6912 int *col = (int*)mymalloc(sizeof(int)*lPos*(size_t)nBootstrap);
6914 for (i = 0; i < nBootstrap; i++) {
6916 for (j = 0; j < nPos; j++) {
6917 int pos = (int)(knuth_rand() * nPos);
6920 else if (pos == nPos)
6922 col[i*lPos + j] = pos;
6926 for (i=0; i < 3 && i < nBootstrap; i++) {
6927 fprintf(stderr,"Boot%d",i);
6929 for (j = 0; j < nPos; j++) {
6930 fprintf(stderr,"\t%d",col[i*lPos+j]);
6932 fprintf(stderr,"\n");
6938 void ReliabilityNJ(/*IN/OUT*/NJ_t *NJ, int nBootstrap) {
6939 /* For each non-root node N, with children A,B, parent P, sibling C, and grandparent G,
6940 we test the reliability of the split (A,B) versus rest by comparing the profiles
6941 of A, B, C, and the "up-profile" of P.
6943 Each node's upProfile is the average of its sibling's (down)-profile + its parent's up-profile
6944 (If node's parent is the root, then there are two siblings and we don't need an up-profile)
6946 To save memory, we do depth-first-search down from the root, and we only keep
6947 up-profiles for nodes in the active path.
6949 if (NJ->nSeq <= 3 || nBootstrap <= 0)
6950 return; /* nothing to do */
6951 int *col = ResampleColumns(NJ->nPos, nBootstrap);
6953 profile_t **upProfiles = UpProfiles(NJ);
6954 traversal_t traversal = InitTraversal(NJ);
6955 int node = NJ->root;
6957 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6958 if (node < NJ->nSeq || node == NJ->root)
6959 continue; /* nothing to do for leaves or root */
6961 if(iNodesDone > 0 && (iNodesDone % 100) == 0)
6962 ProgressReport("Local bootstrap for %6d of %6d internal splits", iNodesDone, NJ->nSeq-3, 0, 0);
6965 profile_t *profiles[4];
6967 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/false);
6969 NJ->support[node] = SplitSupport(profiles[0], profiles[1], profiles[2], profiles[3],
6970 NJ->distance_matrix,
6975 /* no longer needed */
6976 DeleteUpProfile(upProfiles, NJ, nodeABCD[0]);
6977 DeleteUpProfile(upProfiles, NJ, nodeABCD[1]);
6978 DeleteUpProfile(upProfiles, NJ, nodeABCD[2]);
6980 traversal = FreeTraversal(traversal,NJ);
6981 upProfiles = FreeUpProfiles(upProfiles,NJ);
6982 col = myfree(col, sizeof(int)*((size_t)NJ->nPos)*nBootstrap);
6985 profile_t *NewProfile(int nPos, int nConstraints) {
6986 profile_t *profile = (profile_t *)mymalloc(sizeof(profile_t));
6987 profile->weights = mymalloc(sizeof(numeric_t)*nPos);
6988 profile->codes = mymalloc(sizeof(unsigned char)*nPos);
6989 profile->vectors = NULL;
6990 profile->nVectors = 0;
6991 profile->codeDist = NULL;
6992 if (nConstraints == 0) {
6993 profile->nOn = NULL;
6994 profile->nOff = NULL;
6996 profile->nOn = mymalloc(sizeof(int)*nConstraints);
6997 profile->nOff = mymalloc(sizeof(int)*nConstraints);
7002 profile_t *FreeProfile(profile_t *profile, int nPos, int nConstraints) {
7003 if(profile==NULL) return(NULL);
7004 myfree(profile->codes, nPos);
7005 myfree(profile->weights, nPos);
7006 myfree(profile->vectors, sizeof(numeric_t)*nCodes*profile->nVectors);
7007 myfree(profile->codeDist, sizeof(numeric_t)*nCodes*nPos);
7008 if (nConstraints > 0) {
7009 myfree(profile->nOn, sizeof(int)*nConstraints);
7010 myfree(profile->nOff, sizeof(int)*nConstraints);
7012 return(myfree(profile, sizeof(profile_t)));
7015 void SetupABCD(NJ_t *NJ, int node,
7016 /* the 4 profiles; the last one is an outprofile */
7017 /*OPTIONAL OUT*/profile_t *profiles[4],
7018 /*OPTIONAL IN/OUT*/profile_t **upProfiles,
7019 /*OUT*/int nodeABCD[4],
7021 int parent = NJ->parent[node];
7022 assert(parent >= 0);
7023 assert(NJ->child[node].nChild == 2);
7024 nodeABCD[0] = NJ->child[node].child[0]; /*A*/
7025 nodeABCD[1] = NJ->child[node].child[1]; /*B*/
7027 profile_t *profile4 = NULL;
7028 if (parent == NJ->root) {
7030 RootSiblings(NJ, node, /*OUT*/sibs);
7031 nodeABCD[2] = sibs[0];
7032 nodeABCD[3] = sibs[1];
7033 if (profiles == NULL)
7035 profile4 = NJ->profiles[sibs[1]];
7037 nodeABCD[2] = Sibling(NJ,node);
7038 assert(nodeABCD[2] >= 0);
7039 nodeABCD[3] = parent;
7040 if (profiles == NULL)
7042 profile4 = GetUpProfile(upProfiles,NJ,parent,useML);
7044 assert(upProfiles != NULL);
7046 for (i = 0; i < 3; i++)
7047 profiles[i] = NJ->profiles[nodeABCD[i]];
7048 profiles[3] = profile4;
7052 int Sibling(NJ_t *NJ, int node) {
7053 int parent = NJ->parent[node];
7054 if (parent < 0 || parent == NJ->root)
7057 for(iChild=0;iChild<NJ->child[parent].nChild;iChild++) {
7058 if(NJ->child[parent].child[iChild] != node)
7059 return (NJ->child[parent].child[iChild]);
7065 void RootSiblings(NJ_t *NJ, int node, /*OUT*/int sibs[2]) {
7066 assert(NJ->parent[node] == NJ->root);
7067 assert(NJ->child[NJ->root].nChild == 3);
7071 for(iChild=0; iChild < NJ->child[NJ->root].nChild; iChild++) {
7072 int child = NJ->child[NJ->root].child[iChild];
7073 if (child != node) sibs[nSibs++] = child;
7078 void TestSplitsML(/*IN/OUT*/NJ_t *NJ, /*OUT*/SplitCount_t *splitcount, int nBootstrap) {
7079 const double tolerance = 1e-6;
7080 splitcount->nBadSplits = 0;
7081 splitcount->nConstraintViolations = 0;
7082 splitcount->nBadBoth = 0;
7083 splitcount->nSplits = 0;
7084 splitcount->dWorstDeltaUnconstrained = 0;
7085 splitcount->dWorstDeltaConstrained = 0;
7087 profile_t **upProfiles = UpProfiles(NJ);
7088 traversal_t traversal = InitTraversal(NJ);
7089 int node = NJ->root;
7091 int *col = nBootstrap > 0 ? ResampleColumns(NJ->nPos, nBootstrap) : NULL;
7092 double *site_likelihoods[3];
7094 for (choice = 0; choice < 3; choice++)
7095 site_likelihoods[choice] = mymalloc(sizeof(double)*NJ->nPos);
7098 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
7099 if (node < NJ->nSeq || node == NJ->root)
7100 continue; /* nothing to do for leaves or root */
7102 if(iNodesDone > 0 && (iNodesDone % 100) == 0)
7103 ProgressReport("ML split tests for %6d of %6d internal splits", iNodesDone, NJ->nSeq-3, 0, 0);
7106 profile_t *profiles[4];
7108 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/true);
7112 for (i = 0; i < 4; i++)
7113 len[i] = NJ->branchlength[nodeABCD[i]];
7114 len[4] = NJ->branchlength[node];
7115 double lenABvsCD[5] = {len[LEN_A], len[LEN_B], len[LEN_C], len[LEN_D], len[LEN_I]};
7116 double lenACvsBD[5] = {len[LEN_A], len[LEN_C], len[LEN_B], len[LEN_D], len[LEN_I]}; /* Swap B & C */
7117 double lenADvsBC[5] = {len[LEN_A], len[LEN_D], len[LEN_C], len[LEN_B], len[LEN_I]}; /* Swap B & D */
7121 #pragma omp parallel
7122 #pragma omp sections
7129 /* Lengths are already optimized for ABvsCD */
7130 loglk[ABvsCD] = MLQuartetLogLk(profiles[0], profiles[1], profiles[2], profiles[3],
7131 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenABvsCD,
7132 /*OUT*/site_likelihoods[ABvsCD]);
7139 loglk[ACvsBD] = MLQuartetOptimize(profiles[0], profiles[2], profiles[1], profiles[3],
7140 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenACvsBD, /*pStarTest*/NULL,
7141 /*OUT*/site_likelihoods[ACvsBD]);
7148 loglk[ADvsBC] = MLQuartetOptimize(profiles[0], profiles[3], profiles[2], profiles[1],
7149 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenADvsBC, /*pStarTest*/NULL,
7150 /*OUT*/site_likelihoods[ADvsBC]);
7155 /* do a second pass on the better alternative if it is close */
7156 if (loglk[ACvsBD] > loglk[ADvsBC]) {
7157 if (mlAccuracy > 1 || loglk[ACvsBD] > loglk[ABvsCD] - closeLogLkLimit) {
7158 loglk[ACvsBD] = MLQuartetOptimize(profiles[0], profiles[2], profiles[1], profiles[3],
7159 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenACvsBD, /*pStarTest*/NULL,
7160 /*OUT*/site_likelihoods[ACvsBD]);
7163 if (mlAccuracy > 1 || loglk[ADvsBC] > loglk[ABvsCD] - closeLogLkLimit) {
7164 loglk[ADvsBC] = MLQuartetOptimize(profiles[0], profiles[3], profiles[2], profiles[1],
7165 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenADvsBC, /*pStarTest*/NULL,
7166 /*OUT*/site_likelihoods[ADvsBC]);
7170 if (loglk[ABvsCD] >= loglk[ACvsBD] && loglk[ABvsCD] >= loglk[ADvsBC])
7172 else if (loglk[ACvsBD] >= loglk[ABvsCD] && loglk[ACvsBD] >= loglk[ADvsBC])
7176 bool badSplit = loglk[choice] > loglk[ABvsCD] + treeLogLkDelta; /* ignore small changes in likelihood */
7178 /* constraint penalties, indexed by nni_t (lower is better) */
7180 QuartetConstraintPenalties(profiles, NJ->nConstraints, /*OUT*/p);
7181 bool bBadConstr = p[ABvsCD] > p[ACvsBD] + tolerance || p[ABvsCD] > p[ADvsBC] + tolerance;
7182 bool violateConstraint = false;
7184 for (iC=0; iC < NJ->nConstraints; iC++) {
7185 if (SplitViolatesConstraint(profiles, iC)) {
7186 violateConstraint = true;
7190 splitcount->nSplits++;
7191 if (violateConstraint)
7192 splitcount->nConstraintViolations++;
7194 splitcount->nBadSplits++;
7195 if (badSplit && bBadConstr)
7196 splitcount->nBadBoth++;
7198 double delta = loglk[choice] - loglk[ABvsCD];
7199 /* If ABvsCD is favored over the more likely NNI by constraints,
7200 then this is probably a bad split because of the constraint */
7201 if (p[choice] > p[ABvsCD] + tolerance)
7202 splitcount->dWorstDeltaConstrained = MAX(delta, splitcount->dWorstDeltaConstrained);
7204 splitcount->dWorstDeltaUnconstrained = MAX(delta, splitcount->dWorstDeltaUnconstrained);
7207 NJ->support[node] = badSplit ? 0.0 : SHSupport(NJ->nPos, nBootstrap, col, loglk, site_likelihoods);
7209 /* No longer needed */
7210 DeleteUpProfile(upProfiles, NJ, nodeABCD[0]);
7211 DeleteUpProfile(upProfiles, NJ, nodeABCD[1]);
7212 DeleteUpProfile(upProfiles, NJ, nodeABCD[2]);
7214 traversal = FreeTraversal(traversal,NJ);
7215 upProfiles = FreeUpProfiles(upProfiles,NJ);
7217 col = myfree(col, sizeof(int)*((size_t)NJ->nPos)*nBootstrap);
7218 for (choice = 0; choice < 3; choice++)
7219 site_likelihoods[choice] = myfree(site_likelihoods[choice], sizeof(double)*NJ->nPos);
7223 void TestSplitsMinEvo(NJ_t *NJ, /*OUT*/SplitCount_t *splitcount) {
7224 const double tolerance = 1e-6;
7225 splitcount->nBadSplits = 0;
7226 splitcount->nConstraintViolations = 0;
7227 splitcount->nBadBoth = 0;
7228 splitcount->nSplits = 0;
7229 splitcount->dWorstDeltaUnconstrained = 0.0;
7230 splitcount->dWorstDeltaConstrained = 0.0;
7232 profile_t **upProfiles = UpProfiles(NJ);
7233 traversal_t traversal = InitTraversal(NJ);
7234 int node = NJ->root;
7236 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
7237 if (node < NJ->nSeq || node == NJ->root)
7238 continue; /* nothing to do for leaves or root */
7240 profile_t *profiles[4];
7242 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/false);
7245 fprintf(stderr,"Testing Split around %d: A=%d B=%d C=%d D=up(%d) or node parent %d\n",
7246 node, nodeABCD[0], nodeABCD[1], nodeABCD[2], nodeABCD[3], NJ->parent[node]);
7248 double d[6]; /* distances, perhaps log-corrected distances, no constraint penalties */
7249 CorrectedPairDistances(profiles, 4, NJ->distance_matrix, NJ->nPos, /*OUT*/d);
7251 /* alignment-based scores for each split (lower is better) */
7252 double sABvsCD = d[qAB] + d[qCD];
7253 double sACvsBD = d[qAC] + d[qBD];
7254 double sADvsBC = d[qAD] + d[qBC];
7256 /* constraint penalties, indexed by nni_t (lower is better) */
7258 QuartetConstraintPenalties(profiles, NJ->nConstraints, /*OUT*/p);
7260 int nConstraintsViolated = 0;
7262 for (iC=0; iC < NJ->nConstraints; iC++) {
7263 if (SplitViolatesConstraint(profiles, iC)) {
7264 nConstraintsViolated++;
7266 double penalty[3] = {0.0,0.0,0.0};
7267 (void)QuartetConstraintPenaltiesPiece(profiles, iC, /*OUT*/penalty);
7268 fprintf(stderr, "Violate constraint %d at %d (children %d %d) penalties %.3f %.3f %.3f %d/%d %d/%d %d/%d %d/%d\n",
7269 iC, node, NJ->child[node].child[0], NJ->child[node].child[1],
7270 penalty[ABvsCD], penalty[ACvsBD], penalty[ADvsBC],
7271 profiles[0]->nOn[iC], profiles[0]->nOff[iC],
7272 profiles[1]->nOn[iC], profiles[1]->nOff[iC],
7273 profiles[2]->nOn[iC], profiles[2]->nOff[iC],
7274 profiles[3]->nOn[iC], profiles[3]->nOff[iC]);
7279 double delta = sABvsCD - MIN(sACvsBD,sADvsBC);
7280 bool bBadDist = delta > tolerance;
7281 bool bBadConstr = p[ABvsCD] > p[ACvsBD] + tolerance || p[ABvsCD] > p[ADvsBC] + tolerance;
7283 splitcount->nSplits++;
7285 nni_t choice = sACvsBD < sADvsBC ? ACvsBD : ADvsBC;
7286 /* If ABvsCD is favored over the shorter NNI by constraints,
7287 then this is probably a bad split because of the constraint */
7288 if (p[choice] > p[ABvsCD] + tolerance)
7289 splitcount->dWorstDeltaConstrained = MAX(delta, splitcount->dWorstDeltaConstrained);
7291 splitcount->dWorstDeltaUnconstrained = MAX(delta, splitcount->dWorstDeltaUnconstrained);
7294 if (nConstraintsViolated > 0)
7295 splitcount->nConstraintViolations++; /* count splits with any violations, not #constraints in a splits */
7297 splitcount->nBadSplits++;
7298 if (bBadDist && bBadConstr)
7299 splitcount->nBadBoth++;
7300 if (bBadConstr && verbose > 2) {
7301 /* Which NNI would be better */
7302 double dist_advantage = 0;
7303 double constraint_penalty = 0;
7304 if (p[ACvsBD] < p[ADvsBC]) {
7305 dist_advantage = sACvsBD - sABvsCD;
7306 constraint_penalty = p[ABvsCD] - p[ACvsBD];
7308 dist_advantage = sADvsBC - sABvsCD;
7309 constraint_penalty = p[ABvsCD] - p[ADvsBC];
7311 fprintf(stderr, "Violate constraints %d distance_advantage %.3f constraint_penalty %.3f (children %d %d):",
7312 node, dist_advantage, constraint_penalty,
7313 NJ->child[node].child[0], NJ->child[node].child[1]);
7314 /* list the constraints with a penalty, meaning that ABCD all have non-zero
7315 values and that AB|CD worse than others */
7316 for (iC = 0; iC < NJ->nConstraints; iC++) {
7318 if (QuartetConstraintPenaltiesPiece(profiles, iC, /*OUT*/ppart)) {
7319 if (ppart[qAB] + ppart[qCD] > ppart[qAD] + ppart[qBC] + tolerance
7320 || ppart[qAB] + ppart[qCD] > ppart[qAC] + ppart[qBD] + tolerance)
7321 fprintf(stderr, " %d (%d/%d %d/%d %d/%d %d/%d)", iC,
7322 profiles[0]->nOn[iC], profiles[0]->nOff[iC],
7323 profiles[1]->nOn[iC], profiles[1]->nOff[iC],
7324 profiles[2]->nOn[iC], profiles[2]->nOff[iC],
7325 profiles[3]->nOn[iC], profiles[3]->nOff[iC]);
7328 fprintf(stderr, "\n");
7331 /* no longer needed */
7332 DeleteUpProfile(upProfiles, NJ, nodeABCD[0]);
7333 DeleteUpProfile(upProfiles, NJ, nodeABCD[1]);
7335 traversal = FreeTraversal(traversal,NJ);
7336 upProfiles = FreeUpProfiles(upProfiles,NJ);
7339 /* Computes support for (A,B),(C,D) compared to that for (A,C),(B,D) and (A,D),(B,C) */
7340 double SplitSupport(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
7341 /*OPTIONAL*/distance_matrix_t *dmat,
7346 long lPos = nPos; /* to avoid overflow when multiplying */
7348 /* Note distpieces are weighted */
7349 double *distpieces[6];
7351 for (j = 0; j < 6; j++) {
7352 distpieces[j] = (double*)mymalloc(sizeof(double)*nPos);
7353 weights[j] = (double*)mymalloc(sizeof(double)*nPos);
7360 for (i = 0; i < nPos; i++) {
7361 numeric_t *fA = GET_FREQ(pA, i, /*IN/OUT*/iFreqA);
7362 numeric_t *fB = GET_FREQ(pB, i, /*IN/OUT*/iFreqB);
7363 numeric_t *fC = GET_FREQ(pC, i, /*IN/OUT*/iFreqC);
7364 numeric_t *fD = GET_FREQ(pD, i, /*IN/OUT*/iFreqD);
7366 weights[qAB][i] = pA->weights[i] * pB->weights[i];
7367 weights[qAC][i] = pA->weights[i] * pC->weights[i];
7368 weights[qAD][i] = pA->weights[i] * pD->weights[i];
7369 weights[qBC][i] = pB->weights[i] * pC->weights[i];
7370 weights[qBD][i] = pB->weights[i] * pD->weights[i];
7371 weights[qCD][i] = pC->weights[i] * pD->weights[i];
7373 distpieces[qAB][i] = weights[qAB][i] * ProfileDistPiece(pA->codes[i], pB->codes[i], fA, fB, dmat, NULL);
7374 distpieces[qAC][i] = weights[qAC][i] * ProfileDistPiece(pA->codes[i], pC->codes[i], fA, fC, dmat, NULL);
7375 distpieces[qAD][i] = weights[qAD][i] * ProfileDistPiece(pA->codes[i], pD->codes[i], fA, fD, dmat, NULL);
7376 distpieces[qBC][i] = weights[qBC][i] * ProfileDistPiece(pB->codes[i], pC->codes[i], fB, fC, dmat, NULL);
7377 distpieces[qBD][i] = weights[qBD][i] * ProfileDistPiece(pB->codes[i], pD->codes[i], fB, fD, dmat, NULL);
7378 distpieces[qCD][i] = weights[qCD][i] * ProfileDistPiece(pC->codes[i], pD->codes[i], fC, fD, dmat, NULL);
7380 assert(iFreqA == pA->nVectors);
7381 assert(iFreqB == pB->nVectors);
7382 assert(iFreqC == pC->nVectors);
7383 assert(iFreqD == pD->nVectors);
7385 double totpieces[6];
7386 double totweights[6];
7388 for (j = 0; j < 6; j++) {
7390 totweights[j] = 0.0;
7391 for (i = 0; i < nPos; i++) {
7392 totpieces[j] += distpieces[j][i];
7393 totweights[j] += weights[j][i];
7395 dists[j] = totweights[j] > 0.01 ? totpieces[j]/totweights[j] : 3.0;
7397 dists[j] = LogCorrect(dists[j]);
7400 /* Support1 = Support(AB|CD over AC|BD) = d(A,C)+d(B,D)-d(A,B)-d(C,D)
7401 Support2 = Support(AB|CD over AD|BC) = d(A,D)+d(B,C)-d(A,B)-d(C,D)
7403 double support1 = dists[qAC] + dists[qBD] - dists[qAB] - dists[qCD];
7404 double support2 = dists[qAD] + dists[qBC] - dists[qAB] - dists[qCD];
7406 if (support1 < 0 || support2 < 0) {
7407 nSuboptimalSplits++; /* Another split seems superior */
7410 assert(nBootstrap > 0);
7414 for (iBoot=0;iBoot<nBootstrap;iBoot++) {
7415 int *colw = &col[lPos*iBoot];
7417 for (j = 0; j < 6; j++) {
7420 double *d = distpieces[j];
7421 double *w = weights[j];
7422 for (i=0; i<nPos; i++) {
7427 dists[j] = totw > 0.01 ? totp/totw : 3.0;
7429 dists[j] = LogCorrect(dists[j]);
7431 support1 = dists[qAC] + dists[qBD] - dists[qAB] - dists[qCD];
7432 support2 = dists[qAD] + dists[qBC] - dists[qAB] - dists[qCD];
7433 if (support1 > 0 && support2 > 0)
7435 } /* end loop over bootstrap replicates */
7437 for (j = 0; j < 6; j++) {
7438 distpieces[j] = myfree(distpieces[j], sizeof(double)*nPos);
7439 weights[j] = myfree(weights[j], sizeof(double)*nPos);
7441 return( nSupport/(double)nBootstrap );
7444 double SHSupport(int nPos, int nBootstrap, int *col, double loglk[3], double *site_likelihoods[3]) {
7445 long lPos = nPos; /* to avoid overflow when multiplying */
7446 assert(nBootstrap>0);
7447 double delta1 = loglk[0]-loglk[1];
7448 double delta2 = loglk[0]-loglk[2];
7449 double delta = delta1 < delta2 ? delta1 : delta2;
7451 double *siteloglk[3];
7453 for (i = 0; i < 3; i++) {
7454 siteloglk[i] = mymalloc(sizeof(double)*nPos);
7455 for (j = 0; j < nPos; j++)
7456 siteloglk[i][j] = log(site_likelihoods[i][j]);
7461 for (iBoot = 0; iBoot < nBootstrap; iBoot++) {
7462 double resampled[3];
7463 for (i = 0; i < 3; i++)
7464 resampled[i] = -loglk[i];
7465 for (j = 0; j < nPos; j++) {
7466 int pos = col[iBoot*lPos+j];
7467 for (i = 0; i < 3; i++)
7468 resampled[i] += siteloglk[i][pos];
7471 for (i = 1; i < 3; i++)
7472 if (resampled[i] > resampled[iBest])
7474 double resample1 = resampled[iBest] - resampled[(iBest+1)%3];
7475 double resample2 = resampled[iBest] - resampled[(iBest+2)%3];
7476 double resampleDelta = resample1 < resample2 ? resample1 : resample2;
7477 if (resampleDelta < delta)
7481 siteloglk[i] = myfree(siteloglk[i], sizeof(double)*nPos);
7482 return(nSupport/(double)nBootstrap);
7486 void SetDistCriterion(/*IN/OUT*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *hit) {
7487 if (hit->i < NJ->nSeq && hit->j < NJ->nSeq) {
7488 SeqDist(NJ->profiles[hit->i]->codes,
7489 NJ->profiles[hit->j]->codes,
7490 NJ->nPos, NJ->distance_matrix, /*OUT*/hit);
7492 ProfileDist(NJ->profiles[hit->i],
7493 NJ->profiles[hit->j],
7494 NJ->nPos, NJ->distance_matrix, /*OUT*/hit);
7495 hit->dist -= (NJ->diameter[hit->i] + NJ->diameter[hit->j]);
7497 hit->dist += constraintWeight
7498 * (double)JoinConstraintPenalty(NJ, hit->i, hit->j);
7499 SetCriterion(NJ,nActive,/*IN/OUT*/hit);
7502 void SetCriterion(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *join) {
7505 || NJ->parent[join->i] >= 0
7506 || NJ->parent[join->j] >= 0)
7508 assert(NJ->nOutDistActive[join->i] >= nActive);
7509 assert(NJ->nOutDistActive[join->j] >= nActive);
7511 int nDiffAllow = tophitsMult > 0 ? (int)(nActive*staleOutLimit) : 0;
7512 if (NJ->nOutDistActive[join->i] - nActive > nDiffAllow)
7513 SetOutDistance(NJ, join->i, nActive);
7514 if (NJ->nOutDistActive[join->j] - nActive > nDiffAllow)
7515 SetOutDistance(NJ, join->j, nActive);
7516 double outI = NJ->outDistances[join->i];
7517 if (NJ->nOutDistActive[join->i] != nActive)
7518 outI *= (nActive-1)/(double)(NJ->nOutDistActive[join->i]-1);
7519 double outJ = NJ->outDistances[join->j];
7520 if (NJ->nOutDistActive[join->j] != nActive)
7521 outJ *= (nActive-1)/(double)(NJ->nOutDistActive[join->j]-1);
7522 join->criterion = join->dist - (outI+outJ)/(double)(nActive-2);
7523 if (verbose > 2 && nActive <= 5) {
7524 fprintf(stderr, "Set Criterion to join %d %d with nActive=%d dist+penalty %.3f criterion %.3f\n",
7525 join->i, join->j, nActive, join->dist, join->criterion);
7529 void SetOutDistance(NJ_t *NJ, int iNode, int nActive) {
7530 if (NJ->nOutDistActive[iNode] == nActive)
7533 /* May be called by InitNJ before we have parents */
7534 assert(iNode>=0 && (NJ->parent == NULL || NJ->parent[iNode]<0));
7536 ProfileDist(NJ->profiles[iNode], NJ->outprofile, NJ->nPos, NJ->distance_matrix, &dist);
7539 /* out(A) = sum(X!=A) d(A,X)
7540 = sum(X!=A) (profiledist(A,X) - diam(A) - diam(X))
7541 = sum(X!=A) profiledist(A,X) - (N-1)*diam(A) - (totdiam - diam(A))
7543 in the absence of gaps:
7544 profiledist(A,out) = mean profiledist(A, all active nodes)
7545 sum(X!=A) profiledist(A,X) = N * profiledist(A,out) - profiledist(A,A)
7547 With gaps, we need to take the weights of the comparisons into account, where
7548 w(Ai) is the weight of position i in profile A:
7549 w(A,B) = sum_i w(Ai) * w(Bi)
7550 d(A,B) = sum_i w(Ai) * w(Bi) * d(Ai,Bi) / w(A,B)
7552 sum(X!=A) profiledist(A,X) ~= (N-1) * profiledist(A, Out w/o A)
7553 profiledist(A, Out w/o A) = sum_X!=A sum_i d(Ai,Xi) * w(Ai) * w(Bi) / ( sum_X!=A sum_i w(Ai) * w(Bi) )
7554 d(A, Out) = sum_A sum_i d(Ai,Xi) * w(Ai) * w(Bi) / ( sum_X sum_i w(Ai) * w(Bi) )
7557 profiledist(A,out w/o A) = (top of d(A,Out) - top of d(A,A)) / (weight of d(A,Out) - weight of d(A,A))
7559 with another correction of nActive because the weight of the out-profile is the average
7560 weight not the total weight.
7562 double top = (nActive-1)
7563 * (dist.dist * dist.weight * nActive - NJ->selfweight[iNode] * NJ->selfdist[iNode]);
7564 double bottom = (dist.weight * nActive - NJ->selfweight[iNode]);
7565 double pdistOutWithoutA = top/bottom;
7566 NJ->outDistances[iNode] = bottom > 0.01 ?
7567 pdistOutWithoutA - NJ->diameter[iNode] * (nActive-1) - (NJ->totdiam - NJ->diameter[iNode])
7569 NJ->nOutDistActive[iNode] = nActive;
7571 if(verbose>3 && iNode < 5)
7572 fprintf(stderr,"NewOutDist for %d %f from dist %f selfd %f diam %f totdiam %f newActive %d\n",
7573 iNode, NJ->outDistances[iNode], dist.dist, NJ->selfdist[iNode], NJ->diameter[iNode],
7574 NJ->totdiam, nActive);
7575 if (verbose>6 && (iNode % 10) == 0) {
7576 /* Compute the actual out-distance and compare */
7578 double total_pd = 0.0;
7580 for (j=0;j<NJ->maxnode;j++) {
7581 if (j!=iNode && (NJ->parent==NULL || NJ->parent[j]<0)) {
7583 ProfileDist(NJ->profiles[iNode], NJ->profiles[j], NJ->nPos, NJ->distance_matrix, /*OUT*/&bh);
7584 total_pd += bh.dist;
7585 total += bh.dist - (NJ->diameter[iNode] + NJ->diameter[j]);
7588 fprintf(stderr,"OutDist for Node %d %f truth %f profiled %f truth %f pd_err %f\n",
7589 iNode, NJ->outDistances[iNode], total, pdistOutWithoutA, total_pd,fabs(pdistOutWithoutA-total_pd));
7593 top_hits_t *FreeTopHits(top_hits_t *tophits) {
7594 if (tophits == NULL)
7597 for (iNode = 0; iNode < tophits->maxnodes; iNode++) {
7598 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
7599 if (l->hits != NULL)
7600 l->hits = myfree(l->hits, sizeof(hit_t) * l->nHits);
7602 tophits->top_hits_lists = myfree(tophits->top_hits_lists, sizeof(top_hits_list_t) * tophits->maxnodes);
7603 tophits->visible = myfree(tophits->visible, sizeof(hit_t*) * tophits->maxnodes);
7604 tophits->topvisible = myfree(tophits->topvisible, sizeof(int) * tophits->nTopVisible);
7606 for (iNode = 0; iNode < tophits->maxnodes; iNode++)
7607 omp_destroy_lock(&tophits->locks[iNode]);
7608 tophits->locks = myfree(tophits->locks, sizeof(omp_lock_t) * tophits->maxnodes);
7610 return(myfree(tophits, sizeof(top_hits_t)));
7613 top_hits_t *InitTopHits(NJ_t *NJ, int m) {
7616 top_hits_t *tophits = mymalloc(sizeof(top_hits_t));
7618 tophits->q = (int)(0.5 + tophits2Mult * sqrt(tophits->m));
7619 if (!useTopHits2nd || tophits->q >= tophits->m)
7621 tophits->maxnodes = NJ->maxnodes;
7622 tophits->top_hits_lists = mymalloc(sizeof(top_hits_list_t) * tophits->maxnodes);
7623 tophits->visible = mymalloc(sizeof(hit_t) * tophits->maxnodes);
7624 tophits->nTopVisible = (int)(0.5 + topvisibleMult*m);
7625 tophits->topvisible = mymalloc(sizeof(int) * tophits->nTopVisible);
7627 tophits->locks = mymalloc(sizeof(omp_lock_t) * tophits->maxnodes);
7628 for (iNode = 0; iNode < tophits->maxnodes; iNode++)
7629 omp_init_lock(&tophits->locks[iNode]);
7632 for (i = 0; i < tophits->nTopVisible; i++)
7633 tophits->topvisible[i] = -1; /* empty */
7634 tophits->topvisibleAge = 0;
7636 for (iNode = 0; iNode < tophits->maxnodes; iNode++) {
7637 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
7642 hit_t *v = &tophits->visible[iNode];
7649 /* Helper function for sorting in SetAllLeafTopHits,
7650 and the global variables it needs
7652 NJ_t *CompareSeedNJ = NULL;
7653 int *CompareSeedGaps = NULL;
7654 int CompareSeeds(const void *c1, const void *c2) {
7655 int seed1 = *(int *)c1;
7656 int seed2 = *(int *)c2;
7657 int gapdiff = CompareSeedGaps[seed1] - CompareSeedGaps[seed2];
7658 if (gapdiff != 0) return(gapdiff); /* fewer gaps is better */
7659 double outdiff = CompareSeedNJ->outDistances[seed1] - CompareSeedNJ->outDistances[seed2];
7660 if(outdiff < 0) return(-1); /* closer to more nodes is better */
7661 if(outdiff > 0) return(1);
7665 /* Using the seed heuristic and the close global variable */
7666 void SetAllLeafTopHits(/*IN/UPDATE*/NJ_t *NJ, /*IN/OUT*/top_hits_t *tophits) {
7667 double close = tophitsClose;
7669 if (fastest && NJ->nSeq >= 50000) {
7672 double logN = log((double)NJ->nSeq)/log(2.0);
7673 close = logN/(logN+2.0);
7676 /* Sort the potential seeds, by a combination of nGaps and NJ->outDistances
7677 We don't store nGaps so we need to compute that
7679 int *nGaps = (int*)mymalloc(sizeof(int)*NJ->nSeq);
7681 for(iNode=0; iNode<NJ->nSeq; iNode++) {
7682 nGaps[iNode] = (int)(0.5 + NJ->nPos - NJ->selfweight[iNode]);
7684 int *seeds = (int*)mymalloc(sizeof(int)*NJ->nSeq);
7685 for (iNode=0; iNode<NJ->nSeq; iNode++) seeds[iNode] = iNode;
7687 CompareSeedGaps = nGaps;
7688 qsort(/*IN/OUT*/seeds, NJ->nSeq, sizeof(int), CompareSeeds);
7689 CompareSeedNJ = NULL;
7690 CompareSeedGaps = NULL;
7692 /* For each seed, save its top 2*m hits and then look for close neighbors */
7693 assert(2 * tophits->m <= NJ->nSeq);
7695 int nHasTopHits = 0;
7697 #pragma omp parallel for schedule(dynamic, 50)
7699 for(iSeed=0; iSeed < NJ->nSeq; iSeed++) {
7700 int seed = seeds[iSeed];
7701 if (iSeed > 0 && (iSeed % 100) == 0) {
7703 #pragma omp critical
7705 ProgressReport("Top hits for %6d of %6d seqs (at seed %6d)",
7706 nHasTopHits, NJ->nSeq,
7709 if (tophits->top_hits_lists[seed].nHits > 0) {
7710 if(verbose>2) fprintf(stderr, "Skipping seed %d\n", seed);
7714 besthit_t *besthitsSeed = (besthit_t*)mymalloc(sizeof(besthit_t)*NJ->nSeq);
7715 besthit_t *besthitsNeighbor = (besthit_t*)mymalloc(sizeof(besthit_t) * 2 * tophits->m);
7718 if(verbose>2) fprintf(stderr,"Trying seed %d\n", seed);
7719 SetBestHit(seed, NJ, /*nActive*/NJ->nSeq, /*OUT*/&bestjoin, /*OUT*/besthitsSeed);
7721 /* sort & save top hits of self. besthitsSeed is now sorted. */
7722 SortSaveBestHits(seed, /*IN/SORT*/besthitsSeed, /*IN-SIZE*/NJ->nSeq,
7723 /*OUT-SIZE*/tophits->m, /*IN/OUT*/tophits);
7726 /* find "close" neighbors and compute their top hits */
7727 double neardist = besthitsSeed[2 * tophits->m - 1].dist * close;
7728 /* must have at least average weight, rem higher is better
7729 and allow a bit more than average, e.g. if we are looking for within 30% away,
7730 20% more gaps than usual seems OK
7731 Alternatively, have a coverage requirement in case neighbor is short
7732 If fastest, consider the top q/2 hits to be close neighbors, regardless
7734 double nearweight = 0;
7736 for (iClose = 0; iClose < 2 * tophits->m; iClose++)
7737 nearweight += besthitsSeed[iClose].weight;
7738 nearweight = nearweight/(2.0 * tophits->m); /* average */
7739 nearweight *= (1.0-2.0*neardist/3.0);
7740 double nearcover = 1.0 - neardist/2.0;
7742 if(verbose>2) fprintf(stderr,"Distance limit for close neighbors %f weight %f ungapped %d\n",
7743 neardist, nearweight, NJ->nPos-nGaps[seed]);
7744 for (iClose = 0; iClose < tophits->m; iClose++) {
7745 besthit_t *closehit = &besthitsSeed[iClose];
7746 int closeNode = closehit->j;
7747 if (tophits->top_hits_lists[closeNode].nHits > 0)
7750 /* If within close-distance, or identical, use as close neighbor */
7751 bool close = closehit->dist <= neardist
7752 && (closehit->weight >= nearweight
7753 || closehit->weight >= (NJ->nPos-nGaps[closeNode])*nearcover);
7754 bool identical = closehit->dist < 1e-6
7755 && fabs(closehit->weight - (NJ->nPos - nGaps[seed])) < 1e-5
7756 && fabs(closehit->weight - (NJ->nPos - nGaps[closeNode])) < 1e-5;
7757 if (useTopHits2nd && iClose < tophits->q && (close || identical)) {
7760 int nUse = MIN(tophits->q * tophits2Safety, 2 * tophits->m);
7761 besthit_t *besthitsClose = mymalloc(sizeof(besthit_t) * nUse);
7762 TransferBestHits(NJ, /*nActive*/NJ->nSeq,
7764 /*IN*/besthitsSeed, /*SIZE*/nUse,
7765 /*OUT*/besthitsClose,
7766 /*updateDistance*/true);
7767 SortSaveBestHits(closeNode, /*IN/SORT*/besthitsClose,
7768 /*IN-SIZE*/nUse, /*OUT-SIZE*/tophits->q,
7770 tophits->top_hits_lists[closeNode].hitSource = seed;
7771 besthitsClose = myfree(besthitsClose, sizeof(besthit_t) * nUse);
7772 } else if (close || identical || (fastest && iClose < (tophits->q+1)/2)) {
7775 if(verbose>2) fprintf(stderr, "Near neighbor %d (rank %d weight %f ungapped %d %d)\n",
7776 closeNode, iClose, besthitsSeed[iClose].weight,
7777 NJ->nPos-nGaps[seed],
7778 NJ->nPos-nGaps[closeNode]);
7780 /* compute top 2*m hits */
7781 TransferBestHits(NJ, /*nActive*/NJ->nSeq,
7783 /*IN*/besthitsSeed, /*SIZE*/2 * tophits->m,
7784 /*OUT*/besthitsNeighbor,
7785 /*updateDistance*/true);
7786 SortSaveBestHits(closeNode, /*IN/SORT*/besthitsNeighbor,
7787 /*IN-SIZE*/2 * tophits->m, /*OUT-SIZE*/tophits->m,
7790 /* And then try for a second level of transfer. We assume we
7791 are in a good area, because of the 1st
7792 level of transfer, and in a small neighborhood, because q is
7793 small (32 for 1 million sequences), so we do not make any close checks.
7796 for (iClose2 = 0; iClose2 < tophits->q && iClose2 < 2 * tophits->m; iClose2++) {
7797 int closeNode2 = besthitsNeighbor[iClose2].j;
7798 assert(closeNode2 >= 0);
7799 if (tophits->top_hits_lists[closeNode2].hits == NULL) {
7802 int nUse = MIN(tophits->q * tophits2Safety, 2 * tophits->m);
7803 besthit_t *besthitsClose2 = mymalloc(sizeof(besthit_t) * nUse);
7804 TransferBestHits(NJ, /*nActive*/NJ->nSeq,
7806 /*IN*/besthitsNeighbor, /*SIZE*/nUse,
7807 /*OUT*/besthitsClose2,
7808 /*updateDistance*/true);
7809 SortSaveBestHits(closeNode2, /*IN/SORT*/besthitsClose2,
7810 /*IN-SIZE*/nUse, /*OUT-SIZE*/tophits->q,
7812 tophits->top_hits_lists[closeNode2].hitSource = closeNode;
7813 besthitsClose2 = myfree(besthitsClose2, sizeof(besthit_t) * nUse);
7814 } /* end if should do 2nd-level transfer */
7817 } /* end loop over close candidates */
7818 besthitsSeed = myfree(besthitsSeed, sizeof(besthit_t)*NJ->nSeq);
7819 besthitsNeighbor = myfree(besthitsNeighbor, sizeof(besthit_t) * 2 * tophits->m);
7820 } /* end loop over seeds */
7822 for (iNode=0; iNode<NJ->nSeq; iNode++) {
7823 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
7824 assert(l->hits != NULL);
7825 assert(l->hits[0].j >= 0);
7826 assert(l->hits[0].j < NJ->nSeq);
7827 assert(l->hits[0].j != iNode);
7828 tophits->visible[iNode] = l->hits[0];
7831 if (verbose >= 2) fprintf(stderr, "#Close neighbors among leaves: 1st-level %ld 2nd-level %ld seeds %ld\n",
7832 nCloseUsed, nClose2Used, NJ->nSeq-nCloseUsed-nClose2Used);
7833 nGaps = myfree(nGaps, sizeof(int)*NJ->nSeq);
7834 seeds = myfree(seeds, sizeof(int)*NJ->nSeq);
7836 /* Now add a "checking phase" where we ensure that the q or 2*sqrt(m) hits
7837 of i are represented in j (if they should be)
7840 int nCheck = tophits->q > 0 ? tophits->q : (int)(0.5 + 2.0*sqrt(tophits->m));
7841 for (iNode = 0; iNode < NJ->nSeq; iNode++) {
7842 if ((iNode % 100) == 0)
7843 ProgressReport("Checking top hits for %6d of %6d seqs",
7844 iNode+1, NJ->nSeq, 0, 0);
7845 top_hits_list_t *lNode = &tophits->top_hits_lists[iNode];
7847 for (iHit = 0; iHit < nCheck && iHit < lNode->nHits; iHit++) {
7848 besthit_t bh = HitToBestHit(iNode, lNode->hits[iHit]);
7849 SetCriterion(NJ, /*nActive*/NJ->nSeq, /*IN/OUT*/&bh);
7850 top_hits_list_t *lTarget = &tophits->top_hits_lists[bh.j];
7852 /* If this criterion is worse than the nCheck-1 entry of the target,
7853 then skip the check.
7854 This logic is based on assuming that the list is sorted,
7855 which is true initially but may not be true later.
7856 Still, is a good heuristic.
7859 assert(nCheck <= lTarget->nHits);
7860 besthit_t bhCheck = HitToBestHit(bh.j, lTarget->hits[nCheck-1]);
7861 SetCriterion(NJ, /*nActive*/NJ->nSeq, /*IN/OUT*/&bhCheck);
7862 if (bhCheck.criterion < bh.criterion)
7863 continue; /* no check needed */
7865 /* Check if this is present in the top-hit list */
7867 bool bFound = false;
7868 for (iHit2 = 0; iHit2 < lTarget->nHits && !bFound; iHit2++)
7869 if (lTarget->hits[iHit2].j == iNode)
7872 /* Find the hit with the worst criterion and replace it with this one */
7874 double dWorstCriterion = -1e20;
7875 for (iHit2 = 0; iHit2 < lTarget->nHits; iHit2++) {
7876 besthit_t bh2 = HitToBestHit(bh.j, lTarget->hits[iHit2]);
7877 SetCriterion(NJ, /*nActive*/NJ->nSeq, /*IN/OUT*/&bh2);
7878 if (bh2.criterion > dWorstCriterion) {
7880 dWorstCriterion = bh2.criterion;
7883 if (dWorstCriterion > bh.criterion) {
7884 assert(iWorst >= 0);
7885 lTarget->hits[iWorst].j = iNode;
7886 lTarget->hits[iWorst].dist = bh.dist;
7888 /* and perhaps update visible */
7890 bool bSuccess = GetVisible(NJ, /*nActive*/NJ->nSeq, tophits, bh.j, /*OUT*/&v);
7892 if (bh.criterion < v.criterion)
7893 tophits->visible[bh.j] = lTarget->hits[iWorst];
7900 fprintf(stderr, "Replaced %ld top hit entries\n", lReplace);
7903 /* Updates out-distances but does not reset or update visible set */
7904 void GetBestFromTopHits(int iNode,
7905 /*IN/UPDATE*/NJ_t *NJ,
7907 /*IN*/top_hits_t *tophits,
7908 /*OUT*/besthit_t *bestjoin) {
7910 assert(NJ->parent[iNode] < 0);
7911 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
7912 assert(l->nHits > 0);
7913 assert(l->hits != NULL);
7916 SetOutDistance(NJ, iNode, nActive); /* ensure out-distances are not stale */
7920 bestjoin->dist = 1e20;
7921 bestjoin->criterion = 1e20;
7924 for(iBest=0; iBest < l->nHits; iBest++) {
7925 besthit_t bh = HitToBestHit(iNode, l->hits[iBest]);
7926 if (UpdateBestHit(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/&bh, /*update dist*/true)) {
7927 SetCriterion(/*IN/OUT*/NJ, nActive, /*IN/OUT*/&bh); /* make sure criterion is correct */
7928 if (bh.criterion < bestjoin->criterion)
7932 assert(bestjoin->j >= 0); /* a hit was found */
7933 assert(bestjoin->i == iNode);
7936 int ActiveAncestor(/*IN*/NJ_t *NJ, int iNode) {
7939 while(NJ->parent[iNode] >= 0)
7940 iNode = NJ->parent[iNode];
7944 bool UpdateBestHit(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *hit,
7946 int i = ActiveAncestor(/*IN*/NJ, hit->i);
7947 int j = ActiveAncestor(/*IN*/NJ, hit->j);
7948 if (i < 0 || j < 0 || i == j) {
7953 hit->criterion = 1e20;
7956 if (i != hit->i || j != hit->j) {
7960 SetDistCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/hit);
7963 hit->criterion = 1e20;
7969 bool GetVisible(/*IN/UPDATE*/NJ_t *NJ, int nActive,
7970 /*IN/OUT*/top_hits_t *tophits,
7971 int iNode, /*OUT*/besthit_t *visible) {
7972 if (iNode < 0 || NJ->parent[iNode] >= 0)
7974 hit_t *v = &tophits->visible[iNode];
7975 if (v->j < 0 || NJ->parent[v->j] >= 0)
7977 *visible = HitToBestHit(iNode, *v);
7978 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/visible);
7982 besthit_t *UniqueBestHits(/*IN/UPDATE*/NJ_t *NJ, int nActive,
7983 /*IN/SORT*/besthit_t *combined, int nCombined,
7984 /*OUT*/int *nUniqueOut) {
7986 for (iHit = 0; iHit < nCombined; iHit++) {
7987 besthit_t *hit = &combined[iHit];
7988 UpdateBestHit(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/hit, /*update*/false);
7990 qsort(/*IN/OUT*/combined, nCombined, sizeof(besthit_t), CompareHitsByIJ);
7992 besthit_t *uniqueList = (besthit_t*)mymalloc(sizeof(besthit_t)*nCombined);
7994 int iSavedLast = -1;
7996 /* First build the new list */
7997 for (iHit = 0; iHit < nCombined; iHit++) {
7998 besthit_t *hit = &combined[iHit];
7999 if (hit->i < 0 || hit->j < 0)
8001 if (iSavedLast >= 0) {
8002 /* toss out duplicates */
8003 besthit_t *saved = &combined[iSavedLast];
8004 if (saved->i == hit->i && saved->j == hit->j)
8007 assert(nUnique < nCombined);
8008 assert(hit->j >= 0 && NJ->parent[hit->j] < 0);
8009 uniqueList[nUnique++] = *hit;
8012 *nUniqueOut = nUnique;
8014 /* Then do any updates to the criterion or the distances in parallel */
8016 #pragma omp parallel for schedule(dynamic, 50)
8018 for (iHit = 0; iHit < nUnique; iHit++) {
8019 besthit_t *hit = &uniqueList[iHit];
8020 if (hit->dist < 0.0)
8021 SetDistCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/hit);
8023 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/hit);
8029 Create a top hit list for the new node, either
8030 from children (if there are enough best hits left) or by a "refresh"
8031 Also set visible set for newnode
8032 Also update visible set for other nodes if we stumble across a "better" hit
8035 void TopHitJoin(int newnode,
8036 /*IN/UPDATE*/NJ_t *NJ,
8038 /*IN/OUT*/top_hits_t *tophits) {
8039 long startProfileOps = profileOps;
8040 long startOutProfileOps = outprofileOps;
8041 assert(NJ->child[newnode].nChild == 2);
8042 top_hits_list_t *lNew = &tophits->top_hits_lists[newnode];
8043 assert(lNew->hits == NULL);
8047 top_hits_list_t *lChild[2];
8048 for (i = 0; i< 2; i++) {
8049 lChild[i] = &tophits->top_hits_lists[NJ->child[newnode].child[i]];
8050 assert(lChild[i]->hits != NULL && lChild[i]->nHits > 0);
8052 int nCombined = lChild[0]->nHits + lChild[1]->nHits;
8053 besthit_t *combinedList = (besthit_t*)mymalloc(sizeof(besthit_t)*nCombined);
8054 HitsToBestHits(lChild[0]->hits, lChild[0]->nHits, NJ->child[newnode].child[0],
8055 /*OUT*/combinedList);
8056 HitsToBestHits(lChild[1]->hits, lChild[1]->nHits, NJ->child[newnode].child[1],
8057 /*OUT*/combinedList + lChild[0]->nHits);
8059 /* UniqueBestHits() replaces children (used in the calls to HitsToBestHits)
8060 with active ancestors, so all distances & criteria will be recomputed */
8061 besthit_t *uniqueList = UniqueBestHits(/*IN/UPDATE*/NJ, nActive,
8062 /*IN/SORT*/combinedList,
8065 int nUniqueAlloc = nCombined;
8066 combinedList = myfree(combinedList, sizeof(besthit_t)*nCombined);
8068 /* Forget the top-hit lists of the joined nodes */
8069 for (i = 0; i < 2; i++) {
8070 lChild[i]->hits = myfree(lChild[i]->hits, sizeof(hit_t) * lChild[i]->nHits);
8071 lChild[i]->nHits = 0;
8074 /* Use the average age, rounded up, by 1 Versions 2.0 and earlier
8075 used the maximum age, which leads to more refreshes without
8076 improving the accuracy of the NJ phase. Intuitively, if one of
8077 them was just refreshed then another refresh is unlikely to help.
8079 lNew->age = (lChild[0]->age+lChild[1]->age+1)/2 + 1;
8081 /* If top hit ages always match (perfectly balanced), then a
8082 limit of log2(m) would mean a refresh after
8083 m joins, which is about what we want.
8085 int tophitAgeLimit = MAX(1, (int)(0.5 + log((double)tophits->m)/log(2.0)));
8087 /* Either use the merged list as candidate top hits, or
8088 move from 2nd level to 1st level, or do a refresh
8089 UniqueBestHits eliminates hits to self, so if nUnique==nActive-1,
8090 we've already done the exhaustive search.
8092 Either way, we set tophits, visible(newnode), update visible of its top hits,
8093 and modify topvisible: if we do a refresh, then we reset it, otherwise we update
8095 bool bSecondLevel = lChild[0]->hitSource >= 0 && lChild[1]->hitSource >= 0;
8096 bool bUseUnique = nUnique==nActive-1
8097 || (lNew->age <= tophitAgeLimit
8098 && nUnique >= (bSecondLevel ? (int)(0.5 + tophits2Refresh * tophits->q)
8099 : (int)(0.5 + tophits->m * tophitsRefresh) ));
8100 if (bUseUnique && verbose > 2)
8101 fprintf(stderr,"Top hits for %d from combined %d nActive=%d tophitsage %d %s\n",
8102 newnode,nUnique,nActive,lNew->age,
8103 bSecondLevel ? "2ndlevel" : "1stlevel");
8107 && lNew->age <= tophitAgeLimit) {
8108 int source = ActiveAncestor(NJ, lChild[0]->hitSource);
8109 if (source == newnode)
8110 source = ActiveAncestor(NJ, lChild[1]->hitSource);
8111 /* In parallel mode, it is possible that we would select a node as the
8112 hit-source and then over-write that top hit with a short list.
8113 So we need this sanity check.
8115 if (source != newnode
8117 && tophits->top_hits_lists[source].hitSource < 0) {
8119 /* switch from 2nd-level to 1st-level top hits -- compute top hits list
8120 of node from what we have so far plus the active source plus its top hits */
8121 top_hits_list_t *lSource = &tophits->top_hits_lists[source];
8122 assert(lSource->hitSource < 0);
8123 assert(lSource->nHits > 0);
8124 int nMerge = 1 + lSource->nHits + nUnique;
8125 besthit_t *mergeList = mymalloc(sizeof(besthit_t) * nMerge);
8126 memcpy(/*to*/mergeList, /*from*/uniqueList, nUnique * sizeof(besthit_t));
8128 int iMerge = nUnique;
8129 mergeList[iMerge].i = newnode;
8130 mergeList[iMerge].j = source;
8131 SetDistCriterion(NJ, nActive, /*IN/OUT*/&mergeList[iMerge]);
8133 HitsToBestHits(lSource->hits, lSource->nHits, newnode, /*OUT*/mergeList+iMerge);
8134 for (i = 0; i < lSource->nHits; i++) {
8135 SetDistCriterion(NJ, nActive, /*IN/OUT*/&mergeList[iMerge]);
8138 assert(iMerge == nMerge);
8140 uniqueList = myfree(uniqueList, nUniqueAlloc * sizeof(besthit_t));
8141 uniqueList = UniqueBestHits(/*IN/UPDATE*/NJ, nActive,
8142 /*IN/SORT*/mergeList,
8145 nUniqueAlloc = nMerge;
8146 mergeList = myfree(mergeList, sizeof(besthit_t)*nMerge);
8148 assert(nUnique > 0);
8149 bUseUnique = nUnique >= (int)(0.5 + tophits->m * tophitsRefresh);
8150 bSecondLevel = false;
8152 if (bUseUnique && verbose > 2)
8153 fprintf(stderr, "Top hits for %d from children and source %d's %d hits, nUnique %d\n",
8154 newnode, source, lSource->nHits, nUnique);
8160 /* pick arbitrarily */
8161 lNew->hitSource = lChild[0]->hitSource;
8163 int nSave = MIN(nUnique, bSecondLevel ? tophits->q : tophits->m);
8166 fprintf(stderr, "Combined %d ops so far %ld\n", nUnique, profileOps - startProfileOps);
8167 SortSaveBestHits(newnode, /*IN/SORT*/uniqueList, /*nIn*/nUnique,
8168 /*nOut*/nSave, /*IN/OUT*/tophits);
8169 assert(lNew->hits != NULL); /* set by sort/save */
8170 tophits->visible[newnode] = lNew->hits[0];
8171 UpdateTopVisible(/*IN*/NJ, nActive, newnode, &tophits->visible[newnode],
8173 UpdateVisible(/*IN/UPDATE*/NJ, nActive, /*IN*/uniqueList, nSave, /*IN/OUT*/tophits);
8175 /* need to refresh: set top hits for node and for its top hits */
8176 if(verbose > 2) fprintf(stderr,"Top hits for %d by refresh (%d unique age %d) nActive=%d\n",
8177 newnode,nUnique,lNew->age,nActive);
8182 /* ensure all out-distances are up to date ahead of time
8183 to avoid any data overwriting issues.
8186 #pragma omp parallel for schedule(dynamic, 50)
8188 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
8189 if (NJ->parent[iNode] < 0) {
8195 SetCriterion(/*IN/UPDATE*/NJ, nActive, &bh);
8197 SetOutDistance(/*IN/UDPATE*/NJ, iNode, nActive);
8202 /* exhaustively get the best 2*m hits for newnode, set visible, and save the top m */
8203 besthit_t *allhits = (besthit_t*)mymalloc(sizeof(besthit_t)*NJ->maxnode);
8204 assert(2 * tophits->m <= NJ->maxnode);
8206 SetBestHit(newnode, NJ, nActive, /*OUT*/&bh, /*OUT*/allhits);
8207 qsort(/*IN/OUT*/allhits, NJ->maxnode, sizeof(besthit_t), CompareHitsByCriterion);
8208 SortSaveBestHits(newnode, /*IN/SORT*/allhits, /*nIn*/NJ->maxnode,
8209 /*nOut*/tophits->m, /*IN/OUT*/tophits);
8211 /* Do not need to call UpdateVisible because we set visible below */
8213 /* And use the top 2*m entries to expand other best-hit lists, but only for top m */
8216 #pragma omp parallel for schedule(dynamic, 50)
8218 for (iHit=0; iHit < tophits->m; iHit++) {
8219 if (allhits[iHit].i < 0) continue;
8220 int iNode = allhits[iHit].j;
8222 if (NJ->parent[iNode] >= 0) continue;
8223 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
8224 int nHitsOld = l->nHits;
8225 assert(nHitsOld <= tophits->m);
8228 /* Merge: old hits into 0->nHitsOld and hits from iNode above that */
8229 besthit_t *bothList = (besthit_t*)mymalloc(sizeof(besthit_t) * 3 * tophits->m);
8230 HitsToBestHits(/*IN*/l->hits, nHitsOld, iNode, /*OUT*/bothList); /* does not compute criterion */
8231 for (i = 0; i < nHitsOld; i++)
8232 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/&bothList[i]);
8233 if (nActive <= 2 * tophits->m)
8234 l->hitSource = -1; /* abandon the 2nd-level top-hits heuristic */
8235 int nNewHits = l->hitSource >= 0 ? tophits->q : tophits->m;
8236 assert(nNewHits > 0);
8238 TransferBestHits(/*IN/UPDATE*/NJ, nActive, iNode,
8239 /*IN*/allhits, /*nOldHits*/2 * nNewHits,
8240 /*OUT*/&bothList[nHitsOld],
8241 /*updateDist*/false); /* rely on UniqueBestHits to update dist and/or criterion */
8243 besthit_t *uniqueList2 = UniqueBestHits(/*IN/UPDATE*/NJ, nActive,
8244 /*IN/SORT*/bothList, nHitsOld + 2 * nNewHits,
8246 assert(nUnique2 > 0);
8247 bothList = myfree(bothList,3 * tophits->m * sizeof(besthit_t));
8249 /* Note this will overwrite l, but we saved nHitsOld */
8250 SortSaveBestHits(iNode, /*IN/SORT*/uniqueList2, /*nIn*/nUnique2,
8251 /*nOut*/nNewHits, /*IN/OUT*/tophits);
8252 /* will update topvisible below */
8253 tophits->visible[iNode] = tophits->top_hits_lists[iNode].hits[0];
8254 uniqueList2 = myfree(uniqueList2, (nHitsOld + 2 * tophits->m) * sizeof(besthit_t));
8257 ResetTopVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits); /* outside of the parallel phase */
8258 allhits = myfree(allhits,sizeof(besthit_t)*NJ->maxnode);
8260 uniqueList = myfree(uniqueList, nUniqueAlloc * sizeof(besthit_t));
8262 fprintf(stderr, "New top-hit list for %d profile-ops %ld (out-ops %ld): source %d age %d members ",
8264 profileOps - startProfileOps,
8265 outprofileOps - startOutProfileOps,
8266 lNew->hitSource, lNew->age);
8269 for (i = 0; i < lNew->nHits; i++)
8270 fprintf(stderr, " %d", lNew->hits[i].j);
8271 fprintf(stderr,"\n");
8275 void UpdateVisible(/*IN/UPDATE*/NJ_t *NJ, int nActive,
8276 /*IN*/besthit_t *tophitsNode,
8278 /*IN/OUT*/top_hits_t *tophits) {
8281 for(iHit = 0; iHit < nTopHits; iHit++) {
8282 besthit_t *hit = &tophitsNode[iHit];
8283 if (hit->i < 0) continue; /* possible empty entries */
8284 assert(NJ->parent[hit->i] < 0);
8285 assert(hit->j >= 0 && NJ->parent[hit->j] < 0);
8287 bool bSuccess = GetVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, hit->j, /*OUT*/&visible);
8288 if (!bSuccess || hit->criterion < visible.criterion) {
8291 hit_t *v = &tophits->visible[hit->j];
8293 v->dist = hit->dist;
8294 UpdateTopVisible(NJ, nActive, hit->j, v, /*IN/OUT*/tophits);
8295 if(verbose>5) fprintf(stderr,"NewVisible %d %d %f\n",
8296 hit->j,v->j,v->dist);
8298 } /* end loop over hits */
8301 /* Update the top-visible list to perhaps include visible[iNode] */
8302 void UpdateTopVisible(/*IN*/NJ_t * NJ, int nActive,
8303 int iIn, /*IN*/hit_t *hit,
8304 /*IN/OUT*/top_hits_t *tophits) {
8305 assert(tophits != NULL);
8306 bool bIn = false; /* placed in the list */
8309 /* First, if the list is not full, put it in somewhere */
8310 for (i = 0; i < tophits->nTopVisible && !bIn; i++) {
8311 int iNode = tophits->topvisible[i];
8313 /* this node is already in the top hit list */
8315 } else if (iNode < 0 || NJ->parent[iNode] >= 0) {
8316 /* found an empty spot */
8318 tophits->topvisible[i] = iIn;
8323 double dCriterionWorst = -1e20;
8325 /* Search for the worst hit */
8326 for (i = 0; i < tophits->nTopVisible && !bIn; i++) {
8327 int iNode = tophits->topvisible[i];
8328 assert(iNode >= 0 && NJ->parent[iNode] < 0 && iNode != iIn);
8330 if (!GetVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, iNode, /*OUT*/&visible)) {
8331 /* found an empty spot */
8332 tophits->topvisible[i] = iIn;
8334 } else if (visible.i == hit->j && visible.j == iIn) {
8335 /* the reverse hit is already in the top hit list */
8337 } else if (visible.criterion >= dCriterionWorst) {
8339 dCriterionWorst = visible.criterion;
8344 if (!bIn && iPosWorst >= 0) {
8345 besthit_t visible = HitToBestHit(iIn, *hit);
8346 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/&visible);
8347 if (visible.criterion < dCriterionWorst) {
8349 int iOld = tophits->topvisible[iPosWorst];
8350 fprintf(stderr, "TopVisible replace %d=>%d with %d=>%d\n",
8351 iOld, tophits->visible[iOld].j, visible.i, visible.j);
8353 tophits->topvisible[iPosWorst] = iIn;
8358 fprintf(stderr, "Updated TopVisible: ");
8359 for (i = 0; i < tophits->nTopVisible; i++) {
8360 int iNode = tophits->topvisible[i];
8361 if (iNode >= 0 && NJ->parent[iNode] < 0) {
8362 besthit_t bh = HitToBestHit(iNode, tophits->visible[iNode]);
8363 SetDistCriterion(NJ, nActive, &bh);
8364 fprintf(stderr, " %d=>%d:%.4f", bh.i, bh.j, bh.criterion);
8367 fprintf(stderr,"\n");
8371 /* Recompute the topvisible list */
8372 void ResetTopVisible(/*IN/UPDATE*/NJ_t *NJ,
8374 /*IN/OUT*/top_hits_t *tophits) {
8375 besthit_t *visibleSorted = mymalloc(sizeof(besthit_t)*nActive);
8376 int nVisible = 0; /* #entries in visibleSorted */
8378 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
8379 /* skip joins involving stale nodes */
8380 if (NJ->parent[iNode] >= 0)
8383 if (GetVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, iNode, /*OUT*/&v)) {
8384 assert(nVisible < nActive);
8385 visibleSorted[nVisible++] = v;
8388 assert(nVisible > 0);
8390 qsort(/*IN/OUT*/visibleSorted,nVisible,sizeof(besthit_t),CompareHitsByCriterion);
8392 /* Only keep the top m items, and try to avoid duplicating i->j with j->i
8393 Note that visible(i) -> j does not necessarily imply visible(j) -> i,
8394 so we store what the pairing was (or -1 for not used yet)
8396 int *inTopVisible = malloc(sizeof(int) * NJ->maxnodes);
8398 for (i = 0; i < NJ->maxnodes; i++)
8399 inTopVisible[i] = -1;
8402 fprintf(stderr, "top-hit search: nActive %d nVisible %d considering up to %d items\n",
8403 nActive, nVisible, tophits->m);
8405 /* save the sorted indices in topvisible */
8407 for (i = 0; i < nVisible && iSave < tophits->nTopVisible; i++) {
8408 besthit_t *v = &visibleSorted[i];
8409 if (inTopVisible[v->i] != v->j) { /* not seen already */
8410 tophits->topvisible[iSave++] = v->i;
8411 inTopVisible[v->i] = v->j;
8412 inTopVisible[v->j] = v->i;
8415 while(iSave < tophits->nTopVisible)
8416 tophits->topvisible[iSave++] = -1;
8417 myfree(visibleSorted, sizeof(besthit_t)*nActive);
8418 myfree(inTopVisible, sizeof(int) * NJ->maxnodes);
8419 tophits->topvisibleAge = 0;
8421 fprintf(stderr, "Reset TopVisible: ");
8422 for (i = 0; i < tophits->nTopVisible; i++) {
8423 int iNode = tophits->topvisible[i];
8426 fprintf(stderr, " %d=>%d", iNode, tophits->visible[iNode].j);
8428 fprintf(stderr,"\n");
8433 Find best hit to do in O(N*log(N) + m*L*log(N)) time, by
8434 copying and sorting the visible list
8435 updating out-distances for the top (up to m) candidates
8436 selecting the best hit
8438 local hill-climbing for a better join,
8439 using best-hit lists only, and updating
8440 all out-distances in every best-hit list
8442 void TopHitNJSearch(/*IN/UPDATE*/NJ_t *NJ, int nActive,
8443 /*IN/OUT*/top_hits_t *tophits,
8444 /*OUT*/besthit_t *join) {
8445 /* first, do we have at least m/2 candidates in topvisible?
8446 And remember the best one */
8448 int iNodeBestCandidate = -1;
8449 double dBestCriterion = 1e20;
8452 for (i = 0; i < tophits->nTopVisible; i++) {
8453 int iNode = tophits->topvisible[i];
8455 if (GetVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, iNode, /*OUT*/&visible)) {
8457 if (iNodeBestCandidate < 0 || visible.criterion < dBestCriterion) {
8458 iNodeBestCandidate = iNode;
8459 dBestCriterion = visible.criterion;
8464 tophits->topvisibleAge++;
8465 /* Note we may have only nActive/2 joins b/c we try to store them once */
8466 if (2 * tophits->topvisibleAge > tophits->m
8467 || (3*nCandidate < tophits->nTopVisible && 3*nCandidate < nActive)) {
8468 /* recompute top visible */
8470 fprintf(stderr, "Resetting the top-visible list at nActive=%d\n",nActive);
8472 /* If age is low, then our visible set is becoming too sparse, because we have
8473 recently recomputed the top visible subset. This is very rare but can happen
8474 with -fastest. A quick-and-dirty solution is to walk up
8475 the parents to get additional entries in top hit lists. To ensure that the
8476 visible set becomes full, pick an arbitrary node if walking up terminates at self.
8478 if (tophits->topvisibleAge <= 2) {
8480 fprintf(stderr, "Expanding visible set by walking up to active nodes at nActive=%d\n", nActive);
8482 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
8483 if (NJ->parent[iNode] >= 0)
8485 hit_t *v = &tophits->visible[iNode];
8486 int newj = ActiveAncestor(NJ, v->j);
8487 if (newj >= 0 && newj != v->j) {
8488 if (newj == iNode) {
8489 /* pick arbitrarily */
8491 while (NJ->parent[newj] >= 0 || newj == iNode)
8494 assert(newj >= 0 && newj < NJ->maxnodes
8496 && NJ->parent[newj] < 0);
8498 /* Set v to point to newj */
8499 besthit_t bh = { iNode, newj, -1e20, -1e20, -1e20 };
8500 SetDistCriterion(NJ, nActive, /*IN/OUT*/&bh);
8506 ResetTopVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits);
8507 /* and recurse to try again */
8508 TopHitNJSearch(NJ, nActive, tophits, join);
8512 fprintf(stderr, "Top-visible list size %d (nActive %d m %d)\n",
8513 nCandidate, nActive, tophits->m);
8514 assert(iNodeBestCandidate >= 0 && NJ->parent[iNodeBestCandidate] < 0);
8515 bool bSuccess = GetVisible(NJ, nActive, tophits, iNodeBestCandidate, /*OUT*/join);
8517 assert(join->i >= 0 && NJ->parent[join->i] < 0);
8518 assert(join->j >= 0 && NJ->parent[join->j] < 0);
8528 GetBestFromTopHits(join->i, NJ, nActive, tophits, /*OUT*/&bestI);
8529 assert(bestI.i == join->i);
8530 if (bestI.j != join->j && bestI.criterion < join->criterion) {
8533 fprintf(stderr,"BetterI\t%d\t%d\t%d\t%d\t%f\t%f\n",
8534 join->i,join->j,bestI.i,bestI.j,
8535 join->criterion,bestI.criterion);
8540 GetBestFromTopHits(join->j, NJ, nActive, tophits, /*OUT*/&bestJ);
8541 assert(bestJ.i == join->j);
8542 if (bestJ.j != join->i && bestJ.criterion < join->criterion) {
8545 fprintf(stderr,"BetterJ\t%d\t%d\t%d\t%d\t%f\t%f\n",
8546 join->i,join->j,bestJ.i,bestJ.j,
8547 join->criterion,bestJ.criterion);
8550 if(changed) nHillBetter++;
8554 int NGaps(/*IN*/NJ_t *NJ, int iNode) {
8555 assert(iNode < NJ->nSeq);
8558 for(p=0; p<NJ->nPos; p++) {
8559 if (NJ->profiles[iNode]->codes[p] == NOCODE)
8565 int CompareHitsByCriterion(const void *c1, const void *c2) {
8566 const besthit_t *hit1 = (besthit_t*)c1;
8567 const besthit_t *hit2 = (besthit_t*)c2;
8568 if (hit1->criterion < hit2->criterion) return(-1);
8569 if (hit1->criterion > hit2->criterion) return(1);
8573 int CompareHitsByIJ(const void *c1, const void *c2) {
8574 const besthit_t *hit1 = (besthit_t*)c1;
8575 const besthit_t *hit2 = (besthit_t*)c2;
8576 return hit1->i != hit2->i ? hit1->i - hit2->i : hit1->j - hit2->j;
8579 void SortSaveBestHits(int iNode, /*IN/SORT*/besthit_t *besthits,
8581 /*IN/OUT*/top_hits_t *tophits) {
8584 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
8586 qsort(/*IN/OUT*/besthits,nIn,sizeof(besthit_t),CompareHitsByCriterion);
8588 /* First count how many we will save
8589 Not sure if removing duplicates is actually necessary.
8594 for (iBest = 0; iBest < nIn && nSave < nOut; iBest++) {
8595 if (besthits[iBest].i < 0)
8597 assert(besthits[iBest].i == iNode);
8598 int j = besthits[iBest].j;
8599 if (j != iNode && j != jLast && j >= 0) {
8608 omp_set_lock(&tophits->locks[iNode]);
8610 if (l->hits != NULL) {
8611 l->hits = myfree(l->hits, l->nHits * sizeof(hit_t));
8614 l->hits = mymalloc(sizeof(hit_t) * nSave);
8618 for (iBest = 0; iBest < nIn && iSave < nSave; iBest++) {
8619 int j = besthits[iBest].j;
8620 if (j != iNode && j != jLast && j >= 0) {
8621 l->hits[iSave].j = j;
8622 l->hits[iSave].dist = besthits[iBest].dist;
8628 omp_unset_lock(&tophits->locks[iNode]);
8630 assert(iSave == nSave);
8633 void TransferBestHits(/*IN/UPDATE*/NJ_t *NJ,
8636 /*IN*/besthit_t *oldhits,
8638 /*OUT*/besthit_t *newhits,
8639 bool updateDistances) {
8641 assert(NJ->parent[iNode] < 0);
8644 for(iBest = 0; iBest < nOldHits; iBest++) {
8645 besthit_t *old = &oldhits[iBest];
8646 besthit_t *new = &newhits[iBest];
8648 new->j = ActiveAncestor(/*IN*/NJ, old->j);
8649 new->dist = old->dist; /* may get reset below */
8650 new->weight = old->weight;
8651 new->criterion = old->criterion;
8653 if(new->j < 0 || new->j == iNode) {
8656 new->criterion = 1e20;
8657 } else if (new->i != old->i || new->j != old->j) {
8658 if (updateDistances)
8659 SetDistCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/new);
8662 new->criterion = 1e20;
8665 if (updateDistances)
8666 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/new);
8668 new->criterion = 1e20; /* leave dist alone */
8673 void HitsToBestHits(/*IN*/hit_t *hits, int nHits, int iNode, /*OUT*/besthit_t *newhits) {
8675 for (i = 0; i < nHits; i++) {
8676 hit_t *hit = &hits[i];
8677 besthit_t *bh = &newhits[i];
8680 bh->dist = hit->dist;
8681 bh->criterion = 1e20;
8682 bh->weight = -1; /* not the true value -- we compute these directly when needed */
8686 besthit_t HitToBestHit(int i, hit_t hit) {
8691 bh.criterion = 1e20;
8696 char *OpenMPString(void) {
8698 static char buf[100];
8699 sprintf(buf, ", OpenMP (%d threads)", omp_get_max_threads());
8706 /* Algorithm 26.2.17 from Abromowitz and Stegun, Handbook of Mathematical Functions
8707 Absolute accuracy of only about 1e-7, which is enough for us
8709 double pnorm(double x)
8711 double b1 = 0.319381530;
8712 double b2 = -0.356563782;
8713 double b3 = 1.781477937;
8714 double b4 = -1.821255978;
8715 double b5 = 1.330274429;
8716 double p = 0.2316419;
8717 double c = 0.39894228;
8720 double t = 1.0 / ( 1.0 + p * x );
8721 return (1.0 - c * exp( -x * x / 2.0 ) * t *
8722 ( t *( t * ( t * ( t * b5 + b4 ) + b3 ) + b2 ) + b1 ));
8725 double t = 1.0 / ( 1.0 - p * x );
8726 return ( c * exp( -x * x / 2.0 ) * t *
8727 ( t *( t * ( t * ( t * b5 + b4 ) + b3 ) + b2 ) + b1 ));
8730 void *mymalloc(size_t sz) {
8731 if (sz == 0) return(NULL);
8732 void *new = malloc(sz);
8734 fprintf(stderr, "Out of memory\n");
8740 struct mallinfo mi = mallinfo();
8741 if (mi.arena+mi.hblkhd > maxmallocHeap)
8742 maxmallocHeap = mi.arena+mi.hblkhd;
8744 /* gcc malloc should always return 16-byte-aligned values... */
8745 assert(IS_ALIGNED(new));
8749 void *mymemdup(void *data, size_t sz) {
8750 if(data==NULL) return(NULL);
8751 void *new = mymalloc(sz);
8752 memcpy(/*to*/new, /*from*/data, sz);
8756 void *myrealloc(void *data, size_t szOld, size_t szNew, bool bCopy) {
8757 if (data == NULL && szOld == 0)
8758 return(mymalloc(szNew));
8759 if (data == NULL || szOld == 0 || szNew == 0) {
8760 fprintf(stderr,"Empty myrealloc\n");
8767 /* Try to reduce memory fragmentation by allocating anew and copying
8768 Seems to help in practice */
8769 new = mymemdup(data, szNew);
8770 myfree(data, szOld);
8772 new = realloc(data,szNew);
8774 fprintf(stderr, "Out of memory\n");
8777 assert(IS_ALIGNED(new));
8778 szAllAlloc += (szNew-szOld);
8779 mymallocUsed += (szNew-szOld);
8781 struct mallinfo mi = mallinfo();
8782 if (mi.arena+mi.hblkhd > maxmallocHeap)
8783 maxmallocHeap = mi.arena+mi.hblkhd;
8789 void *myfree(void *p, size_t sz) {
8790 if(p==NULL) return(NULL);
8796 /******************************************************************************/
8797 /* Minimization of a 1-dimensional function by Brent's method (Numerical Recipes)
8798 * Borrowed from Tree-Puzzle 5.1 util.c under GPL
8799 * Modified by M.N.P to pass in the accessory data for the optimization function,
8800 * to use 2x bounds around the starting guess and expand them if necessary,
8801 * and to use both a fractional and an absolute tolerance
8805 #define CGOLD 0.3819660
8806 #define TINY 1.0e-20
8807 #define ZEPS 1.0e-10
8808 #define SHFT(a,b,c,d) (a)=(b);(b)=(c);(c)=(d);
8809 #define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))
8811 /* Brents method in one dimension */
8812 double brent(double ax, double bx, double cx, double (*f)(double, void *), void *data,
8813 double ftol, double atol,
8814 double *foptx, double *f2optx, double fax, double fbx, double fcx)
8817 double a,b,d=0,etemp,fu,fv,fw,fx,p,q,r,tol1,tol2,u,v,w,x,xm;
8821 a=(ax < cx ? ax : cx);
8822 b=(ax > cx ? ax : cx);
8836 for (iter=1;iter<=ITMAX;iter++) {
8839 tol2=2.0*(tol1+ZEPS);
8840 if (fabs(x-xm) <= (tol2-0.5*(b-a))
8841 || fabs(a-b) < atol) {
8846 *f2optx = 2.0*(fv*xw + fx*wv + fw*vx)/
8847 (v*v*xw + x*x*wv + w*w*vx);
8850 if (fabs(e) > tol1) {
8855 if (q > 0.0) p = -p;
8859 if (fabs(p) >= fabs(0.5*q*etemp) || p <= q*(a-x) || p >= q*(b-x))
8860 d=CGOLD*(e=(x >= xm ? a-x : b-x));
8864 if (u-a < tol2 || b-u < tol2)
8868 d=CGOLD*(e=(x >= xm ? a-x : b-x));
8870 u=(fabs(d) >= tol1 ? x+d : x+SIGN(tol1,d));
8873 if (u >= x) a=x; else b=x;
8877 if (u < x) a=u; else b=u;
8878 if (fu <= fw || w == x) {
8883 } else if (fu <= fv || v == x || v == w) {
8893 *f2optx = 2.0*(fv*xw + fx*wv + fw*vx)/
8894 (v*v*xw + x*x*wv + w*w*vx);
8903 /* one-dimensional minimization - as input a lower and an upper limit and a trial
8904 value for the minimum is needed: xmin < xguess < xmax
8905 the function and a fractional tolerance has to be specified
8906 onedimenmin returns the optimal x value and the value of the function
8907 and its second derivative at this point
8909 double onedimenmin(double xmin, double xguess, double xmax, double (*f)(double,void*), void *data,
8910 double ftol, double atol,
8911 /*OUT*/double *fx, /*OUT*/double *f2x)
8913 double optx, ax, bx, cx, fa, fb, fc;
8915 /* first attempt to bracketize minimum */
8916 if (xguess == xmin) {
8920 } else if (xguess <= 2.0 * xmin) {
8934 fprintf(stderr, "onedimenmin lo %.4f guess %.4f hi %.4f range %.4f %.4f\n",
8935 ax, bx, cx, xmin, xmax);
8936 /* ideally this range includes the true minimum, i.e.,
8938 if not, we gradually expand the boundaries until it does,
8939 or we near the boundary of the allowed range and use that
8944 while(fa < fb && ax > xmin) {
8946 if (ax < 2.0*xmin) /* give up on shrinking the region */
8950 while(fc < fb && cx < xmax) {
8952 if (cx > xmax * 0.95)
8956 optx = brent(ax, bx, cx, f, data, ftol, atol, fx, f2x, fa, fb, fc);
8959 fprintf(stderr, "onedimenmin reaches optimum f(%.4f) = %.4f f2x %.4f\n", optx, *fx, *f2x);
8960 return optx; /* return optimal x */
8963 /* Numerical code for the gamma distribution is modified from the PhyML 3 code
8964 (GNU public license) of Stephane Guindon
8967 double LnGamma (double alpha)
8969 /* returns ln(gamma(alpha)) for alpha>0, accurate to 10 decimal places.
8970 Stirling's formula is used for the central polynomial part of the procedure.
8971 Pike MC & Hill ID (1966) Algorithm 291: Logarithm of the gamma function.
8972 Communications of the Association for Computing Machinery, 9:684
8974 double x=alpha, f=0, z;
8978 x=z; f=-(double)log(f);
8981 return f + (x-0.5)*(double)log(x) - x + .918938533204673
8982 + (((-.000595238095238*z+.000793650793651)*z-.002777777777778)*z
8983 +.083333333333333)/x;
8986 double IncompleteGamma(double x, double alpha, double ln_gamma_alpha)
8988 /* returns the incomplete gamma ratio I(x,alpha) where x is the upper
8989 limit of the integration and alpha is the shape parameter.
8990 returns (-1) if in error
8991 ln_gamma_alpha = ln(Gamma(alpha)), is almost redundant.
8992 (1) series expansion if (alpha>x || x<=1)
8993 (2) continued fraction otherwise
8995 Bhattacharjee GP (1970) The incomplete gamma integral. Applied Statistics,
8999 double p=alpha, g=ln_gamma_alpha;
9000 double accurate=1e-8, overflow=1e30;
9001 double factor, gin=0, rn=0, a=0,b=0,an=0,dif=0, term=0, pn[6];
9003 if (x==0) return (0);
9004 if (x<0 || p<=0) return (-1);
9006 factor=(double)exp(p*(double)log(x)-x-g);
9007 if (x>1 && x>=p) goto l30;
9008 /* (1) series expansion */
9009 gin=1; term=1; rn=p;
9012 term*=x/rn; gin+=term;
9014 if (term > accurate) goto l20;
9018 /* (2) continued fraction */
9019 a=1-p; b=a+x+1; term=0;
9020 pn[0]=1; pn[1]=x; pn[2]=x+1; pn[3]=x*b;
9023 a++; b+=2; term++; an=a*term;
9024 for (i=0; i<2; i++) pn[i+4]=b*pn[i+2]-an*pn[i];
9025 if (pn[5] == 0) goto l35;
9026 rn=pn[4]/pn[5]; dif=fabs(gin-rn);
9027 if (dif>accurate) goto l34;
9028 if (dif<=accurate*rn) goto l42;
9032 for (i=0; i<4; i++) pn[i]=pn[i+2];
9033 if (fabs(pn[4]) < overflow) goto l32;
9034 for (i=0; i<4; i++) pn[i]/=overflow;
9043 double PGamma(double x, double alpha)
9045 /* scale = 1/alpha */
9046 return IncompleteGamma(x*alpha,alpha,LnGamma(alpha));
9049 /* helper function to subtract timval structures */
9050 /* Subtract the `struct timeval' values X and Y,
9051 storing the result in RESULT.
9052 Return 1 if the difference is negative, otherwise 0. */
9053 int timeval_subtract (struct timeval *result, struct timeval *x, struct timeval *y)
9055 /* Perform the carry for the later subtraction by updating y. */
9056 if (x->tv_usec < y->tv_usec) {
9057 int nsec = (y->tv_usec - x->tv_usec) / 1000000 + 1;
9058 y->tv_usec -= 1000000 * nsec;
9061 if (x->tv_usec - y->tv_usec > 1000000) {
9062 int nsec = (x->tv_usec - y->tv_usec) / 1000000;
9063 y->tv_usec += 1000000 * nsec;
9067 /* Compute the time remaining to wait.
9068 tv_usec is certainly positive. */
9069 result->tv_sec = x->tv_sec - y->tv_sec;
9070 result->tv_usec = x->tv_usec - y->tv_usec;
9072 /* Return 1 if result is negative. */
9073 return x->tv_sec < y->tv_sec;
9076 double clockDiff(/*IN*/struct timeval *clock_start) {
9077 struct timeval time_now, elapsed;
9078 gettimeofday(/*OUT*/&time_now,NULL);
9079 timeval_subtract(/*OUT*/&elapsed,/*IN*/&time_now,/*IN*/clock_start);
9080 return(elapsed.tv_sec + elapsed.tv_usec*1e-6);
9084 /* The random number generator is taken from D E Knuth
9085 http://www-cs-faculty.stanford.edu/~knuth/taocp.html
9088 /* This program by D E Knuth is in the public domain and freely copyable.
9089 * It is explained in Seminumerical Algorithms, 3rd edition, Section 3.6
9090 * (or in the errata to the 2nd edition --- see
9091 * http://www-cs-faculty.stanford.edu/~knuth/taocp.html
9092 * in the changes to Volume 2 on pages 171 and following). */
9094 /* N.B. The MODIFICATIONS introduced in the 9th printing (2002) are
9095 included here; there's no backwards compatibility with the original. */
9097 /* This version also adopts Brendan McKay's suggestion to
9098 accommodate naive users who forget to call ran_start(seed). */
9100 /* If you find any bugs, please report them immediately to
9101 * taocp@cs.stanford.edu
9102 * (and you will be rewarded if the bug is genuine). Thanks! */
9104 /************ see the book for explanations and caveats! *******************/
9105 /************ in particular, you need two's complement arithmetic **********/
9107 #define KK 100 /* the long lag */
9108 #define LL 37 /* the short lag */
9109 #define MM (1L<<30) /* the modulus */
9110 #define mod_diff(x,y) (((x)-(y))&(MM-1)) /* subtraction mod MM */
9112 long ran_x[KK]; /* the generator state */
9115 void ran_array(long aa[],int n)
9117 void ran_array(aa,n) /* put n new random numbers in aa */
9118 long *aa; /* destination */
9119 int n; /* array length (must be at least KK) */
9123 for (j=0;j<KK;j++) aa[j]=ran_x[j];
9124 for (;j<n;j++) aa[j]=mod_diff(aa[j-KK],aa[j-LL]);
9125 for (i=0;i<LL;i++,j++) ran_x[i]=mod_diff(aa[j-KK],aa[j-LL]);
9126 for (;i<KK;i++,j++) ran_x[i]=mod_diff(aa[j-KK],ran_x[i-LL]);
9129 /* the following routines are from exercise 3.6--15 */
9130 /* after calling ran_start, get new randoms by, e.g., "x=ran_arr_next()" */
9132 #define QUALITY 1009 /* recommended quality level for high-res use */
9133 long ran_arr_buf[QUALITY];
9134 long ran_arr_dummy=-1, ran_arr_started=-1;
9135 long *ran_arr_ptr=&ran_arr_dummy; /* the next random number, or -1 */
9137 #define TT 70 /* guaranteed separation between streams */
9138 #define is_odd(x) ((x)&1) /* units bit of x */
9141 void ran_start(long seed)
9143 void ran_start(seed) /* do this before using ran_array */
9144 long seed; /* selector for different streams */
9148 long x[KK+KK-1]; /* the preparation buffer */
9149 register long ss=(seed+2)&(MM-2);
9150 for (j=0;j<KK;j++) {
9151 x[j]=ss; /* bootstrap the buffer */
9152 ss<<=1; if (ss>=MM) ss-=MM-2; /* cyclic shift 29 bits */
9154 x[1]++; /* make x[1] (and only x[1]) odd */
9155 for (ss=seed&(MM-1),t=TT-1; t; ) {
9156 for (j=KK-1;j>0;j--) x[j+j]=x[j], x[j+j-1]=0; /* "square" */
9157 for (j=KK+KK-2;j>=KK;j--)
9158 x[j-(KK-LL)]=mod_diff(x[j-(KK-LL)],x[j]),
9159 x[j-KK]=mod_diff(x[j-KK],x[j]);
9160 if (is_odd(ss)) { /* "multiply by z" */
9161 for (j=KK;j>0;j--) x[j]=x[j-1];
9162 x[0]=x[KK]; /* shift the buffer cyclically */
9163 x[LL]=mod_diff(x[LL],x[KK]);
9165 if (ss) ss>>=1; else t--;
9167 for (j=0;j<LL;j++) ran_x[j+KK-LL]=x[j];
9168 for (;j<KK;j++) ran_x[j-LL]=x[j];
9169 for (j=0;j<10;j++) ran_array(x,KK+KK-1); /* warm things up */
9170 ran_arr_ptr=&ran_arr_started;
9173 #define ran_arr_next() (*ran_arr_ptr>=0? *ran_arr_ptr++: ran_arr_cycle())
9174 long ran_arr_cycle()
9176 if (ran_arr_ptr==&ran_arr_dummy)
9177 ran_start(314159L); /* the user forgot to initialize */
9178 ran_array(ran_arr_buf,QUALITY);
9180 ran_arr_ptr=ran_arr_buf+1;
9181 return ran_arr_buf[0];
9184 /* end of code from Knuth */
9186 double knuth_rand() {
9187 return(9.31322574615479e-10 * ran_arr_next()); /* multiply by 2**-30 */
9190 hashstrings_t *MakeHashtable(char **strings, int nStrings) {
9191 hashstrings_t *hash = (hashstrings_t*)mymalloc(sizeof(hashstrings_t));
9192 hash->nBuckets = 8*nStrings;
9193 hash->buckets = (hashbucket_t*)mymalloc(sizeof(hashbucket_t) * hash->nBuckets);
9195 for (i=0; i < hash->nBuckets; i++) {
9196 hash->buckets[i].string = NULL;
9197 hash->buckets[i].nCount = 0;
9198 hash->buckets[i].first = -1;
9200 for (i=0; i < nStrings; i++) {
9201 hashiterator_t hi = FindMatch(hash, strings[i]);
9202 if (hash->buckets[hi].string == NULL) {
9203 /* save a unique entry */
9204 assert(hash->buckets[hi].nCount == 0);
9205 hash->buckets[hi].string = strings[i];
9206 hash->buckets[hi].nCount = 1;
9207 hash->buckets[hi].first = i;
9209 /* record a duplicate entry */
9210 assert(hash->buckets[hi].string != NULL);
9211 assert(strcmp(hash->buckets[hi].string, strings[i]) == 0);
9212 assert(hash->buckets[hi].first >= 0);
9213 hash->buckets[hi].nCount++;
9219 hashstrings_t *FreeHashtable(hashstrings_t* hash) {
9221 myfree(hash->buckets, sizeof(hashbucket_t) * hash->nBuckets);
9222 myfree(hash, sizeof(hashstrings_t));
9227 #define MAXADLER 65521
9228 hashiterator_t FindMatch(hashstrings_t *hash, char *string) {
9229 /* Adler-32 checksum */
9230 unsigned int hashA = 1;
9231 unsigned int hashB = 0;
9233 for (p = string; *p != '\0'; p++) {
9234 hashA = ((unsigned int)*p + hashA);
9235 hashB = hashA+hashB;
9239 hashiterator_t hi = (hashB*65536+hashA) % hash->nBuckets;
9240 while(hash->buckets[hi].string != NULL
9241 && strcmp(hash->buckets[hi].string, string) != 0) {
9243 if (hi >= hash->nBuckets)
9249 char *GetHashString(hashstrings_t *hash, hashiterator_t hi) {
9250 return(hash->buckets[hi].string);
9253 int HashCount(hashstrings_t *hash, hashiterator_t hi) {
9254 return(hash->buckets[hi].nCount);
9257 int HashFirst(hashstrings_t *hash, hashiterator_t hi) {
9258 return(hash->buckets[hi].first);
9261 uniquify_t *UniquifyAln(alignment_t *aln) {
9263 char **uniqueSeq = (char**)mymalloc(aln->nSeq * sizeof(char*)); /* iUnique -> seq */
9264 int *uniqueFirst = (int*)mymalloc(aln->nSeq * sizeof(int)); /* iUnique -> iFirst in aln */
9265 int *alnNext = (int*)mymalloc(aln->nSeq * sizeof(int)); /* i in aln -> next, or -1 */
9266 int *alnToUniq = (int*)mymalloc(aln->nSeq * sizeof(int)); /* i in aln -> iUnique; many -> -1 */
9269 for (i = 0; i < aln->nSeq; i++) {
9270 uniqueSeq[i] = NULL;
9271 uniqueFirst[i] = -1;
9275 hashstrings_t *hashseqs = MakeHashtable(aln->seqs, aln->nSeq);
9276 for (i=0; i<aln->nSeq; i++) {
9277 hashiterator_t hi = FindMatch(hashseqs,aln->seqs[i]);
9278 int first = HashFirst(hashseqs,hi);
9280 uniqueSeq[nUniqueSeq] = aln->seqs[i];
9281 uniqueFirst[nUniqueSeq] = i;
9282 alnToUniq[i] = nUniqueSeq;
9286 while (alnNext[last] != -1)
9287 last = alnNext[last];
9290 assert(alnToUniq[last] >= 0 && alnToUniq[last] < nUniqueSeq);
9291 alnToUniq[i] = alnToUniq[last];
9294 assert(nUniqueSeq>0);
9295 hashseqs = FreeHashtable(hashseqs);
9297 uniquify_t *uniquify = (uniquify_t*)mymalloc(sizeof(uniquify_t));
9298 uniquify->nSeq = aln->nSeq;
9299 uniquify->nUnique = nUniqueSeq;
9300 uniquify->uniqueFirst = uniqueFirst;
9301 uniquify->alnNext = alnNext;
9302 uniquify->alnToUniq = alnToUniq;
9303 uniquify->uniqueSeq = uniqueSeq;
9307 uniquify_t *FreeUniquify(uniquify_t *unique) {
9308 if (unique != NULL) {
9309 myfree(unique->uniqueFirst, sizeof(int)*unique->nSeq);
9310 myfree(unique->alnNext, sizeof(int)*unique->nSeq);
9311 myfree(unique->alnToUniq, sizeof(int)*unique->nSeq);
9312 myfree(unique->uniqueSeq, sizeof(char*)*unique->nSeq);
9313 myfree(unique,sizeof(uniquify_t));
9319 traversal_t InitTraversal(NJ_t *NJ) {
9320 traversal_t worked = (bool*)mymalloc(sizeof(bool)*NJ->maxnodes);
9322 for (i=0; i<NJ->maxnodes; i++)
9327 void SkipTraversalInto(int node, /*IN/OUT*/traversal_t traversal) {
9328 traversal[node] = true;
9331 int TraversePostorder(int node, NJ_t *NJ, /*IN/OUT*/traversal_t traversal,
9332 /*OPTIONAL OUT*/bool *pUp) {
9338 /* move to a child if possible */
9341 for (iChild=0; iChild < NJ->child[node].nChild; iChild++) {
9342 int child = NJ->child[node].child[iChild];
9343 if (!traversal[child]) {
9350 continue; /* keep moving down */
9351 if (!traversal[node]) {
9352 traversal[node] = true;
9355 /* If we've already done this node, need to move up */
9356 if (node == NJ->root)
9357 return(-1); /* nowhere to go -- done traversing */
9358 node = NJ->parent[node];
9359 /* If we go up to someplace that was already marked as visited, this is due
9360 to a change in topology, so return it marked as "up" */
9361 if (pUp && traversal[node]) {
9368 traversal_t FreeTraversal(traversal_t traversal, NJ_t *NJ) {
9369 myfree(traversal, sizeof(bool)*NJ->maxnodes);
9373 profile_t **UpProfiles(NJ_t *NJ) {
9374 profile_t **upProfiles = (profile_t**)mymalloc(sizeof(profile_t*)*NJ->maxnodes);
9376 for (i=0; i<NJ->maxnodes; i++) upProfiles[i] = NULL;
9380 profile_t *GetUpProfile(/*IN/OUT*/profile_t **upProfiles, NJ_t *NJ, int outnode, bool useML) {
9381 assert(outnode != NJ->root && outnode >= NJ->nSeq); /* not for root or leaves */
9382 if (upProfiles[outnode] != NULL)
9383 return(upProfiles[outnode]);
9386 int *pathToRoot = PathToRoot(NJ, outnode, /*OUT*/&depth);
9388 /* depth-1 is root */
9389 for (i = depth-2; i>=0; i--) {
9390 int node = pathToRoot[i];
9392 if (upProfiles[node] == NULL) {
9393 /* Note -- SetupABCD may call GetUpProfile, but it should do it farther
9394 up in the path to the root
9396 profile_t *profiles[4];
9398 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, useML);
9400 /* If node is a child of root, then the 4th profile is of the 2nd root-sibling of node
9401 Otherwise, the 4th profile is the up-profile of the parent of node, and that
9402 is the branch-length we need
9404 double lenC = NJ->branchlength[nodeABCD[2]];
9405 double lenD = NJ->branchlength[nodeABCD[3]];
9407 fprintf(stderr, "Computing UpProfile for node %d with lenC %.4f lenD %.4f pair-loglk %.3f\n",
9409 PairLogLk(profiles[2],profiles[3],lenC+lenD,NJ->nPos,NJ->transmat,&NJ->rates, /*site_lk*/NULL));
9410 PrintNJInternal(stderr, NJ, /*useLen*/true);
9412 upProfiles[node] = PosteriorProfile(/*C*/profiles[2], /*D*/profiles[3],
9414 NJ->transmat, &NJ->rates, NJ->nPos, NJ->nConstraints);
9416 profile_t *profilesCDAB[4] = { profiles[2], profiles[3], profiles[0], profiles[1] };
9417 double weight = QuartetWeight(profilesCDAB, NJ->distance_matrix, NJ->nPos);
9419 fprintf(stderr, "Compute upprofile of %d from %d and parents (vs. children %d %d) with weight %.3f\n",
9420 node, nodeABCD[2], nodeABCD[0], nodeABCD[1], weight);
9421 upProfiles[node] = AverageProfile(profiles[2], profiles[3],
9422 NJ->nPos, NJ->nConstraints,
9423 NJ->distance_matrix,
9428 FreePath(pathToRoot,NJ);
9429 assert(upProfiles[outnode] != NULL);
9430 return(upProfiles[outnode]);
9433 profile_t *DeleteUpProfile(/*IN/OUT*/profile_t **upProfiles, NJ_t *NJ, int node) {
9434 assert(node>=0 && node < NJ->maxnodes);
9435 if (upProfiles[node] != NULL)
9436 upProfiles[node] = FreeProfile(upProfiles[node], NJ->nPos, NJ->nConstraints); /* returns NULL */
9440 profile_t **FreeUpProfiles(profile_t **upProfiles, NJ_t *NJ) {
9443 for (i=0; i < NJ->maxnodes; i++) {
9444 if (upProfiles[i] != NULL)
9446 DeleteUpProfile(upProfiles, NJ, i);
9448 myfree(upProfiles, sizeof(profile_t*)*NJ->maxnodes);
9450 fprintf(stderr,"FreeUpProfiles -- freed %d\n", nUsed);
9454 int *PathToRoot(NJ_t *NJ, int node, /*OUT*/int *outDepth) {
9455 int *pathToRoot = (int*)mymalloc(sizeof(int)*NJ->maxnodes);
9457 int ancestor = node;
9458 while(ancestor >= 0) {
9459 pathToRoot[depth] = ancestor;
9460 ancestor = NJ->parent[ancestor];
9467 int *FreePath(int *path, NJ_t *NJ) {
9468 myfree(path, sizeof(int)*NJ->maxnodes);
9472 transition_matrix_t *CreateGTR(double *r/*ac ag at cg ct gt*/, double *f/*acgt*/) {
9473 double matrix[4][MAXCODES];
9476 /* Place rates onto a symmetric matrix, but correct by f(target), so that
9477 stationary distribution f[] is maintained
9478 Leave diagonals as 0 (CreateTransitionMatrix will fix them)
9481 for (i = 0; i < nCodes; i++) {
9483 for (j = i+1; j < nCodes; j++) {
9484 double rate = r[imat++];
9486 /* Want t(matrix) * f to be 0 */
9487 matrix[i][j] = rate * f[i];
9488 matrix[j][i] = rate * f[j];
9491 /* Compute average mutation rate */
9492 double total_rate = 0;
9493 for (i = 0; i < nCodes; i++)
9494 for (j = 0; j < nCodes; j++)
9495 total_rate += f[i] * matrix[i][j];
9496 assert(total_rate > 1e-6);
9497 double inv = 1.0/total_rate;
9498 for (i = 0; i < nCodes; i++)
9499 for (j = 0; j < nCodes; j++)
9500 matrix[i][j] *= inv;
9501 return(CreateTransitionMatrix(matrix,f));
9504 transition_matrix_t *CreateTransitionMatrix(/*IN*/double matrix[MAXCODES][MAXCODES],
9505 /*IN*/double stat[MAXCODES]) {
9507 transition_matrix_t *transmat = mymalloc(sizeof(transition_matrix_t));
9508 double sqrtstat[20];
9509 for (i = 0; i < nCodes; i++) {
9510 transmat->stat[i] = stat[i];
9511 transmat->statinv[i] = 1.0/stat[i];
9512 sqrtstat[i] = sqrt(stat[i]);
9515 double sym[20*20]; /* symmetrized matrix M' */
9516 /* set diagonals so columns sums are 0 before symmetrization */
9517 for (i = 0; i < nCodes; i++)
9518 for (j = 0; j < nCodes; j++)
9519 sym[nCodes*i+j] = matrix[i][j];
9520 for (j = 0; j < nCodes; j++) {
9522 sym[nCodes*j+j] = 0;
9523 for (i = 0; i < nCodes; i++)
9524 sum += sym[nCodes*i+j];
9525 sym[nCodes*j+j] = -sum;
9527 /* M' = S**-1 M S */
9528 for (i = 0; i < nCodes; i++)
9529 for (j = 0; j < nCodes; j++)
9530 sym[nCodes*i+j] *= sqrtstat[j]/sqrtstat[i];
9532 /* eigen decomposition of M' -- note that eigenW is the transpose of what we want,
9533 which is eigenvectors in columns */
9534 double eigenW[20*20], eval[20], e[20];
9535 for (i = 0; i < nCodes*nCodes; i++)
9537 tred2(eigenW, nCodes, nCodes, eval, e);
9538 tqli(eval, e, nCodes , nCodes, eigenW);
9540 /* save eigenvalues */
9541 for (i = 0; i < nCodes; i++)
9542 transmat->eigenval[i] = eval[i];
9544 /* compute eigen decomposition of M into t(codeFreq): V = S*W */
9545 /* compute inverse of V in eigeninv: V**-1 = t(W) S**-1 */
9546 for (i = 0; i < nCodes; i++) {
9547 for (j = 0; j < nCodes; j++) {
9548 transmat->eigeninv[i][j] = eigenW[nCodes*i+j] / sqrtstat[j];
9549 transmat->eigeninvT[j][i] = transmat->eigeninv[i][j];
9552 for (i = 0; i < nCodes; i++)
9553 for (j = 0; j < nCodes; j++)
9554 transmat->codeFreq[i][j] = eigenW[j*nCodes+i] * sqrtstat[i];
9555 /* codeFreq[NOCODE] is the rotation of (1,1,...) not (1/nCodes,1/nCodes,...), which
9556 gives correct posterior probabilities
9558 for (j = 0; j < nCodes; j++) {
9559 transmat->codeFreq[NOCODE][j] = 0.0;
9560 for (i = 0; i < nCodes; i++)
9561 transmat->codeFreq[NOCODE][j] += transmat->codeFreq[i][j];
9563 /* save some posterior probabilities for approximating later:
9564 first, we compute P(B | A, t) for t = approxMLnearT, by using
9565 V * exp(L*t) * V**-1 */
9566 double expvalues[MAXCODES];
9567 for (i = 0; i < nCodes; i++)
9568 expvalues[i] = exp(approxMLnearT * transmat->eigenval[i]);
9569 double LVinv[MAXCODES][MAXCODES]; /* exp(L*t) * V**-1 */
9570 for (i = 0; i < nCodes; i++) {
9571 for (j = 0; j < nCodes; j++)
9572 LVinv[i][j] = transmat->eigeninv[i][j] * expvalues[i];
9574 /* matrix transform for converting A -> B given t: transt[i][j] = P(j->i | t) */
9575 double transt[MAXCODES][MAXCODES];
9576 for (i = 0; i < nCodes; i++) {
9577 for (j = 0; j < nCodes; j++) {
9579 for (k = 0; k < nCodes; k++)
9580 transt[i][j] += transmat->codeFreq[i][k] * LVinv[k][j];
9583 /* nearP[i][j] = P(parent = j | both children are i) = P(j | i,i) ~ stat(j) * P(j->i | t)**2 */
9584 for (i = 0; i < nCodes; i++) {
9585 double nearP[MAXCODES];
9587 for (j = 0; j < nCodes; j++) {
9588 assert(transt[j][i] > 0);
9589 assert(transmat->stat[j] > 0);
9590 nearP[j] = transmat->stat[j] * transt[i][j] * transt[i][j];
9594 for (j = 0; j < nCodes; j++)
9595 nearP[j] *= 1.0/tot;
9596 /* save nearP in transmat->nearP[i][] */
9597 for (j = 0; j < nCodes; j++)
9598 transmat->nearP[i][j] = nearP[j];
9599 /* multiply by 1/stat and rotate nearP */
9600 for (j = 0; j < nCodes; j++)
9601 nearP[j] /= transmat->stat[j];
9602 for (j = 0; j < nCodes; j++) {
9604 for (k = 0; k < nCodes; k++)
9605 rot += nearP[k] * transmat->codeFreq[i][j];
9606 transmat->nearFreq[i][j] = rot;
9613 distance_matrix_t *TransMatToDistanceMat(transition_matrix_t *transmat) {
9614 if (transmat == NULL)
9616 distance_matrix_t *dmat = mymalloc(sizeof(distance_matrix_t));
9618 for (i=0; i<nCodes; i++) {
9619 for (j=0; j<nCodes; j++) {
9620 dmat->distances[i][j] = 0; /* never actually used */
9621 dmat->eigeninv[i][j] = transmat->eigeninv[i][j];
9622 dmat->codeFreq[i][j] = transmat->codeFreq[i][j];
9625 /* eigentot . rotated-vector is the total frequency of the unrotated vector
9626 (used to normalize in NormalizeFreq()
9627 For transition matrices, we rotate by transpose of eigenvectors, so
9628 we need to multiply by the inverse matrix by 1....1 to get this vector,
9629 or in other words, sum the columns
9631 for(i = 0; i<nCodes; i++) {
9632 dmat->eigentot[i] = 0.0;
9633 for (j = 0; j<nCodes; j++)
9634 dmat->eigentot[i] += transmat->eigeninv[i][j];
9639 /* Numerical recipes code for eigen decomposition (actually taken from RAxML rev_functions.c) */
9640 void tred2 (double *a, const int n, const int np, double *d, double *e)
9642 #define a(i,j) a[(j-1)*np + (i-1)]
9646 double f, g, h, hh, scale;
9647 for (i = n; i > 1; i--) {
9652 for ( k = 1; k <= l; k++ )
9653 scale += fabs(a(i,k));
9657 for (k = 1; k <= l; k++) {
9659 h += a(i,k) * a(i,k);
9668 for (j = 1; j <=l ; j++) {
9669 a(j,i) = a(i,j) / h;
9671 for (k = 1; k <= j; k++)
9673 for (k = j+1; k <= l; k++)
9679 for (j = 1; j <= l; j++) {
9683 for (k = 1; k <= j; k++)
9684 a(j,k) -= f*e(k) + g*a(i,k);
9693 for (i = 1; i <= n; i++) {
9696 for (j = 1; j <=l; j++) {
9698 for (k = 1; k <= l; k++)
9700 for (k=1; k <=l; k++)
9701 a(k,j) -= g * a(k,i);
9706 for (j=1; j<=l; j++)
9707 a(i,j) = a(j,i) = 0;
9716 double pythag(double a, double b) {
9717 double absa = fabs(a), absb = fabs(b);
9718 return (absa > absb) ?
9719 absa * sqrt(1+ (absb/absa)*(absb/absa)) :
9722 absb * sqrt(1+ (absa/absb)*(absa/absb));
9725 void tqli(double *d, double *e, int n, int np, double *z)
9727 #define z(i,j) z[(j-1)*np + (i-1)]
9731 int i = 0, iter = 0, k = 0, l = 0, m = 0;
9732 double b = 0, c = 0, dd = 0, f = 0, g = 0, p = 0, r = 0, s = 0;
9738 for (l = 1; l <= n; l++)
9743 for (m = l; (m < n); m++)
9745 dd = fabs(d(m))+fabs(d(m+1));
9747 if (fabs(e(m))+dd == dd)
9756 g = (d(l+1)-d(l))/(2*e(l));
9758 g = d(m)-d(l)+e(l)/(g+(g<0?-r:r));
9763 for (i = m-1; i>=l; i--)
9780 r = (d(i)-g)*s + 2*c*b;
9784 for (k=1; k <= n; k++)
9787 z(k,i+1) = s * z(k,i) + c*f;
9788 z(k,i) = c * z(k,i) - s*f;
9807 inline float mm_sum(register __m128 sum) {
9809 /* stupider but faster */
9811 _mm_store_ps(f,sum);
9812 return(f[0]+f[1]+f[2]+f[3]);
9814 /* first we get sum[0]+sum[1], sum[2]+sum[3] by selecting 0/1 and 2/3 */
9815 sum = _mm_add_ps(sum,_mm_shuffle_ps(sum,sum,_MM_SHUFFLE(0,1,2,3)));
9816 /* then get sum[0]+sum[1]+sum[2]+sum[3] by selecting 0/1 and 0/1 */
9817 sum = _mm_add_ps(sum,_mm_shuffle_ps(sum,sum,_MM_SHUFFLE(0,1,0,1)));
9819 _mm_store_ss(&f, sum); /* save the lowest word */
9825 void vector_multiply(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, int n, /*OUT*/numeric_t *fOut) {
9828 for (i = 0; i < n; i += 4) {
9830 a = _mm_load_ps(f1+i);
9831 b = _mm_load_ps(f2+i);
9832 c = _mm_mul_ps(a, b);
9833 _mm_store_ps(fOut+i,c);
9837 for (i = 0; i < n; i++)
9838 fOut[i] = f1[i]*f2[i];
9842 numeric_t vector_multiply_sum(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, int n) {
9845 return(f1[0]*f2[0]+f1[1]*f2[1]+f1[2]*f2[2]+f1[3]*f2[3]);
9846 __m128 sum = _mm_setzero_ps();
9848 for (i = 0; i < n; i += 4) {
9850 a = _mm_load_ps(f1+i);
9851 b = _mm_load_ps(f2+i);
9852 c = _mm_mul_ps(a, b);
9853 sum = _mm_add_ps(c, sum);
9855 return(mm_sum(sum));
9858 numeric_t out = 0.0;
9859 for (i=0; i < n; i++)
9866 numeric_t vector_multiply3_sum(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, /*IN*/numeric_t* f3, int n) {
9868 __m128 sum = _mm_setzero_ps();
9870 for (i = 0; i < n; i += 4) {
9872 a1 = _mm_load_ps(f1+i);
9873 a2 = _mm_load_ps(f2+i);
9874 a3 = _mm_load_ps(f3+i);
9875 sum = _mm_add_ps(_mm_mul_ps(_mm_mul_ps(a1,a2),a3),sum);
9877 return(mm_sum(sum));
9880 numeric_t sum = 0.0;
9881 for (i = 0; i < n; i++)
9882 sum += f1[i]*f2[i]*f3[i];
9887 numeric_t vector_dot_product_rot(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, /*IN*/numeric_t *fBy, int n) {
9889 __m128 sum1 = _mm_setzero_ps();
9890 __m128 sum2 = _mm_setzero_ps();
9892 for (i = 0; i < n; i += 4) {
9894 a1 = _mm_load_ps(f1+i);
9895 a2 = _mm_load_ps(f2+i);
9896 aBy = _mm_load_ps(fBy+i);
9897 sum1 = _mm_add_ps(_mm_mul_ps(a1, aBy), sum1);
9898 sum2 = _mm_add_ps(_mm_mul_ps(a2, aBy), sum2);
9900 return(mm_sum(sum1)*mm_sum(sum2));
9903 numeric_t out1 = 0.0;
9904 numeric_t out2 = 0.0;
9905 for (i=0; i < n; i++) {
9906 out1 += f1[i]*fBy[i];
9907 out2 += f2[i]*fBy[i];
9913 numeric_t vector_sum(/*IN*/numeric_t *f1, int n) {
9916 return(f1[0]+f1[1]+f1[2]+f1[3]);
9917 __m128 sum = _mm_setzero_ps();
9919 for (i = 0; i < n; i+=4) {
9921 a = _mm_load_ps(f1+i);
9922 sum = _mm_add_ps(a, sum);
9924 return(mm_sum(sum));
9926 numeric_t out = 0.0;
9928 for (i = 0; i < n; i++)
9934 void vector_multiply_by(/*IN/OUT*/numeric_t *f, /*IN*/numeric_t fBy, int n) {
9937 __m128 c = _mm_set1_ps(fBy);
9938 for (i = 0; i < n; i += 4) {
9940 a = _mm_load_ps(f+i);
9941 b = _mm_mul_ps(a,c);
9942 _mm_store_ps(f+i,b);
9945 for (i = 0; i < n; i++)
9950 void vector_add_mult(/*IN/OUT*/numeric_t *fTot, /*IN*/numeric_t *fAdd, numeric_t weight, int n) {
9953 __m128 w = _mm_set1_ps(weight);
9954 for (i = 0; i < n; i += 4) {
9956 tot = _mm_load_ps(fTot+i);
9957 add = _mm_load_ps(fAdd+i);
9958 _mm_store_ps(fTot+i, _mm_add_ps(tot, _mm_mul_ps(add,w)));
9962 for (i = 0; i < n; i++)
9963 fTot[i] += fAdd[i] * weight;
9967 void matrixt_by_vector4(/*IN*/numeric_t mat[4][MAXCODES], /*IN*/numeric_t vec[4], /*OUT*/numeric_t out[4]) {
9969 /*__m128 v = _mm_load_ps(vec);*/
9970 __m128 o = _mm_setzero_ps();
9972 /* result is a sum of vectors: sum(k) v[k] * mat[k][] */
9973 for (j = 0; j < 4; j++) {
9974 __m128 m = _mm_load_ps(&mat[j][0]);
9975 __m128 vj = _mm_load1_ps(&vec[j]); /* is it faster to shuffle v? */
9976 o = _mm_add_ps(o, _mm_mul_ps(vj,m));
9978 _mm_store_ps(out, o);
9981 for (j = 0; j < 4; j++) {
9983 for (k = 0; k < 4; k++)
9984 sum += vec[k] * mat[k][j];
9990 distance_matrix_t matrixBLOSUM45 =
9994 {0, 1.31097856157468, 1.06573001937323, 1.2682782988532, 0.90471293383305, 1.05855446876905, 1.05232790675508, 0.769574440593014, 1.27579668305679, 0.964604099952603, 0.987178199640556, 1.05007594438157, 1.05464162250736, 1.1985987403937, 0.967404475245526, 0.700490199584332, 0.880060189098976, 1.09748548316685, 1.28141710375267, 0.800038509951648},
9995 {1.31097856157468, 0, 0.8010890222701, 0.953340718498495, 1.36011107208122, 0.631543775840481, 0.791014908659279, 1.15694899265629, 0.761152570032029, 1.45014917711188, 1.17792001455227, 0.394661075648738, 0.998807558909651, 1.135143404599, 1.15432562628921, 1.05309036790541, 1.05010474413616, 1.03938321130789, 0.963216908696184, 1.20274751778601},
9996 {1.06573001937323, 0.8010890222701, 0, 0.488217214273568, 1.10567116937273, 0.814970207038261, 0.810176440932339, 0.746487413974582, 0.61876156253224, 1.17886558630004, 1.52003670190022, 0.808442678243754, 1.2889025816028, 1.16264109995678, 1.18228799147301, 0.679475681649858, 0.853658619686283, 1.68988558988005, 1.24297493464833, 1.55207513886163},
9997 {1.2682782988532, 0.953340718498495, 0.488217214273568, 0, 1.31581050011876, 0.769778474953791, 0.482077627352988, 0.888361752320536, 0.736360849050364, 1.76756333403346, 1.43574761894039, 0.763612910719347, 1.53386612356483, 1.74323672079854, 0.886347403928663, 0.808614044804528, 1.01590147813779, 1.59617804551619, 1.1740494822217, 1.46600946033173},
9998 {0.90471293383305, 1.36011107208122, 1.10567116937273, 1.31581050011876, 0, 1.3836789310481, 1.37553994252576, 1.26740695314856, 1.32361065635259, 1.26087264215993, 1.02417540515351, 1.37259631233791, 1.09416720447891, 0.986982088723923, 1.59321190226694, 0.915638787768407, 0.913042853922533, 1.80744143643002, 1.3294417177004, 0.830022143283238},
9999 {1.05855446876905, 0.631543775840481, 0.814970207038261, 0.769778474953791, 1.3836789310481, 0, 0.506942797642807, 1.17699648087288, 0.614595446514896, 1.17092829494457, 1.19833088638994, 0.637341078675405, 0.806490842729072, 1.83315144709714, 0.932064479113502, 0.850321696813199, 1.06830084665916, 1.05739353225849, 0.979907428113788, 1.5416250309563},
10000 {1.05232790675508, 0.791014908659279, 0.810176440932339, 0.482077627352988, 1.37553994252576, 0.506942797642807, 0, 1.17007322676118, 0.769786956320484, 1.46659942462342, 1.19128214039009, 0.633592151371708, 1.27269395724349, 1.44641491621774, 0.735428579892476, 0.845319988414402, 1.06201695511881, 1.324395996498, 1.22734387448031, 1.53255698189437},
10001 {0.769574440593014, 1.15694899265629, 0.746487413974582, 0.888361752320536, 1.26740695314856, 1.17699648087288, 1.17007322676118, 0, 1.1259007054424, 1.7025415585924, 1.38293205218175, 1.16756929156758, 1.17264582493965, 1.33271035269688, 1.07564768421292, 0.778868281341681, 1.23287107008366, 0.968539655354582, 1.42479529031801, 1.41208067821187},
10002 {1.27579668305679, 0.761152570032029, 0.61876156253224, 0.736360849050364, 1.32361065635259, 0.614595446514896, 0.769786956320484, 1.1259007054424, 0, 1.4112324673522, 1.14630894167097, 0.967795284542623, 0.771479459384692, 1.10468029976148, 1.12334774065132, 1.02482926701639, 1.28754326478771, 1.27439749294131, 0.468683841672724, 1.47469999960758},
10003 {0.964604099952603, 1.45014917711188, 1.17886558630004, 1.76756333403346, 1.26087264215993, 1.17092829494457, 1.46659942462342, 1.7025415585924, 1.4112324673522, 0, 0.433350517223017, 1.463460928818, 0.462965544381851, 0.66291968000662, 1.07010201755441, 1.23000200130049, 0.973485453109068, 0.963546200571036, 0.708724769805536, 0.351200119909572},
10004 {0.987178199640556, 1.17792001455227, 1.52003670190022, 1.43574761894039, 1.02417540515351, 1.19833088638994, 1.19128214039009, 1.38293205218175, 1.14630894167097, 0.433350517223017, 0, 1.49770950074319, 0.473800072611076, 0.538473125003292, 1.37979627224964, 1.5859723170438, 0.996267398224516, 0.986095542821092, 0.725310666139274, 0.570542199221932},
10005 {1.05007594438157, 0.394661075648738, 0.808442678243754, 0.763612910719347, 1.37259631233791, 0.637341078675405, 0.633592151371708, 1.16756929156758, 0.967795284542623, 1.463460928818, 1.49770950074319, 0, 1.0079761868248, 1.44331961488922, 0.924599080166146, 1.06275728888356, 1.05974425835993, 1.04892430642749, 0.972058829603409, 1.21378822764856},
10006 {1.05464162250736, 0.998807558909651, 1.2889025816028, 1.53386612356483, 1.09416720447891, 0.806490842729072, 1.27269395724349, 1.17264582493965, 0.771479459384692, 0.462965544381851, 0.473800072611076, 1.0079761868248, 0, 0.72479754849538, 1.1699868662153, 1.34481214251794, 1.06435197383538, 1.05348497728858, 0.774878150710318, 0.609532859331199},
10007 {1.1985987403937, 1.135143404599, 1.16264109995678, 1.74323672079854, 0.986982088723923, 1.83315144709714, 1.44641491621774, 1.33271035269688, 1.10468029976148, 0.66291968000662, 0.538473125003292, 1.44331961488922, 0.72479754849538, 0, 1.32968844979665, 1.21307373491949, 0.960087571600877, 0.475142555482979, 0.349485367759138, 0.692733248746636},
10008 {0.967404475245526, 1.15432562628921, 1.18228799147301, 0.886347403928663, 1.59321190226694, 0.932064479113502, 0.735428579892476, 1.07564768421292, 1.12334774065132, 1.07010201755441, 1.37979627224964, 0.924599080166146, 1.1699868662153, 1.32968844979665, 0, 0.979087429691819, 0.97631161216338, 1.21751652292503, 1.42156458605332, 1.40887880416009},
10009 {0.700490199584332, 1.05309036790541, 0.679475681649858, 0.808614044804528, 0.915638787768407, 0.850321696813199, 0.845319988414402, 0.778868281341681, 1.02482926701639, 1.23000200130049, 1.5859723170438, 1.06275728888356, 1.34481214251794, 1.21307373491949, 0.979087429691819, 0, 0.56109848274013, 1.76318885009194, 1.29689226231656, 1.02015839286433},
10010 {0.880060189098976, 1.05010474413616, 0.853658619686283, 1.01590147813779, 0.913042853922533, 1.06830084665916, 1.06201695511881, 1.23287107008366, 1.28754326478771, 0.973485453109068, 0.996267398224516, 1.05974425835993, 1.06435197383538, 0.960087571600877, 0.97631161216338, 0.56109848274013, 0, 1.39547634461879, 1.02642577026706, 0.807404666228614},
10011 {1.09748548316685, 1.03938321130789, 1.68988558988005, 1.59617804551619, 1.80744143643002, 1.05739353225849, 1.324395996498, 0.968539655354582, 1.27439749294131, 0.963546200571036, 0.986095542821092, 1.04892430642749, 1.05348497728858, 0.475142555482979, 1.21751652292503, 1.76318885009194, 1.39547634461879, 0, 0.320002937404137, 1.268589159299},
10012 {1.28141710375267, 0.963216908696184, 1.24297493464833, 1.1740494822217, 1.3294417177004, 0.979907428113788, 1.22734387448031, 1.42479529031801, 0.468683841672724, 0.708724769805536, 0.725310666139274, 0.972058829603409, 0.774878150710318, 0.349485367759138, 1.42156458605332, 1.29689226231656, 1.02642577026706, 0.320002937404137, 0, 0.933095433689795},
10013 {0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795, 0}
10017 {-0.216311217101265, -0.215171653035930, -0.217000020881064, -0.232890860601250, -0.25403526530177, -0.211569372858927, -0.218073620637049, -0.240585637190076, -0.214507049619293, -0.228476323330312, -0.223235445346107, -0.216116483840334, -0.206903836810903, -0.223553828183343, -0.236937609127783, -0.217652789023588, -0.211982652566286, -0.245995223308316, -0.206187718714279, -0.227670670439422},
10018 {-0.0843931919568687, -0.0342164464991033, 0.393702284928246, -0.166018266253027, 0.0500896782860136, -0.262731388032538, 0.030139964190519, -0.253997503551094, -0.0932603349591988, -0.32884667697173, 0.199966846276877, -0.117543453869516, 0.196248237055757, -0.456448703853250, 0.139286961076387, 0.241166801918811, -0.0783508285295053, 0.377438091416498, 0.109499076984234, 0.128581669647144},
10019 {-0.0690428674271772, 0.0133858672878363, -0.208289917312908, 0.161232925220819, 0.0735806288007248, -0.316269599838174, -0.0640708424745702, -0.117078801507436, 0.360805085405857, 0.336899760384943, 0.0332447078185156, 0.132954055834276, 0.00595209121998118, -0.157755611190327, -0.199839273133436, 0.193688928807663, 0.0970290928040946, 0.374683975138541, -0.478110944870958, -0.243290196936098},
10020 {0.117284581850481, 0.310399467781876, -0.143513477698805, 0.088808130300351, 0.105747812943691, -0.373871701179853, 0.189069306295134, 0.133258225034741, -0.213043549687694, 0.301303731259140, -0.182085224761849, -0.161971915020789, 0.229301173581378, -0.293586313243755, -0.0260480060747498, -0.0217953684540699, 0.0202675755458796, -0.160134624443657, 0.431950096999465, -0.329885160320501},
10021 {0.256496969244703, 0.0907408349583135, 0.0135731083898029, 0.477557831930769, -0.0727379669280703, 0.101732675207959, -0.147293025369251, -0.348325291603251, -0.255678082078362, -0.187092643740172, -0.177164064346593, -0.225921480146133, 0.422318841046522, 0.319959853469398, -0.0623652546300045, 0.0824203908606883, -0.102057926881110, 0.120728407576411, -0.156845807891241, -0.123528163091204},
10022 {-0.00906668858975576, -0.0814722888231236, -0.0762715085459023, 0.055819989938286, -0.0540516675257271, -0.0070589302769034, -0.315813159989213, -0.0103527463419808, -0.194634331372293, -0.0185860407566822, 0.50134169352609, 0.384531812730061, -0.0405008616742061, 0.0781033650669525, 0.069334900096687, 0.396455180448549, -0.204065801866462, -0.215272089630713, 0.171046818996465, -0.396393364716348},
10023 {0.201971098571663, 0.489747667606921, 0.00226258734592836, 0.0969514005747054, 0.0853921636903791, 0.0862068740282345, -0.465412154271164, -0.130516676347786, 0.165513616974634, 0.0712238027886633, 0.140746943067963, -0.325919272273406, -0.421213488261598, -0.163508199065965, 0.269695802810568, -0.110296405171437, -0.106834099902202, 0.00509414588152415, 0.00909215239544615, 0.0500401865589727},
10024 {0.515854176692456, -0.087468413428258, 0.102796468891449, -0.06046105990993, -0.212014383772414, -0.259853648383794, -0.0997372883043333, -0.109934574535736, 0.284891018406112, -0.250578342940183, 0.142174204994568, 0.210384918947619, 0.118803190788946, -0.0268434355996836, 0.0103721198836548, -0.355555176478458, 0.428042332431476, -0.150610175411631, 0.0464090887952940, -0.140238796382057},
10025 {-0.239392215229762, -0.315483492656425, 0.100205194952396, 0.197830195325302, 0.40178804665223, 0.195809461460298, -0.407817115321684, 0.0226836686147386, -0.169780276210306, 0.0818161585952184, -0.172886230584939, 0.174982644851064, 0.0868786992159535, -0.198450519980824, 0.168581078329968, -0.361514336004068, 0.238668430084722, 0.165494019791904, 0.110437707249228, -0.169592003035203},
10026 {-0.313151735678025, 0.10757884850664, -0.49249098807229, 0.0993472335619114, -0.148695715250836, 0.0573801136941699, -0.190040373500722, 0.254848437434773, 0.134147888304352, -0.352719341442756, 0.0839609323513986, -0.207904182300122, 0.253940523323376, -0.109832138553288, 0.0980084518687944, 0.209026594443723, 0.406236051871548, -0.0521120230935943, 0.0554108014592302, 0.134681046631955},
10027 {-0.102905214421384, 0.235803606800009, 0.213414976431981, -0.253606415825635, 0.00945656859370683, 0.259551282655855, 0.159527348902192, 0.083218761193016, -0.286815935191867, 0.0135069477264877, 0.336758103107357, -0.271707359524149, -0.0400009875851839, 0.0871186292716414, -0.171506310409388, -0.0954276577211755, 0.393467571460712, 0.111732846649458, -0.239886066474217, -0.426474828195231},
10028 {-0.0130795552324104, 0.0758967690968058, -0.165099404017689, -0.46035152559912, 0.409888158016031, -0.0235053940299396, 0.0699393201709723, -0.161320910316996, 0.226111732196825, -0.177811841258496, -0.219073917645916, -0.00703219376737286, 0.162831878334912, 0.271670554900684, 0.451033612762052, 0.0820942662443393, -0.0904983490498446, -0.0587000279313978, -0.0938852980928252, -0.306078621571843},
10029 {0.345092040577428, -0.257721588971295, -0.301689123771848, -0.0875212184538126, 0.161012613069275, 0.385104899829821, 0.118355290985046, -0.241723794416731, 0.083201920119646, -0.0809095291508749, -0.0820275390511991, -0.115569770103317, -0.250105681098033, -0.164197583037664, -0.299481453795592, 0.255906951902366, 0.129042051416371, 0.203761730442746, 0.347550071284268, -0.109264854744020},
10030 {0.056345924962239, 0.072536751679082, 0.303127492633681, -0.368877185781648, -0.343024497082421, 0.206879529669083, -0.413012709639426, 0.078538816203612, 0.103382383425097, 0.288319996147499, -0.392663258459423, 0.0319588502083897, 0.220316797792669, -0.0563686494606947, -0.0869286063283735, 0.323677017794391, 0.0984875197088935, -0.0303289828821742, 0.0450197853450979, -0.0261771221270139},
10031 {-0.253701638374729, -0.148922815783583, 0.111794052194159, 0.157313977830326, -0.269846001260543, -0.222989872703583, 0.115441028189268, -0.350456582262355, -0.0409581422905941, 0.174078744248002, -0.130673397086811, -0.123963802708056, -0.351609207081548, 0.281548012920868, 0.340382662112428, 0.180262131025562, 0.3895263830793, 0.0121546812430960, 0.214830943227063, -0.0617782909660214},
10032 {-0.025854479416026, 0.480654788977767, -0.138024550829229, -0.130191670810919, 0.107816875829919, -0.111243997319276, -0.0679814460571245, -0.183167991080677, -0.363355166018786, -0.183934891092050, -0.216097125080962, 0.520240628803255, -0.179616013606479, 0.0664131536100941, -0.178350708111064, 0.0352047611606709, 0.223857228692892, 0.128363679623513, -0.000403433628490731, 0.224972110977704},
10033 {0.159207394033448, -0.0371517305736114, -0.294302634912281, -0.0866954375908417, -0.259998567870054, 0.284966673982689, 0.205356416771391, -0.257613708650298, -0.264820519037270, 0.293359248624603, 0.0997476397434102, 0.151390539497369, 0.165571346773648, -0.347569523551258, 0.43792310820533, -0.0723248163210163, 0.0379214984816955, -0.0542758730251438, -0.258020301801603, 0.128680501102363},
10034 {0.316853842351797, -0.153950010941153, -0.13387065213508, -0.0702971390607613, -0.202558481846057, -0.172941438694837, -0.068882524588574, 0.524738203063889, -0.271670479920716, -0.112864756695310, -0.146831636946145, -0.0352336188578041, -0.211108490884767, 0.097857111349555, 0.276459740956662, 0.0231297536754823, -0.0773173324868396, 0.487208384389438, -0.0734191389266824, -0.113198765573319},
10035 {-0.274285525741087, 0.227334266052039, -0.0973746625709059, -0.00965256583655389, -0.402438444750043, 0.198586229519026, 0.0958135064575833, -0.108934376958686, 0.253641732094319, -0.0551918478254021, 0.0243640218331436, 0.181936272247179, 0.090952738347629, 0.0603352483029044, -0.0043821671755761, -0.347720824658591, -0.267879988539971, 0.403804652116592, 0.337654323971186, -0.241509293972297},
10036 {-0.0197089518344238, 0.139681034626696, 0.251980475788267, 0.341846624362846, -0.075141195125153, 0.2184951591319, 0.268870823491343, 0.150392399018138, 0.134592404015057, -0.337050200539163, -0.313109373497998, 0.201993318439135, -0.217140733851970, -0.337622749083808, 0.135253284365068, 0.181729249828045, -0.00627813335422765, -0.197218833324039, -0.194060005031698, -0.303055888528004}
10040 20.29131, 0.5045685, 0.2769945, 0.1551147, 0.03235484, -0.04127639, -0.3516426, -0.469973, -0.5835191, -0.6913107, -0.7207972, -0.7907875, -0.9524307, -1.095310, -1.402153, -1.424179, -1.936704, -2.037965, -3.273561, -5.488734
10042 /*eigentot and codeFreq left out, these are initialized elsewhere*/
10045 /* The JTT92 matrix, D. T. Jones, W. R. Taylor, & J. M. Thorton, CABIOS 8:275 (1992)
10046 Derived from the PhyML source code (models.c) by filling in the other side of the symmetric matrix,
10047 scaling the entries by the stationary rate (to give the rate of a->b not b|a), to set the diagonals
10048 so the rows sum to 0, to rescale the matrix so that the implied rate of evolution is 1.
10049 The resulting matrix is the transpose (I think).
10054 for (i=0; i<20; i++) for (j=0; j<i; j++) daa[j*20+i] = daa[i*20+j];
10055 for (i = 0; i < 20; i++) for (j = 0; j < 20; j++) daa[i*20+j] *= pi[j] / 100.0;
10056 double mr = 0; /* mean rate */
10057 for (i = 0; i < 20; i++) {
10059 for (j = 0; j < 20; j++)
10060 sum += daa[i*20+j];
10061 daa[i*20+i] = -sum;
10064 for (i = 0; i < 20*20; i++)
10069 double statJTT92[MAXCODES] = {0.07674789,0.05169087,0.04264509,0.05154407,0.01980301,0.04075195,0.06182989,0.07315199,0.02294399,0.05376110,0.09190390,0.05867583,0.02382594,0.04012589,0.05090097,0.06876503,0.05856501,0.01426057,0.03210196,0.06600504};
10070 double matrixJTT92[MAXCODES][MAXCODES] = {
10071 { -1.247831,0.044229,0.041179,0.061769,0.042704,0.043467,0.08007,0.136501,0.02059,0.027453,0.022877,0.02669,0.041179,0.011439,0.14794,0.288253,0.362223,0.006863,0.008388,0.227247 },
10072 { 0.029789,-1.025965,0.023112,0.008218,0.058038,0.159218,0.014895,0.070364,0.168463,0.011299,0.019517,0.33179,0.022599,0.002568,0.038007,0.051874,0.032871,0.064714,0.010272,0.008731 },
10073 { 0.022881,0.019068,-1.280568,0.223727,0.014407,0.03644,0.024576,0.034322,0.165676,0.019915,0.005085,0.11144,0.012712,0.004237,0.006356,0.213134,0.098304,0.00339,0.029661,0.00678 },
10074 { 0.041484,0.008194,0.270413,-1.044903,0.005121,0.025095,0.392816,0.066579,0.05736,0.005634,0.003585,0.013316,0.007682,0.002049,0.007682,0.030217,0.019462,0.002049,0.023559,0.015877 },
10075 { 0.011019,0.022234,0.00669,0.001968,-0.56571,0.001771,0.000984,0.011609,0.013577,0.003345,0.004526,0.001377,0.0061,0.015348,0.002755,0.043878,0.008264,0.022628,0.041124,0.012199 },
10076 { 0.02308,0.125524,0.034823,0.019841,0.003644,-1.04415,0.130788,0.010528,0.241735,0.003644,0.029154,0.118235,0.017411,0.00162,0.066406,0.021461,0.020651,0.007288,0.009718,0.008098 },
10077 { 0.064507,0.017816,0.035632,0.471205,0.003072,0.198435,-0.944343,0.073107,0.015973,0.007372,0.005529,0.111197,0.011058,0.003072,0.011058,0.01843,0.019659,0.006143,0.0043,0.027646 },
10078 { 0.130105,0.099578,0.058874,0.09449,0.042884,0.018898,0.086495,-0.647831,0.016717,0.004361,0.004361,0.019625,0.010176,0.003634,0.017444,0.146096,0.023986,0.039976,0.005815,0.034162 },
10079 { 0.006155,0.074775,0.089138,0.025533,0.01573,0.1361,0.005927,0.005243,-1.135695,0.003648,0.012767,0.010259,0.007523,0.009119,0.026217,0.016642,0.010487,0.001824,0.130629,0.002508 },
10080 { 0.01923,0.011752,0.025106,0.005876,0.009081,0.004808,0.00641,0.003205,0.008547,-1.273602,0.122326,0.011218,0.25587,0.047542,0.005342,0.021367,0.130873,0.004808,0.017094,0.513342 },
10081 { 0.027395,0.0347,0.010958,0.006392,0.021003,0.065748,0.008219,0.005479,0.051137,0.209115,-0.668139,0.012784,0.354309,0.226465,0.093143,0.053877,0.022829,0.047485,0.021916,0.16437 },
10082 { 0.020405,0.376625,0.153332,0.015158,0.004081,0.170239,0.105525,0.015741,0.026235,0.012243,0.008162,-0.900734,0.037896,0.002332,0.012243,0.027401,0.06005,0.00583,0.004664,0.008162 },
10083 { 0.012784,0.010416,0.007102,0.003551,0.007339,0.01018,0.004261,0.003314,0.007812,0.113397,0.091854,0.015388,-1.182051,0.01018,0.003788,0.006865,0.053503,0.005682,0.004261,0.076466 },
10084 { 0.00598,0.001993,0.003987,0.001595,0.031098,0.001595,0.001993,0.001993,0.015948,0.035484,0.098877,0.001595,0.017144,-0.637182,0.006778,0.03668,0.004784,0.021131,0.213701,0.024719 },
10085 { 0.098117,0.037426,0.007586,0.007586,0.007081,0.082944,0.009104,0.012138,0.058162,0.005058,0.051587,0.010621,0.008092,0.008598,-0.727675,0.144141,0.059679,0.003035,0.005058,0.011632 },
10086 { 0.258271,0.069009,0.343678,0.040312,0.152366,0.036213,0.020498,0.137334,0.049878,0.02733,0.040312,0.032113,0.019814,0.06286,0.194728,-1.447863,0.325913,0.023914,0.043045,0.025964 },
10087 { 0.276406,0.037242,0.135003,0.022112,0.02444,0.029677,0.018621,0.019203,0.026768,0.142567,0.014548,0.059936,0.131511,0.006983,0.068665,0.27757,-1.335389,0.006983,0.01222,0.065174 },
10088 { 0.001275,0.017854,0.001134,0.000567,0.016295,0.002551,0.001417,0.007793,0.001134,0.001275,0.007368,0.001417,0.003401,0.00751,0.00085,0.004959,0.0017,-0.312785,0.010061,0.003542 },
10089 { 0.003509,0.006379,0.022328,0.014673,0.066664,0.007655,0.002233,0.002552,0.182769,0.010207,0.007655,0.002552,0.005741,0.170967,0.00319,0.020095,0.006698,0.022647,-0.605978,0.005103 },
10090 { 0.195438,0.011149,0.010493,0.020331,0.040662,0.013117,0.029512,0.030824,0.007214,0.630254,0.11805,0.009182,0.211834,0.040662,0.015084,0.024922,0.073453,0.016396,0.010493,-1.241722 }
10093 double statWAG01[MAXCODES] = {0.0866279,0.043972, 0.0390894,0.0570451,0.0193078,0.0367281,0.0580589,0.0832518,0.0244314,0.048466, 0.086209, 0.0620286,0.0195027,0.0384319,0.0457631,0.0695179,0.0610127,0.0143859,0.0352742,0.0708956};
10094 double matrixWAG01[MAXCODES][MAXCODES] = {
10095 {-1.117151, 0.050147, 0.046354, 0.067188, 0.093376, 0.082607, 0.143908, 0.128804, 0.028817, 0.017577, 0.036177, 0.082395, 0.081234, 0.019138, 0.130789, 0.306463, 0.192846, 0.010286, 0.021887, 0.182381},
10096 {0.025455, -0.974318, 0.029321, 0.006798, 0.024376, 0.140086, 0.020267, 0.026982, 0.098628, 0.008629, 0.022967, 0.246964, 0.031527, 0.004740, 0.031358, 0.056495, 0.025586, 0.053714, 0.017607, 0.011623},
10097 {0.020916, 0.026065, -1.452438, 0.222741, 0.010882, 0.063328, 0.038859, 0.046176, 0.162306, 0.022737, 0.005396, 0.123567, 0.008132, 0.003945, 0.008003, 0.163042, 0.083283, 0.002950, 0.044553, 0.008051},
10098 {0.044244, 0.008819, 0.325058, -0.989665, 0.001814, 0.036927, 0.369645, 0.051822, 0.055719, 0.002361, 0.005077, 0.028729, 0.006212, 0.002798, 0.025384, 0.064166, 0.022443, 0.007769, 0.019500, 0.009120},
10099 {0.020812, 0.010703, 0.005375, 0.000614, -0.487357, 0.002002, 0.000433, 0.006214, 0.005045, 0.003448, 0.007787, 0.001500, 0.007913, 0.008065, 0.002217, 0.028525, 0.010395, 0.014531, 0.011020, 0.020307},
10100 {0.035023, 0.117008, 0.059502, 0.023775, 0.003809, -1.379785, 0.210830, 0.012722, 0.165524, 0.004391, 0.033516, 0.150135, 0.059565, 0.003852, 0.035978, 0.039660, 0.033070, 0.008316, 0.008777, 0.011613},
10101 {0.096449, 0.026759, 0.057716, 0.376214, 0.001301, 0.333275, -1.236894, 0.034593, 0.034734, 0.007763, 0.009400, 0.157479, 0.019202, 0.004944, 0.041578, 0.042955, 0.050134, 0.009540, 0.011961, 0.035874},
10102 {0.123784, 0.051085, 0.098345, 0.075630, 0.026795, 0.028838, 0.049604, -0.497615, 0.021792, 0.002661, 0.005356, 0.032639, 0.015212, 0.004363, 0.021282, 0.117240, 0.019732, 0.029444, 0.009052, 0.016361},
10103 {0.008127, 0.054799, 0.101443, 0.023863, 0.006384, 0.110105, 0.014616, 0.006395, -0.992342, 0.003543, 0.012807, 0.022832, 0.010363, 0.017420, 0.017851, 0.018979, 0.012136, 0.006733, 0.099319, 0.003035},
10104 {0.009834, 0.009511, 0.028192, 0.002006, 0.008654, 0.005794, 0.006480, 0.001549, 0.007029, -1.233162, 0.161294, 0.016472, 0.216559, 0.053891, 0.005083, 0.016249, 0.074170, 0.010808, 0.021372, 0.397837},
10105 {0.036002, 0.045028, 0.011900, 0.007673, 0.034769, 0.078669, 0.013957, 0.005547, 0.045190, 0.286902, -0.726011, 0.023303, 0.439180, 0.191376, 0.037625, 0.031191, 0.029552, 0.060196, 0.036066, 0.162890},
10106 {0.058998, 0.348377, 0.196082, 0.031239, 0.004820, 0.253558, 0.168246, 0.024319, 0.057967, 0.021081, 0.016767, -1.124580, 0.060821, 0.005783, 0.036254, 0.062960, 0.090292, 0.008952, 0.008675, 0.019884},
10107 {0.018288, 0.013983, 0.004057, 0.002124, 0.007993, 0.031629, 0.006450, 0.003564, 0.008272, 0.087143, 0.099354, 0.019123, -1.322098, 0.024370, 0.003507, 0.010109, 0.031033, 0.010556, 0.008769, 0.042133},
10108 {0.008490, 0.004143, 0.003879, 0.001885, 0.016054, 0.004030, 0.003273, 0.002014, 0.027402, 0.042734, 0.085315, 0.003583, 0.048024, -0.713669, 0.006512, 0.022020, 0.006934, 0.061698, 0.260332, 0.026213},
10109 {0.069092, 0.032635, 0.009370, 0.020364, 0.005255, 0.044829, 0.032773, 0.011698, 0.033438, 0.004799, 0.019973, 0.026747, 0.008229, 0.007754, -0.605590, 0.077484, 0.038202, 0.006695, 0.010376, 0.015124},
10110 {0.245933, 0.089317, 0.289960, 0.078196, 0.102703, 0.075066, 0.051432, 0.097899, 0.054003, 0.023306, 0.025152, 0.070562, 0.036035, 0.039831, 0.117705, -1.392239, 0.319421, 0.038212, 0.057419, 0.016981},
10111 {0.135823, 0.035501, 0.129992, 0.024004, 0.032848, 0.054936, 0.052685, 0.014461, 0.030308, 0.093371, 0.020915, 0.088814, 0.097083, 0.011008, 0.050931, 0.280341, -1.154973, 0.007099, 0.018643, 0.088894},
10112 {0.001708, 0.017573, 0.001086, 0.001959, 0.010826, 0.003257, 0.002364, 0.005088, 0.003964, 0.003208, 0.010045, 0.002076, 0.007786, 0.023095, 0.002105, 0.007908, 0.001674, -0.466694, 0.037525, 0.005516},
10113 {0.008912, 0.014125, 0.040205, 0.012058, 0.020133, 0.008430, 0.007267, 0.003836, 0.143398, 0.015555, 0.014757, 0.004934, 0.015861, 0.238943, 0.007998, 0.029135, 0.010779, 0.092011, -0.726275, 0.011652},
10114 {0.149259, 0.018739, 0.014602, 0.011335, 0.074565, 0.022417, 0.043805, 0.013932, 0.008807, 0.581952, 0.133956, 0.022726, 0.153161, 0.048356, 0.023429, 0.017317, 0.103293, 0.027186, 0.023418, -1.085487},