2 * FastTree -- inferring approximately-maximum-likelihood trees for large
3 * multiple sequence alignments.
6 * http://www.microbesonline.org/fasttree/
8 * Thanks to Jim Hester of the Cleveland Clinic Foundation for
9 * providing the first parallel (OpenMP) code, Siavash Mirarab of
10 * UT Austin for implementing the WAG option, Samuel Shepard
11 * at the CDC for suggesting and helping with the -quote option, and
12 * Aaron Darling (University of Technology, Sydney) for numerical changes
13 * for wide alignments of closely-related sequences.
15 * Copyright (C) 2008-2015 The Regents of the University of California
16 * All rights reserved.
18 * This program is free software; you can redistribute it and/or modify
19 * it under the terms of the GNU General Public License as published by
20 * the Free Software Foundation; either version 2 of the License, or
21 * (at your option) any later version.
23 * This program is distributed in the hope that it will be useful,
24 * but WITHOUT ANY WARRANTY; without even the implied warranty of
25 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
26 * GNU General Public License for more details.
28 * You should have received a copy of the GNU General Public License along
29 * with this program; if not, write to the Free Software Foundation, Inc.,
30 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
31 * or visit http://www.gnu.org/copyleft/gpl.html
35 * NEITHER THE UNITED STATES NOR THE UNITED STATES DEPARTMENT OF ENERGY,
36 * NOR ANY OF THEIR EMPLOYEES, MAKES ANY WARRANTY, EXPRESS OR IMPLIED,
37 * OR ASSUMES ANY LEGAL LIABILITY OR RESPONSIBILITY FOR THE ACCURACY,
38 * COMPLETENESS, OR USEFULNESS OF ANY INFORMATION, APPARATUS, PRODUCT,
39 * OR PROCESS DISCLOSED, OR REPRESENTS THAT ITS USE WOULD NOT INFRINGE
40 * PRIVATELY OWNED RIGHTS.
44 * To compile FastTree, do:
45 * gcc -Wall -O3 -finline-functions -funroll-loops -o FastTree -lm FastTree.c
46 * Use -DNO_SSE to turn off use of SSE3 instructions
47 * (should not be necessary because compiler should not set __SSE__ if
48 * not available, and modern mallocs should return 16-byte-aligned values)
49 * Use -DOPENMP -fopenmp to use multiple threads (note, old versions of gcc
50 * may not support -fopenmp)
51 * Use -DTRACK_MEMORY if you want detailed reports of memory usage,
52 * but results are not correct above 4GB because mallinfo stores int values.
53 * It also makes FastTree run significantly slower.
55 * To get usage guidance, do:
58 * FastTree uses profiles instead of a distance matrix, and computes
59 * support values for each split from the profiles of the 4 nodes
60 * around the split. It stores a profile for each node and a average
61 * profile over all active nodes (the "out-profile" for computing the
62 * total sum of distance to other nodes). The neighbor joining phase
63 * requires O(N*L*a) space, where N is the number of sequences, L is
64 * the alignment width, and a is the alphabet size. The top-hits
65 * heuristic requires an additional O(N sqrt(N)) memory. After
66 * neighbor-joining, FastTree improves the topology with
67 * nearest-neighbor interchanges (NNIs) and subtree-prune-regraft
68 * moves (SPRs), which does not have a significant additional memory
69 * requirement. (We need only store "up-profiles" on the path from our
70 * current traversal point to the root.) These take O(NLa) time per
71 * round, and with default settings, O(N log(N) L a) time total.
72 * FastTree further improves the topology with maximum-likelihood
73 * NNIs, using similar data structures and complexity, but with a
74 * higher constant factor, and now the "profiles" are actually
75 * posterior distributions for that subtree. Finally, FastTree
76 * resamples the site likelihoods around each NNI and uses
77 * the Shimodaira Hasegawa test to estimate the reliability of each split.
79 * Overview of the neighbor-joining phase:
81 * Although FastTree uses a log correction on profile distances to
82 * account for multiple substitutions when doing NNIs and SPRs, the
83 * operations on the profiles themselves involve "additive" distances
84 * -- either %different (for nucleotide) or by using an amino acid
85 * similarity matrix (for proteins). If we are using %different as
86 * our distance matrix then
88 * Profile_distance(A,B) = 1 - sum over characters of freq(A)*freq(B)
90 * and we can average this value over positions. Positions with gaps
91 * are weighted by %ungapped(A) * %ungapped(B).
93 * If we are using an amino acid dissimilarity matrix D(i,j) then at
96 * Profile_distance(A,B) = sum(i,j) freq(A==i) * freq(B==j) * D(i,j)
97 * = sum(k) Ak * Bk * Lambda(k)
99 * where k iterates over 20 eigenvectors, Lambda(k) is the eigenvalue,
100 * and if A==i, then Ak is the kth column of the inverse of the
101 * eigenvector matrix.
103 * The exhaustive approach (-slow) takes O(N**3*L*a) time, but
104 * this can be reduced to as little as O(N**(3/2)*log(N)*L*a) time
105 * by using heuristics.
107 * It uses a combination of three heuristics: a visible set similar to
108 * that of FastTree (Elias & Lagergren 2005), a local hill-climbing
109 * search for a better join (as in relaxed neighbor-joining, Evans et
110 * al. 2006), and a top-hit list to reduce the search space (see
113 * The "visible" set stores, for each node, the best join for that
114 * node, as identified at some point in the past
116 * If top-hits are not being used, then the neighbor-joining phase can
119 * Compute the out-profile by averaging the leaves
120 * Compute the out-distance of each leaf quickly, using the out-profile
121 * Compute the visible set (or approximate it using top-hits, see below)
122 * Until we're down to 3 active nodes:
123 * Find the best join in the visible set
124 * (This involves recomputing the neighbor-joining criterion,
125 * as out-distances and #active nodes may have changed)
126 * Follow a chain of best hits (again recomputing the criterion)
127 * until we find a locally best join, as in relaxed neighbor joining
128 * Create a profile of the parent node, either using simple averages (default)
129 * or using weighted joining as in BIONJ (if -bionj was specified)
130 * Update the out-profile and the out-distances
131 * Update the visible set:
132 * find the best join for the new joined node
133 * replace hits to the joined children with hits to the parent
134 * if we stumble across a join for the new node that is better
135 * than the corresponding entry in the visible set, "reset"
138 * For each iteration, this method does
139 * O(N) work to find the best hit in the visible set
140 * O(L*N*a*log(N)) work to do the local search, where log(N)
141 * is a pessimistic estimate of the number of iterations. In
142 * practice, we average <1 iteration for 2,000 sequences.
143 * With -fastest, this step is omitted.
144 * O(N*a) work to compute the joined profile and update the out-profile
145 * O(L*N*a) work to update the out-distances
146 * O(L*N*a) work to compare the joined profile to the other nodes
147 * (to find the new entry in the visible set)
149 * and there are N-3 iterations, so it takes O(N**2 * L * log(N) * a) time.
151 * The profile distances give exactly the same result as matrix
152 * distances in neighbor-joining or BIONJ would if there are no gaps
153 * in the alignment. If there are gaps, then it is an
154 * approximation. To get the same result we also store a "diameter"
155 * for each node (diameter is 0 for leaves).
157 * In the simpler case (NJ rather than BIONJ), when we join A and B to
158 * give a new node AB,
160 * Profile(AB) = (A+B)/2
161 * Profile_distance(AB,C) = (Profile_distance(A,C)+Profile_distance(B,C))/2
162 * because the formulas above are linear
164 * And according to the neighor-joining rule,
165 * d(AB,C) = (d(A,C)+d(B,C)-d(A,B))/2
167 * and we can achieve the same value by writing
168 * diameter(AB) = pd(A,B)/2
170 * d(A,B) = pd(A,B) - diameter(A) - diameter(B)
173 * d(AB,C) = (d(A,C)+d(B,C)-d(A,B))/2
174 * = (pd(A,C)-diam(A)-diam(C)+pd(B,C)-diam(B)-diam(C)-d(A,B)+diam(A)+diam(B))/2
175 * = (pd(A,C)+pd(B,C))/2 - diam(C) - pd(A,B)
176 * = pd(AB,C) - diam(AB) - diam(C)
178 * If we are using BIONJ, with weight lambda for the join:
179 * Profile(AB) = lambda*A + (1-lambda)*B
180 * then a similar argument gives
181 * diam(AB) = lambda*diam(A) + (1-lambda)*diam(B) + lambda*d(A,AB) + (1-lambda)*d(B,AB),
183 * where, as in neighbor joining,
184 * d(A,AB) = d(A,B) + (total out_distance(A) - total out_distance(B))/(n-2)
186 * A similar recursion formula works for the "variance" matrix of BIONJ,
187 * var(AB,C) = lambda*var(A,C) + (1-lambda)*var(B,C) - lambda*(1-lambda)*var(A,B)
189 * var(A,B) = pv(A,B) - vd(A) - vd(B), where
191 * vd(A) = 0 for leaves
192 * vd(AB) = lambda*vd(A) + (1-lambda)*vd(B) + lambda*(1-lambda)*var(A,B)
194 * The top-hist heuristic to reduce the work below O(N**2*L) stores a top-hit
195 * list of size m=sqrt(N) for each active node.
197 * The list can be initialized for all the leaves in sub (N**2 * L) time as follows:
198 * Pick a "seed" sequence and compare it to all others
199 * Store the top m hits of the seed as its top-hit list
200 * Take "close" hits of the seed(within the top m, and see the "close" parameter),
201 * and assume that their top m hits lie within the top 2*m hits of the seed.
202 * So, compare them to the seed's neighors (if they do not already
203 * have a top hit list) and set their top hits.
205 * This method does O(N*L) work for each seed, or O(N**(3/2)*L) work total.
207 * To avoid doing O(N*L) work at each iteration, we need to avoid
208 * updating the visible set and the out-distances. So, we use "stale"
209 * out-distances, and when searching the visible set for the best hit,
210 * we only inspect the top m=sqrt(N) entries. We then update those
211 * out-distances (up to 2*m*L*a work) and then find the best hit.
213 * To avoid searching the entire visible set, FastTree keeps
214 * and updates a list of the top sqrt(N) entries in the visible set.
215 * This costs O(sqrt(N)) time per join to find the best entry and to
216 * update, or (N sqrt(N)) time overall.
218 * Similarly, when doing the local hill-climbing, we avoid O(N*L) work
219 * by only considering the top-hits for the current node. So this adds
220 * O(m*a*log(N)) work per iteration.
222 * When we join two nodes, we compute profiles and update the
223 * out-profile as before. We need to compute the best hits of the node
224 * -- we merge the lists for the children and select the best up-to-m
225 * hits. If the top hit list contains a stale node we replace it with
226 * its parent. If we still have <m/2 entries, we do a "refresh".
228 * In a "refresh", similar to the fast top-hit computation above, we
229 * compare the "seed", in this case the new joined node, to all other
230 * nodes. We compare its close neighbors (the top m hits) to all
231 * neighbors (the top 2*m hits) and update the top-hit lists of all
232 * neighbors (by merging to give a list of 3*m entries and then
233 * selecting the best m entries).
235 * Finally, during these processes we update the visible sets for
236 * other nodes with better hits if we find them, and we set the
237 * visible entry for the new joined node to the best entry in its
238 * top-hit list. (And whenever we update a visible entry, we
239 * do O(sqrt(N)) work to update the top-visible list.)
240 * These udpates are not common so they do not alter the
241 * O(N sqrt(N) log(N) L a) total running time for the joining phase.
243 * Second-level top hits
245 * With -fastest or with -2nd, FastTree uses an additional "2nd-level" top hits
246 * heuristic to reduce the running time for the top-hits phase to
247 * O(N**1.25 L) and for the neighbor-joining phase to O(N**1.25 L a).
248 * This also reduces the memory usage for the top-hits lists to
249 * O(N**1.25), which is important for alignments with a million
250 * sequences. The key idea is to store just q = sqrt(m) top hits for
253 * Given the neighbors of A -- either for a seed or for a neighbor
254 * from the top-hits heuristic, if B is within the top q hits of A, we
255 * set top-hits(B) from the top 3*q top-hits of A. And, we record that
256 * A is the "source" of the hits for B, so if we run low on hits for
257 * B, instead of doing a full refresh, we can do top-hits(B) :=
258 * top-hits(B) union top-hits(active_ancestor(A)).
259 * During a refresh, these "2nd-level" top hits are updated just as
260 * normal, but the source is maintained and only q entries are stored,
261 * until we near the end of the neighbor joining phase (until the
262 * root as 2*m children or less).
264 * Parallel execution with OpenMP
266 * If you compile FastTree with OpenMP support, it will take
267 * advantage of multiple CPUs on one machine. It will parallelize:
270 * Comparing one node to many others during the NJ phase (the simplest kind of join)
272 * Optimizing likelihoods for 3 alternate topologies during ML NNIs and ML supports
273 * (only 3 threads can be used)
275 * This accounts for most of the O(N L a) or slower steps except for
276 * minimum-evolution NNIs (which are fast anyway), minimum-evolution SPRs,
277 * selecting per-site rates, and optimizing branch lengths outside of ML NNIs.
279 * Parallelizing the top hits phase may lead to a slight change in the tree,
280 * as some top hits are computed from different (and potentially less optimal source).
281 * This means that results on repeated runs may not be 100% identical.
282 * However, this should not have any significant effect on tree quality
283 * after the NNIs and SPRs.
285 * The OpenMP code also turns off the star-topology test during ML
286 * NNIs, which may lead to slight improvements in likelihood.
295 #include <sys/time.h>
299 /* malloc.h apparently doesn't exist on MacOS */
303 /* Compile with -DOPENMP to turn on multithreading */
308 /* By default, tries to compile with SSE instructions for greater speed.
309 But if compiled with -DUSE_DOUBLE, uses double precision instead of single-precision
310 floating point (2x memory required), does not use SSE, and allows much shorter
314 #if !defined(NO_SSE) && !defined(USE_DOUBLE)
321 #define SSE_STRING "Double precision (No SSE3)"
322 typedef double numeric_t;
323 #define ScanNumericSpec "%lf"
325 typedef float numeric_t;
326 #define ScanNumericSpec "%f"
330 #define SSE_STRING "SSE3"
331 #define ALIGNED __attribute__((aligned(16)))
332 #define IS_ALIGNED(X) ((((unsigned long) new) & 15L) == 0L)
333 #include <xmmintrin.h>
338 #define IS_ALIGNED(X) 1
341 #define SSE_STRING "No SSE3"
344 #endif /* USE_SSE3 */
346 #define FT_VERSION "2.1.8"
349 " FastTree protein_alignment > tree\n"
350 " FastTree < protein_alignment > tree\n"
351 " FastTree -out tree protein_alignment\n"
352 " FastTree -nt nucleotide_alignment > tree\n"
353 " FastTree -nt -gtr < nucleotide_alignment > tree\n"
354 " FastTree < nucleotide_alignment > tree\n"
355 "FastTree accepts alignments in fasta or phylip interleaved formats\n"
357 "Common options (must be before the alignment file):\n"
358 " -quiet to suppress reporting information\n"
359 " -nopr to suppress progress indicator\n"
360 " -log logfile -- save intermediate trees, settings, and model details\n"
361 " -fastest -- speed up the neighbor joining phase & reduce memory usage\n"
362 " (recommended for >50,000 sequences)\n"
363 " -n <number> to analyze multiple alignments (phylip format only)\n"
364 " (use for global bootstrap, with seqboot and CompareToBootstrap.pl)\n"
365 " -nosupport to not compute support values\n"
366 " -intree newick_file to set the starting tree(s)\n"
367 " -intree1 newick_file to use this starting tree for all the alignments\n"
368 " (for faster global bootstrap on huge alignments)\n"
369 " -pseudo to use pseudocounts (recommended for highly gapped sequences)\n"
370 " -gtr -- generalized time-reversible model (nucleotide alignments only)\n"
371 " -wag -- Whelan-And-Goldman 2001 model (amino acid alignments only)\n"
372 " -quote -- allow spaces and other restricted characters (but not ' ) in\n"
373 " sequence names and quote names in the output tree (fasta input only;\n"
374 " FastTree will not be able to read these trees back in)\n"
375 " -noml to turn off maximum-likelihood\n"
376 " -nome to turn off minimum-evolution NNIs and SPRs\n"
377 " (recommended if running additional ML NNIs with -intree)\n"
378 " -nome -mllen with -intree to optimize branch lengths for a fixed topology\n"
379 " -cat # to specify the number of rate categories of sites (default 20)\n"
380 " or -nocat to use constant rates\n"
381 " -gamma -- after optimizing the tree under the CAT approximation,\n"
382 " rescale the lengths to optimize the Gamma20 likelihood\n"
383 " -constraints constraintAlignment to constrain the topology search\n"
384 " constraintAlignment should have 1s or 0s to indicates splits\n"
385 " -expert -- see more options\n"
386 "For more information, see http://www.microbesonline.org/fasttree/\n";
389 "FastTree [-nt] [-n 100] [-quote] [-pseudo | -pseudo 1.0]\n"
390 " [-boot 1000 | -nosupport]\n"
391 " [-intree starting_trees_file | -intree1 starting_tree_file]\n"
392 " [-quiet | -nopr]\n"
393 " [-nni 10] [-spr 2] [-noml | -mllen | -mlnni 10]\n"
394 " [-mlacc 2] [-cat 20 | -nocat] [-gamma]\n"
395 " [-slow | -fastest] [-2nd | -no2nd] [-slownni] [-seed 1253] \n"
396 " [-top | -notop] [-topm 1.0 [-close 0.75] [-refresh 0.8]]\n"
397 " [-matrix Matrix | -nomatrix] [-nj | -bionj]\n"
398 " [-wag] [-nt] [-gtr] [-gtrrates ac ag at cg ct gt] [-gtrfreq A C G T]\n"
399 " [ -constraints constraintAlignment [ -constraintWeight 100.0 ] ]\n"
401 " [ alignment_file ]\n"
402 " [ -out output_newick_file | > newick_tree]\n"
406 "FastTree [-nt] [-matrix Matrix | -nomatrix] [-rawdist] -makematrix [alignment]\n"
407 " [-n 100] > phylip_distance_matrix\n"
409 " FastTree supports fasta or phylip interleaved alignments\n"
410 " By default FastTree expects protein alignments, use -nt for nucleotides\n"
411 " FastTree reads standard input if no alignment file is given\n"
413 "Input/output options:\n"
414 " -n -- read in multiple alignments in. This only\n"
415 " works with phylip interleaved format. For example, you can\n"
416 " use it with the output from phylip's seqboot. If you use -n, FastTree\n"
417 " will write 1 tree per line to standard output.\n"
418 " -intree newickfile -- read the starting tree in from newickfile.\n"
419 " Any branch lengths in the starting trees are ignored.\n"
420 " -intree with -n will read a separate starting tree for each alignment.\n"
421 " -intree1 newickfile -- read the same starting tree for each alignment\n"
422 " -quiet -- do not write to standard error during normal operation (no progress\n"
423 " indicator, no options summary, no likelihood values, etc.)\n"
424 " -nopr -- do not write the progress indicator to stderr\n"
425 " -log logfile -- save intermediate trees so you can extract\n"
426 " the trees and restart long-running jobs if they crash\n"
427 " -log also reports the per-site rates (1 means slowest category)\n"
428 " -quote -- quote sequence names in the output and allow spaces, commas,\n"
429 " parentheses, and colons in them but not ' characters (fasta files only)\n"
432 " Default: For protein sequences, log-corrected distances and an\n"
433 " amino acid dissimilarity matrix derived from BLOSUM45\n"
434 " or for nucleotide sequences, Jukes-Cantor distances\n"
435 " To specify a different matrix, use -matrix FilePrefix or -nomatrix\n"
436 " Use -rawdist to turn the log-correction off\n"
437 " or to use %different instead of Jukes-Cantor\n"
439 " -pseudo [weight] -- Use pseudocounts to estimate distances between\n"
440 " sequences with little or no overlap. (Off by default.) Recommended\n"
441 " if analyzing the alignment has sequences with little or no overlap.\n"
442 " If the weight is not specified, it is 1.0\n"
444 "Topology refinement:\n"
445 " By default, FastTree tries to improve the tree with up to 4*log2(N)\n"
446 " rounds of minimum-evolution nearest-neighbor interchanges (NNI),\n"
447 " where N is the number of unique sequences, 2 rounds of\n"
448 " subtree-prune-regraft (SPR) moves (also min. evo.), and\n"
449 " up to 2*log(N) rounds of maximum-likelihood NNIs.\n"
450 " Use -nni to set the number of rounds of min. evo. NNIs,\n"
451 " and -spr to set the rounds of SPRs.\n"
452 " Use -noml to turn off both min-evo NNIs and SPRs (useful if refining\n"
453 " an approximately maximum-likelihood tree with further NNIs)\n"
454 " Use -sprlength set the maximum length of a SPR move (default 10)\n"
455 " Use -mlnni to set the number of rounds of maximum-likelihood NNIs\n"
456 " Use -mlacc 2 or -mlacc 3 to always optimize all 5 branches at each NNI,\n"
457 " and to optimize all 5 branches in 2 or 3 rounds\n"
458 " Use -mllen to optimize branch lengths without ML NNIs\n"
459 " Use -mllen -nome with -intree to optimize branch lengths on a fixed topology\n"
460 " Use -slownni to turn off heuristics to avoid constant subtrees (affects both\n"
463 "Maximum likelihood model options:\n"
464 " -wag -- Whelan-And-Goldman 2001 model instead of (default) Jones-Taylor-Thorton 1992 model (a.a. only)\n"
465 " -gtr -- generalized time-reversible instead of (default) Jukes-Cantor (nt only)\n"
466 " -cat # -- specify the number of rate categories of sites (default 20)\n"
467 " -nocat -- no CAT model (just 1 category)\n"
468 " -gamma -- after the final round of optimizing branch lengths with the CAT model,\n"
469 " report the likelihood under the discrete gamma model with the same\n"
470 " number of categories. FastTree uses the same branch lengths but\n"
471 " optimizes the gamma shape parameter and the scale of the lengths.\n"
472 " The final tree will have rescaled lengths. Used with -log, this\n"
473 " also generates per-site likelihoods for use with CONSEL, see\n"
474 " GammaLogToPaup.pl and documentation on the FastTree web site.\n"
476 "Support value options:\n"
477 " By default, FastTree computes local support values by resampling the site\n"
478 " likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome,\n"
479 " it will compute minimum-evolution bootstrap supports instead\n"
480 " In either case, the support values are proportions ranging from 0 to 1\n"
482 " Use -nosupport to turn off support values or -boot 100 to use just 100 resamples\n"
483 " Use -seed to initialize the random number generator\n"
485 "Searching for the best join:\n"
486 " By default, FastTree combines the 'visible set' of fast neighbor-joining with\n"
487 " local hill-climbing as in relaxed neighbor-joining\n"
488 " -slow -- exhaustive search (like NJ or BIONJ, but different gap handling)\n"
489 " -slow takes half an hour instead of 8 seconds for 1,250 proteins\n"
490 " -fastest -- search the visible set (the top hit for each node) only\n"
491 " Unlike the original fast neighbor-joining, -fastest updates visible(C)\n"
492 " after joining A and B if join(AB,C) is better than join(C,visible(C))\n"
493 " -fastest also updates out-distances in a very lazy way,\n"
494 " -fastest sets -2nd on as well, use -fastest -no2nd to avoid this\n"
496 "Top-hit heuristics:\n"
497 " By default, FastTree uses a top-hit list to speed up search\n"
498 " Use -notop (or -slow) to turn this feature off\n"
499 " and compare all leaves to each other,\n"
500 " and all new joined nodes to each other\n"
501 " -topm 1.0 -- set the top-hit list size to parameter*sqrt(N)\n"
502 " FastTree estimates the top m hits of a leaf from the\n"
503 " top 2*m hits of a 'close' neighbor, where close is\n"
504 " defined as d(seed,close) < 0.75 * d(seed, hit of rank 2*m),\n"
505 " and updates the top-hits as joins proceed\n"
506 " -close 0.75 -- modify the close heuristic, lower is more conservative\n"
507 " -refresh 0.8 -- compare a joined node to all other nodes if its\n"
508 " top-hit list is less than 80% of the desired length,\n"
509 " or if the age of the top-hit list is log2(m) or greater\n"
510 " -2nd or -no2nd to turn 2nd-level top hits heuristic on or off\n"
511 " This reduces memory usage and running time but may lead to\n"
512 " marginal reductions in tree quality.\n"
513 " (By default, -fastest turns on -2nd.)\n"
516 " -nj: regular (unweighted) neighbor-joining (default)\n"
517 " -bionj: weighted joins as in BIONJ\n"
518 " FastTree will also weight joins during NNIs\n"
520 "Constrained topology search options:\n"
521 " -constraints alignmentfile -- an alignment with values of 0, 1, and -\n"
522 " Not all sequences need be present. A column of 0s and 1s defines a\n"
523 " constrained split. Some constraints may be violated\n"
524 " (see 'violating constraints:' in standard error).\n"
525 " -constraintWeight -- how strongly to weight the constraints. A value of 1\n"
526 " means a penalty of 1 in tree length for violating a constraint\n"
529 "For more information, see http://www.microbesonline.org/fasttree/\n"
530 " or the comments in the source code\n";
536 /* Note -- sequence lines longer than BUFFER_SIZE are
537 allowed, but FASTA header lines must be within this limit */
538 #define BUFFER_SIZE 5000
539 #define MIN(X,Y) ((X) < (Y) ? (X) : (Y))
540 #define MAX(X,Y) ((X) > (Y) ? (X) : (Y))
547 int nSaved; /* actual allocated size of names and seqs */
550 /* For each position in a profile, we have a weight (% non-gapped) and a
551 frequency vector. (If using a matrix, the frequency vector is in eigenspace).
552 We also store codes for simple profile positions (all gaps or only 1 value)
553 If weight[pos] > 0 && codes[pos] == NOCODE then we store the vector
554 vectors itself is sets of nCodes long, so the vector for the ith nonconstant position
555 starts at &vectors[nCodes*i]
557 To speed up comparison of outprofile to a sequence or other simple profile, we also
558 (for outprofiles) store codeDist[iPos*nCodes+k] = dist(k,profile[iPos])
560 For constraints, we store a vector of nOn and nOff
561 If not using constraints, those will be NULL
564 /* alignment profile */
566 unsigned char *codes;
567 numeric_t *vectors; /* NULL if no non-constant positions, e.g. for leaves */
569 numeric_t *codeDist; /* Optional -- distance to each code at each position */
571 /* constraint profile */
576 /* A visible node is a pair of nodes i, j such that j is the best hit of i,
577 using the neighbor-joining criterion, at the time the comparison was made,
578 or approximately so since then.
580 Note that variance = dist because in BIONJ, constant factors of variance do not matter,
581 and because we weight ungapped sequences higher naturally when averaging profiles,
582 so we do not take this into account in the computation of "lambda" for BIONJ.
584 For the top-hit list heuristic, if the top hit list becomes "too short",
585 we store invalid entries with i=j=-1 and dist/criterion very high.
589 numeric_t weight; /* Total product of weights (maximum value is nPos)
590 This is needed for weighted joins and for pseudocounts,
591 but not in most other places.
592 For example, it is not maintained by the top hits code */
593 numeric_t dist; /* The uncorrected distance (includes diameter correction) */
594 numeric_t criterion; /* changes when we update the out-profile or change nActive */
603 /* Distances between amino acids */
604 numeric_t distances[MAXCODES][MAXCODES];
606 /* Inverse of the eigenvalue matrix, for rotating a frequency vector
607 into eigenspace so that profile similarity computations are
608 O(alphabet) not O(alphabet*alphabet) time.
610 numeric_t eigeninv[MAXCODES][MAXCODES];
611 numeric_t eigenval[MAXCODES]; /* eigenvalues */
614 /* eigentot=eigeninv times the all-1s frequency vector
615 useful for normalizing rotated frequency vectors
617 numeric_t eigentot[MAXCODES];
619 /* codeFreq is the transpose of the eigeninv matrix is
620 the rotated frequency vector for each code */
621 numeric_t codeFreq[MAXCODES][MAXCODES];
622 numeric_t gapFreq[MAXCODES];
626 /* A transition matrix gives the instantaneous rate of change of frequencies
630 and which is not a symmetric matrix because of
631 non-uniform stationary frequencies stat, so that
633 M(i,j) is instantaneous rate of j -> i, not of i -> j
635 S = diag(sqrt(stat)) is a correction so that
636 M' = S**-1 M S is symmetric
637 Let W L W**-1 = M' be an eigendecomposition of M'
638 Because M' is symmetric, W can be a rotation, and W**-1 = t(W)
640 M = V L V**-1 is an eigendecomposition of M
641 Note V**-1 = W**-1 S**-1 = t(W) S**-1
643 Evolution by time t is given by
645 exp(M*t) = V exp(L*t) V**-1
646 P(A & B | t) = B . exp(M*t) . (A * stat)
647 note this is *not* the same as P(A->B | t)
649 and we can reduce some of the computations from O(a**2) to O(a) time,
650 where a is the alphabet size, by storing frequency vectors as
651 t(V) . f = t(W) . t(S) . f
654 P(f0 & f1 | t) = f1 . exp(M*t) . f0 * (f0 . stat) = sum(r0j * r1j * exp(l_j*t))
655 where r0 and r1 are the transformed vectors
657 Posterior distribution of P given children f0 and f1 is given by
658 P(i | f0, f1, t0, t1) = stat * P(i->f0 | t0) * P(i->f1 | t1)
659 = P(i & f0 | t0) * P(i & f1 | t1) / stat
660 ~ (V . exp(t0*L) . r0) * (V . exp(t1*L) . r1) / stat
662 When normalize this posterior distribution (to sum to 1), divide by stat,
663 and transform by t(V) -- this is the "profile" of internal nodes
665 To eliminate the O(N**2) step of transforming by t(V), if the posterior
666 distribution of an amino acid is near 1 then we can approximate it by
667 P(i) ~= (i==A) * w + nearP(i) * (1-w), where
668 w is fit so that P(i==A) is correct
669 nearP = Posterior(i | i, i, 0.1, 0.1) [0.1 is an arbitrary choice]
670 and we confirm that the approximation works well before we use it.
672 Given this parameter w we can set
673 rotated_posterior = rotation(w * (i==A)/stat + (1-w) * nearP/stat)
674 = codeFreq(A) * w/stat(A) + nearFreq(A) * (1-w)
677 numeric_t stat[MAXCODES]; /* The stationary distribution */
678 numeric_t statinv[MAXCODES]; /* 1/stat */
679 /* the eigenmatrix, with the eigenvectors as columns and rotations of individual
680 characters as rows. Also includes a NOCODE entry for gaps */
681 numeric_t codeFreq[NOCODE+1][MAXCODES];
682 numeric_t eigeninv[MAXCODES][MAXCODES]; /* Inverse of eigenmatrix */
683 numeric_t eigeninvT[MAXCODES][MAXCODES]; /* transpose of eigeninv */
684 numeric_t eigenval[MAXCODES]; /* Eigenvalues */
685 /* These are for approximate posteriors (off by default) */
686 numeric_t nearP[MAXCODES][MAXCODES]; /* nearP[i][j] = P(parent=j | both children are i, both lengths are 0.1 */
687 numeric_t nearFreq[MAXCODES][MAXCODES]; /* rotation of nearP/stat */
688 } transition_matrix_t;
692 numeric_t *rates; /* 1 per rate category */
693 unsigned int *ratecat; /* 1 category per position */
700 char **seqs; /* the aligment sequences array (not reallocated) */
701 distance_matrix_t *distance_matrix; /* a pointer (not reallocated), or NULL if using %identity distance */
702 transition_matrix_t *transmat; /* a pointer (is allocated), or NULL for Jukes-Cantor */
703 /* Topological constraints are represented for each sequence as binary characters
704 with values of '0', '1', or '-' (for missing data)
705 Sequences that have no constraint may have a NULL string
708 char **constraintSeqs;
710 /* The profile data structures */
711 int maxnode; /* The next index to allocate */
712 int maxnodes; /* Space allocated in data structures below */
713 profile_t **profiles; /* Profiles of leaves and intermediate nodes */
714 numeric_t *diameter; /* To correct for distance "up" from children (if any) */
715 numeric_t *varDiameter; /* To correct variances for distance "up" */
716 numeric_t *selfdist; /* Saved for use in some formulas */
717 numeric_t *selfweight; /* Saved for use in some formulas */
719 /* Average profile of all active nodes, the "outprofile"
720 * If all inputs are ungapped, this has weight 1 (not nSequences) at each position
721 * The frequencies all sum to one (or that is implied by the eigen-representation)
723 profile_t *outprofile;
726 /* We sometimes use stale out-distances, so we remember what nActive was */
727 numeric_t *outDistances; /* Sum of distances to other active (parent==-1) nodes */
728 int *nOutDistActive; /* What nActive was when this outDistance was computed */
730 /* the inferred tree */
731 int root; /* index of the root. Unlike other internal nodes, it has 3 children */
732 int *parent; /* -1 or index of parent */
734 numeric_t *branchlength; /* Distance to parent */
735 numeric_t *support; /* 1 for high-confidence nodes */
737 /* auxilliary data for maximum likelihood (defaults to 1 category of rate=1.0) */
741 /* Uniquify sequences in an alignment -- map from indices
742 in the alignment to unique indicies in a NJ_t
747 int *uniqueFirst; /* iUnique -> iAln */
748 int *alnNext; /* iAln -> next, or -1 */
749 int *alnToUniq; /* iAln -> iUnique, or -1 if another was the exemplar */
750 char **uniqueSeq; /* indexed by iUniq -- points to strings allocated elsewhere */
753 /* Describes which switch to do */
754 typedef enum {ABvsCD,ACvsBD,ADvsBC} nni_t;
756 /* A list of these describes a chain of NNI moves in a rooted tree,
757 making up, in total, an SPR move
761 double deltaLength; /* change in tree length for this step (lower is better) */
764 /* Keep track of hits for the top-hits heuristic without wasting memory
766 If j is an inactive node, this may be replaced by that node's parent (and dist recomputed)
774 int nHits; /* the allocated and desired size; some of them may be empty */
776 int hitSource; /* where to refresh hits from if a 2nd-level top-hit list, or -1 */
777 int age; /* number of joins since a refresh */
781 int m; /* size of a full top hits list, usually sqrt(N) */
782 int q; /* size of a 2nd-level top hits, usually sqrt(m) */
784 top_hits_list_t *top_hits_lists; /* one per node */
785 hit_t *visible; /* the "visible" (very best) hit for each node */
787 /* The top-visible set is a subset, usually of size m, of the visible set --
788 it is the set of joins to select from
789 Each entry is either a node whose visible set entry has a good (low) criterion,
790 or -1 for empty, or is an obsolete node (which is effectively the same).
791 Whenever we update the visible set, should also call UpdateTopVisible()
792 which ensures that none of the topvisible set are stale (that is, they
793 all point to an active node).
795 int nTopVisible; /* nTopVisible = m * topvisibleMult */
798 int topvisibleAge; /* joins since the top-visible list was recomputed */
801 /* 1 lock to read or write any top hits list, no thread grabs more than one */
806 /* Global variables */
809 int showProgress = 1;
812 bool useTopHits2nd = false; /* use the second-level top hits heuristic? */
814 double tophitsMult = 1.0; /* 0 means compare nodes to all other nodes */
815 double tophitsClose = -1.0; /* Parameter for how close is close; also used as a coverage req. */
816 double topvisibleMult = 1.5; /* nTopVisible = m * topvisibleMult; 1 or 2 did not make much difference
817 in either running time or accuracy so I chose a compromise. */
819 double tophitsRefresh = 0.8; /* Refresh if fraction of top-hit-length drops to this */
820 double tophits2Mult = 1.0; /* Second-level top heuristic -- only with -fastest */
821 int tophits2Safety = 3; /* Safety factor for second level of top-hits heuristic */
822 double tophits2Refresh = 0.6; /* Refresh 2nd-level top hits if drops down to this fraction of length */
824 double staleOutLimit = 0.01; /* nActive changes by at most this amount before we recompute
825 an out-distance. (Only applies if using the top-hits heuristic) */
826 double fResetOutProfile = 0.02; /* Recompute out profile from scratch if nActive has changed
827 by more than this proportion, and */
828 int nResetOutProfile = 200; /* nActive has also changed more than this amount */
829 int nCodes=20; /* 20 if protein, 4 if nucleotide */
830 bool useMatrix=true; /* If false, use %different as the uncorrected distance */
831 bool logdist = true; /* If true, do a log-correction (scoredist-like or Jukes-Cantor)
832 but only during NNIs and support values, not during neighbor-joining */
833 double pseudoWeight = 0.0; /* The weight of pseudocounts to avoid artificial long branches when
834 nearby sequences in the tree have little or no overlap
835 (off by default). The prior distance is based on
836 all overlapping positions among the quartet or triplet under
837 consideration. The log correction takes place after the
838 pseudocount is used. */
839 double constraintWeight = 100.0;/* Cost of violation of a topological constraint in evolutionary distance
841 double MEMinDelta = 1.0e-4; /* Changes of less than this in tree-length are discounted for
842 purposes of identifying fixed subtrees */
844 bool gammaLogLk = false; /* compute gamma likelihood without reoptimizing branch lengths? */
846 /* Maximum likelihood options and constants */
847 /* These are used to rescale likelihood values and avoid taking a logarithm at each position */
848 const double LkUnderflow = 1.0e-4;
849 const double LkUnderflowInv = 1.0e4;
850 const double LogLkUnderflow = 9.21034037197618; /* -log(LkUnderflowInv) */
851 const double Log2 = 0.693147180559945;
852 /* These are used to limit the optimization of branch lengths.
853 Also very short branch lengths can create numerical problems.
854 In version 2.1.7, the minimum branch lengths (MLMinBranchLength and MLMinRelBranchLength)
855 were increased to prevent numerical problems in rare cases.
856 In version 2.1.8, to provide useful branch lengths for genome-wide alignments,
857 the minimum branch lengths were dramatically decreased if USE_DOUBLE is defined.
860 const double MLMinBranchLengthTolerance = 1.0e-4; /* absolute tolerance for optimizing branch lengths */
861 const double MLFTolBranchLength = 0.001; /* fractional tolerance for optimizing branch lengths */
862 const double MLMinBranchLength = 5.0e-4; /* minimum value for branch length */
863 const double MLMinRelBranchLength = 2.5e-4; /* minimum of rate * length */
864 const double fPostTotalTolerance = 1.0e-10; /* posterior vector must sum to at least this before rescaling */
866 const double MLMinBranchLengthTolerance = 1.0e-9;
867 const double MLFTolBranchLength = 0.001;
868 const double MLMinBranchLength = 5.0e-9;
869 const double MLMinRelBranchLength = 2.5e-9;
870 const double fPostTotalTolerance = 1.0e-20;
873 int mlAccuracy = 1; /* Rounds of optimization of branch lengths; 1 means do 2nd round only if close */
874 double closeLogLkLimit = 5.0; /* If partial optimization of an NNI looks like it would decrease the log likelihood
875 by this much or more then do not optimize it further */
876 double treeLogLkDelta = 0.1; /* Give up if tree log-lk changes by less than this; NNIs that change
877 likelihood by less than this also are considered unimportant
878 by some heuristics */
879 bool exactML = true; /* Exact or approximate posterior distributions for a.a.s */
880 double approxMLminf = 0.95; /* Only try to approximate posterior distributions if max. value is at least this high */
881 double approxMLminratio = 2/3.0;/* Ratio of approximated/true posterior values must be at least this high */
882 double approxMLnearT = 0.2; /* 2nd component of near-constant posterior distribution uses this time scale */
883 const int nDefaultRateCats = 20;
885 /* Performance and memory usage */
886 long profileOps = 0; /* Full profile-based distance operations */
887 long outprofileOps = 0; /* How many of profileOps are comparisons to outprofile */
888 long seqOps = 0; /* Faster leaf-based distance operations */
889 long profileAvgOps = 0; /* Number of profile-average steps */
890 long nHillBetter = 0; /* Number of hill-climbing steps */
891 long nCloseUsed = 0; /* Number of "close" neighbors we avoid full search for */
892 long nClose2Used = 0; /* Number of "close" neighbors we use 2nd-level top hits for */
893 long nRefreshTopHits = 0; /* Number of full-blown searches (interior nodes) */
894 long nVisibleUpdate = 0; /* Number of updates of the visible set */
895 long nNNI = 0; /* Number of NNI changes performed */
896 long nSPR = 0; /* Number of SPR changes performed */
897 long nML_NNI = 0; /* Number of max-lik. NNI changes performed */
898 long nSuboptimalSplits = 0; /* # of splits that are rejected given final tree (during bootstrap) */
899 long nSuboptimalConstrained = 0; /* Bad splits that are due to constraints */
900 long nConstraintViolations = 0; /* Number of constraint violations */
901 long nProfileFreqAlloc = 0;
902 long nProfileFreqAvoid = 0;
904 long mymallocUsed = 0; /* useful allocations by mymalloc */
905 long maxmallocHeap = 0; /* Maximum of mi.arena+mi.hblkhd from mallinfo (actual mem usage) */
906 long nLkCompute = 0; /* # of likelihood computations for pairs of probability vectors */
907 long nPosteriorCompute = 0; /* # of computations of posterior probabilities */
908 long nAAPosteriorExact = 0; /* # of times compute exact AA posterior */
909 long nAAPosteriorRough = 0; /* # of times use rough approximation */
910 long nStarTests = 0; /* # of times we use star test to avoid testing an NNI */
912 /* Protein character set */
913 unsigned char *codesStringAA = (unsigned char*) "ARNDCQEGHILKMFPSTWYV";
914 unsigned char *codesStringNT = (unsigned char*) "ACGT";
915 unsigned char *codesString = NULL;
917 distance_matrix_t *ReadDistanceMatrix(char *prefix);
918 void SetupDistanceMatrix(/*IN/OUT*/distance_matrix_t *); /* set eigentot, codeFreq, gapFreq */
919 void ReadMatrix(char *filename, /*OUT*/numeric_t codes[MAXCODES][MAXCODES], bool check_codes);
920 void ReadVector(char *filename, /*OUT*/numeric_t codes[MAXCODES]);
921 alignment_t *ReadAlignment(/*READ*/FILE *fp, bool bQuote); /* Returns a list of strings (exits on failure) */
922 alignment_t *FreeAlignment(alignment_t *); /* returns NULL */
923 void FreeAlignmentSeqs(/*IN/OUT*/alignment_t *);
925 /* Takes as input the transpose of the matrix V, with i -> j
926 This routine takes care of setting the diagonals
928 transition_matrix_t *CreateTransitionMatrix(/*IN*/double matrix[MAXCODES][MAXCODES],
929 /*IN*/double stat[MAXCODES]);
930 transition_matrix_t *CreateGTR(double *gtrrates/*ac,ag,at,cg,ct,gt*/, double *gtrfreq/*ACGT*/);
932 /* For converting profiles from 1 rotation to another, or converts NULL to NULL */
933 distance_matrix_t *TransMatToDistanceMat(transition_matrix_t *transmat);
935 /* Allocates memory, initializes leaf profiles */
936 NJ_t *InitNJ(char **sequences, int nSeqs, int nPos,
937 /*IN OPTIONAL*/char **constraintSeqs, int nConstraints,
938 /*IN OPTIONAL*/distance_matrix_t *,
939 /*IN OPTIONAL*/transition_matrix_t *);
941 NJ_t *FreeNJ(NJ_t *NJ); /* returns NULL */
942 void FastNJ(/*IN/OUT*/NJ_t *NJ); /* Does the joins */
943 void ReliabilityNJ(/*IN/OUT*/NJ_t *NJ, int nBootstrap); /* Estimates the reliability of the joins */
945 /* nni_stats_t is meaningless for leaves and root, so all of those entries
946 will just be high (for age) or 0 (for delta)
949 int age; /* number of rounds since this node was modified by an NNI */
950 int subtreeAge; /* number of rounds since self or descendent had a significant improvement */
951 double delta; /* improvement in score for this node (or 0 if no change) */
952 double support; /* improvement of score for self over better of alternatives */
955 /* One round of nearest-neighbor interchanges according to the
956 minimum-evolution or approximate maximum-likelihood criterion.
957 If doing maximum likelihood then this modifies the branch lengths.
958 age is the # of rounds since a node was NNId
959 Returns the # of topological changes performed
961 int NNI(/*IN/OUT*/NJ_t *NJ, int iRound, int nRounds, bool useML,
962 /*IN/OUT*/nni_stats_t *stats,
963 /*OUT*/double *maxDeltaCriterion);
964 nni_stats_t *InitNNIStats(NJ_t *NJ);
965 nni_stats_t *FreeNNIStats(nni_stats_t *, NJ_t *NJ); /* returns NULL */
967 /* One round of subtree-prune-regraft moves (minimum evolution) */
968 void SPR(/*IN/OUT*/NJ_t *NJ, int maxSPRLength, int iRound, int nRounds);
970 /* Recomputes all branch lengths by minimum evolution criterion*/
971 void UpdateBranchLengths(/*IN/OUT*/NJ_t *NJ);
973 /* Recomputes all branch lengths and, optionally, internal profiles */
974 double TreeLength(/*IN/OUT*/NJ_t *NJ, bool recomputeProfiles);
978 int nConstraintViolations;
981 /* How much length would be reduce or likelihood would be increased by the
982 best NNI we find (the worst "miss") */
983 double dWorstDeltaUnconstrained;
984 double dWorstDeltaConstrained;
987 void TestSplitsMinEvo(NJ_t *NJ, /*OUT*/SplitCount_t *splitcount);
989 /* Sets SH-like support values if nBootstrap>0 */
990 void TestSplitsML(/*IN/OUT*/NJ_t *NJ, /*OUT*/SplitCount_t *splitcount, int nBootstrap);
992 /* Pick columns for resampling, stored as returned_vector[iBoot*nPos + j] */
993 int *ResampleColumns(int nPos, int nBootstrap);
995 /* Use out-profile and NJ->totdiam to recompute out-distance for node iNode
996 Only does this computation if the out-distance is "stale" (nOutDistActive[iNode] != nActive)
997 Note "IN/UPDATE" for NJ always means that we may update out-distances but otherwise
1000 void SetOutDistance(/*IN/UPDATE*/NJ_t *NJ, int iNode, int nActive);
1002 /* Always sets join->criterion; may update NJ->outDistance and NJ->nOutDistActive,
1003 assumes join's weight and distance are already set,
1004 and that the constraint penalty (if any) is included in the distance
1006 void SetCriterion(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *join);
1008 /* Computes weight and distance (which includes the constraint penalty)
1009 and then sets the criterion (maybe update out-distances)
1011 void SetDistCriterion(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *join);
1013 /* If join->i or join->j are inactive nodes, replaces them with their active ancestors.
1014 After doing this, if i == j, or either is -1, sets weight to 0 and dist and criterion to 1e20
1015 and returns false (not a valid join)
1016 Otherwise, if i or j changed, recomputes the distance and criterion.
1017 Note that if i and j are unchanged then the criterion could be stale
1018 If bUpdateDist is false, and i or j change, then it just sets dist to a negative number
1020 bool UpdateBestHit(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *join,
1023 /* This recomputes the criterion, or returns false if the visible node
1024 is no longer active.
1026 bool GetVisible(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/top_hits_t *tophits,
1027 int iNode, /*OUT*/besthit_t *visible);
1029 int ActiveAncestor(/*IN*/NJ_t *NJ, int node);
1031 /* Compute the constraint penalty for a join. This is added to the "distance"
1033 int JoinConstraintPenalty(/*IN*/NJ_t *NJ, int node1, int node2);
1034 int JoinConstraintPenaltyPiece(NJ_t *NJ, int node1, int node2, int iConstraint);
1036 /* Helper function for computing the number of constraints violated by
1037 a split, represented as counts of on and off on each side */
1038 int SplitConstraintPenalty(int nOn1, int nOff1, int nOn2, int nOff2);
1040 /* Reports the (min. evo.) support for the (1,2) vs. (3,4) split
1041 col[iBoot*nPos+j] is column j for bootstrap iBoot
1043 double SplitSupport(profile_t *p1, profile_t *p2, profile_t *p3, profile_t *p4,
1044 /*OPTIONAL*/distance_matrix_t *dmat,
1049 /* Returns SH-like support given resampling spec. (in col) and site likelihods
1050 for the three quartets
1052 double SHSupport(int nPos, int nBoostrap, int *col, double loglk[3], double *site_likelihoods[3]);
1054 profile_t *SeqToProfile(/*IN/OUT*/NJ_t *NJ,
1055 char *seq, int nPos,
1056 /*OPTIONAL*/char *constraintSeqs, int nConstraints,
1058 unsigned long counts[256]);
1060 /* ProfileDist and SeqDist only set the dist and weight fields
1061 If using an outprofile, use the second argument of ProfileDist
1062 for better performance.
1064 These produce uncorrected distances.
1066 void ProfileDist(profile_t *profile1, profile_t *profile2, int nPos,
1067 /*OPTIONAL*/distance_matrix_t *distance_matrix,
1068 /*OUT*/besthit_t *hit);
1069 void SeqDist(unsigned char *codes1, unsigned char *codes2, int nPos,
1070 /*OPTIONAL*/distance_matrix_t *distance_matrix,
1071 /*OUT*/besthit_t *hit);
1073 /* Computes all pairs of profile distances, applies pseudocounts
1074 if pseudoWeight > 0, and applies log-correction if logdist is true.
1075 The lower index is compared to the higher index, e.g. for profiles
1076 A,B,C,D the comparison will be as in quartet_pair_t
1078 typedef enum {qAB,qAC,qAD,qBC,qBD,qCD} quartet_pair_t;
1079 void CorrectedPairDistances(profile_t **profiles, int nProfiles,
1080 /*OPTIONAL*/distance_matrix_t *distance_matrix,
1082 /*OUT*/double *distances);
1084 /* output is indexed by nni_t
1085 To ensure good behavior while evaluating a subtree-prune-regraft move as a series
1086 of nearest-neighbor interchanges, this uses a distance-ish model of constraints,
1087 as given by PairConstraintDistance(), rather than
1088 counting the number of violated splits (which is what FastTree does
1089 during neighbor-joining).
1090 Thus, penalty values may well be >0 even if no constraints are violated, but the
1091 relative scores for the three NNIs will be correct.
1093 void QuartetConstraintPenalties(profile_t *profiles[4], int nConstraints, /*OUT*/double d[3]);
1095 double PairConstraintDistance(int nOn1, int nOff1, int nOn2, int nOff2);
1097 /* the split is consistent with the constraint if any of the profiles have no data
1098 or if three of the profiles have the same uniform value (all on or all off)
1099 or if AB|CD = 00|11 or 11|00 (all uniform)
1101 bool SplitViolatesConstraint(profile_t *profiles[4], int iConstraint);
1103 /* If false, no values were set because this constraint was not relevant.
1104 output is for the 3 splits
1106 bool QuartetConstraintPenaltiesPiece(profile_t *profiles[4], int iConstraint, /*OUT*/double penalty[3]);
1108 /* Apply Jukes-Cantor or scoredist-like log(1-d) transform
1109 to correct the distance for multiple substitutions.
1111 double LogCorrect(double distance);
1113 /* AverageProfile is used to do a weighted combination of nodes
1114 when doing a join. If weight is negative, then the value is ignored and the profiles
1115 are averaged. The weight is *not* adjusted for the gap content of the nodes.
1116 Also, the weight does not affect the representation of the constraints
1118 profile_t *AverageProfile(profile_t *profile1, profile_t *profile2,
1119 int nPos, int nConstraints,
1120 distance_matrix_t *distance_matrix,
1123 /* PosteriorProfile() is like AverageProfile() but it computes posterior probabilities
1124 rather than an average
1126 profile_t *PosteriorProfile(profile_t *profile1, profile_t *profile2,
1127 double len1, double len2,
1128 /*OPTIONAL*/transition_matrix_t *transmat,
1130 int nPos, int nConstraints);
1132 /* Set a node's profile from its children.
1133 Deletes the previous profile if it exists
1134 Use -1.0 for a balanced join
1135 Fails unless the node has two children (e.g., no leaves or root)
1137 void SetProfile(/*IN/OUT*/NJ_t *NJ, int node, double weight1);
1139 /* OutProfile does an unweighted combination of nodes to create the
1140 out-profile. It always sets code to NOCODE so that UpdateOutProfile
1143 profile_t *OutProfile(profile_t **profiles, int nProfiles,
1144 int nPos, int nConstraints,
1145 distance_matrix_t *distance_matrix);
1147 void UpdateOutProfile(/*UPDATE*/profile_t *out, profile_t *old1, profile_t *old2,
1148 profile_t *new, int nActiveOld,
1149 int nPos, int nConstraints,
1150 distance_matrix_t *distance_matrix);
1152 profile_t *NewProfile(int nPos, int nConstraints); /* returned has no vectors */
1153 profile_t *FreeProfile(profile_t *profile, int nPos, int nConstraints); /* returns NULL */
1155 void AllocRateCategories(/*IN/OUT*/rates_t *rates, int nRateCategories, int nPos);
1157 /* f1 can be NULL if code1 != NOCODE, and similarly for f2
1158 Or, if (say) weight1 was 0, then can have code1==NOCODE *and* f1==NULL
1159 In that case, returns an arbitrary large number.
1161 double ProfileDistPiece(unsigned int code1, unsigned int code2,
1162 numeric_t *f1, numeric_t *f2,
1163 /*OPTIONAL*/distance_matrix_t *dmat,
1164 /*OPTIONAL*/numeric_t *codeDist2);
1166 /* Adds (or subtracts, if weight is negative) fIn/codeIn from fOut
1167 fOut is assumed to exist (as from an outprofile)
1168 do not call unless weight of input profile > 0
1170 void AddToFreq(/*IN/OUT*/numeric_t *fOut, double weight,
1171 unsigned int codeIn, /*OPTIONAL*/numeric_t *fIn,
1172 /*OPTIONAL*/distance_matrix_t *dmat);
1174 /* Divide the vector (of length nCodes) by a constant
1175 so that the total (unrotated) frequency is 1.0 */
1176 void NormalizeFreq(/*IN/OUT*/numeric_t *freq, distance_matrix_t *distance_matrix);
1178 /* Allocate, if necessary, and recompute the codeDist*/
1179 void SetCodeDist(/*IN/OUT*/profile_t *profile, int nPos, distance_matrix_t *dmat);
1181 /* The allhits list contains the distances of the node to all other active nodes
1182 This is useful for the "reset" improvement to the visible set
1183 Note that the following routines do not handle the tophits heuristic
1184 and assume that out-distances are up to date.
1186 void SetBestHit(int node, NJ_t *NJ, int nActive,
1187 /*OUT*/besthit_t *bestjoin,
1188 /*OUT OPTIONAL*/besthit_t *allhits);
1189 void ExhaustiveNJSearch(NJ_t *NJ, int nActive, /*OUT*/besthit_t *bestjoin);
1191 /* Searches the visible set */
1192 void FastNJSearch(NJ_t *NJ, int nActive, /*UPDATE*/besthit_t *visible, /*OUT*/besthit_t *bestjoin);
1194 /* Subroutines for handling the tophits heuristic */
1196 top_hits_t *InitTopHits(NJ_t *NJ, int m);
1197 top_hits_t *FreeTopHits(top_hits_t *tophits); /* returns NULL */
1199 /* Before we do any joins -- sets tophits and visible
1200 NJ may be modified by setting out-distances
1202 void SetAllLeafTopHits(/*IN/UPDATE*/NJ_t *NJ, /*IN/OUT*/top_hits_t *tophits);
1204 /* Find the best join to do. */
1205 void TopHitNJSearch(/*IN/UPDATE*/NJ_t *NJ,
1207 /*IN/OUT*/top_hits_t *tophits,
1208 /*OUT*/besthit_t *bestjoin);
1210 /* Returns the best hit within top hits
1211 NJ may be modified because it updates out-distances if they are too stale
1212 Does *not* update visible set
1214 void GetBestFromTopHits(int iNode, /*IN/UPDATE*/NJ_t *NJ, int nActive,
1215 /*IN*/top_hits_t *tophits,
1216 /*OUT*/besthit_t *bestjoin);
1218 /* visible set is modifiable so that we can reset it more globally when we do
1219 a "refresh", but we also set the visible set for newnode and do any
1220 "reset" updates too. And, we update many outdistances.
1222 void TopHitJoin(int newnode,
1223 /*IN/UPDATE*/NJ_t *NJ, int nActive,
1224 /*IN/OUT*/top_hits_t *tophits);
1226 /* Sort the input besthits by criterion
1227 and save the best nOut hits as a new array in top_hits_lists
1228 Does not update criterion or out-distances
1229 Ignores (silently removes) hit to self
1230 Saved list may be shorter than requested if there are insufficient entries
1232 void SortSaveBestHits(int iNode, /*IN/SORT*/besthit_t *besthits,
1234 /*IN/OUT*/top_hits_t *tophits);
1236 /* Given candidate hits from one node, "transfer" them to another node:
1237 Stores them in a new place in the same order
1238 searches up to active nodes if hits involve non-active nodes
1239 If update flag is set, it also recomputes distance and criterion
1240 (and ensures that out-distances are updated); otherwise
1241 it sets dist to -1e20 and criterion to 1e20
1244 void TransferBestHits(/*IN/UPDATE*/NJ_t *NJ, int nActive,
1246 /*IN*/besthit_t *oldhits,
1248 /*OUT*/besthit_t *newhits,
1249 bool updateDistance);
1251 /* Create best hit objects from 1 or more hits. Do not update out-distances or set criteria */
1252 void HitsToBestHits(/*IN*/hit_t *hits, int nHits, int iNode, /*OUT*/besthit_t *newhits);
1253 besthit_t HitToBestHit(int i, hit_t hit);
1255 /* Given a set of besthit entries,
1256 look for improvements to the visible set of the j entries.
1257 Updates out-distances as it goes.
1258 Also replaces stale nodes with this node, because a join is usually
1259 how this happens (i.e. it does not need to walk up to ancestors).
1260 Note this calls UpdateTopVisible() on any change
1262 void UpdateVisible(/*IN/UPDATE*/NJ_t *NJ, int nActive,
1263 /*IN*/besthit_t *tophitsNode,
1265 /*IN/OUT*/top_hits_t *tophits);
1267 /* Update the top-visible list to perhaps include this hit (O(sqrt(N)) time) */
1268 void UpdateTopVisible(/*IN*/NJ_t * NJ, int nActive,
1269 int iNode, /*IN*/hit_t *hit,
1270 /*IN/OUT*/top_hits_t *tophits);
1272 /* Recompute the top-visible subset of the visible set */
1273 void ResetTopVisible(/*IN/UPDATE*/NJ_t *NJ,
1275 /*IN/OUT*/top_hits_t *tophits);
1277 /* Make a shorter list with only unique entries.
1278 Replaces any "dead" hits to nodes that have parents with their active ancestors
1279 and ignores any that become dead.
1280 Updates all criteria.
1281 Combined gets sorted by i & j
1282 The returned list is allocated to nCombined even though only *nUniqueOut entries are filled
1284 besthit_t *UniqueBestHits(/*IN/UPDATE*/NJ_t *NJ, int nActive,
1285 /*IN/SORT*/besthit_t *combined, int nCombined,
1286 /*OUT*/int *nUniqueOut);
1288 nni_t ChooseNNI(profile_t *profiles[4],
1289 /*OPTIONAL*/distance_matrix_t *dmat,
1290 int nPos, int nConstraints,
1291 /*OUT*/double criteria[3]); /* The three internal branch lengths or log likelihoods*/
1293 /* length[] is ordered as described by quartet_length_t, but after we do the swap
1294 of B with C (to give AC|BD) or B with D (to get AD|BC), if that is the returned choice
1295 bFast means do not consider NNIs if AB|CD is noticeably better than the star topology
1296 (as implemented by MLQuartetOptimize).
1297 If there are constraints, then the constraint penalty is included in criteria[]
1299 nni_t MLQuartetNNI(profile_t *profiles[4],
1300 /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1301 int nPos, int nConstraints,
1302 /*OUT*/double criteria[3], /* The three potential quartet log-likelihoods */
1303 /*IN/OUT*/numeric_t length[5],
1306 void OptimizeAllBranchLengths(/*IN/OUT*/NJ_t *NJ);
1307 double TreeLogLk(/*IN*/NJ_t *NJ, /*OPTIONAL OUT*/double *site_loglk);
1308 double MLQuartetLogLk(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
1309 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1310 /*IN*/double branch_lengths[5],
1311 /*OPTIONAL OUT*/double *site_likelihoods);
1313 /* Given a topology and branch lengths, estimate rates & recompute profiles */
1314 void SetMLRates(/*IN/OUT*/NJ_t *NJ, int nRateCategories);
1316 /* Returns a set of nRateCategories potential rates; the caller must free it */
1317 numeric_t *MLSiteRates(int nRateCategories);
1319 /* returns site_loglk so that
1320 site_loglk[nPos*iRate + j] is the log likelihood of site j with rate iRate
1321 The caller must free it.
1323 double *MLSiteLikelihoodsByRate(/*IN*/NJ_t *NJ, /*IN*/numeric_t *rates, int nRateCategories);
1326 double mult; /* multiplier for the rates / divisor for the tree-length */
1334 double GammaLogLk(/*IN*/siteratelk_t *s, /*OPTIONAL OUT*/double *gamma_loglk_sites);
1336 /* Input site_loglk must be for each rate. Note that FastTree does not reoptimize
1337 the branch lengths under the Gamma model -- it optimizes the overall scale.
1338 Reports the gamma log likelihhod (and logs site likelihoods if fpLog is set),
1339 and reports the rescaling value.
1341 double RescaleGammaLogLk(int nPos, int nRateCats,
1342 /*IN*/numeric_t *rates, /*IN*/double *site_loglk,
1343 /*OPTIONAL*/FILE *fpLog);
1345 /* P(value<=x) for the gamma distribution with shape parameter alpha and scale 1/alpha */
1346 double PGamma(double x, double alpha);
1348 /* Given a topology and branch lengths, optimize GTR rates and quickly reoptimize branch lengths
1349 If gtrfreq is NULL, then empirical frequencies are used
1351 void SetMLGtr(/*IN/OUT*/NJ_t *NJ, /*OPTIONAL IN*/double *gtrfreq, /*OPTIONAL WRITE*/FILE *fpLog);
1353 /* P(A & B | len) = P(B | A, len) * P(A)
1354 If site_likelihoods is present, multiplies those values by the site likelihood at each point
1355 (Note it does not handle underflow)
1357 double PairLogLk(/*IN*/profile_t *p1, /*IN*/profile_t *p2, double length,
1358 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1359 /*OPTIONAL IN/OUT*/double *site_likelihoods);
1361 /* Branch lengths for 4-taxon tree ((A,B),C,D); I means internal */
1362 typedef enum {LEN_A,LEN_B,LEN_C,LEN_D,LEN_I} quartet_length_t;
1366 transition_matrix_t *transmat;
1368 int nEval; /* number of likelihood evaluations */
1369 /* The pair to optimize */
1374 double PairNegLogLk(double x, void *data); /* data must be a quartet_opt_t */
1380 int iRate; /* which rate to set x from */
1383 /* Returns -log_likelihood for the tree with the given rates
1384 data must be a gtr_opt_t and x is used to set rate iRate
1385 Does not recompute profiles -- assumes that the caller will
1387 double GTRNegLogLk(double x, void *data);
1389 /* Returns the resulting log likelihood. Optionally returns whether other
1390 topologies should be abandoned, based on the difference between AB|CD and
1391 the "star topology" (AB|CD with a branch length of MLMinBranchLength) exceeding
1393 If bStarTest is passed in, it only optimized the internal branch if
1394 the star test is true. Otherwise, it optimized all 5 branch lengths
1397 double MLQuartetOptimize(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
1398 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1399 /*IN/OUT*/double branch_lengths[5],
1400 /*OPTIONAL OUT*/bool *pStarTest,
1401 /*OPTIONAL OUT*/double *site_likelihoods);
1403 /* Returns the resulting log likelihood */
1404 double MLPairOptimize(profile_t *pA, profile_t *pB,
1405 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
1406 /*IN/OUT*/double *branch_length);
1408 /* Returns the number of steps considered, with the actual steps in steps[]
1409 Modifies the tree by this chain of NNIs
1411 int FindSPRSteps(/*IN/OUT*/NJ_t *NJ,
1413 int parent, /* sibling or parent of node to NNI to start the chain */
1414 /*IN/OUT*/profile_t **upProfiles,
1415 /*OUT*/spr_step_t *steps,
1419 /* Undo a single NNI */
1420 void UnwindSPRStep(/*IN/OUT*/NJ_t *NJ,
1421 /*IN*/spr_step_t *step,
1422 /*IN/OUT*/profile_t **upProfiles);
1425 /* Update the profile of node and its ancestor, and delete nearby out-profiles */
1426 void UpdateForNNI(/*IN/OUT*/NJ_t *NJ, int node, /*IN/OUT*/profile_t **upProfiles, bool useML);
1428 /* Sets NJ->parent[newchild] and replaces oldchild with newchild
1429 in the list of children of parent
1431 void ReplaceChild(/*IN/OUT*/NJ_t *NJ, int parent, int oldchild, int newchild);
1433 int CompareHitsByCriterion(const void *c1, const void *c2);
1434 int CompareHitsByIJ(const void *c1, const void *c2);
1436 int NGaps(NJ_t *NJ, int node); /* only handles leaf sequences */
1438 /* node is the parent of AB, sibling of C
1439 node cannot be root or a leaf
1440 If node is the child of root, then D is the other sibling of node,
1441 and the 4th profile is D's profile.
1442 Otherwise, D is the parent of node, and we use its upprofile
1443 Call this with profiles=NULL to get the nodes, without fetching or
1446 void SetupABCD(NJ_t *NJ, int node,
1447 /* the 4 profiles for ABCD; the last one is an upprofile */
1448 /*OPTIONAL OUT*/profile_t *profiles[4],
1449 /*OPTIONAL IN/OUT*/profile_t **upProfiles,
1450 /*OUT*/int nodeABCD[4],
1453 int Sibling(NJ_t *NJ, int node); /* At root, no unique sibling so returns -1 */
1454 void RootSiblings(NJ_t *NJ, int node, /*OUT*/int sibs[2]);
1456 /* JC probability of nucleotide not changing, for each rate category */
1457 double *PSameVector(double length, rates_t *rates);
1459 /* JC probability of nucleotide not changing, for each rate category */
1460 double *PDiffVector(double *pSame, rates_t *rates);
1462 /* expeigen[iRate*nCodes + j] = exp(length * rate iRate * eigenvalue j) */
1463 numeric_t *ExpEigenRates(double length, transition_matrix_t *transmat, rates_t *rates);
1465 /* Print a progress report if more than 0.1 second has gone by since the progress report */
1466 /* Format should include 0-4 %d references and no newlines */
1467 void ProgressReport(char *format, int iArg1, int iArg2, int iArg3, int iArg4);
1468 void LogTree(char *format, int round, /*OPTIONAL WRITE*/FILE *fp, NJ_t *NJ, char **names, uniquify_t *unique, bool bQuote);
1469 void LogMLRates(/*OPTIONAL WRITE*/FILE *fpLog, NJ_t *NJ);
1471 void *mymalloc(size_t sz); /* Prints "Out of memory" and exits on failure */
1472 void *myfree(void *, size_t sz); /* Always returns NULL */
1474 /* One-dimensional minimization using brent's function, with
1475 a fractional and an absolute tolerance */
1476 double onedimenmin(double xmin, double xguess, double xmax, double (*f)(double,void*), void *data,
1477 double ftol, double atol,
1478 /*OUT*/double *fx, /*OUT*/double *f2x);
1480 double brent(double ax, double bx, double cx, double (*f)(double, void *), void *data,
1481 double ftol, double atol,
1482 double *foptx, double *f2optx, double fax, double fbx, double fcx);
1484 /* Vector operations, either using SSE3 or not
1485 Code assumes that vectors are a multiple of 4 in size
1487 void vector_multiply(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, int n, /*OUT*/numeric_t *fOut);
1488 numeric_t vector_multiply_sum(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, int n);
1489 void vector_add_mult(/*IN/OUT*/numeric_t *f, /*IN*/numeric_t *add, numeric_t weight, int n);
1491 /* multiply the transpose of a matrix by a vector */
1492 void matrixt_by_vector4(/*IN*/numeric_t mat[4][MAXCODES], /*IN*/numeric_t vec[4], /*OUT*/numeric_t out[4]);
1494 /* sum(f1*fBy)*sum(f2*fBy) */
1495 numeric_t vector_dot_product_rot(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, /*IN*/numeric_t* fBy, int n);
1498 numeric_t vector_multiply3_sum(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, /*IN*/numeric_t* f3, int n);
1500 numeric_t vector_sum(/*IN*/numeric_t *f1, int n);
1501 void vector_multiply_by(/*IN/OUT*/numeric_t *f, /*IN*/numeric_t fBy, int n);
1503 double clockDiff(/*IN*/struct timeval *clock_start);
1504 int timeval_subtract (/*OUT*/struct timeval *result, /*IN*/struct timeval *x, /*IN*/struct timeval *y);
1506 char *OpenMPString(void);
1508 void ran_start(long seed);
1509 double knuth_rand(); /* Random number between 0 and 1 */
1510 void tred2 (double *a, const int n, const int np, double *d, double *e);
1511 double pythag(double a, double b);
1512 void tqli(double *d, double *e, int n, int np, double *z);
1514 /* Like mymalloc; duplicates the input (returns NULL if given NULL) */
1515 void *mymemdup(void *data, size_t sz);
1516 void *myrealloc(void *data, size_t szOld, size_t szNew, bool bCopy);
1518 double pnorm(double z); /* Probability(value <=z) */
1520 /* Hashtable functions */
1524 int nCount; /* number of times this entry was seen */
1525 int first; /* index of first entry with this value */
1530 /* hashvalue -> bucket. Or look in bucket + 1, +2, etc., till you hit a NULL string */
1531 hashbucket_t *buckets;
1533 typedef int hashiterator_t;
1535 hashstrings_t *MakeHashtable(char **strings, int nStrings);
1536 hashstrings_t *FreeHashtable(hashstrings_t* hash); /*returns NULL*/
1537 hashiterator_t FindMatch(hashstrings_t *hash, char *string);
1539 /* Return NULL if we have run out of values */
1540 char *GetHashString(hashstrings_t *hash, hashiterator_t hi);
1541 int HashCount(hashstrings_t *hash, hashiterator_t hi);
1542 int HashFirst(hashstrings_t *hash, hashiterator_t hi);
1544 void PrintNJ(/*WRITE*/FILE *, NJ_t *NJ, char **names, uniquify_t *unique, bool bShowSupport, bool bQuoteNames);
1546 /* Print topology using node indices as node names */
1547 void PrintNJInternal(/*WRITE*/FILE *, NJ_t *NJ, bool useLen);
1549 uniquify_t *UniquifyAln(/*IN*/alignment_t *aln);
1550 uniquify_t *FreeUniquify(uniquify_t *); /* returns NULL */
1552 /* Convert a constraint alignment to a list of sequences. The returned array is indexed
1553 by iUnique and points to values in the input alignment
1555 char **AlnToConstraints(alignment_t *constraints, uniquify_t *unique, hashstrings_t *hashnames);
1557 /* ReadTree ignores non-unique leaves after the first instance.
1558 At the end, it prunes the tree to ignore empty children and it
1559 unroots the tree if necessary.
1561 void ReadTree(/*IN/OUT*/NJ_t *NJ,
1562 /*IN*/uniquify_t *unique,
1563 /*IN*/hashstrings_t *hashnames,
1564 /*READ*/FILE *fpInTree);
1565 char *ReadTreeToken(/*READ*/FILE *fp); /* returns a static array, or NULL on EOF */
1566 void ReadTreeAddChild(int parent, int child, /*IN/OUT*/int *parents, /*IN/OUT*/children_t *children);
1567 /* Do not add the leaf if we already set this unique-set to another parent */
1568 void ReadTreeMaybeAddLeaf(int parent, char *name,
1569 hashstrings_t *hashnames, uniquify_t *unique,
1570 /*IN/OUT*/int *parents, /*IN/OUT*/children_t *children);
1571 void ReadTreeRemove(/*IN/OUT*/int *parents, /*IN/OUT*/children_t *children, int node);
1573 /* Routines to support tree traversal and prevent visiting a node >1 time
1574 (esp. if topology changes).
1576 typedef bool *traversal_t;
1577 traversal_t InitTraversal(NJ_t*);
1578 void SkipTraversalInto(int node, /*IN/OUT*/traversal_t traversal);
1579 traversal_t FreeTraversal(traversal_t, NJ_t*); /*returns NULL*/
1581 /* returns new node, or -1 if nothing left to do. Use root for the first call.
1582 Will return every node and then root.
1583 Uses postorder tree traversal (depth-first search going down to leaves first)
1584 Keeps track of which nodes are visited, so even after an NNI that swaps a
1585 visited child with an unvisited uncle, the next call will visit the
1586 was-uncle-now-child. (However, after SPR moves, there is no such guarantee.)
1588 If pUp is not NULL, then, if going "back up" through a previously visited node
1589 (presumably due to an NNI), then it will return the node another time,
1592 int TraversePostorder(int lastnode, NJ_t *NJ, /*IN/OUT*/traversal_t,
1593 /*OUT OPTIONAL*/bool *pUp);
1595 /* Routines to support storing up-profiles during tree traversal
1596 Eventually these should be smart enough to do weighted joins and
1597 to minimize memory usage
1599 profile_t **UpProfiles(NJ_t *NJ);
1600 profile_t *GetUpProfile(/*IN/OUT*/profile_t **upProfiles, NJ_t *NJ, int node, bool useML);
1601 profile_t *DeleteUpProfile(/*IN/OUT*/profile_t **upProfiles, NJ_t *NJ, int node); /* returns NULL */
1602 profile_t **FreeUpProfiles(profile_t **upProfiles, NJ_t *NJ); /* returns NULL */
1604 /* Recomputes the profile for a node, presumably to reflect topology changes
1605 If bionj is set, does a weighted join -- which requires using upProfiles
1606 If useML is set, computes the posterior probability instead of averaging
1608 void RecomputeProfile(/*IN/OUT*/NJ_t *NJ, /*IN/OUT*/profile_t **upProfiles, int node, bool useML);
1610 /* Recompute profiles going up from the leaves, using the provided distance matrix
1611 and unweighted joins
1613 void RecomputeProfiles(/*IN/OUT*/NJ_t *NJ, /*OPTIONAL*/distance_matrix_t *dmat);
1615 void RecomputeMLProfiles(/*IN/OUT*/NJ_t *NJ);
1617 /* If bionj is set, computes the weight to be given to A when computing the
1618 profile for the ancestor of A and B. C and D are the other profiles in the quartet
1619 If bionj is not set, returns -1 (which means unweighted in AverageProfile).
1620 (A and B are the first two profiles in the array)
1622 double QuartetWeight(profile_t *profiles[4], distance_matrix_t *dmat, int nPos);
1624 /* Returns a list of nodes, starting with node and ending with root */
1625 int *PathToRoot(NJ_t *NJ, int node, /*OUT*/int *depth);
1626 int *FreePath(int *path, NJ_t *NJ); /* returns NULL */
1628 /* The default amino acid distance matrix, derived from the BLOSUM45 similarity matrix */
1629 distance_matrix_t matrixBLOSUM45;
1631 /* The default amino acid transition matrix (Jones Taylor Thorton 1992) */
1632 double matrixJTT92[MAXCODES][MAXCODES];
1633 double statJTT92[MAXCODES];
1635 /* The WAG amino acid transition matrix (Whelan-And-Goldman 2001) */
1636 double matrixWAG01[MAXCODES][MAXCODES];
1637 double statWAG01[MAXCODES];
1640 int main(int argc, char **argv) {
1641 int nAlign = 1; /* number of alignments to read */
1643 char *matrixPrefix = NULL;
1644 distance_matrix_t *distance_matrix = NULL;
1645 bool make_matrix = false;
1646 char *constraintsFile = NULL;
1647 char *intreeFile = NULL;
1648 bool intree1 = false; /* the same starting tree each round */
1649 int nni = -1; /* number of rounds of NNI, defaults to 4*log2(n) */
1650 int spr = 2; /* number of rounds of SPR */
1651 int maxSPRLength = 10; /* maximum distance to move a node */
1652 int MLnni = -1; /* number of rounds of ML NNI, defaults to 2*log2(n) */
1653 bool MLlen = false; /* optimize branch lengths; no topology changes */
1654 int nBootstrap = 1000; /* If set, number of replicates of local bootstrap to do */
1655 int nRateCats = nDefaultRateCats;
1656 char *logfile = NULL;
1657 bool bUseGtr = false;
1658 bool bUseWag = false;
1659 bool bUseGtrRates = false;
1660 double gtrrates[6] = {1,1,1,1,1,1};
1661 bool bUseGtrFreq = false;
1662 double gtrfreq[4] = {0.25,0.25,0.25,0.25};
1663 bool bQuote = false;
1664 FILE *fpOut = stdout;
1666 if (isatty(STDIN_FILENO) && argc == 1) {
1667 fprintf(stderr,"Usage for FastTree version %s %s%s:\n%s",
1668 FT_VERSION, SSE_STRING, OpenMPString(), usage);
1669 #if (defined _WIN32 || defined WIN32 || defined WIN64 || defined _WIN64)
1670 fprintf(stderr, "Windows users: Please remember to run this inside a command shell\n");
1671 fprintf(stderr,"Hit return to continue\n");
1676 for (iArg = 1; iArg < argc; iArg++) {
1677 if (strcmp(argv[iArg],"-makematrix") == 0) {
1679 } else if (strcmp(argv[iArg],"-logdist") == 0) {
1680 fprintf(stderr, "Warning: logdist is now on by default and obsolete\n");
1681 } else if (strcmp(argv[iArg],"-rawdist") == 0) {
1683 } else if (strcmp(argv[iArg],"-verbose") == 0 && iArg < argc-1) {
1684 verbose = atoi(argv[++iArg]);
1685 } else if (strcmp(argv[iArg],"-quiet") == 0) {
1688 } else if (strcmp(argv[iArg],"-nopr") == 0) {
1690 } else if (strcmp(argv[iArg],"-slow") == 0) {
1692 } else if (strcmp(argv[iArg],"-fastest") == 0) {
1694 tophitsRefresh = 0.5;
1695 useTopHits2nd = true;
1696 } else if (strcmp(argv[iArg],"-2nd") == 0) {
1697 useTopHits2nd = true;
1698 } else if (strcmp(argv[iArg],"-no2nd") == 0) {
1699 useTopHits2nd = false;
1700 } else if (strcmp(argv[iArg],"-slownni") == 0) {
1702 } else if (strcmp(argv[iArg], "-matrix") == 0 && iArg < argc-1) {
1704 matrixPrefix = argv[iArg];
1705 } else if (strcmp(argv[iArg], "-nomatrix") == 0) {
1707 } else if (strcmp(argv[iArg], "-n") == 0 && iArg < argc-1) {
1709 nAlign = atoi(argv[iArg]);
1711 fprintf(stderr, "-n argument for #input alignments must be > 0 not %s\n", argv[iArg]);
1714 } else if (strcmp(argv[iArg], "-quote") == 0) {
1716 } else if (strcmp(argv[iArg], "-nt") == 0) {
1718 } else if (strcmp(argv[iArg], "-intree") == 0 && iArg < argc-1) {
1720 intreeFile = argv[iArg];
1721 } else if (strcmp(argv[iArg], "-intree1") == 0 && iArg < argc-1) {
1723 intreeFile = argv[iArg];
1725 } else if (strcmp(argv[iArg], "-nj") == 0) {
1727 } else if (strcmp(argv[iArg], "-bionj") == 0) {
1729 } else if (strcmp(argv[iArg], "-boot") == 0 && iArg < argc-1) {
1731 nBootstrap = atoi(argv[iArg]);
1732 } else if (strcmp(argv[iArg], "-noboot") == 0 || strcmp(argv[iArg], "-nosupport") == 0) {
1734 } else if (strcmp(argv[iArg], "-seed") == 0 && iArg < argc-1) {
1736 long seed = atol(argv[iArg]);
1738 } else if (strcmp(argv[iArg],"-top") == 0) {
1739 if(tophitsMult < 0.01)
1741 } else if (strcmp(argv[iArg],"-notop") == 0) {
1743 } else if (strcmp(argv[iArg], "-topm") == 0 && iArg < argc-1) {
1745 tophitsMult = atof(argv[iArg]);
1746 } else if (strcmp(argv[iArg], "-close") == 0 && iArg < argc-1) {
1748 tophitsClose = atof(argv[iArg]);
1749 if (tophitsMult <= 0) {
1750 fprintf(stderr, "Cannot use -close unless -top is set above 0\n");
1753 if (tophitsClose <= 0 || tophitsClose >= 1) {
1754 fprintf(stderr, "-close argument must be between 0 and 1\n");
1757 } else if (strcmp(argv[iArg], "-refresh") == 0 && iArg < argc-1) {
1759 tophitsRefresh = atof(argv[iArg]);
1760 if (tophitsMult <= 0) {
1761 fprintf(stderr, "Cannot use -refresh unless -top is set above 0\n");
1764 if (tophitsRefresh <= 0 || tophitsRefresh >= 1) {
1765 fprintf(stderr, "-refresh argument must be between 0 and 1\n");
1768 } else if (strcmp(argv[iArg],"-nni") == 0 && iArg < argc-1) {
1770 nni = atoi(argv[iArg]);
1773 } else if (strcmp(argv[iArg],"-spr") == 0 && iArg < argc-1) {
1775 spr = atoi(argv[iArg]);
1776 } else if (strcmp(argv[iArg],"-sprlength") == 0 && iArg < argc-1) {
1778 maxSPRLength = atoi(argv[iArg]);
1779 } else if (strcmp(argv[iArg],"-mlnni") == 0 && iArg < argc-1) {
1781 MLnni = atoi(argv[iArg]);
1782 } else if (strcmp(argv[iArg],"-noml") == 0) {
1784 } else if (strcmp(argv[iArg],"-mllen") == 0) {
1787 } else if (strcmp(argv[iArg],"-nome") == 0) {
1790 } else if (strcmp(argv[iArg],"-help") == 0) {
1791 fprintf(stderr,"FastTree %s %s%s:\n%s", FT_VERSION, SSE_STRING, OpenMPString(), usage);
1793 } else if (strcmp(argv[iArg],"-expert") == 0) {
1794 fprintf(stderr, "Detailed usage for FastTree %s %s%s:\n%s",
1795 FT_VERSION, SSE_STRING, OpenMPString(), expertUsage);
1797 } else if (strcmp(argv[iArg],"-pseudo") == 0) {
1798 if (iArg < argc-1 && isdigit(argv[iArg+1][0])) {
1800 pseudoWeight = atof(argv[iArg]);
1801 if (pseudoWeight < 0.0) {
1802 fprintf(stderr,"Illegal argument to -pseudo: %s\n", argv[iArg]);
1808 } else if (strcmp(argv[iArg],"-constraints") == 0 && iArg < argc-1) {
1810 constraintsFile = argv[iArg];
1811 } else if (strcmp(argv[iArg],"-constraintWeight") == 0 && iArg < argc-1) {
1813 constraintWeight = atof(argv[iArg]);
1814 if (constraintWeight <= 0.0) {
1815 fprintf(stderr, "Illegal argument to -constraintWeight (must be greater than zero): %s\n", argv[iArg]);
1818 } else if (strcmp(argv[iArg],"-mlacc") == 0 && iArg < argc-1) {
1820 mlAccuracy = atoi(argv[iArg]);
1821 if (mlAccuracy < 1) {
1822 fprintf(stderr, "Illlegal -mlacc argument: %s\n", argv[iArg]);
1825 } else if (strcmp(argv[iArg],"-exactml") == 0 || strcmp(argv[iArg],"-mlexact") == 0) {
1826 fprintf(stderr,"-exactml is not required -- exact posteriors is the default now\n");
1827 } else if (strcmp(argv[iArg],"-approxml") == 0 || strcmp(argv[iArg],"-mlapprox") == 0) {
1829 } else if (strcmp(argv[iArg],"-cat") == 0 && iArg < argc-1) {
1831 nRateCats = atoi(argv[iArg]);
1832 if (nRateCats < 1) {
1833 fprintf(stderr, "Illlegal argument to -ncat (must be greater than zero): %s\n", argv[iArg]);
1836 } else if (strcmp(argv[iArg],"-nocat") == 0) {
1838 } else if (strcmp(argv[iArg], "-wag") == 0) {
1840 } else if (strcmp(argv[iArg], "-gtr") == 0) {
1842 } else if (strcmp(argv[iArg], "-gtrrates") == 0 && iArg < argc-6) {
1844 bUseGtrRates = true;
1846 for (i = 0; i < 6; i++) {
1847 gtrrates[i] = atof(argv[++iArg]);
1848 if (gtrrates[i] < 1e-5) {
1849 fprintf(stderr, "Illegal or too small value of GTR rate: %s\n", argv[iArg]);
1853 } else if (strcmp(argv[iArg],"-gtrfreq") == 0 && iArg < argc-4) {
1858 for (i = 0; i < 4; i++) {
1859 gtrfreq[i] = atof(argv[++iArg]);
1861 if (gtrfreq[i] < 1e-5) {
1862 fprintf(stderr, "Illegal or too small value of GTR frequency: %s\n", argv[iArg]);
1866 if (fabs(1.0-sum) > 0.01) {
1867 fprintf(stderr, "-gtrfreq values do not sum to 1\n");
1870 for (i = 0; i < 4; i++)
1872 } else if (strcmp(argv[iArg],"-log") == 0 && iArg < argc-1) {
1874 logfile = argv[iArg];
1875 } else if (strcmp(argv[iArg],"-gamma") == 0) {
1877 } else if (strcmp(argv[iArg],"-out") == 0 && iArg < argc-1) {
1879 fpOut = fopen(argv[iArg],"w");
1881 fprintf(stderr,"Cannot write to %s\n",argv[iArg]);
1884 } else if (argv[iArg][0] == '-') {
1885 fprintf(stderr, "Unknown or incorrect use of option %s\n%s", argv[iArg], usage);
1891 fprintf(stderr, "%s", usage);
1895 codesString = nCodes == 20 ? codesStringAA : codesStringNT;
1896 if (nCodes == 4 && matrixPrefix == NULL)
1897 useMatrix = false; /* no default nucleotide matrix */
1899 char *fileName = iArg == (argc-1) ? argv[argc-1] : NULL;
1901 if (slow && fastest) {
1902 fprintf(stderr,"Cannot be both slow and fastest\n");
1905 if (slow && tophitsMult > 0) {
1910 if (logfile != NULL) {
1911 fpLog = fopen(logfile, "w");
1912 if (fpLog == NULL) {
1913 fprintf(stderr, "Cannot write to: %s\n", logfile);
1916 fprintf(fpLog, "Command:");
1918 for (i=0; i < argc; i++)
1919 fprintf(fpLog, " %s", argv[i]);
1920 fprintf(fpLog,"\n");
1925 FILE *fps[2] = {NULL,NULL};
1928 fps[nFPs++] = stderr;
1930 fps[nFPs++] = fpLog;
1932 if (!make_matrix) { /* Report settings */
1933 char tophitString[100] = "no";
1934 char tophitsCloseStr[100] = "default";
1935 if(tophitsClose > 0) sprintf(tophitsCloseStr,"%.2f",tophitsClose);
1936 if(tophitsMult>0) sprintf(tophitString,"%.2f*sqrtN close=%s refresh=%.2f",
1937 tophitsMult, tophitsCloseStr, tophitsRefresh);
1938 char supportString[100] = "none";
1940 if (MLnni != 0 || MLlen)
1941 sprintf(supportString, "SH-like %d", nBootstrap);
1943 sprintf(supportString,"Local boot %d",nBootstrap);
1945 char nniString[100] = "(no NNI)";
1947 sprintf(nniString, "+NNI (%d rounds)", nni);
1949 strcpy(nniString, "+NNI");
1950 char sprString[100] = "(no SPR)";
1952 sprintf(sprString, "+SPR (%d rounds range %d)", spr, maxSPRLength);
1953 char mlnniString[100] = "(no ML-NNI)";
1955 sprintf(mlnniString, "+ML-NNI (%d rounds)", MLnni);
1956 else if (MLnni == -1)
1957 sprintf(mlnniString, "+ML-NNI");
1959 sprintf(mlnniString, "+ML branch lengths");
1960 if ((MLlen || MLnni != 0) && !exactML)
1961 strcat(mlnniString, " approx");
1963 sprintf(mlnniString+strlen(mlnniString), " opt-each=%d",mlAccuracy);
1965 for (i = 0; i < nFPs; i++) {
1967 fprintf(fp,"FastTree Version %s %s%s\nAlignment: %s",
1968 FT_VERSION, SSE_STRING, OpenMPString(), fileName != NULL ? fileName : "standard input");
1970 fprintf(fp, " (%d alignments)", nAlign);
1971 fprintf(fp,"\n%s distances: %s Joins: %s Support: %s\n",
1972 nCodes == 20 ? "Amino acid" : "Nucleotide",
1973 matrixPrefix ? matrixPrefix : (useMatrix? "BLOSUM45"
1974 : (nCodes==4 && logdist ? "Jukes-Cantor" : "%different")),
1975 bionj ? "weighted" : "balanced" ,
1977 if (intreeFile == NULL)
1978 fprintf(fp, "Search: %s%s %s %s %s\nTopHits: %s\n",
1979 slow?"Exhaustive (slow)" : (fastest ? "Fastest" : "Normal"),
1980 useTopHits2nd ? "+2nd" : "",
1981 nniString, sprString, mlnniString,
1984 fprintf(fp, "Start at tree from %s %s %s\n", intreeFile, nniString, sprString);
1986 if (MLnni != 0 || MLlen) {
1987 fprintf(fp, "ML Model: %s,",
1988 (nCodes == 4) ? (bUseGtr ? "Generalized Time-Reversible" : "Jukes-Cantor") : (bUseWag ? "Whelan-And-Goldman" : "Jones-Taylor-Thorton"));
1990 fprintf(fp, " No rate variation across sites");
1992 fprintf(fp, " CAT approximation with %d rate categories", nRateCats);
1994 if (nCodes == 4 && bUseGtrRates)
1995 fprintf(fp, "GTR rates(ac ag at cg ct gt) %.4f %.4f %.4f %.4f %.4f %.4f\n",
1996 gtrrates[0],gtrrates[1],gtrrates[2],gtrrates[3],gtrrates[4],gtrrates[5]);
1997 if (nCodes == 4 && bUseGtrFreq)
1998 fprintf(fp, "GTR frequencies(A C G T) %.4f %.4f %.4f %.4f\n",
1999 gtrfreq[0],gtrfreq[1],gtrfreq[2],gtrfreq[3]);
2001 if (constraintsFile != NULL)
2002 fprintf(fp, "Constraints: %s Weight: %.3f\n", constraintsFile, constraintWeight);
2003 if (pseudoWeight > 0)
2004 fprintf(fp, "Pseudocount weight for comparing sequences with little overlap: %.3lf\n",pseudoWeight);
2008 if (matrixPrefix != NULL) {
2010 fprintf(stderr,"Cannot use both -matrix and -nomatrix arguments!");
2013 distance_matrix = ReadDistanceMatrix(matrixPrefix);
2014 } else if (useMatrix) { /* use default matrix */
2016 distance_matrix = &matrixBLOSUM45;
2017 SetupDistanceMatrix(distance_matrix);
2019 distance_matrix = NULL;
2023 FILE *fpIn = fileName != NULL ? fopen(fileName, "r") : stdin;
2025 fprintf(stderr, "Cannot read %s\n", fileName);
2028 FILE *fpConstraints = NULL;
2029 if (constraintsFile != NULL) {
2030 fpConstraints = fopen(constraintsFile, "r");
2031 if (fpConstraints == NULL) {
2032 fprintf(stderr, "Cannot read %s\n", constraintsFile);
2037 FILE *fpInTree = NULL;
2038 if (intreeFile != NULL) {
2039 fpInTree = fopen(intreeFile,"r");
2040 if (fpInTree == NULL) {
2041 fprintf(stderr, "Cannot read %s\n", intreeFile);
2046 for(iAln = 0; iAln < nAlign; iAln++) {
2047 alignment_t *aln = ReadAlignment(fpIn, bQuote);
2048 if (aln->nSeq < 1) {
2049 fprintf(stderr, "No alignment sequences\n");
2053 fprintf(fpLog, "Read %d sequences, %d positions\n", aln->nSeq, aln->nPos);
2057 struct timeval clock_start;
2058 gettimeofday(&clock_start,NULL);
2059 ProgressReport("Read alignment",0,0,0,0);
2061 /* Check that all names in alignment are unique */
2062 hashstrings_t *hashnames = MakeHashtable(aln->names, aln->nSeq);
2064 for (i=0; i<aln->nSeq; i++) {
2065 hashiterator_t hi = FindMatch(hashnames,aln->names[i]);
2066 if (HashCount(hashnames,hi) != 1) {
2067 fprintf(stderr,"Non-unique name '%s' in the alignment\n",aln->names[i]);
2072 /* Make a list of unique sequences -- note some lists are bigger than required */
2073 ProgressReport("Hashed the names",0,0,0,0);
2075 NJ_t *NJ = InitNJ(aln->seqs, aln->nSeq, aln->nPos,
2076 /*constraintSeqs*/NULL, /*nConstraints*/0,
2077 distance_matrix, /*transmat*/NULL);
2078 printf(" %d\n",aln->nSeq);
2080 for(i = 0; i < NJ->nSeq; i++) {
2081 printf("%s",aln->names[i]);
2082 for (j = 0; j < NJ->nSeq; j++) {
2084 SeqDist(NJ->profiles[i]->codes,NJ->profiles[j]->codes,NJ->nPos,NJ->distance_matrix,/*OUT*/&hit);
2086 hit.dist = LogCorrect(hit.dist);
2087 /* Make sure -0 prints as 0 */
2088 printf(" %f", hit.dist <= 0.0 ? 0.0 : hit.dist);
2101 nRefreshTopHits = 0;
2105 nProfileFreqAlloc = 0;
2106 nProfileFreqAvoid = 0;
2111 nPosteriorCompute = 0;
2112 nAAPosteriorExact = 0;
2113 nAAPosteriorRough = 0;
2116 uniquify_t *unique = UniquifyAln(aln);
2117 ProgressReport("Identified unique sequences",0,0,0,0);
2119 /* read constraints */
2120 alignment_t *constraints = NULL;
2121 char **uniqConstraints = NULL;
2122 if (constraintsFile != NULL) {
2123 constraints = ReadAlignment(fpConstraints, bQuote);
2124 if (constraints->nSeq < 4) {
2125 fprintf(stderr, "Warning: constraints file with less than 4 sequences ignored:\nalignment #%d in %s\n",
2126 iAln+1, constraintsFile);
2127 constraints = FreeAlignment(constraints);
2129 uniqConstraints = AlnToConstraints(constraints, unique, hashnames);
2130 ProgressReport("Read the constraints",0,0,0,0);
2132 } /* end load constraints */
2134 transition_matrix_t *transmat = NULL;
2136 transmat = bUseWag? CreateTransitionMatrix(matrixWAG01,statWAG01) : CreateTransitionMatrix(matrixJTT92,statJTT92);
2137 } else if (nCodes == 4 && bUseGtr && (bUseGtrRates || bUseGtrFreq)) {
2138 transmat = CreateGTR(gtrrates,gtrfreq);
2140 NJ_t *NJ = InitNJ(unique->uniqueSeq, unique->nUnique, aln->nPos,
2142 uniqConstraints != NULL ? constraints->nPos : 0, /* nConstraints */
2145 if (verbose>2) fprintf(stderr, "read %s seqs %d (%d unique) positions %d nameLast %s seqLast %s\n",
2146 fileName ? fileName : "standard input",
2147 aln->nSeq, unique->nUnique, aln->nPos, aln->names[aln->nSeq-1], aln->seqs[aln->nSeq-1]);
2148 FreeAlignmentSeqs(/*IN/OUT*/aln); /*no longer needed*/
2149 if (fpInTree != NULL) {
2151 fseek(fpInTree, 0L, SEEK_SET);
2152 ReadTree(/*IN/OUT*/NJ, /*IN*/unique, /*IN*/hashnames, /*READ*/fpInTree);
2154 fprintf(stderr, "Read tree from %s\n", intreeFile);
2156 PrintNJ(stderr, NJ, aln->names, unique, /*support*/false, bQuote);
2160 LogTree("NJ", 0, fpLog, NJ, aln->names, unique, bQuote);
2162 /* profile-frequencies for the "up-profiles" in ReliabilityNJ take only diameter(Tree)*L*a
2163 space not N*L*a space, because we can free them as we go.
2164 And up-profile by their nature tend to be complicated.
2165 So save the profile-frequency memory allocation counters now to exclude later results.
2168 long svProfileFreqAlloc = nProfileFreqAlloc;
2169 long svProfileFreqAvoid = nProfileFreqAvoid;
2171 int nniToDo = nni == -1 ? (int)(0.5 + 4.0 * log(NJ->nSeq)/log(2)) : nni;
2172 int sprRemaining = spr;
2173 int MLnniToDo = (MLnni != -1) ? MLnni : (int)(0.5 + 2.0*log(NJ->nSeq)/log(2));
2175 if (fpInTree == NULL)
2176 fprintf(stderr, "Initial topology in %.2f seconds\n", clockDiff(&clock_start));
2177 if (spr > 0 || nniToDo > 0 || MLnniToDo > 0)
2178 fprintf(stderr,"Refining topology: %d rounds ME-NNIs, %d rounds ME-SPRs, %d rounds ML-NNIs\n", nniToDo, spr, MLnniToDo);
2183 bool bConverged = false;
2184 nni_stats_t *nni_stats = InitNNIStats(NJ);
2185 for (i=0; i < nniToDo; i++) {
2188 int nChange = NNI(/*IN/OUT*/NJ, i, nniToDo, /*use ml*/false, /*IN/OUT*/nni_stats, /*OUT*/&maxDelta);
2189 LogTree("ME_NNI%d",i+1, fpLog, NJ, aln->names, unique, bQuote);
2193 fprintf(stderr, "Min_evolution NNIs converged at round %d -- skipping some rounds\n", i+1);
2195 fprintf(fpLog, "Min_evolution NNIs converged at round %d -- skipping some rounds\n", i+1);
2199 /* Interleave SPRs with NNIs (typically 1/3rd NNI, SPR, 1/3rd NNI, SPR, 1/3rd NNI */
2200 if (sprRemaining > 0 && (nniToDo/(spr+1) > 0 && ((i+1) % (nniToDo/(spr+1))) == 0)) {
2201 SPR(/*IN/OUT*/NJ, maxSPRLength, spr-sprRemaining, spr);
2202 LogTree("ME_SPR%d",spr-sprRemaining+1, fpLog, NJ, aln->names, unique, bQuote);
2204 /* Restart the NNIs -- set all ages to 0, etc. */
2206 nni_stats = FreeNNIStats(nni_stats, NJ);
2207 nni_stats = InitNNIStats(NJ);
2210 nni_stats = FreeNNIStats(nni_stats, NJ);
2212 while(sprRemaining > 0) { /* do any remaining SPR rounds */
2213 SPR(/*IN/OUT*/NJ, maxSPRLength, spr-sprRemaining, spr);
2214 LogTree("ME_SPR%d",spr-sprRemaining+1, fpLog, NJ, aln->names, unique, bQuote);
2218 /* In minimum-evolution mode, update branch lengths, even if no NNIs or SPRs,
2219 so that they are log-corrected, do not include penalties from constraints,
2220 and avoid errors due to approximation of out-distances.
2221 If doing maximum-likelihood NNIs, then we'll also use these
2222 to get estimates of starting distances for quartets, etc.
2224 UpdateBranchLengths(/*IN/OUT*/NJ);
2225 LogTree("ME_Lengths",0, fpLog, NJ, aln->names, unique, bQuote);
2227 double total_len = 0;
2229 for (iNode = 0; iNode < NJ->maxnode; iNode++)
2230 total_len += fabs(NJ->branchlength[iNode]);
2233 fprintf(stderr, "Total branch-length %.3f after %.2f sec\n",
2234 total_len, clockDiff(&clock_start));
2238 fprintf(fpLog, "Total branch-length %.3f after %.2f sec\n",
2239 total_len, clockDiff(&clock_start));
2245 struct mallinfo mi = mallinfo();
2246 fprintf(stderr, "Memory @ end of ME phase: %.2f MB (%.1f byte/pos) useful %.2f expected %.2f\n",
2247 (mi.arena+mi.hblkhd)/1.0e6, (mi.arena+mi.hblkhd)/(double)(NJ->nSeq*(double)NJ->nPos),
2248 mi.uordblks/1.0e6, mymallocUsed/1e6);
2252 SplitCount_t splitcount = {0,0,0,0,0.0,0.0};
2254 if (MLnniToDo > 0 || MLlen) {
2255 bool warn_len = total_len/NJ->maxnode < 0.001 && MLMinBranchLengthTolerance > 1.0/aln->nPos;
2256 bool warn = warn_len || (total_len/NJ->maxnode < 0.001 && aln->nPos >= 10000);
2258 fprintf(stderr, "\nWARNING! This alignment consists of closely-related and very-long sequences.\n");
2261 "This version of FastTree may not report reasonable branch lengths!\n"
2263 "Consider changing MLMinBranchLengthTolerance.\n"
2265 "Consider recompiling FastTree with -DUSE_DOUBLE.\n"
2267 "For more information, visit\n"
2268 "http://www.microbesonline.org/fasttree/#BranchLen\n\n");
2270 fprintf(stderr, "WARNING! FastTree (or other standard maximum-likelihood tools)\n"
2271 "may not be appropriate for aligments of very closely-related sequences\n"
2272 "like this one, as FastTree does not account for recombination or gene conversion\n\n");
2274 /* Do maximum-likelihood computations */
2275 /* Convert profiles to use the transition matrix */
2276 distance_matrix_t *tmatAsDist = TransMatToDistanceMat(/*OPTIONAL*/NJ->transmat);
2277 RecomputeProfiles(NJ, /*OPTIONAL*/tmatAsDist);
2278 tmatAsDist = myfree(tmatAsDist, sizeof(distance_matrix_t));
2279 double lastloglk = -1e20;
2280 nni_stats_t *nni_stats = InitNNIStats(NJ);
2281 bool resetGtr = nCodes == 4 && bUseGtr && !bUseGtrRates;
2285 int maxRound = (int)(0.5 + log(NJ->nSeq)/log(2));
2286 double dLastLogLk = -1e20;
2287 for (iRound = 1; iRound <= maxRound; iRound++) {
2289 numeric_t *oldlength = (numeric_t*)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2290 for (node = 0; node < NJ->maxnode; node++)
2291 oldlength[node] = NJ->branchlength[node];
2292 OptimizeAllBranchLengths(/*IN/OUT*/NJ);
2293 LogTree("ML_Lengths",iRound, fpLog, NJ, aln->names, unique, bQuote);
2294 double dMaxChange = 0; /* biggest change in branch length */
2295 for (node = 0; node < NJ->maxnode; node++) {
2296 double d = fabs(oldlength[node] - NJ->branchlength[node]);
2300 oldlength = myfree(oldlength, sizeof(numeric_t)*NJ->maxnodes);
2301 double loglk = TreeLogLk(NJ, /*site_likelihoods*/NULL);
2302 bool bConverged = iRound > 1 && (dMaxChange < 0.001 || loglk < (dLastLogLk+treeLogLkDelta));
2304 fprintf(stderr, "%d rounds ML lengths: LogLk %s= %.3lf Max-change %.4lf%s Time %.2f\n",
2306 exactML || nCodes != 20 ? "" : "~",
2309 bConverged ? " (converged)" : "",
2310 clockDiff(&clock_start));
2312 fprintf(fpLog, "TreeLogLk\tLength%d\t%.4lf\tMaxChange\t%.4lf\n",
2313 iRound, loglk, dMaxChange);
2316 SetMLGtr(/*IN/OUT*/NJ, bUseGtrFreq ? gtrfreq : NULL, fpLog);
2317 SetMLRates(/*IN/OUT*/NJ, nRateCats);
2318 LogMLRates(fpLog, NJ);
2325 if (MLnniToDo > 0) {
2326 /* This may help us converge faster, and is fast */
2327 OptimizeAllBranchLengths(/*IN/OUT*/NJ);
2328 LogTree("ML_Lengths%d",1, fpLog, NJ, aln->names, unique, bQuote);
2333 bool bConverged = false;
2334 for (iMLnni = 0; iMLnni < MLnniToDo; iMLnni++) {
2335 int changes = NNI(/*IN/OUT*/NJ, iMLnni, MLnniToDo, /*use ml*/true, /*IN/OUT*/nni_stats, /*OUT*/&maxDelta);
2336 LogTree("ML_NNI%d",iMLnni+1, fpLog, NJ, aln->names, unique, bQuote);
2337 double loglk = TreeLogLk(NJ, /*site_likelihoods*/NULL);
2338 bool bConvergedHere = (iMLnni > 0) && ((loglk < lastloglk + treeLogLkDelta) || maxDelta < treeLogLkDelta);
2340 fprintf(stderr, "ML-NNI round %d: LogLk %s= %.3f NNIs %d max delta %.2f Time %.2f%s\n",
2342 exactML || nCodes != 20 ? "" : "~",
2343 loglk, changes, maxDelta, clockDiff(&clock_start),
2344 bConverged ? " (final)" : "");
2346 fprintf(fpLog, "TreeLogLk\tML_NNI%d\t%.4lf\tMaxChange\t%.4lf\n", iMLnni+1, loglk, maxDelta);
2348 break; /* we did our extra round */
2351 if (bConverged || iMLnni == MLnniToDo-2) {
2352 /* last round uses high-accuracy seettings -- reset NNI stats to tone down heuristics */
2353 nni_stats = FreeNNIStats(nni_stats, NJ);
2354 nni_stats = InitNNIStats(NJ);
2356 fprintf(stderr, "Turning off heuristics for final round of ML NNIs%s\n",
2357 bConvergedHere? " (converged)" : "");
2359 fprintf(fpLog, "Turning off heuristics for final round of ML NNIs%s\n",
2360 bConvergedHere? " (converged)" : "");
2363 if (iMLnni == 0 && NJ->rates.nRateCategories == 1) {
2365 SetMLGtr(/*IN/OUT*/NJ, bUseGtrFreq ? gtrfreq : NULL, fpLog);
2366 SetMLRates(/*IN/OUT*/NJ, nRateCats);
2367 LogMLRates(fpLog, NJ);
2370 nni_stats = FreeNNIStats(nni_stats, NJ);
2372 /* This does not take long and improves the results */
2373 if (MLnniToDo > 0) {
2374 OptimizeAllBranchLengths(/*IN/OUT*/NJ);
2375 LogTree("ML_Lengths%d",2, fpLog, NJ, aln->names, unique, bQuote);
2376 if (verbose || fpLog) {
2377 double loglk = TreeLogLk(NJ, /*site_likelihoods*/NULL);
2379 fprintf(stderr, "Optimize all lengths: LogLk %s= %.3f Time %.2f\n",
2380 exactML || nCodes != 20 ? "" : "~",
2382 clockDiff(&clock_start));
2384 fprintf(fpLog, "TreeLogLk\tML_Lengths%d\t%.4f\n", 2, loglk);
2390 /* Count bad splits and compute SH-like supports if desired */
2391 if ((MLnniToDo > 0 && !fastest) || nBootstrap > 0)
2392 TestSplitsML(NJ, /*OUT*/&splitcount, nBootstrap);
2394 /* Compute gamma-based likelihood? */
2395 if (gammaLogLk && nRateCats > 1) {
2396 numeric_t *rates = MLSiteRates(nRateCats);
2397 double *site_loglk = MLSiteLikelihoodsByRate(NJ, rates, nRateCats);
2398 double scale = RescaleGammaLogLk(NJ->nPos, nRateCats, rates, /*IN*/site_loglk, /*OPTIONAL*/fpLog);
2399 rates = myfree(rates, sizeof(numeric_t) * nRateCats);
2400 site_loglk = myfree(site_loglk, sizeof(double) * nRateCats * NJ->nPos);
2402 for (i = 0; i < NJ->maxnodes; i++)
2403 NJ->branchlength[i] *= scale;
2406 /* Minimum evolution supports */
2407 TestSplitsMinEvo(NJ, /*OUT*/&splitcount);
2409 ReliabilityNJ(NJ, nBootstrap);
2412 for (i = 0; i < nFPs; i++) {
2414 fprintf(fp, "Total time: %.2f seconds Unique: %d/%d Bad splits: %d/%d",
2415 clockDiff(&clock_start),
2416 NJ->nSeq, aln->nSeq,
2417 splitcount.nBadSplits, splitcount.nSplits);
2418 if (splitcount.dWorstDeltaUnconstrained > 0)
2419 fprintf(fp, " Worst %sdelta-%s %.3f",
2420 uniqConstraints != NULL ? "unconstrained " : "",
2421 (MLnniToDo > 0 || MLlen) ? "LogLk" : "Len",
2422 splitcount.dWorstDeltaUnconstrained);
2424 if (NJ->nSeq > 3 && NJ->nConstraints > 0) {
2425 fprintf(fp, "Violating constraints: %d both bad: %d",
2426 splitcount.nConstraintViolations, splitcount.nBadBoth);
2427 if (splitcount.dWorstDeltaConstrained > 0)
2428 fprintf(fp, " Worst delta-%s due to constraints: %.3f",
2429 (MLnniToDo > 0 || MLlen) ? "LogLk" : "Len",
2430 splitcount.dWorstDeltaConstrained);
2433 if (verbose > 1 || fp == fpLog) {
2434 double dN2 = NJ->nSeq*(double)NJ->nSeq;
2435 fprintf(fp, "Dist/N**2: by-profile %.3f (out %.3f) by-leaf %.3f avg-prof %.3f\n",
2436 profileOps/dN2, outprofileOps/dN2, seqOps/dN2, profileAvgOps/dN2);
2437 if (nCloseUsed>0 || nClose2Used > 0 || nRefreshTopHits>0)
2438 fprintf(fp, "Top hits: close neighbors %ld/%d 2nd-level %ld refreshes %ld",
2439 nCloseUsed, NJ->nSeq, nClose2Used, nRefreshTopHits);
2440 if(!slow) fprintf(fp, " Hill-climb: %ld Update-best: %ld\n", nHillBetter, nVisibleUpdate);
2441 if (nniToDo > 0 || spr > 0 || MLnniToDo > 0)
2442 fprintf(fp, "NNI: %ld SPR: %ld ML-NNI: %ld\n", nNNI, nSPR, nML_NNI);
2443 if (MLnniToDo > 0) {
2444 fprintf(fp, "Max-lk operations: lk %ld posterior %ld", nLkCompute, nPosteriorCompute);
2445 if (nAAPosteriorExact > 0 || nAAPosteriorRough > 0)
2446 fprintf(fp, " approximate-posteriors %.2f%%",
2447 (100.0*nAAPosteriorRough)/(double)(nAAPosteriorExact+nAAPosteriorRough));
2449 fprintf(fp, " star-only %ld", nStarTests);
2454 fprintf(fp, "Memory: %.2f MB (%.1f byte/pos) ",
2455 maxmallocHeap/1.0e6, maxmallocHeap/(double)(aln->nSeq*(double)aln->nPos));
2456 /* Only report numbers from before we do reliability estimates */
2457 fprintf(fp, "profile-freq-alloc %ld avoided %.2f%%\n",
2459 svProfileFreqAvoid > 0 ?
2460 100.0*svProfileFreqAvoid/(double)(svProfileFreqAlloc+svProfileFreqAvoid)
2465 PrintNJ(fpOut, NJ, aln->names, unique, /*support*/nBootstrap > 0, bQuote);
2468 fprintf(fpLog,"TreeCompleted\n");
2472 if (uniqConstraints != NULL)
2473 uniqConstraints = myfree(uniqConstraints, sizeof(char*) * unique->nUnique);
2474 constraints = FreeAlignment(constraints);
2475 unique = FreeUniquify(unique);
2476 } /* end build tree */
2477 hashnames = FreeHashtable(hashnames);
2478 aln = FreeAlignment(aln);
2479 } /* end loop over alignments */
2482 if (fpOut != stdout) fclose(fpOut);
2486 void ProgressReport(char *format, int i1, int i2, int i3, int i4) {
2487 static bool time_set = false;
2488 static struct timeval time_last;
2489 static struct timeval time_begin;
2494 static struct timeval time_now;
2495 gettimeofday(&time_now,NULL);
2497 time_begin = time_last = time_now;
2500 static struct timeval elapsed;
2501 timeval_subtract(&elapsed,&time_now,&time_last);
2503 if (elapsed.tv_sec > 1 || elapsed.tv_usec > 100*1000 || verbose > 1) {
2504 timeval_subtract(&elapsed,&time_now,&time_begin);
2505 fprintf(stderr, "%7i.%2.2i seconds: ", (int)elapsed.tv_sec, (int)(elapsed.tv_usec/10000));
2506 fprintf(stderr, format, i1, i2, i3, i4);
2507 if (verbose > 1 || !isatty(STDERR_FILENO)) {
2508 fprintf(stderr, "\n");
2510 fprintf(stderr, " \r");
2513 time_last = time_now;
2517 void LogMLRates(/*OPTIONAL WRITE*/FILE *fpLog, NJ_t *NJ) {
2518 if (fpLog != NULL) {
2519 rates_t *rates = &NJ->rates;
2520 fprintf(fpLog, "NCategories\t%d\nRates",rates->nRateCategories);
2521 assert(rates->nRateCategories > 0);
2523 for (iRate = 0; iRate < rates->nRateCategories; iRate++)
2524 fprintf(fpLog, " %f", rates->rates[iRate]);
2525 fprintf(fpLog,"\nSiteCategories");
2527 for (iPos = 0; iPos < NJ->nPos; iPos++) {
2528 iRate = rates->ratecat[iPos];
2529 fprintf(fpLog," %d",iRate+1);
2531 fprintf(fpLog,"\n");
2536 void LogTree(char *format, int i, /*OPTIONAL WRITE*/FILE *fpLog, NJ_t *NJ, char **names, uniquify_t *unique, bool bQuote) {
2538 fprintf(fpLog, format, i);
2539 fprintf(fpLog, "\t");
2540 PrintNJ(fpLog, NJ, names, unique, /*support*/false, bQuote);
2545 NJ_t *InitNJ(char **sequences, int nSeq, int nPos,
2546 /*OPTIONAL*/char **constraintSeqs, int nConstraints,
2547 /*OPTIONAL*/distance_matrix_t *distance_matrix,
2548 /*OPTIONAL*/transition_matrix_t *transmat) {
2551 NJ_t *NJ = (NJ_t*)mymalloc(sizeof(NJ_t));
2552 NJ->root = -1; /* set at end of FastNJ() */
2553 NJ->maxnode = NJ->nSeq = nSeq;
2555 NJ->maxnodes = 2*nSeq;
2556 NJ->seqs = sequences;
2557 NJ->distance_matrix = distance_matrix;
2558 NJ->transmat = transmat;
2559 NJ->nConstraints = nConstraints;
2560 NJ->constraintSeqs = constraintSeqs;
2562 NJ->profiles = (profile_t **)mymalloc(sizeof(profile_t*) * NJ->maxnodes);
2564 unsigned long counts[256];
2566 for (i = 0; i < 256; i++)
2568 for (iNode = 0; iNode < NJ->nSeq; iNode++) {
2569 NJ->profiles[iNode] = SeqToProfile(NJ, NJ->seqs[iNode], nPos,
2570 constraintSeqs != NULL ? constraintSeqs[iNode] : NULL,
2575 unsigned long totCount = 0;
2576 for (i = 0; i < 256; i++)
2577 totCount += counts[i];
2579 /* warnings about unknown characters */
2580 for (i = 0; i < 256; i++) {
2581 if (counts[i] == 0 || i == '.' || i == '-')
2583 unsigned char *codesP;
2584 bool bMatched = false;
2585 for (codesP = codesString; *codesP != '\0'; codesP++) {
2586 if (*codesP == i || tolower(*codesP) == i) {
2592 fprintf(stderr, "Ignored unknown character %c (seen %lu times)\n", i, counts[i]);
2596 /* warnings about the counts */
2597 double fACGTUN = (counts['A'] + counts['C'] + counts['G'] + counts['T'] + counts['U'] + counts['N']
2598 + counts['a'] + counts['c'] + counts['g'] + counts['t'] + counts['u'] + counts['n'])
2599 / (double)(totCount - counts['-'] - counts['.']);
2600 if (nCodes == 4 && fACGTUN < 0.9)
2601 fprintf(stderr, "WARNING! ONLY %.1f%% NUCLEOTIDE CHARACTERS -- IS THIS REALLY A NUCLEOTIDE ALIGNMENT?\n",
2603 else if (nCodes == 20 && fACGTUN >= 0.9)
2604 fprintf(stderr, "WARNING! %.1f%% NUCLEOTIDE CHARACTERS -- IS THIS REALLY A PROTEIN ALIGNMENT?\n",
2607 if(verbose>10) fprintf(stderr,"Made sequence profiles\n");
2608 for (iNode = NJ->nSeq; iNode < NJ->maxnodes; iNode++)
2609 NJ->profiles[iNode] = NULL; /* not yet exists */
2611 NJ->outprofile = OutProfile(NJ->profiles, NJ->nSeq,
2612 NJ->nPos, NJ->nConstraints,
2613 NJ->distance_matrix);
2614 if(verbose>10) fprintf(stderr,"Made out-profile\n");
2618 NJ->diameter = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2619 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->diameter[iNode] = 0;
2621 NJ->varDiameter = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2622 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->varDiameter[iNode] = 0;
2624 NJ->selfdist = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2625 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->selfdist[iNode] = 0;
2627 NJ->selfweight = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2628 for (iNode = 0; iNode < NJ->nSeq; iNode++)
2629 NJ->selfweight[iNode] = NJ->nPos - NGaps(NJ,iNode);
2631 NJ->outDistances = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2632 NJ->nOutDistActive = (int *)mymalloc(sizeof(int)*NJ->maxnodes);
2633 for (iNode = 0; iNode < NJ->maxnodes; iNode++)
2634 NJ->nOutDistActive[iNode] = NJ->nSeq * 10; /* unreasonably high value */
2635 NJ->parent = NULL; /* so SetOutDistance ignores it */
2636 for (iNode = 0; iNode < NJ->nSeq; iNode++)
2637 SetOutDistance(/*IN/UPDATE*/NJ, iNode, /*nActive*/NJ->nSeq);
2640 for (iNode = 0; iNode < 4 && iNode < NJ->nSeq; iNode++)
2641 fprintf(stderr, "Node %d outdist %f\n", iNode, NJ->outDistances[iNode]);
2644 NJ->parent = (int *)mymalloc(sizeof(int)*NJ->maxnodes);
2645 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->parent[iNode] = -1;
2647 NJ->branchlength = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes); /* distance to parent */
2648 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->branchlength[iNode] = 0;
2650 NJ->support = (numeric_t *)mymalloc(sizeof(numeric_t)*NJ->maxnodes);
2651 for (iNode = 0; iNode < NJ->maxnodes; iNode++) NJ->support[iNode] = -1.0;
2653 NJ->child = (children_t*)mymalloc(sizeof(children_t)*NJ->maxnodes);
2654 for (iNode= 0; iNode < NJ->maxnode; iNode++) NJ->child[iNode].nChild = 0;
2656 NJ->rates.nRateCategories = 0;
2657 NJ->rates.rates = NULL;
2658 NJ->rates.ratecat = NULL;
2659 AllocRateCategories(&NJ->rates, 1, NJ->nPos);
2663 NJ_t *FreeNJ(NJ_t *NJ) {
2668 for (i=0; i < NJ->maxnode; i++)
2669 NJ->profiles[i] = FreeProfile(NJ->profiles[i], NJ->nPos, NJ->nConstraints);
2670 NJ->profiles = myfree(NJ->profiles, sizeof(profile_t*) * NJ->maxnodes);
2671 NJ->outprofile = FreeProfile(NJ->outprofile, NJ->nPos, NJ->nConstraints);
2672 NJ->diameter = myfree(NJ->diameter, sizeof(numeric_t)*NJ->maxnodes);
2673 NJ->varDiameter = myfree(NJ->varDiameter, sizeof(numeric_t)*NJ->maxnodes);
2674 NJ->selfdist = myfree(NJ->selfdist, sizeof(numeric_t)*NJ->maxnodes);
2675 NJ->selfweight = myfree(NJ->selfweight, sizeof(numeric_t)*NJ->maxnodes);
2676 NJ->outDistances = myfree(NJ->outDistances, sizeof(numeric_t)*NJ->maxnodes);
2677 NJ->nOutDistActive = myfree(NJ->nOutDistActive, sizeof(int)*NJ->maxnodes);
2678 NJ->parent = myfree(NJ->parent, sizeof(int)*NJ->maxnodes);
2679 NJ->branchlength = myfree(NJ->branchlength, sizeof(numeric_t)*NJ->maxnodes);
2680 NJ->support = myfree(NJ->support, sizeof(numeric_t)*NJ->maxnodes);
2681 NJ->child = myfree(NJ->child, sizeof(children_t)*NJ->maxnodes);
2682 NJ->transmat = myfree(NJ->transmat, sizeof(transition_matrix_t));
2683 AllocRateCategories(&NJ->rates, 0, NJ->nPos);
2684 return(myfree(NJ, sizeof(NJ_t)));
2687 /* Allocate or reallocate the rate categories, and set every position
2688 to category 0 and every category's rate to 1.0
2689 If nRateCategories=0, just deallocate
2691 void AllocRateCategories(/*IN/OUT*/rates_t *rates, int nRateCategories, int nPos) {
2692 assert(nRateCategories >= 0);
2693 rates->rates = myfree(rates->rates, sizeof(numeric_t)*rates->nRateCategories);
2694 rates->ratecat = myfree(rates->ratecat, sizeof(unsigned int)*nPos);
2695 rates->nRateCategories = nRateCategories;
2696 if (rates->nRateCategories > 0) {
2697 rates->rates = (numeric_t*)mymalloc(sizeof(numeric_t)*rates->nRateCategories);
2699 for (i = 0; i < nRateCategories; i++)
2700 rates->rates[i] = 1.0;
2701 rates->ratecat = (unsigned int *)mymalloc(sizeof(unsigned int)*nPos);
2702 for (i = 0; i < nPos; i++)
2703 rates->ratecat[i] = 0;
2707 void FastNJ(NJ_t *NJ) {
2710 assert(NJ->nSeq >= 1);
2712 NJ->root = NJ->maxnode++;
2713 NJ->child[NJ->root].nChild = NJ->nSeq;
2714 for (iNode = 0; iNode < NJ->nSeq; iNode++) {
2715 NJ->parent[iNode] = NJ->root;
2716 NJ->child[NJ->root].child[iNode] = iNode;
2718 if (NJ->nSeq == 1) {
2719 NJ->branchlength[0] = 0;
2721 assert (NJ->nSeq == 2);
2723 SeqDist(NJ->profiles[0]->codes,NJ->profiles[1]->codes,NJ->nPos,NJ->distance_matrix,/*OUT*/&hit);
2724 NJ->branchlength[0] = hit.dist/2.0;
2725 NJ->branchlength[1] = hit.dist/2.0;
2730 /* else 3 or more sequences */
2732 /* The visible set stores the best hit of each node (unless using top hits, in which case
2733 it is handled by the top hits routines) */
2734 besthit_t *visible = NULL; /* Not used if doing top hits */
2735 besthit_t *besthitNew = NULL; /* All hits of new node -- not used if doing top-hits */
2737 /* The top-hits lists, with the key parameter m = length of each top-hit list */
2738 top_hits_t *tophits = NULL;
2739 int m = 0; /* maximum length of a top-hits list */
2740 if (tophitsMult > 0) {
2741 m = (int)(0.5 + tophitsMult*sqrt(NJ->nSeq));
2742 if(m<4 || 2*m >= NJ->nSeq) {
2744 if(verbose>1) fprintf(stderr,"Too few leaves, turning off top-hits\n");
2746 if(verbose>2) fprintf(stderr,"Top-hit-list size = %d of %d\n", m, NJ->nSeq);
2749 assert(!(slow && m>0));
2751 /* Initialize top-hits or visible set */
2753 tophits = InitTopHits(NJ, m);
2754 SetAllLeafTopHits(/*IN/UPDATE*/NJ, /*OUT*/tophits);
2755 ResetTopVisible(/*IN/UPDATE*/NJ, /*nActive*/NJ->nSeq, /*IN/OUT*/tophits);
2757 visible = (besthit_t*)mymalloc(sizeof(besthit_t)*NJ->maxnodes);
2758 besthitNew = (besthit_t*)mymalloc(sizeof(besthit_t)*NJ->maxnodes);
2759 for (iNode = 0; iNode < NJ->nSeq; iNode++)
2760 SetBestHit(iNode, NJ, /*nActive*/NJ->nSeq, /*OUT*/&visible[iNode], /*OUT IGNORED*/NULL);
2763 /* Iterate over joins */
2764 int nActiveOutProfileReset = NJ->nSeq;
2766 for (nActive = NJ->nSeq; nActive > 3; nActive--) {
2767 int nJoinsDone = NJ->nSeq - nActive;
2768 if (nJoinsDone > 0 && (nJoinsDone % 100) == 0)
2769 ProgressReport("Joined %6d of %6d", nJoinsDone, NJ->nSeq-3, 0, 0);
2771 besthit_t join; /* the join to do */
2773 ExhaustiveNJSearch(NJ,nActive,/*OUT*/&join);
2775 TopHitNJSearch(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, /*OUT*/&join);
2777 FastNJSearch(NJ, nActive, /*IN/OUT*/visible, /*OUT*/&join);
2781 double penalty = constraintWeight
2782 * (double)JoinConstraintPenalty(NJ, join.i, join.j);
2783 if (penalty > 0.001) {
2784 fprintf(stderr, "Constraint violation during neighbor-joining %d %d into %d penalty %.3f\n",
2785 join.i, join.j, NJ->maxnode, penalty);
2787 for (iC = 0; iC < NJ->nConstraints; iC++) {
2788 int local = JoinConstraintPenaltyPiece(NJ, join.i, join.j, iC);
2790 fprintf(stderr, "Constraint %d piece %d %d/%d %d/%d %d/%d\n", iC, local,
2791 NJ->profiles[join.i]->nOn[iC],
2792 NJ->profiles[join.i]->nOff[iC],
2793 NJ->profiles[join.j]->nOn[iC],
2794 NJ->profiles[join.j]->nOff[iC],
2795 NJ->outprofile->nOn[iC] - NJ->profiles[join.i]->nOn[iC] - NJ->profiles[join.j]->nOn[iC],
2796 NJ->outprofile->nOff[iC] - NJ->profiles[join.i]->nOff[iC] - NJ->profiles[join.j]->nOff[iC]);
2801 /* because of the stale out-distance heuristic, make sure that these are up-to-date */
2802 SetOutDistance(NJ, join.i, nActive);
2803 SetOutDistance(NJ, join.j, nActive);
2804 /* Make sure weight is set and criterion is up to date */
2805 SetDistCriterion(NJ, nActive, /*IN/OUT*/&join);
2806 assert(NJ->nOutDistActive[join.i] == nActive);
2807 assert(NJ->nOutDistActive[join.j] == nActive);
2809 int newnode = NJ->maxnode++;
2810 NJ->parent[join.i] = newnode;
2811 NJ->parent[join.j] = newnode;
2812 NJ->child[newnode].nChild = 2;
2813 NJ->child[newnode].child[0] = join.i < join.j ? join.i : join.j;
2814 NJ->child[newnode].child[1] = join.i > join.j ? join.i : join.j;
2816 double rawIJ = join.dist + NJ->diameter[join.i] + NJ->diameter[join.j];
2817 double distIJ = join.dist;
2819 double deltaDist = (NJ->outDistances[join.i]-NJ->outDistances[join.j])/(double)(nActive-2);
2820 NJ->branchlength[join.i] = (distIJ + deltaDist)/2;
2821 NJ->branchlength[join.j] = (distIJ - deltaDist)/2;
2823 double bionjWeight = 0.5; /* IJ = bionjWeight*I + (1-bionjWeight)*J */
2824 double varIJ = rawIJ - NJ->varDiameter[join.i] - NJ->varDiameter[join.j];
2826 if (bionj && join.weight > 0.01 && varIJ > 0.001) {
2827 /* Set bionjWeight according to the BIONJ formula, where
2828 the variance matrix is approximated by
2830 Vij = ProfileVar(i,j) - varDiameter(i) - varDiameter(j)
2831 ProfileVar(i,j) = distance(i,j) = top(i,j)/weight(i,j)
2833 (The node's distance diameter does not affect the variances.)
2835 The BIONJ formula is equation 9 from Gascuel 1997:
2837 bionjWeight = 1/2 + sum(k!=i,j) (Vjk - Vik) / ((nActive-2)*Vij)
2838 sum(k!=i,j) (Vjk - Vik) = sum(k!=i,j) Vik - varDiameter(j) + varDiameter(i)
2839 = sum(k!=i,j) ProfileVar(j,k) - sum(k!=i,j) ProfileVar(i,k) + (nActive-2)*(varDiameter(i)-varDiameter(j))
2841 sum(k!=i,j) ProfileVar(i,k)
2842 ~= (sum(k!=i,j) distance(i,k) * weight(i,k))/(mean(k!=i,j) weight(i,k))
2843 ~= (N-2) * top(i, Out-i-j) / weight(i, Out-i-j)
2845 weight(i, Out-i-j) = N*weight(i,Out) - weight(i,i) - weight(i,j)
2846 top(i, Out-i-j) = N*top(i,Out) - top(i,i) - top(i,j)
2850 ProfileDist(NJ->profiles[join.i],NJ->outprofile,NJ->nPos,NJ->distance_matrix,/*OUT*/&outI);
2851 ProfileDist(NJ->profiles[join.j],NJ->outprofile,NJ->nPos,NJ->distance_matrix,/*OUT*/&outJ);
2854 double varIWeight = (nActive * outI.weight - NJ->selfweight[join.i] - join.weight);
2855 double varJWeight = (nActive * outJ.weight - NJ->selfweight[join.j] - join.weight);
2857 double varITop = outI.dist * outI.weight * nActive
2858 - NJ->selfdist[join.i] * NJ->selfweight[join.i] - rawIJ * join.weight;
2859 double varJTop = outJ.dist * outJ.weight * nActive
2860 - NJ->selfdist[join.j] * NJ->selfweight[join.j] - rawIJ * join.weight;
2862 double deltaProfileVarOut = (nActive-2) * (varJTop/varJWeight - varITop/varIWeight);
2863 double deltaVarDiam = (nActive-2)*(NJ->varDiameter[join.i] - NJ->varDiameter[join.j]);
2864 if (varJWeight > 0.01 && varIWeight > 0.01)
2865 bionjWeight = 0.5 + (deltaProfileVarOut+deltaVarDiam)/(2*(nActive-2)*varIJ);
2866 if(bionjWeight<0) bionjWeight=0;
2867 if(bionjWeight>1) bionjWeight=1;
2868 if (verbose>2) fprintf(stderr,"dVarO %f dVarDiam %f varIJ %f from dist %f weight %f (pos %d) bionjWeight %f %f\n",
2869 deltaProfileVarOut, deltaVarDiam,
2870 varIJ, join.dist, join.weight, NJ->nPos,
2871 bionjWeight, 1-bionjWeight);
2872 if (verbose>3 && (newnode%5) == 0) {
2873 /* Compare weight estimated from outprofiles from weight made by summing over other nodes */
2874 double deltaProfileVarTot = 0;
2875 for (iNode = 0; iNode < newnode; iNode++) {
2876 if (NJ->parent[iNode] < 0) { /* excludes join.i, join.j */
2878 ProfileDist(NJ->profiles[join.i],NJ->profiles[iNode],NJ->nPos,NJ->distance_matrix,/*OUT*/&di);
2879 ProfileDist(NJ->profiles[join.j],NJ->profiles[iNode],NJ->nPos,NJ->distance_matrix,/*OUT*/&dj);
2880 deltaProfileVarTot += dj.dist - di.dist;
2883 double lambdaTot = 0.5 + (deltaProfileVarTot+deltaVarDiam)/(2*(nActive-2)*varIJ);
2884 if (lambdaTot < 0) lambdaTot = 0;
2885 if (lambdaTot > 1) lambdaTot = 1;
2886 if (fabs(bionjWeight-lambdaTot) > 0.01 || verbose > 4)
2887 fprintf(stderr, "deltaProfileVar actual %.6f estimated %.6f lambda actual %.3f estimated %.3f\n",
2888 deltaProfileVarTot,deltaProfileVarOut,lambdaTot,bionjWeight);
2891 if (verbose > 2) fprintf(stderr, "Join\t%d\t%d\t%.6f\tlambda\t%.6f\tselfw\t%.3f\t%.3f\tnew\t%d\n",
2892 join.i < join.j ? join.i : join.j,
2893 join.i < join.j ? join.j : join.i,
2894 join.criterion, bionjWeight,
2895 NJ->selfweight[join.i < join.j ? join.i : join.j],
2896 NJ->selfweight[join.i < join.j ? join.j : join.i],
2899 NJ->diameter[newnode] = bionjWeight * (NJ->branchlength[join.i] + NJ->diameter[join.i])
2900 + (1-bionjWeight) * (NJ->branchlength[join.j] + NJ->diameter[join.j]);
2901 NJ->varDiameter[newnode] = bionjWeight * NJ->varDiameter[join.i]
2902 + (1-bionjWeight) * NJ->varDiameter[join.j]
2903 + bionjWeight * (1-bionjWeight) * varIJ;
2905 NJ->profiles[newnode] = AverageProfile(NJ->profiles[join.i],NJ->profiles[join.j],
2906 NJ->nPos, NJ->nConstraints,
2907 NJ->distance_matrix,
2908 bionj ? bionjWeight : /*noweight*/-1.0);
2910 /* Update out-distances and total diameters */
2911 int changedActiveOutProfile = nActiveOutProfileReset - (nActive-1);
2912 if (changedActiveOutProfile >= nResetOutProfile
2913 && changedActiveOutProfile >= fResetOutProfile * nActiveOutProfileReset) {
2914 /* Recompute the outprofile from scratch to avoid roundoff error */
2915 profile_t **activeProfiles = (profile_t**)mymalloc(sizeof(profile_t*)*(nActive-1));
2918 for (iNode=0;iNode<NJ->maxnode;iNode++) {
2919 if (NJ->parent[iNode]<0) {
2920 assert(nSaved < nActive-1);
2921 activeProfiles[nSaved++] = NJ->profiles[iNode];
2922 NJ->totdiam += NJ->diameter[iNode];
2925 assert(nSaved==nActive-1);
2926 FreeProfile(NJ->outprofile, NJ->nPos, NJ->nConstraints);
2927 if(verbose>2) fprintf(stderr,"Recomputing outprofile %d %d\n",nActiveOutProfileReset,nActive-1);
2928 NJ->outprofile = OutProfile(activeProfiles, nSaved,
2929 NJ->nPos, NJ->nConstraints,
2930 NJ->distance_matrix);
2931 activeProfiles = myfree(activeProfiles, sizeof(profile_t*)*(nActive-1));
2932 nActiveOutProfileReset = nActive-1;
2934 UpdateOutProfile(/*OUT*/NJ->outprofile,
2935 NJ->profiles[join.i], NJ->profiles[join.j], NJ->profiles[newnode],
2937 NJ->nPos, NJ->nConstraints,
2938 NJ->distance_matrix);
2939 NJ->totdiam += NJ->diameter[newnode] - NJ->diameter[join.i] - NJ->diameter[join.j];
2942 /* Store self-dist for use in other computations */
2944 ProfileDist(NJ->profiles[newnode],NJ->profiles[newnode],NJ->nPos,NJ->distance_matrix,/*OUT*/&selfdist);
2945 NJ->selfdist[newnode] = selfdist.dist;
2946 NJ->selfweight[newnode] = selfdist.weight;
2948 /* Find the best hit of the joined node IJ */
2950 TopHitJoin(newnode, /*IN/UPDATE*/NJ, nActive-1, /*IN/OUT*/tophits);
2952 /* Not using top-hits, so we update all out-distances */
2953 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
2954 if (NJ->parent[iNode] < 0) {
2955 /* True nActive is now nActive-1 */
2956 SetOutDistance(/*IN/UPDATE*/NJ, iNode, nActive-1);
2960 if(visible != NULL) {
2961 SetBestHit(newnode, NJ, nActive-1, /*OUT*/&visible[newnode], /*OUT OPTIONAL*/besthitNew);
2963 fprintf(stderr,"Visible %d %d %f %f\n",
2964 visible[newnode].i, visible[newnode].j,
2965 visible[newnode].dist, visible[newnode].criterion);
2966 if (besthitNew != NULL) {
2967 /* Use distances to new node to update visible set entries that are non-optimal */
2968 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
2969 if (NJ->parent[iNode] >= 0 || iNode == newnode)
2971 int iOldVisible = visible[iNode].j;
2972 assert(iOldVisible>=0);
2973 assert(visible[iNode].i == iNode);
2975 /* Update the criterion; use nActive-1 because haven't decremented nActive yet */
2976 if (NJ->parent[iOldVisible] < 0)
2977 SetCriterion(/*IN/OUT*/NJ, nActive-1, &visible[iNode]);
2979 if (NJ->parent[iOldVisible] >= 0
2980 || besthitNew[iNode].criterion < visible[iNode].criterion) {
2981 if(verbose>3) fprintf(stderr,"Visible %d reset from %d to %d (%f vs. %f)\n",
2983 newnode, visible[iNode].criterion, besthitNew[iNode].criterion);
2984 if(NJ->parent[iOldVisible] < 0) nVisibleUpdate++;
2985 visible[iNode].j = newnode;
2986 visible[iNode].dist = besthitNew[iNode].dist;
2987 visible[iNode].criterion = besthitNew[iNode].criterion;
2989 } /* end loop over all nodes */
2990 } /* end if recording all hits of new node */
2991 } /* end if keeping a visible set */
2992 } /* end else (m==0) */
2993 } /* end loop over nActive */
2997 struct mallinfo mi = mallinfo();
2998 fprintf(stderr, "Memory @ end of FastNJ(): %.2f MB (%.1f byte/pos) useful %.2f expected %.2f\n",
2999 (mi.arena+mi.hblkhd)/1.0e6, (mi.arena+mi.hblkhd)/(double)(NJ->nSeq*(double)NJ->nPos),
3000 mi.uordblks/1.0e6, mymallocUsed/1e6);
3004 /* We no longer need the tophits, visible set, etc. */
3005 if (visible != NULL) visible = myfree(visible,sizeof(besthit_t)*NJ->maxnodes);
3006 if (besthitNew != NULL) besthitNew = myfree(besthitNew,sizeof(besthit_t)*NJ->maxnodes);
3007 tophits = FreeTopHits(tophits);
3009 /* Add a root for the 3 remaining nodes */
3012 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
3013 if (NJ->parent[iNode] < 0) {
3015 top[nTop++] = iNode;
3020 NJ->root = NJ->maxnode++;
3021 NJ->child[NJ->root].nChild = 3;
3022 for (nTop = 0; nTop < 3; nTop++) {
3023 NJ->parent[top[nTop]] = NJ->root;
3024 NJ->child[NJ->root].child[nTop] = top[nTop];
3027 besthit_t dist01, dist02, dist12;
3028 ProfileDist(NJ->profiles[top[0]], NJ->profiles[top[1]], NJ->nPos, NJ->distance_matrix, /*OUT*/&dist01);
3029 ProfileDist(NJ->profiles[top[0]], NJ->profiles[top[2]], NJ->nPos, NJ->distance_matrix, /*OUT*/&dist02);
3030 ProfileDist(NJ->profiles[top[1]], NJ->profiles[top[2]], NJ->nPos, NJ->distance_matrix, /*OUT*/&dist12);
3032 double d01 = dist01.dist - NJ->diameter[top[0]] - NJ->diameter[top[1]];
3033 double d02 = dist02.dist - NJ->diameter[top[0]] - NJ->diameter[top[2]];
3034 double d12 = dist12.dist - NJ->diameter[top[1]] - NJ->diameter[top[2]];
3035 NJ->branchlength[top[0]] = (d01 + d02 - d12)/2;
3036 NJ->branchlength[top[1]] = (d01 + d12 - d02)/2;
3037 NJ->branchlength[top[2]] = (d02 + d12 - d01)/2;
3039 /* Check how accurate the outprofile is */
3041 profile_t *p[3] = {NJ->profiles[top[0]], NJ->profiles[top[1]], NJ->profiles[top[2]]};
3042 profile_t *out = OutProfile(p, 3, NJ->nPos, NJ->nConstraints, NJ->distance_matrix);
3044 double freqerror = 0;
3045 double weighterror = 0;
3046 for (i=0;i<NJ->nPos;i++) {
3047 weighterror += fabs(out->weights[i] - NJ->outprofile->weights[i]);
3049 for(k=0;k<nCodes;k++)
3050 freqerror += fabs(out->vectors[nCodes*i+k] - NJ->outprofile->vectors[nCodes*i+k]);
3052 fprintf(stderr,"Roundoff error in outprofile@end: WeightError %f FreqError %f\n", weighterror, freqerror);
3053 FreeProfile(out, NJ->nPos, NJ->nConstraints);
3058 void ExhaustiveNJSearch(NJ_t *NJ, int nActive, /*OUT*/besthit_t *join) {
3063 join->criterion = 1e20;
3064 double bestCriterion = 1e20;
3067 for (i = 0; i < NJ->maxnode-1; i++) {
3068 if (NJ->parent[i] < 0) {
3069 for (j = i+1; j < NJ->maxnode; j++) {
3070 if (NJ->parent[j] < 0) {
3074 SetDistCriterion(NJ, nActive, /*IN/OUT*/&hit);
3075 if (hit.criterion < bestCriterion) {
3077 bestCriterion = hit.criterion;
3083 assert (join->i >= 0 && join->j >= 0);
3086 void FastNJSearch(NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *besthits, /*OUT*/besthit_t *join) {
3091 join->criterion = 1e20;
3093 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
3094 int jNode = besthits[iNode].j;
3095 if (NJ->parent[iNode] < 0 && NJ->parent[jNode] < 0) { /* both i and j still active */
3096 /* recompute criterion to reflect the current out-distances */
3097 SetCriterion(NJ, nActive, /*IN/OUT*/&besthits[iNode]);
3098 if (besthits[iNode].criterion < join->criterion)
3099 *join = besthits[iNode];
3107 assert(join->i >= 0 && join->j >= 0);
3108 SetBestHit(join->i, NJ, nActive, /*OUT*/&besthits[join->i], /*OUT IGNORED*/NULL);
3109 if (besthits[join->i].j != join->j) {
3112 fprintf(stderr,"BetterI\t%d\t%d\t%d\t%d\t%f\t%f\n",
3113 join->i,join->j,besthits[join->i].i,besthits[join->i].j,
3114 join->criterion,besthits[join->i].criterion);
3117 /* Save the best hit either way, because the out-distance has probably changed
3118 since we started the computation. */
3119 join->j = besthits[join->i].j;
3120 join->weight = besthits[join->i].weight;
3121 join->dist = besthits[join->i].dist;
3122 join->criterion = besthits[join->i].criterion;
3124 SetBestHit(join->j, NJ, nActive, /*OUT*/&besthits[join->j], /*OUT IGNORE*/NULL);
3125 if (besthits[join->j].j != join->i) {
3128 fprintf(stderr,"BetterJ\t%d\t%d\t%d\t%d\t%f\t%f\n",
3129 join->i,join->j,besthits[join->j].i,besthits[join->j].j,
3130 join->criterion,besthits[join->j].criterion);
3131 join->i = besthits[join->j].j;
3132 join->weight = besthits[join->j].weight;
3133 join->dist = besthits[join->j].dist;
3134 join->criterion = besthits[join->j].criterion;
3136 if(changed) nHillBetter++;
3141 /* A token is one of ():;, or an alphanumeric string without whitespace
3142 Any whitespace between tokens is ignored */
3143 char *ReadTreeToken(FILE *fp) {
3144 static char buf[BUFFER_SIZE];
3147 for (c = fgetc(fp); c != EOF; c = fgetc(fp)) {
3148 if (c == '(' || c == ')' || c == ':' || c == ';' || c == ',') {
3149 /* standalone token */
3159 } else if (isspace(c)) {
3164 /* else ignore whitespace at beginning of token */
3166 /* not whitespace or standalone token */
3168 if (len >= BUFFER_SIZE) {
3169 buf[BUFFER_SIZE-1] = '\0';
3170 fprintf(stderr, "Token too long in tree file, token begins with\n%s\n", buf);
3176 /* return the token we have so far */
3184 void ReadTreeError(char *err, char *token) {
3185 fprintf(stderr, "Tree parse error: unexpected token '%s' -- %s\n",
3186 token == NULL ? "(End of file)" : token,
3191 void ReadTreeAddChild(int parent, int child, /*IN/OUT*/int *parents, /*IN/OUT*/children_t *children) {
3192 assert(parent >= 0);
3194 assert(parents[child] < 0);
3195 assert(children[parent].nChild < 3);
3196 parents[child] = parent;
3197 children[parent].child[children[parent].nChild++] = child;
3200 void ReadTreeMaybeAddLeaf(int parent, char *name,
3201 hashstrings_t *hashnames, uniquify_t *unique,
3202 /*IN/OUT*/int *parents, /*IN/OUT*/children_t *children) {
3203 hashiterator_t hi = FindMatch(hashnames,name);
3204 if (HashCount(hashnames,hi) != 1)
3205 ReadTreeError("not recognized as a sequence name", name);
3207 int iSeqNonunique = HashFirst(hashnames,hi);
3208 assert(iSeqNonunique >= 0 && iSeqNonunique < unique->nSeq);
3209 int iSeqUnique = unique->alnToUniq[iSeqNonunique];
3210 assert(iSeqUnique >= 0 && iSeqUnique < unique->nUnique);
3211 /* Either record this leaves' parent (if it is -1) or ignore this leaf (if already seen) */
3212 if (parents[iSeqUnique] < 0) {
3213 ReadTreeAddChild(parent, iSeqUnique, /*IN/OUT*/parents, /*IN/OUT*/children);
3215 fprintf(stderr, "Found leaf uniq%d name %s child of %d\n", iSeqUnique, name, parent);
3218 fprintf(stderr, "Skipped redundant leaf uniq%d name %s\n", iSeqUnique, name);
3222 void ReadTreeRemove(/*IN/OUT*/int *parents, /*IN/OUT*/children_t *children, int node) {
3224 fprintf(stderr,"Removing node %d parent %d\n", node, parents[node]);
3225 assert(parents[node] >= 0);
3226 int parent = parents[node];
3228 children_t *pc = &children[parent];
3230 for (oldn = 0; oldn < pc->nChild; oldn++) {
3231 if (pc->child[oldn] == node)
3234 assert(oldn < pc->nChild);
3236 /* move successor nodes back in child list and shorten list */
3238 for (i = oldn; i < pc->nChild-1; i++)
3239 pc->child[i] = pc->child[i+1];
3242 /* add its children to parent's child list */
3243 children_t *nc = &children[node];
3244 if (nc->nChild > 0) {
3245 assert(nc->nChild<=2);
3246 assert(pc->nChild < 3);
3247 assert(pc->nChild + nc->nChild <= 3);
3249 for (j = 0; j < nc->nChild; j++) {
3251 fprintf(stderr,"Repointing parent %d to child %d\n", parent, nc->child[j]);
3252 pc->child[pc->nChild++] = nc->child[j];
3253 parents[nc->child[j]] = parent;
3259 void ReadTree(/*IN/OUT*/NJ_t *NJ,
3260 /*IN*/uniquify_t *unique,
3261 /*IN*/hashstrings_t *hashnames,
3262 /*READ*/FILE *fpInTree) {
3263 assert(NJ->nSeq == unique->nUnique);
3264 /* First, do a preliminary parse of the tree to with non-unique leaves ignored
3265 We need to store this separately from NJ because it may have too many internal nodes
3266 (matching sequences show up once in the NJ but could be in multiple places in the tree)
3267 Will use iUnique as the index of nodes, as in the NJ structure
3269 int maxnodes = unique->nSeq*2;
3270 int maxnode = unique->nSeq;
3271 int *parent = (int*)mymalloc(sizeof(int)*maxnodes);
3272 children_t *children = (children_t *)mymalloc(sizeof(children_t)*maxnodes);
3273 int root = maxnode++;
3275 for (i = 0; i < maxnodes; i++) {
3277 children[i].nChild = 0;
3280 /* The stack is the current path to the root, with the root at the first (top) position */
3282 int *stack = (int*)mymalloc(sizeof(int)*maxnodes);
3288 token = ReadTreeToken(fpInTree);
3289 if (token == NULL || *token != '(')
3290 ReadTreeError("No '(' at start", token);
3291 /* nDown is still 0 because we have created the root */
3293 while ((token = ReadTreeToken(fpInTree)) != NULL) {
3294 if (nDown > 0) { /* In a stream of parentheses */
3297 else if (*token == ',' || *token == ';' || *token == ':' || *token == ')')
3298 ReadTreeError("while reading parentheses", token);
3300 /* Add intermediate nodes if nDown was > 1 (for nDown=1, the only new node is the leaf) */
3301 while (nDown-- > 0) {
3302 int new = maxnode++;
3303 assert(new < maxnodes);
3304 ReadTreeAddChild(stack[stack_size-1], new, /*IN/OUT*/parent, /*IN/OUT*/children);
3306 fprintf(stderr, "Added internal child %d of %d, stack size increase to %d\n",
3307 new, stack[stack_size-1],stack_size+1);
3308 stack[stack_size++] = new;
3309 assert(stack_size < maxnodes);
3311 ReadTreeMaybeAddLeaf(stack[stack_size-1], token,
3313 /*IN/OUT*/parent, /*IN/OUT*/children);
3315 } else if (nUp > 0) {
3316 if (*token == ';') { /* end the tree? */
3317 if (nUp != stack_size)
3318 ReadTreeError("unbalanced parentheses", token);
3321 } else if (*token == ')')
3323 else if (*token == '(')
3324 ReadTreeError("unexpected '(' after ')'", token);
3325 else if (*token == ':') {
3326 token = ReadTreeToken(fpInTree);
3327 /* Read the branch length and ignore it */
3328 if (token == NULL || (*token != '-' && !isdigit(*token)))
3329 ReadTreeError("not recognized as a branch length", token);
3330 } else if (*token == ',') {
3331 /* Go back up the stack the correct #times */
3335 fprintf(stderr, "Up to nUp=%d stack size %d at %d\n",
3336 nUp, stack_size, stack[stack_size-1]);
3337 if (stack_size <= 0)
3338 ReadTreeError("too many ')'", token);
3341 } else if (*token == '-' || isdigit(*token))
3342 ; /* ignore bootstrap value */
3344 fprintf(stderr, "Warning while parsing tree: non-numeric label %s for internal node\n",
3346 } else if (*token == '(') {
3348 } else if (*token == ')') {
3350 } else if (*token == ':') {
3351 token = ReadTreeToken(fpInTree);
3352 if (token == NULL || (*token != '-' && !isdigit(*token)))
3353 ReadTreeError("not recognized as a branch length", token);
3354 } else if (*token == ',') {
3356 } else if (*token == ';')
3357 ReadTreeError("unexpected token", token);
3359 ReadTreeMaybeAddLeaf(stack[stack_size-1], token,
3361 /*IN/OUT*/parent, /*IN/OUT*/children);
3364 /* Verify that all sequences were seen */
3365 for (i = 0; i < unique->nUnique; i++) {
3366 if (parent[i] < 0) {
3367 fprintf(stderr, "Alignment sequence %d (unique %d) absent from input tree\n"
3368 "The starting tree (the argument to -intree) must include all sequences in the alignment!\n",
3369 unique->uniqueFirst[i], i);
3374 /* Simplify the tree -- remove all internal nodes with < 2 children
3375 Keep trying until no nodes get removed
3380 /* Here stack is the list of nodes we haven't visited yet while doing
3384 while (stack_size > 0) {
3385 int node = stack[--stack_size];
3386 if (node >= unique->nUnique) { /* internal node */
3387 if (children[node].nChild <= 1) {
3389 ReadTreeRemove(/*IN/OUT*/parent,/*IN/OUT*/children,node);
3391 } else if (node == root && children[node].nChild == 1) {
3392 int newroot = children[node].child[0];
3393 parent[newroot] = -1;
3394 children[root].nChild = 0;
3397 fprintf(stderr,"Changed root from %d to %d\n",root,newroot);
3399 stack[stack_size++] = newroot;
3403 for (j = 0; j < children[node].nChild; j++) {
3404 assert(stack_size < maxnodes);
3405 stack[stack_size++] = children[node].child[j];
3407 fprintf(stderr,"Added %d to stack\n", stack[stack_size-1]);
3412 } while (nRemoved > 0);
3414 /* Simplify the root node to 3 children if it has 2 */
3415 if (children[root].nChild == 2) {
3416 for (i = 0; i < 2; i++) {
3417 int child = children[root].child[i];
3418 assert(child >= 0 && child < maxnodes);
3419 if (children[child].nChild == 2) {
3420 ReadTreeRemove(parent,children,child); /* replace root -> child -> A,B with root->A,B */
3426 for (i = 0; i < maxnodes; i++)
3428 fprintf(stderr,"Simplfied node %d has parent %d nchild %d\n",
3429 i, parent[i], children[i].nChild);
3431 /* Map the remaining internal nodes to NJ nodes */
3432 int *map = (int*)mymalloc(sizeof(int)*maxnodes);
3433 for (i = 0; i < unique->nUnique; i++)
3435 for (i = unique->nUnique; i < maxnodes; i++)
3439 while (stack_size > 0) {
3440 int node = stack[--stack_size];
3441 if (node >= unique->nUnique) { /* internal node */
3442 assert(node == root || children[node].nChild > 1);
3443 map[node] = NJ->maxnode++;
3444 for (i = 0; i < children[node].nChild; i++) {
3445 assert(stack_size < maxnodes);
3446 stack[stack_size++] = children[node].child[i];
3450 for (i = 0; i < maxnodes; i++)
3452 fprintf(stderr,"Map %d to %d (parent %d nchild %d)\n",
3453 i, map[i], parent[i], children[i].nChild);
3455 /* Set NJ->parent, NJ->children, NJ->root */
3456 NJ->root = map[root];
3458 for (node = 0; node < maxnodes; node++) {
3459 int njnode = map[node];
3461 NJ->child[njnode].nChild = children[node].nChild;
3462 for (i = 0; i < children[node].nChild; i++) {
3463 assert(children[node].child[i] >= 0 && children[node].child[i] < maxnodes);
3464 NJ->child[njnode].child[i] = map[children[node].child[i]];
3466 if (parent[node] >= 0)
3467 NJ->parent[njnode] = map[parent[node]];
3471 /* Make sure that parent/child relationships match */
3472 for (i = 0; i < NJ->maxnode; i++) {
3473 children_t *c = &NJ->child[i];
3475 for (j = 0; j < c->nChild;j++)
3476 assert(c->child[j] >= 0 && c->child[j] < NJ->maxnode && NJ->parent[c->child[j]] == i);
3478 assert(NJ->parent[NJ->root] < 0);
3480 map = myfree(map,sizeof(int)*maxnodes);
3481 stack = myfree(stack,sizeof(int)*maxnodes);
3482 children = myfree(children,sizeof(children_t)*maxnodes);
3483 parent = myfree(parent,sizeof(int)*maxnodes);
3485 /* Compute profiles as balanced -- the NNI stage will recompute these
3488 traversal_t traversal = InitTraversal(NJ);
3490 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
3491 if (node >= NJ->nSeq && node != NJ->root)
3492 SetProfile(/*IN/OUT*/NJ, node, /*noweight*/-1.0);
3494 traversal = FreeTraversal(traversal,NJ);
3497 /* Print topology using node indices as node names */
3498 void PrintNJInternal(FILE *fp, NJ_t *NJ, bool useLen) {
3502 typedef struct { int node; int end; } stack_t;
3503 stack_t *stack = (stack_t *)mymalloc(sizeof(stack_t)*NJ->maxnodes);
3505 stack[0].node = NJ->root;
3508 while(stackSize>0) {
3509 stack_t *last = &stack[stackSize-1];
3511 /* Save last, as we are about to overwrite it */
3512 int node = last->node;
3513 int end = last->end;
3515 if (node < NJ->nSeq) {
3516 if (NJ->child[NJ->parent[node]].child[0] != node) fputs(",",fp);
3517 fprintf(fp, "%d", node);
3519 fprintf(fp, ":%.4f", NJ->branchlength[node]);
3521 fprintf(fp, ")%d", node);
3523 fprintf(fp, ":%.4f", NJ->branchlength[node]);
3525 if (node != NJ->root && NJ->child[NJ->parent[node]].child[0] != node) fprintf(fp, ",");
3528 stack[stackSize-1].node = node;
3529 stack[stackSize-1].end = 1;
3530 children_t *c = &NJ->child[node];
3531 /* put children on in reverse order because we use the last one first */
3533 for (i = c->nChild-1; i >=0; i--) {
3535 stack[stackSize-1].node = c->child[i];
3536 stack[stackSize-1].end = 0;
3541 stack = myfree(stack, sizeof(stack_t)*NJ->maxnodes);
3544 void PrintNJ(FILE *fp, NJ_t *NJ, char **names, uniquify_t *unique, bool bShowSupport, bool bQuote) {
3545 /* And print the tree: depth first search
3546 * The stack contains
3547 * list of remaining children with their depth
3548 * parent node, with a flag of -1 so I know to print right-paren
3550 if (NJ->nSeq==1 && unique->alnNext[unique->uniqueFirst[0]] >= 0) {
3551 /* Special case -- otherwise we end up with double parens */
3552 int first = unique->uniqueFirst[0];
3553 assert(first >= 0 && first < unique->nSeq);
3554 fprintf(fp, bQuote ? "('%s':0.0" : "(%s:0.0", names[first]);
3555 int iName = unique->alnNext[first];
3556 while (iName >= 0) {
3557 assert(iName < unique->nSeq);
3558 fprintf(fp, bQuote ? ",'%s':0.0" : ",%s:0.0", names[iName]);
3559 iName = unique->alnNext[iName];
3565 typedef struct { int node; int end; } stack_t;
3566 stack_t *stack = (stack_t *)mymalloc(sizeof(stack_t)*NJ->maxnodes);
3568 stack[0].node = NJ->root;
3571 while(stackSize>0) {
3572 stack_t *last = &stack[stackSize-1];
3574 /* Save last, as we are about to overwrite it */
3575 int node = last->node;
3576 int end = last->end;
3578 if (node < NJ->nSeq) {
3579 if (NJ->child[NJ->parent[node]].child[0] != node) fputs(",",fp);
3580 int first = unique->uniqueFirst[node];
3581 assert(first >= 0 && first < unique->nSeq);
3582 /* Print the name, or the subtree of duplicate names */
3583 if (unique->alnNext[first] == -1) {
3584 fprintf(fp, bQuote ? "'%s'" : "%s", names[first]);
3586 fprintf(fp, bQuote ? "('%s':0.0" : "(%s:0.0", names[first]);
3587 int iName = unique->alnNext[first];
3588 while (iName >= 0) {
3589 assert(iName < unique->nSeq);
3590 fprintf(fp, bQuote ? ",'%s':0.0" : ",%s:0.0", names[iName]);
3591 iName = unique->alnNext[iName];
3595 /* Print the branch length */
3597 #define FP_FORMAT "%.9f"
3599 #define FP_FORMAT "%.5f"
3601 fprintf(fp, ":" FP_FORMAT, NJ->branchlength[node]);
3603 if (node == NJ->root)
3605 else if (bShowSupport)
3606 fprintf(fp, ")%.3f:" FP_FORMAT, NJ->support[node], NJ->branchlength[node]);
3608 fprintf(fp, "):" FP_FORMAT, NJ->branchlength[node]);
3610 if (node != NJ->root && NJ->child[NJ->parent[node]].child[0] != node) fprintf(fp, ",");
3613 stack[stackSize-1].node = node;
3614 stack[stackSize-1].end = 1;
3615 children_t *c = &NJ->child[node];
3616 /* put children on in reverse order because we use the last one first */
3618 for (i = c->nChild-1; i >=0; i--) {
3620 stack[stackSize-1].node = c->child[i];
3621 stack[stackSize-1].end = 0;
3626 stack = myfree(stack, sizeof(stack_t)*NJ->maxnodes);
3629 alignment_t *ReadAlignment(/*IN*/FILE *fp, bool bQuote) {
3630 /* bQuote supports the -quote option */
3633 char **names = NULL;
3635 char buf[BUFFER_SIZE] = "";
3636 if (fgets(buf,sizeof(buf),fp) == NULL) {
3637 fprintf(stderr, "Error reading header line\n");
3641 if (buf[0] == '>') {
3642 /* FASTA, truncate names at any of these */
3643 char *nameStop = bQuote ? "'\t\r\n" : "(),: \t\r\n";
3644 char *seqSkip = " \t\r\n"; /* skip these characters in the sequence */
3645 seqs = (char**)mymalloc(sizeof(char*) * nSaved);
3646 names = (char**)mymalloc(sizeof(char*) * nSaved);
3649 /* loop over lines */
3650 if (buf[0] == '>') {
3651 /* truncate the name */
3653 for (p = buf+1; *p != '\0'; p++) {
3654 for (q = nameStop; *q != '\0'; q++) {
3660 if (*p == '\0') break;
3663 /* allocate space for another sequence */
3665 if (nSeq > nSaved) {
3666 int nNewSaved = nSaved*2;
3667 seqs = myrealloc(seqs,sizeof(char*)*nSaved,sizeof(char*)*nNewSaved, /*copy*/false);
3668 names = myrealloc(names,sizeof(char*)*nSaved,sizeof(char*)*nNewSaved, /*copy*/false);
3671 names[nSeq-1] = (char*)mymemdup(buf+1,strlen(buf));
3672 seqs[nSeq-1] = NULL;
3674 /* count non-space characters and append to sequence */
3677 for (p=buf; *p != '\0'; p++) {
3678 for (q=seqSkip; *q != '\0'; q++) {
3685 int nOld = (seqs[nSeq-1] == NULL) ? 0 : strlen(seqs[nSeq-1]);
3686 seqs[nSeq-1] = (char*)myrealloc(seqs[nSeq-1], nOld, nOld+nKeep+1, /*copy*/false);
3687 if (nOld+nKeep > nPos)
3688 nPos = nOld + nKeep;
3689 char *out = seqs[nSeq-1] + nOld;
3690 for (p=buf; *p != '\0'; p++) {
3691 for (q=seqSkip; *q != '\0'; q++) {
3700 assert(out-seqs[nSeq-1] == nKeep + nOld);
3703 } while(fgets(buf,sizeof(buf),fp) != NULL);
3705 if (seqs[nSeq-1] == NULL) {
3706 fprintf(stderr, "No sequence data for last entry %s\n",names[nSeq-1]);
3709 names = myrealloc(names,sizeof(char*)*nSaved,sizeof(char*)*nSeq, /*copy*/false);
3710 seqs = myrealloc(seqs,sizeof(char*)*nSaved,sizeof(char*)*nSeq, /*copy*/false);
3712 /* PHYLIP interleaved-like format
3713 Allow arbitrary length names, require spaces between names and sequences
3714 Allow multiple alignments, either separated by a single empty line (e.g. seqboot output)
3717 if (buf[0] == '\n' || buf[0] == '\r') {
3718 if (fgets(buf,sizeof(buf),fp) == NULL) {
3719 fprintf(stderr, "Empty header line followed by EOF\n");
3723 if (sscanf(buf, "%d%d", &nSeq, &nPos) != 2
3724 || nSeq < 1 || nPos < 1) {
3725 fprintf(stderr, "Error parsing header line:%s\n", buf);
3728 names = (char **)mymalloc(sizeof(char*) * nSeq);
3729 seqs = (char **)mymalloc(sizeof(char*) * nSeq);
3733 for (i = 0; i < nSeq; i++) {
3735 seqs[i] = (char *)mymalloc(nPos+1); /* null-terminate */
3740 while(fgets(buf,sizeof(buf),fp)) {
3741 if ((buf[0] == '\n' || buf[0] == '\r') && (iSeq == nSeq || iSeq == 0)) {
3744 int j = 0; /* character just past end of name */
3745 if (buf[0] == ' ') {
3746 if (names[iSeq] == NULL) {
3747 fprintf(stderr, "No name in phylip line %s", buf);
3751 while (buf[j] != '\n' && buf[j] != '\0' && buf[j] != ' ')
3753 if (buf[j] != ' ' || j == 0) {
3754 fprintf(stderr, "No sequence in phylip line %s", buf);
3758 fprintf(stderr, "No empty line between sequence blocks (is the sequence count wrong?)\n");
3761 if (names[iSeq] == NULL) {
3763 names[iSeq] = (char *)mymalloc(j+1);
3765 for (k = 0; k < j; k++) names[iSeq][k] = buf[k];
3766 names[iSeq][j] = '\0';
3768 /* check the name */
3771 for (k = 0; k < j; k++) {
3772 if (names[iSeq][k] != buf[k]) {
3777 if (!match || names[iSeq][j] != '\0') {
3778 fprintf(stderr, "Wrong name in phylip line %s\nExpected %s\n", buf, names[iSeq]);
3783 int seqlen = strlen(seqs[iSeq]);
3784 for (; buf[j] != '\n' && buf[j] != '\0'; j++) {
3785 if (buf[j] != ' ') {
3786 if (seqlen >= nPos) {
3787 fprintf(stderr, "Too many characters (expected %d) for sequence named %s\nSo far have:\n%s\n",
3788 nPos, names[iSeq], seqs[iSeq]);
3791 seqs[iSeq][seqlen++] = toupper(buf[j]);
3794 seqs[iSeq][seqlen] = '\0'; /* null-terminate */
3795 if(verbose>10) fprintf(stderr,"Read iSeq %d name %s seqsofar %s\n", iSeq, names[iSeq], seqs[iSeq]);
3797 if (iSeq == nSeq && strlen(seqs[0]) == nPos)
3798 break; /* finished alignment */
3799 } /* end else non-empty phylip line */
3801 if (iSeq != nSeq && iSeq != 0) {
3802 fprintf(stderr, "Wrong number of sequences: expected %d\n", nSeq);
3806 /* Check lengths of sequences */
3808 for (i = 0; i < nSeq; i++) {
3809 int seqlen = strlen(seqs[i]);
3810 if (seqlen != nPos) {
3811 fprintf(stderr, "Wrong number of characters for %s: expected %d but have %d instead.\n"
3812 "This sequence may be truncated, or another sequence may be too long.\n",
3813 names[i], nPos, seqlen);
3817 /* Replace "." with "-" and warn if we find any */
3818 /* If nucleotide sequences, replace U with T and N with X */
3819 bool findDot = false;
3820 for (i = 0; i < nSeq; i++) {
3822 for (p = seqs[i]; *p != '\0'; p++) {
3827 if (nCodes == 4 && *p == 'U')
3829 if (nCodes == 4 && *p == 'N')
3834 fprintf(stderr, "Warning! Found \".\" character(s). These are treated as gaps\n");
3837 fprintf(stderr, "Error reading input file\n");
3841 alignment_t *align = (alignment_t*)mymalloc(sizeof(alignment_t));
3844 align->names = names;
3846 align->nSaved = nSaved;
3850 void FreeAlignmentSeqs(/*IN/OUT*/alignment_t *aln) {
3851 assert(aln != NULL);
3853 for (i = 0; i < aln->nSeq; i++)
3854 aln->seqs[i] = myfree(aln->seqs[i], aln->nPos+1);
3857 alignment_t *FreeAlignment(alignment_t *aln) {
3861 for (i = 0; i < aln->nSeq; i++) {
3862 aln->names[i] = myfree(aln->names[i],strlen(aln->names[i])+1);
3863 aln->seqs[i] = myfree(aln->seqs[i], aln->nPos+1);
3865 aln->names = myfree(aln->names, sizeof(char*)*aln->nSaved);
3866 aln->seqs = myfree(aln->seqs, sizeof(char*)*aln->nSaved);
3867 myfree(aln, sizeof(alignment_t));
3871 char **AlnToConstraints(alignment_t *constraints, uniquify_t *unique, hashstrings_t *hashnames) {
3872 /* look up constraints as names and map to unique-space */
3873 char ** uniqConstraints = (char**)mymalloc(sizeof(char*) * unique->nUnique);
3875 for (i = 0; i < unique->nUnique; i++)
3876 uniqConstraints[i] = NULL;
3877 for (i = 0; i < constraints->nSeq; i++) {
3878 char *name = constraints->names[i];
3879 char *constraintSeq = constraints->seqs[i];
3880 hashiterator_t hi = FindMatch(hashnames,name);
3881 if (HashCount(hashnames,hi) != 1) {
3882 fprintf(stderr, "Sequence %s from constraints file is not in the alignment\n", name);
3885 int iSeqNonunique = HashFirst(hashnames,hi);
3886 assert(iSeqNonunique >= 0 && iSeqNonunique < unique->nSeq);
3887 int iSeqUnique = unique->alnToUniq[iSeqNonunique];
3888 assert(iSeqUnique >= 0 && iSeqUnique < unique->nUnique);
3889 if (uniqConstraints[iSeqUnique] != NULL) {
3890 /* Already set a constraint for this group of sequences!
3891 Warn that we are ignoring this one unless the constraints match */
3892 if (strcmp(uniqConstraints[iSeqUnique],constraintSeq) != 0) {
3894 "Warning: ignoring constraints for %s:\n%s\n"
3895 "Another sequence has the same sequence but different constraints\n",
3896 name, constraintSeq);
3899 uniqConstraints[iSeqUnique] = constraintSeq;
3902 return(uniqConstraints);
3906 profile_t *SeqToProfile(/*IN/OUT*/NJ_t *NJ,
3907 char *seq, int nPos,
3908 /*OPTIONAL*/char *constraintSeq, int nConstraints,
3910 unsigned long counts[256]) {
3911 static unsigned char charToCode[256];
3912 static int codeSet = 0;
3916 for (c = 0; c < 256; c++) {
3917 charToCode[c] = nCodes;
3919 for (i = 0; codesString[i]; i++) {
3920 charToCode[codesString[i]] = i;
3921 charToCode[tolower(codesString[i])] = i;
3923 charToCode['-'] = NOCODE;
3927 assert(strlen(seq) == nPos);
3928 profile_t *profile = NewProfile(nPos,nConstraints);
3930 for (i = 0; i < nPos; i++) {
3931 unsigned int character = (unsigned int) seq[i];
3932 counts[character]++;
3933 c = charToCode[character];
3934 if(verbose>10 && i < 2) fprintf(stderr,"pos %d char %c code %d\n", i, seq[i], c);
3935 /* treat unknowns as gaps */
3936 if (c == nCodes || c == NOCODE) {
3937 profile->codes[i] = NOCODE;
3938 profile->weights[i] = 0.0;
3940 profile->codes[i] = c;
3941 profile->weights[i] = 1.0;
3944 if (nConstraints > 0) {
3945 for (i = 0; i < nConstraints; i++) {
3946 profile->nOn[i] = 0;
3947 profile->nOff[i] = 0;
3950 if (constraintSeq != NULL) {
3951 assert(strlen(constraintSeq) == nConstraints);
3952 for (i = 0; i < nConstraints; i++) {
3953 if (constraintSeq[i] == '1') {
3954 profile->nOn[i] = 1;
3955 } else if (constraintSeq[i] == '0') {
3956 profile->nOff[i] = 1;
3957 } else if (constraintSeq[i] != '-') {
3959 fprintf(stderr, "Constraint characters in unique sequence %d replaced with gap:", iNode+1);
3962 fprintf(stderr, " %c%d", constraintSeq[i], i+1);
3963 /* For the benefit of ConstraintSequencePenalty -- this is a bit of a hack, as
3964 this modifies the value read from the alignment
3966 constraintSeq[i] = '-';
3970 fprintf(stderr, "\n");
3976 void SeqDist(unsigned char *codes1, unsigned char *codes2, int nPos,
3977 distance_matrix_t *dmat,
3978 /*OUT*/besthit_t *hit) {
3979 double top = 0; /* summed over positions */
3984 for (i = 0; i < nPos; i++) {
3985 if (codes1[i] != NOCODE && codes2[i] != NOCODE) {
3987 if (codes1[i] != codes2[i]) nDiff++;
3990 top = (double)nDiff;
3992 for (i = 0; i < nPos; i++) {
3993 if (codes1[i] != NOCODE && codes2[i] != NOCODE) {
3995 top += dmat->distances[(unsigned int)codes1[i]][(unsigned int)codes2[i]];
3999 hit->weight = (double)nUse;
4000 hit->dist = nUse > 0 ? top/(double)nUse : 1.0;
4004 void CorrectedPairDistances(profile_t **profiles, int nProfiles,
4005 /*OPTIONAL*/distance_matrix_t *distance_matrix,
4007 /*OUT*/double *distances) {
4008 assert(distances != NULL);
4009 assert(profiles != NULL);
4010 assert(nProfiles>1 && nProfiles <= 4);
4014 for (iHit=0, i=0; i < nProfiles; i++) {
4015 for (j=i+1; j < nProfiles; j++, iHit++) {
4016 ProfileDist(profiles[i],profiles[j],nPos,distance_matrix,/*OUT*/&hit[iHit]);
4017 distances[iHit] = hit[iHit].dist;
4020 if (pseudoWeight > 0) {
4021 /* Estimate the prior distance */
4024 for (iHit=0; iHit < (nProfiles*(nProfiles-1))/2; iHit++) {
4025 dTop += hit[iHit].dist * hit[iHit].weight;
4026 dBottom += hit[iHit].weight;
4028 double prior = (dBottom > 0.01) ? dTop/dBottom : 3.0;
4029 for (iHit=0; iHit < (nProfiles*(nProfiles-1))/2; iHit++)
4030 distances[iHit] = (distances[iHit] * hit[iHit].weight + prior * pseudoWeight)
4031 / (hit[iHit].weight + pseudoWeight);
4034 for (iHit=0; iHit < (nProfiles*(nProfiles-1))/2; iHit++)
4035 distances[iHit] = LogCorrect(distances[iHit]);
4039 /* During the neighbor-joining phase, a join only violates our constraints if
4040 node1, node2, and other are all represented in the constraint
4041 and if one of the 3 is split and the other two do not agree
4043 int JoinConstraintPenalty(/*IN*/NJ_t *NJ, int node1, int node2) {
4044 if (NJ->nConstraints == 0)
4048 for (iC = 0; iC < NJ->nConstraints; iC++)
4049 penalty += JoinConstraintPenaltyPiece(NJ, node1, node2, iC);
4053 int JoinConstraintPenaltyPiece(NJ_t *NJ, int node1, int node2, int iC) {
4054 profile_t *pOut = NJ->outprofile;
4055 profile_t *p1 = NJ->profiles[node1];
4056 profile_t *p2 = NJ->profiles[node2];
4057 int nOn1 = p1->nOn[iC];
4058 int nOff1 = p1->nOff[iC];
4059 int nOn2 = p2->nOn[iC];
4060 int nOff2 = p2->nOff[iC];
4061 int nOnOut = pOut->nOn[iC] - nOn1 - nOn2;
4062 int nOffOut = pOut->nOff[iC] - nOff1 - nOff2;
4064 if ((nOn1+nOff1) > 0 && (nOn2+nOff2) > 0 && (nOnOut+nOffOut) > 0) {
4065 /* code is -1 for split, 0 for off, 1 for on */
4066 int code1 = (nOn1 > 0 && nOff1 > 0) ? -1 : (nOn1 > 0 ? 1 : 0);
4067 int code2 = (nOn2 > 0 && nOff2 > 0) ? -1 : (nOn2 > 0 ? 1 : 0);
4068 int code3 = (nOnOut > 0 && nOffOut) > 0 ? -1 : (nOnOut > 0 ? 1 : 0);
4069 int nSplit = (code1 == -1 ? 1 : 0) + (code2 == -1 ? 1 : 0) + (code3 == -1 ? 1 : 0);
4070 int nOn = (code1 == 1 ? 1 : 0) + (code2 == 1 ? 1 : 0) + (code3 == 1 ? 1 : 0);
4071 if (nSplit == 1 && nOn == 1)
4072 return(SplitConstraintPenalty(nOn1+nOn2, nOff1+nOff2, nOnOut, nOffOut));
4078 void QuartetConstraintPenalties(profile_t *profiles[4], int nConstraints, /*OUT*/double penalty[3]) {
4080 for (i=0; i < 3; i++)
4082 if(nConstraints == 0)
4085 for (iC = 0; iC < nConstraints; iC++) {
4087 if (QuartetConstraintPenaltiesPiece(profiles, iC, /*OUT*/part)) {
4089 penalty[i] += part[i];
4092 && (fabs(part[ABvsCD]-part[ACvsBD]) > 0.001 || fabs(part[ABvsCD]-part[ADvsBC]) > 0.001))
4093 fprintf(stderr, "Constraint Penalties at %d: ABvsCD %.3f ACvsBD %.3f ADvsBC %.3f %d/%d %d/%d %d/%d %d/%d\n",
4094 iC, part[ABvsCD], part[ACvsBD], part[ADvsBC],
4095 profiles[0]->nOn[iC], profiles[0]->nOff[iC],
4096 profiles[1]->nOn[iC], profiles[1]->nOff[iC],
4097 profiles[2]->nOn[iC], profiles[2]->nOff[iC],
4098 profiles[3]->nOn[iC], profiles[3]->nOff[iC]);
4102 fprintf(stderr, "Total Constraint Penalties: ABvsCD %.3f ACvsBD %.3f ADvsBC %.3f\n",
4103 penalty[ABvsCD], penalty[ACvsBD], penalty[ADvsBC]);
4106 double PairConstraintDistance(int nOn1, int nOff1, int nOn2, int nOff2) {
4107 double f1 = nOn1/(double)(nOn1+nOff1);
4108 double f2 = nOn2/(double)(nOn2+nOff2);
4109 /* 1 - f1 * f2 - (1-f1)*(1-f2) = 1 - f1 * f2 - 1 + f1 + f2 - f1 * f2 */
4110 return(f1 + f2 - 2.0 * f1 * f2);
4113 bool QuartetConstraintPenaltiesPiece(profile_t *profiles[4], int iC, /*OUT*/double piece[3]) {
4121 for (i=0; i < 4; i++) {
4122 nOn[i] = profiles[i]->nOn[iC];
4123 nOff[i] = profiles[i]->nOff[iC];
4124 if (nOn[i] + nOff[i] == 0)
4125 return(false); /* ignore */
4126 else if (nOn[i] > 0 && nOff[i] > 0)
4128 else if (nOn[i] > 0)
4133 /* If just one of them is split or on the other side and the others all agree, also ignore */
4134 if (nPlus >= 3 || nMinus >= 3)
4136 piece[ABvsCD] = constraintWeight
4137 * (PairConstraintDistance(nOn[0],nOff[0],nOn[1],nOff[1])
4138 + PairConstraintDistance(nOn[2],nOff[2],nOn[3],nOff[3]));
4139 piece[ACvsBD] = constraintWeight
4140 * (PairConstraintDistance(nOn[0],nOff[0],nOn[2],nOff[2])
4141 + PairConstraintDistance(nOn[1],nOff[1],nOn[3],nOff[3]));
4142 piece[ADvsBC] = constraintWeight
4143 * (PairConstraintDistance(nOn[0],nOff[0],nOn[3],nOff[3])
4144 + PairConstraintDistance(nOn[2],nOff[2],nOn[1],nOff[1]));
4148 /* Minimum number of constrained leaves that need to be moved
4149 to satisfy the constraint (or 0 if constraint is satisfied)
4150 Defining it this way should ensure that SPR moves that break
4151 constraints get a penalty
4153 int SplitConstraintPenalty(int nOn1, int nOff1, int nOn2, int nOff2) {
4154 return(nOn1 + nOff2 < nOn2 + nOff1 ?
4155 (nOn1 < nOff2 ? nOn1 : nOff2)
4156 : (nOn2 < nOff1 ? nOn2 : nOff1));
4159 bool SplitViolatesConstraint(profile_t *profiles[4], int iConstraint) {
4161 int codes[4]; /* 0 for off, 1 for on, -1 for split (quit if not constrained at all) */
4162 for (i = 0; i < 4; i++) {
4163 if (profiles[i]->nOn[iConstraint] + profiles[i]->nOff[iConstraint] == 0)
4165 else if (profiles[i]->nOn[iConstraint] > 0 && profiles[i]->nOff[iConstraint] == 0)
4167 else if (profiles[i]->nOn[iConstraint] == 0 && profiles[i]->nOff[iConstraint] > 0)
4174 for (i = 0; i < 4; i++) {
4177 else if (codes[i] == 1)
4180 /* 3 on one side means no violation, even if other is code -1
4181 otherwise must have code != -1 and agreement on the split
4183 if (n0 >= 3 || n1 >= 3)
4185 if (n0==2 && n1==2 && codes[0] == codes[1] && codes[2] == codes[3])
4190 double LogCorrect(double dist) {
4191 const double maxscore = 3.0;
4192 if (nCodes == 4 && !useMatrix) { /* Jukes-Cantor */
4193 dist = dist < 0.74 ? -0.75*log(1.0 - dist * 4.0/3.0) : maxscore;
4194 } else { /* scoredist-like */
4195 dist = dist < 0.99 ? -1.3*log(1.0 - dist) : maxscore;
4197 return (dist < maxscore ? dist : maxscore);
4200 /* A helper function -- f1 and f2 can be NULL if the corresponding code != NOCODE
4202 double ProfileDistPiece(unsigned int code1, unsigned int code2,
4203 numeric_t *f1, numeric_t *f2,
4204 /*OPTIONAL*/distance_matrix_t *dmat,
4205 /*OPTIONAL*/numeric_t *codeDist2) {
4207 if (code1 != NOCODE && code2 != NOCODE) { /* code1 vs code2 */
4208 return(dmat->distances[code1][code2]);
4209 } else if (codeDist2 != NULL && code1 != NOCODE) { /* code1 vs. codeDist2 */
4210 return(codeDist2[code1]);
4211 } else { /* f1 vs f2 */
4213 if(code1 == NOCODE) return(10.0);
4214 f1 = &dmat->codeFreq[code1][0];
4217 if(code2 == NOCODE) return(10.0);
4218 f2 = &dmat->codeFreq[code2][0];
4220 return(vector_multiply3_sum(f1,f2,dmat->eigenval,nCodes));
4224 if (code1 != NOCODE) {
4225 if (code2 != NOCODE) {
4226 return(code1 == code2 ? 0.0 : 1.0); /* code1 vs code2 */
4228 if(f2 == NULL) return(10.0);
4229 return(1.0 - f2[code1]); /* code1 vs. f2 */
4232 if (code2 != NOCODE) {
4233 if(f1 == NULL) return(10.0);
4234 return(1.0 - f1[code2]); /* f1 vs code2 */
4235 } else { /* f1 vs. f2 */
4236 if (f1 == NULL || f2 == NULL) return(10.0);
4239 for (k = 0; k < nCodes; k++) {
4240 piece -= f1[k] * f2[k];
4249 /* E.g. GET_FREQ(profile,iPos,iVector)
4250 Gets the next element of the vectors (and updates iVector), or
4251 returns NULL if we didn't store a vector
4253 #define GET_FREQ(P,I,IVECTOR) \
4254 (P->weights[I] > 0 && P->codes[I] == NOCODE ? &P->vectors[nCodes*(IVECTOR++)] : NULL)
4256 void ProfileDist(profile_t *profile1, profile_t *profile2, int nPos,
4257 /*OPTIONAL*/distance_matrix_t *dmat,
4258 /*OUT*/besthit_t *hit) {
4264 for (i = 0; i < nPos; i++) {
4265 numeric_t *f1 = GET_FREQ(profile1,i,/*IN/OUT*/iFreq1);
4266 numeric_t *f2 = GET_FREQ(profile2,i,/*IN/OUT*/iFreq2);
4267 if (profile1->weights[i] > 0 && profile2->weights[i] > 0) {
4268 double weight = profile1->weights[i] * profile2->weights[i];
4270 double piece = ProfileDistPiece(profile1->codes[i],profile2->codes[i],f1,f2,dmat,
4271 profile2->codeDist ? &profile2->codeDist[i*nCodes] : NULL);
4272 top += weight * piece;
4275 assert(iFreq1 == profile1->nVectors);
4276 assert(iFreq2 == profile2->nVectors);
4277 hit->weight = denom > 0 ? denom : 0.01; /* 0.01 is an arbitrarily low value of weight (normally >>1) */
4278 hit->dist = denom > 0 ? top/denom : 1;
4282 /* This should not be called if the update weight is 0, as
4283 in that case code==NOCODE and in=NULL is possible, and then
4286 void AddToFreq(/*IN/OUT*/numeric_t *fOut,
4288 unsigned int codeIn, /*OPTIONAL*/numeric_t *fIn,
4289 /*OPTIONAL*/distance_matrix_t *dmat) {
4290 assert(fOut != NULL);
4292 vector_add_mult(fOut, fIn, weight, nCodes);
4294 assert(codeIn != NOCODE);
4295 vector_add_mult(fOut, dmat->codeFreq[codeIn], weight, nCodes);
4297 assert(codeIn != NOCODE);
4298 fOut[codeIn] += weight;
4302 void SetProfile(/*IN/OUT*/NJ_t *NJ, int node, double weight1) {
4303 children_t *c = &NJ->child[node];
4304 assert(c->nChild == 2);
4305 assert(NJ->profiles[c->child[0]] != NULL);
4306 assert(NJ->profiles[c->child[1]] != NULL);
4307 if (NJ->profiles[node] != NULL)
4308 FreeProfile(NJ->profiles[node], NJ->nPos, NJ->nConstraints);
4309 NJ->profiles[node] = AverageProfile(NJ->profiles[c->child[0]],
4310 NJ->profiles[c->child[1]],
4311 NJ->nPos, NJ->nConstraints,
4312 NJ->distance_matrix,
4316 /* bionjWeight is the weight of the first sequence (between 0 and 1),
4317 or -1 to do the average.
4319 profile_t *AverageProfile(profile_t *profile1, profile_t *profile2,
4320 int nPos, int nConstraints,
4321 distance_matrix_t *dmat,
4322 double bionjWeight) {
4324 if (bionjWeight < 0) {
4328 /* First, set codes and weights and see how big vectors will be */
4329 profile_t *out = NewProfile(nPos, nConstraints);
4331 for (i = 0; i < nPos; i++) {
4332 out->weights[i] = bionjWeight * profile1->weights[i]
4333 + (1-bionjWeight) * profile2->weights[i];
4334 out->codes[i] = NOCODE;
4335 if (out->weights[i] > 0) {
4336 if (profile1->weights[i] > 0 && profile1->codes[i] != NOCODE
4337 && (profile2->weights[i] <= 0 || profile1->codes[i] == profile2->codes[i])) {
4338 out->codes[i] = profile1->codes[i];
4339 } else if (profile1->weights[i] <= 0
4340 && profile2->weights[i] > 0
4341 && profile2->codes[i] != NOCODE) {
4342 out->codes[i] = profile2->codes[i];
4344 if (out->codes[i] == NOCODE) out->nVectors++;
4348 /* Allocate and set the vectors */
4349 out->vectors = (numeric_t*)mymalloc(sizeof(numeric_t)*nCodes*out->nVectors);
4350 for (i = 0; i < nCodes * out->nVectors; i++) out->vectors[i] = 0;
4351 nProfileFreqAlloc += out->nVectors;
4352 nProfileFreqAvoid += nPos - out->nVectors;
4356 for (i=0; i < nPos; i++) {
4357 numeric_t *f = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4358 numeric_t *f1 = GET_FREQ(profile1,i,/*IN/OUT*/iFreq1);
4359 numeric_t *f2 = GET_FREQ(profile2,i,/*IN/OUT*/iFreq2);
4361 if (profile1->weights[i] > 0)
4362 AddToFreq(/*IN/OUT*/f, profile1->weights[i] * bionjWeight,
4363 profile1->codes[i], f1, dmat);
4364 if (profile2->weights[i] > 0)
4365 AddToFreq(/*IN/OUT*/f, profile2->weights[i] * (1.0-bionjWeight),
4366 profile2->codes[i], f2, dmat);
4367 NormalizeFreq(/*IN/OUT*/f, dmat);
4368 } /* end if computing f */
4369 if (verbose > 10 && i < 5) {
4370 fprintf(stderr,"Average profiles: pos %d in-w1 %f in-w2 %f bionjWeight %f to weight %f code %d\n",
4371 i, profile1->weights[i], profile2->weights[i], bionjWeight,
4372 out->weights[i], out->codes[i]);
4375 for (k = 0; k < nCodes; k++)
4376 fprintf(stderr, "\t%c:%f", codesString[k], f ? f[k] : -1.0);
4377 fprintf(stderr,"\n");
4380 } /* end loop over positions */
4381 assert(iFreq1 == profile1->nVectors);
4382 assert(iFreq2 == profile2->nVectors);
4383 assert(iFreqOut == out->nVectors);
4385 /* compute total constraints */
4386 for (i = 0; i < nConstraints; i++) {
4387 out->nOn[i] = profile1->nOn[i] + profile2->nOn[i];
4388 out->nOff[i] = profile1->nOff[i] + profile2->nOff[i];
4394 /* Make the (unrotated) frequencies sum to 1
4395 Simply dividing by total_weight is not ideal because of roundoff error
4396 So compute total_freq instead
4398 void NormalizeFreq(/*IN/OUT*/numeric_t *freq, distance_matrix_t *dmat) {
4399 double total_freq = 0;
4402 /* The total frequency is dot_product(true_frequencies, 1)
4403 So we rotate the 1 vector by eigeninv (stored in eigentot)
4405 total_freq = vector_multiply_sum(freq, dmat->eigentot, nCodes);
4407 for (k = 0; k < nCodes; k++)
4408 total_freq += freq[k];
4410 if (total_freq > fPostTotalTolerance) {
4411 numeric_t inverse_weight = 1.0/total_freq;
4412 vector_multiply_by(/*IN/OUT*/freq, inverse_weight, nCodes);
4414 /* This can happen if we are in a very low-weight region, e.g. if a mostly-gap position gets weighted down
4415 repeatedly; just set them all to arbitrary but legal values */
4417 for (k = 0; k < nCodes; k++)
4418 freq[k] = 1.0/nCodes;
4420 for (k = 0; k < nCodes; k++)
4421 freq[k] = dmat->codeFreq[0][k];/*XXX gapFreq[k];*/
4426 /* OutProfile() computes the out-profile */
4427 profile_t *OutProfile(profile_t **profiles, int nProfiles,
4428 int nPos, int nConstraints,
4429 distance_matrix_t *dmat) {
4430 int i; /* position */
4431 int in; /* profile */
4432 profile_t *out = NewProfile(nPos, nConstraints);
4434 double inweight = 1.0/(double)nProfiles; /* The maximal output weight is 1.0 */
4436 /* First, set weights -- code is always NOCODE, prevent weight=0 */
4437 for (i = 0; i < nPos; i++) {
4438 out->weights[i] = 0;
4439 for (in = 0; in < nProfiles; in++)
4440 out->weights[i] += profiles[in]->weights[i] * inweight;
4441 if (out->weights[i] <= 0) out->weights[i] = 1e-20; /* always store a vector */
4443 out->codes[i] = NOCODE; /* outprofile is normally complicated */
4446 /* Initialize the frequencies to 0 */
4447 out->vectors = (numeric_t*)mymalloc(sizeof(numeric_t)*nCodes*out->nVectors);
4448 for (i = 0; i < nCodes*out->nVectors; i++)
4449 out->vectors[i] = 0;
4451 /* Add up the weights, going through each sequence in turn */
4452 for (in = 0; in < nProfiles; in++) {
4455 for (i = 0; i < nPos; i++) {
4456 numeric_t *fIn = GET_FREQ(profiles[in],i,/*IN/OUT*/iFreqIn);
4457 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4458 if (profiles[in]->weights[i] > 0)
4459 AddToFreq(/*IN/OUT*/fOut, profiles[in]->weights[i],
4460 profiles[in]->codes[i], fIn, dmat);
4462 assert(iFreqOut == out->nVectors);
4463 assert(iFreqIn == profiles[in]->nVectors);
4466 /* And normalize the frequencies to sum to 1 */
4468 for (i = 0; i < nPos; i++) {
4469 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4471 NormalizeFreq(/*IN/OUT*/fOut, dmat);
4473 assert(iFreqOut == out->nVectors);
4474 if (verbose > 10) fprintf(stderr,"Average %d profiles\n", nProfiles);
4476 SetCodeDist(/*IN/OUT*/out, nPos, dmat);
4478 /* Compute constraints */
4479 for (i = 0; i < nConstraints; i++) {
4482 for (in = 0; in < nProfiles; in++) {
4483 out->nOn[i] += profiles[in]->nOn[i];
4484 out->nOff[i] += profiles[in]->nOff[i];
4490 void UpdateOutProfile(/*IN/OUT*/profile_t *out, profile_t *old1, profile_t *old2,
4491 profile_t *new, int nActiveOld,
4492 int nPos, int nConstraints,
4493 distance_matrix_t *dmat) {
4499 assert(nActiveOld > 0);
4501 for (i = 0; i < nPos; i++) {
4502 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4503 numeric_t *fOld1 = GET_FREQ(old1,i,/*IN/OUT*/iFreq1);
4504 numeric_t *fOld2 = GET_FREQ(old2,i,/*IN/OUT*/iFreq2);
4505 numeric_t *fNew = GET_FREQ(new,i,/*IN/OUT*/iFreqNew);
4507 assert(out->codes[i] == NOCODE && fOut != NULL); /* No no-vector optimization for outprofiles */
4508 if (verbose > 3 && i < 3) {
4509 fprintf(stderr,"Updating out-profile position %d weight %f (mult %f)\n",
4510 i, out->weights[i], out->weights[i]*nActiveOld);
4512 double originalMult = out->weights[i]*nActiveOld;
4513 double newMult = originalMult + new->weights[i] - old1->weights[i] - old2->weights[i];
4514 out->weights[i] = newMult/(nActiveOld-1);
4515 if (out->weights[i] <= 0) out->weights[i] = 1e-20; /* always use the vector */
4517 for (k = 0; k < nCodes; k++) fOut[k] *= originalMult;
4519 if (old1->weights[i] > 0)
4520 AddToFreq(/*IN/OUT*/fOut, -old1->weights[i], old1->codes[i], fOld1, dmat);
4521 if (old2->weights[i] > 0)
4522 AddToFreq(/*IN/OUT*/fOut, -old2->weights[i], old2->codes[i], fOld2, dmat);
4523 if (new->weights[i] > 0)
4524 AddToFreq(/*IN/OUT*/fOut, new->weights[i], new->codes[i], fNew, dmat);
4526 /* And renormalize */
4527 NormalizeFreq(/*IN/OUT*/fOut, dmat);
4529 if (verbose > 2 && i < 3) {
4530 fprintf(stderr,"Updated out-profile position %d weight %f (mult %f)",
4531 i, out->weights[i], out->weights[i]*nActiveOld);
4532 if(out->weights[i] > 0)
4533 for (k=0;k<nCodes;k++)
4534 fprintf(stderr, " %c:%f", dmat?'?':codesString[k], fOut[k]);
4535 fprintf(stderr,"\n");
4538 assert(iFreqOut == out->nVectors);
4539 assert(iFreq1 == old1->nVectors);
4540 assert(iFreq2 == old2->nVectors);
4541 assert(iFreqNew == new->nVectors);
4543 SetCodeDist(/*IN/OUT*/out,nPos,dmat);
4545 /* update constraints -- note in practice this should be a no-op */
4546 for (i = 0; i < nConstraints; i++) {
4547 out->nOn[i] += new->nOn[i] - old1->nOn[i] - old2->nOn[i];
4548 out->nOff[i] += new->nOff[i] - old1->nOff[i] - old2->nOff[i];
4552 void SetCodeDist(/*IN/OUT*/profile_t *profile, int nPos,
4553 distance_matrix_t *dmat) {
4554 if (profile->codeDist == NULL)
4555 profile->codeDist = (numeric_t*)mymalloc(sizeof(numeric_t)*nPos*nCodes);
4558 for (i = 0; i < nPos; i++) {
4559 numeric_t *f = GET_FREQ(profile,i,/*IN/OUT*/iFreq);
4562 for (k = 0; k < nCodes; k++)
4563 profile->codeDist[i*nCodes+k] = ProfileDistPiece(/*code1*/profile->codes[i], /*code2*/k,
4564 /*f1*/f, /*f2*/NULL,
4567 assert(iFreq==profile->nVectors);
4571 void SetBestHit(int node, NJ_t *NJ, int nActive,
4572 /*OUT*/besthit_t *bestjoin, /*OUT OPTIONAL*/besthit_t *allhits) {
4573 assert(NJ->parent[node] < 0);
4577 bestjoin->dist = 1e20;
4578 bestjoin->criterion = 1e20;
4584 /* Note -- if we are already in a parallel region, this will be ignored */
4585 #pragma omp parallel for schedule(dynamic, 50)
4587 for (j = 0; j < NJ->maxnode; j++) {
4588 besthit_t *sv = allhits != NULL ? &allhits[j] : &tmp;
4591 if (NJ->parent[j] >= 0) {
4592 sv->i = -1; /* illegal/empty join */
4594 sv->criterion = sv->dist = 1e20;
4597 /* Note that we compute self-distances (allow j==node) because the top-hit heuristic
4598 expects self to be within its top hits, but we exclude those from the bestjoin
4601 SetDistCriterion(NJ, nActive, /*IN/OUT*/sv);
4602 if (sv->criterion < bestjoin->criterion && node != j)
4606 fprintf(stderr, "SetBestHit %d %d %f %f\n", bestjoin->i, bestjoin->j, bestjoin->dist, bestjoin->criterion);
4610 void ReadMatrix(char *filename, /*OUT*/numeric_t codes[MAXCODES][MAXCODES], bool checkCodes) {
4611 char buf[BUFFER_SIZE] = "";
4612 FILE *fp = fopen(filename, "r");
4614 fprintf(stderr, "Cannot read %s\n",filename);
4617 if (fgets(buf,sizeof(buf),fp) == NULL) {
4618 fprintf(stderr, "Error reading header line for %s:\n%s\n", filename, buf);
4624 for (iBufPos=0,i=0;i<nCodes;i++,iBufPos++) {
4625 if(buf[iBufPos] != codesString[i]) {
4626 fprintf(stderr,"Header line\n%s\nin file %s does not have expected code %c # %d in %s\n",
4627 buf, filename, codesString[i], i, codesString);
4631 if(buf[iBufPos] != '\n' && buf[iBufPos] != '\r' && buf[iBufPos] != '\0' && buf[iBufPos] != '\t') {
4632 fprintf(stderr, "Header line in %s should be tab-delimited\n", filename);
4635 if (buf[iBufPos] == '\0' && i < nCodes-1) {
4636 fprintf(stderr, "Header line in %s ends prematurely\n",filename);
4639 } /* end loop over codes */
4640 /* Should be at end, but allow \n because of potential DOS \r\n */
4641 if(buf[iBufPos] != '\0' && buf[iBufPos] != '\n' && buf[iBufPos] != '\r') {
4642 fprintf(stderr, "Header line in %s has too many entries\n", filename);
4647 for (iLine = 0; iLine < nCodes; iLine++) {
4649 if (fgets(buf,sizeof(buf),fp) == NULL) {
4650 fprintf(stderr, "Cannot read line %d from file %s\n", iLine+2, filename);
4653 char *field = strtok(buf,"\t\r\n");
4654 field = strtok(NULL, "\t"); /* ignore first column */
4656 for (iColumn = 0; iColumn < nCodes && field != NULL; iColumn++, field = strtok(NULL,"\t")) {
4657 if(sscanf(field,ScanNumericSpec,&codes[iLine][iColumn]) != 1) {
4658 fprintf(stderr,"Cannot parse field %s in file %s\n", field, filename);
4665 void ReadVector(char *filename, /*OUT*/numeric_t codes[MAXCODES]) {
4666 FILE *fp = fopen(filename,"r");
4668 fprintf(stderr, "Cannot read %s\n",filename);
4672 for (i = 0; i < nCodes; i++) {
4673 if (fscanf(fp,ScanNumericSpec,&codes[i]) != 1) {
4674 fprintf(stderr,"Cannot read %d entry of %s\n",i+1,filename);
4678 if (fclose(fp) != 0) {
4679 fprintf(stderr, "Error reading %s\n",filename);
4684 distance_matrix_t *ReadDistanceMatrix(char *prefix) {
4685 char buffer[BUFFER_SIZE];
4686 distance_matrix_t *dmat = (distance_matrix_t*)mymalloc(sizeof(distance_matrix_t));
4688 if(strlen(prefix) > BUFFER_SIZE-20) {
4689 fprintf(stderr,"Filename %s too long\n", prefix);
4693 strcpy(buffer, prefix);
4694 strcat(buffer, ".distances");
4695 ReadMatrix(buffer, /*OUT*/dmat->distances, /*checkCodes*/true);
4697 strcpy(buffer, prefix);
4698 strcat(buffer, ".inverses");
4699 ReadMatrix(buffer, /*OUT*/dmat->eigeninv, /*checkCodes*/false);
4701 strcpy(buffer, prefix);
4702 strcat(buffer, ".eigenvalues");
4703 ReadVector(buffer, /*OUT*/dmat->eigenval);
4705 if(verbose>1) fprintf(stderr, "Read distance matrix from %s\n",prefix);
4706 SetupDistanceMatrix(/*IN/OUT*/dmat);
4710 void SetupDistanceMatrix(/*IN/OUT*/distance_matrix_t *dmat) {
4711 /* Check that the eigenvalues and eigen-inverse are consistent with the
4712 distance matrix and that the matrix is symmetric */
4714 for (i = 0; i < nCodes; i++) {
4715 for (j = 0; j < nCodes; j++) {
4716 if(fabs(dmat->distances[i][j]-dmat->distances[j][i]) > 1e-6) {
4717 fprintf(stderr,"Distance matrix not symmetric for %d,%d: %f vs %f\n",
4719 dmat->distances[i][j],
4720 dmat->distances[j][i]);
4724 for (k = 0; k < nCodes; k++)
4725 total += dmat->eigenval[k] * dmat->eigeninv[k][i] * dmat->eigeninv[k][j];
4726 if(fabs(total - dmat->distances[i][j]) > 1e-6) {
4727 fprintf(stderr,"Distance matrix entry %d,%d should be %f but eigen-representation gives %f\n",
4728 i+1,j+1,dmat->distances[i][j],total);
4734 /* And compute eigentot */
4735 for (k = 0; k < nCodes; k++) {
4736 dmat->eigentot[k] = 0.;
4738 for (j = 0; j < nCodes; j++)
4739 dmat->eigentot[k] += dmat->eigeninv[k][j];
4742 /* And compute codeFreq */
4744 for(code = 0; code < nCodes; code++) {
4745 for (k = 0; k < nCodes; k++) {
4746 dmat->codeFreq[code][k] = dmat->eigeninv[k][code];
4750 for(code = 0; code < nCodes; code++) {
4751 double gapFreq = 0.0;
4752 for (k = 0; k < nCodes; k++)
4753 gapFreq += dmat->codeFreq[k][code];
4754 dmat->gapFreq[code] = gapFreq / nCodes;
4757 if(verbose>10) fprintf(stderr, "Made codeFreq\n");
4760 nni_t ChooseNNI(profile_t *profiles[4],
4761 /*OPTIONAL*/distance_matrix_t *dmat,
4762 int nPos, int nConstraints,
4763 /*OUT*/double criteria[3]) {
4765 CorrectedPairDistances(profiles, 4, dmat, nPos, /*OUT*/d);
4766 double penalty[3]; /* indexed as nni_t */
4767 QuartetConstraintPenalties(profiles, nConstraints, /*OUT*/penalty);
4768 criteria[ABvsCD] = d[qAB] + d[qCD] + penalty[ABvsCD];
4769 criteria[ACvsBD] = d[qAC] + d[qBD] + penalty[ACvsBD];
4770 criteria[ADvsBC] = d[qAD] + d[qBC] + penalty[ADvsBC];
4772 nni_t choice = ABvsCD;
4773 if (criteria[ACvsBD] < criteria[ABvsCD] && criteria[ACvsBD] <= criteria[ADvsBC]) {
4775 } else if (criteria[ADvsBC] < criteria[ABvsCD] && criteria[ADvsBC] <= criteria[ACvsBD]) {
4778 if (verbose > 1 && penalty[choice] > penalty[ABvsCD] + 1e-6) {
4779 fprintf(stderr, "Worsen constraint: from %.3f to %.3f distance %.3f to %.3f: ",
4780 penalty[ABvsCD], penalty[choice],
4781 criteria[ABvsCD], choice == ACvsBD ? criteria[ACvsBD] : criteria[ADvsBC]);
4783 for (iC = 0; iC < nConstraints; iC++) {
4785 if (QuartetConstraintPenaltiesPiece(profiles, iC, /*OUT*/ppart)) {
4786 double old_penalty = ppart[ABvsCD];
4787 double new_penalty = ppart[choice];
4788 if (new_penalty > old_penalty + 1e-6)
4789 fprintf(stderr, " %d (%d/%d %d/%d %d/%d %d/%d)", iC,
4790 profiles[0]->nOn[iC], profiles[0]->nOff[iC],
4791 profiles[1]->nOn[iC], profiles[1]->nOff[iC],
4792 profiles[2]->nOn[iC], profiles[2]->nOff[iC],
4793 profiles[3]->nOn[iC], profiles[3]->nOff[iC]);
4796 fprintf(stderr,"\n");
4799 fprintf(stderr, "NNI scores ABvsCD %.5f ACvsBD %.5f ADvsBC %.5f choice %s\n",
4800 criteria[ABvsCD], criteria[ACvsBD], criteria[ADvsBC],
4801 choice == ABvsCD ? "AB|CD" : (choice == ACvsBD ? "AC|BD" : "AD|BC"));
4805 profile_t *PosteriorProfile(profile_t *p1, profile_t *p2,
4806 double len1, double len2,
4807 /*OPTIONAL*/transition_matrix_t *transmat,
4809 int nPos, int nConstraints) {
4810 if (len1 < MLMinBranchLength)
4811 len1 = MLMinBranchLength;
4812 if (len2 < MLMinBranchLength)
4813 len2 = MLMinBranchLength;
4816 profile_t *out = NewProfile(nPos, nConstraints);
4817 for (i = 0; i < nPos; i++) {
4818 out->codes[i] = NOCODE;
4819 out->weights[i] = 1.0;
4821 out->nVectors = nPos;
4822 out->vectors = (numeric_t*)mymalloc(sizeof(numeric_t)*nCodes*out->nVectors);
4823 for (i = 0; i < nCodes * out->nVectors; i++) out->vectors[i] = 0;
4827 numeric_t *expeigenRates1 = NULL, *expeigenRates2 = NULL;
4829 if (transmat != NULL) {
4830 expeigenRates1 = ExpEigenRates(len1, transmat, rates);
4831 expeigenRates2 = ExpEigenRates(len2, transmat, rates);
4834 if (transmat == NULL) { /* Jukes-Cantor */
4835 assert(nCodes == 4);
4837 numeric_t fAll[128][4];
4838 for (j = 0; j < 4; j++)
4839 for (k = 0; k < 4; k++)
4840 fAll[j][k] = (j==k) ? 1.0 : 0.0;
4841 for (k = 0; k < 4; k++)
4842 fAll[NOCODE][k] = 0.25;
4844 double *PSame1 = PSameVector(len1, rates);
4845 double *PDiff1 = PDiffVector(PSame1, rates);
4846 double *PSame2 = PSameVector(len2, rates);
4847 double *PDiff2 = PDiffVector(PSame2, rates);
4849 numeric_t mix1[4], mix2[4];
4851 for (i=0; i < nPos; i++) {
4852 int iRate = rates->ratecat[i];
4853 double w1 = p1->weights[i];
4854 double w2 = p2->weights[i];
4855 int code1 = p1->codes[i];
4856 int code2 = p2->codes[i];
4857 numeric_t *f1 = GET_FREQ(p1,i,/*IN/OUT*/iFreq1);
4858 numeric_t *f2 = GET_FREQ(p2,i,/*IN/OUT*/iFreq2);
4860 /* First try to store a simple profile */
4861 if (f1 == NULL && f2 == NULL) {
4862 if (code1 == NOCODE && code2 == NOCODE) {
4863 out->codes[i] = NOCODE;
4864 out->weights[i] = 0.0;
4866 } else if (code1 == NOCODE) {
4867 /* Posterior(parent | character & gap, len1, len2) = Posterior(parent | character, len1)
4868 = PSame() for matching characters and 1-PSame() for the rest
4869 = (pSame - pDiff) * character + (1-(pSame-pDiff)) * gap
4871 out->codes[i] = code2;
4872 out->weights[i] = w2 * (PSame2[iRate] - PDiff2[iRate]);
4874 } else if (code2 == NOCODE) {
4875 out->codes[i] = code1;
4876 out->weights[i] = w1 * (PSame1[iRate] - PDiff1[iRate]);
4878 } else if (code1 == code2) {
4879 out->codes[i] = code1;
4880 double f12code = (w1*PSame1[iRate] + (1-w1)*0.25) * (w2*PSame2[iRate] + (1-w2)*0.25);
4881 double f12other = (w1*PDiff1[iRate] + (1-w1)*0.25) * (w2*PDiff2[iRate] + (1-w2)*0.25);
4882 /* posterior probability of code1/code2 after scaling */
4883 double pcode = f12code/(f12code+3*f12other);
4884 /* Now f = w * (code ? 1 : 0) + (1-w) * 0.25, so to get pcode we need
4885 fcode = 1/4 + w1*3/4 or w = (f-1/4)*4/3
4887 out->weights[i] = (pcode - 0.25) * 4.0/3.0;
4888 /* This can be zero because of numerical problems, I think */
4889 if (out->weights[i] < 1e-6) {
4891 fprintf(stderr, "Replaced weight %f with %f from w1 %f w2 %f PSame %f %f f12code %f f12other %f\n",
4892 out->weights[i], 1e-6,
4894 PSame1[iRate], PSame2[iRate],
4896 out->weights[i] = 1e-6;
4901 /* if we did not compute a simple profile, then do the full computation and
4902 store the full vector
4905 for (j = 0; j < 4; j++)
4906 mix1[j] = (1-w1)*0.25;
4912 for (j = 0; j < 4; j++)
4913 mix2[j] = (1-w2)*0.25;
4918 out->codes[i] = NOCODE;
4919 out->weights[i] = 1.0;
4920 numeric_t *f = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4922 for (j = 0; j < 4; j++) {
4923 f[j] = (f1[j] * PSame1[iRate] + (1.0-f1[j]) * PDiff1[iRate])
4924 * (f2[j] * PSame2[iRate] + (1.0-f2[j]) * PDiff2[iRate]);
4927 double lkABInv = 1.0/lkAB;
4928 for (j = 0; j < 4; j++)
4931 PSame1 = myfree(PSame1, sizeof(double) * rates->nRateCategories);
4932 PSame2 = myfree(PSame2, sizeof(double) * rates->nRateCategories);
4933 PDiff1 = myfree(PDiff1, sizeof(double) * rates->nRateCategories);
4934 PDiff2 = myfree(PDiff2, sizeof(double) * rates->nRateCategories);
4935 } else if (nCodes == 4) { /* matrix model on nucleotides */
4936 numeric_t *fGap = &transmat->codeFreq[NOCODE][0];
4937 numeric_t f1mix[4], f2mix[4];
4939 for (i=0; i < nPos; i++) {
4940 if (p1->codes[i] == NOCODE && p2->codes[i] == NOCODE
4941 && p1->weights[i] == 0 && p2->weights[i] == 0) {
4942 /* aligning gap with gap -- just output a gap
4943 out->codes[i] is already set to NOCODE so need not set that */
4944 out->weights[i] = 0;
4947 int iRate = rates->ratecat[i];
4948 numeric_t *expeigen1 = &expeigenRates1[iRate*4];
4949 numeric_t *expeigen2 = &expeigenRates2[iRate*4];
4950 numeric_t *f1 = GET_FREQ(p1,i,/*IN/OUT*/iFreq1);
4951 numeric_t *f2 = GET_FREQ(p2,i,/*IN/OUT*/iFreq2);
4952 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
4953 assert(fOut != NULL);
4956 f1 = &transmat->codeFreq[p1->codes[i]][0]; /* codeFreq includes an entry for NOCODE */
4957 double w = p1->weights[i];
4958 if (w > 0.0 && w < 1.0) {
4959 for (j = 0; j < 4; j++)
4960 f1mix[j] = w * f1[j] + (1.0-w) * fGap[j];
4965 f2 = &transmat->codeFreq[p2->codes[i]][0];
4966 double w = p2->weights[i];
4967 if (w > 0.0 && w < 1.0) {
4968 for (j = 0; j < 4; j++)
4969 f2mix[j] = w * f2[j] + (1.0-w) * fGap[j];
4973 numeric_t fMult1[4] ALIGNED; /* rotated1 * expeigen1 */
4974 numeric_t fMult2[4] ALIGNED; /* rotated2 * expeigen2 */
4975 #if 0 /* SSE3 is slower */
4976 vector_multiply(f1, expeigen1, 4, /*OUT*/fMult1);
4977 vector_multiply(f2, expeigen2, 4, /*OUT*/fMult2);
4979 for (j = 0; j < 4; j++) {
4980 fMult1[j] = f1[j]*expeigen1[j];
4981 fMult2[j] = f2[j]*expeigen2[j];
4984 numeric_t fPost[4] ALIGNED; /* in unrotated space */
4985 for (j = 0; j < 4; j++) {
4986 #if 0 /* SSE3 is slower */
4987 fPost[j] = vector_dot_product_rot(fMult1, fMult2, &transmat->codeFreq[j][0], 4)
4988 * transmat->statinv[j]; */
4992 for (k = 0; k < 4; k++) {
4993 out1 += fMult1[k] * transmat->codeFreq[j][k];
4994 out2 += fMult2[k] * transmat->codeFreq[j][k];
4996 fPost[j] = out1*out2*transmat->statinv[j];
4999 double fPostTot = 0;
5000 for (j = 0; j < 4; j++)
5001 fPostTot += fPost[j];
5002 assert(fPostTot > fPostTotalTolerance);
5003 double fPostInv = 1.0/fPostTot;
5004 #if 0 /* SSE3 is slower */
5005 vector_multiply_by(fPost, fPostInv, 4);
5007 for (j = 0; j < 4; j++)
5008 fPost[j] *= fPostInv;
5011 /* and finally, divide by stat again & rotate to give the new frequencies */
5012 matrixt_by_vector4(transmat->eigeninvT, fPost, /*OUT*/fOut);
5013 } /* end loop over position i */
5014 } else if (nCodes == 20) { /* matrix model on amino acids */
5015 numeric_t *fGap = &transmat->codeFreq[NOCODE][0];
5016 numeric_t f1mix[20] ALIGNED;
5017 numeric_t f2mix[20] ALIGNED;
5019 for (i=0; i < nPos; i++) {
5020 if (p1->codes[i] == NOCODE && p2->codes[i] == NOCODE
5021 && p1->weights[i] == 0 && p2->weights[i] == 0) {
5022 /* aligning gap with gap -- just output a gap
5023 out->codes[i] is already set to NOCODE so need not set that */
5024 out->weights[i] = 0;
5027 int iRate = rates->ratecat[i];
5028 numeric_t *expeigen1 = &expeigenRates1[iRate*20];
5029 numeric_t *expeigen2 = &expeigenRates2[iRate*20];
5030 numeric_t *f1 = GET_FREQ(p1,i,/*IN/OUT*/iFreq1);
5031 numeric_t *f2 = GET_FREQ(p2,i,/*IN/OUT*/iFreq2);
5032 numeric_t *fOut = GET_FREQ(out,i,/*IN/OUT*/iFreqOut);
5033 assert(fOut != NULL);
5036 f1 = &transmat->codeFreq[p1->codes[i]][0]; /* codeFreq includes an entry for NOCODE */
5037 double w = p1->weights[i];
5038 if (w > 0.0 && w < 1.0) {
5039 for (j = 0; j < 20; j++)
5040 f1mix[j] = w * f1[j] + (1.0-w) * fGap[j];
5045 f2 = &transmat->codeFreq[p2->codes[i]][0];
5046 double w = p2->weights[i];
5047 if (w > 0.0 && w < 1.0) {
5048 for (j = 0; j < 20; j++)
5049 f2mix[j] = w * f2[j] + (1.0-w) * fGap[j];
5053 numeric_t fMult1[20] ALIGNED; /* rotated1 * expeigen1 */
5054 numeric_t fMult2[20] ALIGNED; /* rotated2 * expeigen2 */
5055 vector_multiply(f1, expeigen1, 20, /*OUT*/fMult1);
5056 vector_multiply(f2, expeigen2, 20, /*OUT*/fMult2);
5057 numeric_t fPost[20] ALIGNED; /* in unrotated space */
5058 for (j = 0; j < 20; j++) {
5059 numeric_t value = vector_dot_product_rot(fMult1, fMult2, &transmat->codeFreq[j][0], 20)
5060 * transmat->statinv[j];
5061 /* Added this logic try to avoid rare numerical problems */
5062 fPost[j] = value >= 0 ? value : 0;
5064 double fPostTot = vector_sum(fPost, 20);
5065 assert(fPostTot > fPostTotalTolerance);
5066 double fPostInv = 1.0/fPostTot;
5067 vector_multiply_by(/*IN/OUT*/fPost, fPostInv, 20);
5068 int ch = -1; /* the dominant character, if any */
5070 for (j = 0; j < 20; j++) {
5071 if (fPost[j] >= approxMLminf) {
5078 /* now, see if we can use the approximation
5079 fPost ~= (1 or 0) * w + nearP * (1-w)
5080 to avoid rotating */
5083 w = (fPost[ch] - transmat->nearP[ch][ch]) / (1.0 - transmat->nearP[ch][ch]);
5084 for (j = 0; j < 20; j++) {
5086 double fRough = (1.0-w) * transmat->nearP[ch][j];
5087 if (fRough < fPost[j] * approxMLminratio) {
5088 ch = -1; /* give up on the approximation */
5095 nAAPosteriorRough++;
5096 double wInvStat = w * transmat->statinv[ch];
5097 for (j = 0; j < 20; j++)
5098 fOut[j] = wInvStat * transmat->codeFreq[ch][j] + (1.0-w) * transmat->nearFreq[ch][j];
5100 /* and finally, divide by stat again & rotate to give the new frequencies */
5101 nAAPosteriorExact++;
5102 for (j = 0; j < 20; j++)
5103 fOut[j] = vector_multiply_sum(fPost, &transmat->eigeninv[j][0], 20);
5105 } /* end loop over position i */
5107 assert(0); /* illegal nCodes */
5110 if (transmat != NULL) {
5111 expeigenRates1 = myfree(expeigenRates1, sizeof(numeric_t) * rates->nRateCategories * nCodes);
5112 expeigenRates2 = myfree(expeigenRates2, sizeof(numeric_t) * rates->nRateCategories * nCodes);
5115 /* Reallocate out->vectors to be the right size */
5116 out->nVectors = iFreqOut;
5117 if (out->nVectors == 0)
5118 out->vectors = (numeric_t*)myfree(out->vectors, sizeof(numeric_t)*nCodes*nPos);
5120 out->vectors = (numeric_t*)myrealloc(out->vectors,
5121 /*OLDSIZE*/sizeof(numeric_t)*nCodes*nPos,
5122 /*NEWSIZE*/sizeof(numeric_t)*nCodes*out->nVectors,
5123 /*copy*/true); /* try to save space */
5124 nProfileFreqAlloc += out->nVectors;
5125 nProfileFreqAvoid += nPos - out->nVectors;
5127 /* compute total constraints */
5128 for (i = 0; i < nConstraints; i++) {
5129 out->nOn[i] = p1->nOn[i] + p2->nOn[i];
5130 out->nOff[i] = p1->nOff[i] + p2->nOff[i];
5132 nPosteriorCompute++;
5136 double *PSameVector(double length, rates_t *rates) {
5137 double *pSame = mymalloc(sizeof(double) * rates->nRateCategories);
5139 for (iRate = 0; iRate < rates->nRateCategories; iRate++)
5140 pSame[iRate] = 0.25 + 0.75 * exp((-4.0/3.0) * fabs(length*rates->rates[iRate]));
5144 double *PDiffVector(double *pSame, rates_t *rates) {
5145 double *pDiff = mymalloc(sizeof(double) * rates->nRateCategories);
5147 for (iRate = 0; iRate < rates->nRateCategories; iRate++)
5148 pDiff[iRate] = (1.0 - pSame[iRate])/3.0;
5152 numeric_t *ExpEigenRates(double length, transition_matrix_t *transmat, rates_t *rates) {
5153 numeric_t *expeigen = mymalloc(sizeof(numeric_t) * nCodes * rates->nRateCategories);
5155 for (iRate = 0; iRate < rates->nRateCategories; iRate++) {
5156 for (j = 0; j < nCodes; j++) {
5157 double relLen = length * rates->rates[iRate];
5158 /* very short branch lengths lead to numerical problems so prevent them */
5159 if (relLen < MLMinRelBranchLength)
5160 relLen = MLMinRelBranchLength;
5161 expeigen[iRate*nCodes + j] = exp(relLen * transmat->eigenval[j]);
5167 double PairLogLk(profile_t *pA, profile_t *pB, double length, int nPos,
5168 /*OPTIONAL*/transition_matrix_t *transmat,
5170 /*OPTIONAL IN/OUT*/double *site_likelihoods) {
5172 double loglk = 0.0; /* stores underflow of lk during the loop over positions */
5174 assert(rates != NULL && rates->nRateCategories > 0);
5175 numeric_t *expeigenRates = NULL;
5176 if (transmat != NULL)
5177 expeigenRates = ExpEigenRates(length, transmat, rates);
5179 if (transmat == NULL) { /* Jukes-Cantor */
5180 assert (nCodes == 4);
5181 double *pSame = PSameVector(length, rates);
5182 double *pDiff = PDiffVector(pSame, rates);
5183 numeric_t fAll[128][4];
5184 for (j = 0; j < 4; j++)
5185 for (k = 0; k < 4; k++)
5186 fAll[j][k] = (j==k) ? 1.0 : 0.0;
5187 for (k = 0; k < 4; k++)
5188 fAll[NOCODE][k] = 0.25;
5192 for (i = 0; i < nPos; i++) {
5193 int iRate = rates->ratecat[i];
5194 double wA = pA->weights[i];
5195 double wB = pB->weights[i];
5196 int codeA = pA->codes[i];
5197 int codeB = pB->codes[i];
5198 numeric_t *fA = GET_FREQ(pA,i,/*IN/OUT*/iFreqA);
5199 numeric_t *fB = GET_FREQ(pB,i,/*IN/OUT*/iFreqB);
5202 if (fA == NULL && fB == NULL) {
5203 if (codeA == NOCODE) { /* A is all gaps */
5204 /* gap to gap is sum(j) 0.25 * (0.25 * pSame + 0.75 * pDiff) = sum(i) 0.25*0.25 = 0.25
5205 gap to any character gives the same result
5208 } else if (codeB == NOCODE) { /* B is all gaps */
5210 } else if (codeA == codeB) { /* A and B match */
5211 lkAB = pSame[iRate] * wA*wB + 0.25 * (1-wA*wB);
5212 } else { /* codeA != codeB */
5213 lkAB = pDiff[iRate] * wA*wB + 0.25 * (1-wA*wB);
5215 } else if (fA == NULL) {
5216 /* Compare codeA to profile of B */
5217 if (codeA == NOCODE)
5220 lkAB = wA * (pDiff[iRate] + fB[codeA] * (pSame[iRate]-pDiff[iRate])) + (1.0-wA) * 0.25;
5221 /* because lkAB = wA * P(codeA->B) + (1-wA) * 0.25
5222 P(codeA -> B) = sum(j) P(B==j) * (j==codeA ? pSame : pDiff)
5223 = sum(j) P(B==j) * pDiff +
5224 = pDiff + P(B==codeA) * (pSame-pDiff)
5226 } else if (fB == NULL) { /* Compare codeB to profile of A */
5227 if (codeB == NOCODE)
5230 lkAB = wB * (pDiff[iRate] + fA[codeB] * (pSame[iRate]-pDiff[iRate])) + (1.0-wB) * 0.25;
5231 } else { /* both are full profiles */
5232 for (j = 0; j < 4; j++)
5233 lkAB += fB[j] * (fA[j] * pSame[iRate] + (1-fA[j])* pDiff[iRate]); /* P(A|B) */
5237 while (lk < LkUnderflow) {
5238 lk *= LkUnderflowInv;
5239 loglk -= LogLkUnderflow;
5241 if (site_likelihoods != NULL)
5242 site_likelihoods[i] *= lkAB;
5244 pSame = myfree(pSame, sizeof(double) * rates->nRateCategories);
5245 pDiff = myfree(pDiff, sizeof(double) * rates->nRateCategories);
5246 } else if (nCodes == 4) { /* matrix model on nucleotides */
5249 numeric_t fAmix[4], fBmix[4];
5250 numeric_t *fGap = &transmat->codeFreq[NOCODE][0];
5252 for (i = 0; i < nPos; i++) {
5253 int iRate = rates->ratecat[i];
5254 numeric_t *expeigen = &expeigenRates[iRate*4];
5255 double wA = pA->weights[i];
5256 double wB = pB->weights[i];
5257 if (wA == 0 && wB == 0 && pA->codes[i] == NOCODE && pB->codes[i] == NOCODE) {
5258 /* Likelihood of A vs B is 1, so nothing changes
5259 Do not need to advance iFreqA or iFreqB */
5262 numeric_t *fA = GET_FREQ(pA,i,/*IN/OUT*/iFreqA);
5263 numeric_t *fB = GET_FREQ(pB,i,/*IN/OUT*/iFreqB);
5265 fA = &transmat->codeFreq[pA->codes[i]][0];
5266 if (wA > 0.0 && wA < 1.0) {
5267 for (j = 0; j < 4; j++)
5268 fAmix[j] = wA*fA[j] + (1.0-wA)*fGap[j];
5272 fB = &transmat->codeFreq[pB->codes[i]][0];
5273 if (wB > 0.0 && wB < 1.0) {
5274 for (j = 0; j < 4; j++)
5275 fBmix[j] = wB*fB[j] + (1.0-wB)*fGap[j];
5278 /* SSE3 instructions do not speed this step up:
5279 numeric_t lkAB = vector_multiply3_sum(expeigen, fA, fB); */
5281 for (j = 0; j < 4; j++)
5282 lkAB += expeigen[j]*fA[j]*fB[j];
5284 if (site_likelihoods != NULL)
5285 site_likelihoods[i] *= lkAB;
5287 while (lk < LkUnderflow) {
5288 lk *= LkUnderflowInv;
5289 loglk -= LogLkUnderflow;
5291 while (lk > LkUnderflowInv) {
5293 loglk += LogLkUnderflow;
5296 } else if (nCodes == 20) { /* matrix model on amino acids */
5299 numeric_t fAmix[20], fBmix[20];
5300 numeric_t *fGap = &transmat->codeFreq[NOCODE][0];
5302 for (i = 0; i < nPos; i++) {
5303 int iRate = rates->ratecat[i];
5304 numeric_t *expeigen = &expeigenRates[iRate*20];
5305 double wA = pA->weights[i];
5306 double wB = pB->weights[i];
5307 if (wA == 0 && wB == 0 && pA->codes[i] == NOCODE && pB->codes[i] == NOCODE) {
5308 /* Likelihood of A vs B is 1, so nothing changes
5309 Do not need to advance iFreqA or iFreqB */
5312 numeric_t *fA = GET_FREQ(pA,i,/*IN/OUT*/iFreqA);
5313 numeric_t *fB = GET_FREQ(pB,i,/*IN/OUT*/iFreqB);
5315 fA = &transmat->codeFreq[pA->codes[i]][0];
5316 if (wA > 0.0 && wA < 1.0) {
5317 for (j = 0; j < 20; j++)
5318 fAmix[j] = wA*fA[j] + (1.0-wA)*fGap[j];
5322 fB = &transmat->codeFreq[pB->codes[i]][0];
5323 if (wB > 0.0 && wB < 1.0) {
5324 for (j = 0; j < 20; j++)
5325 fBmix[j] = wB*fB[j] + (1.0-wB)*fGap[j];
5328 numeric_t lkAB = vector_multiply3_sum(expeigen, fA, fB, 20);
5330 /* If this happens, it indicates a numerical problem that needs to be addressed elsewhere,
5331 so report all the details */
5332 fprintf(stderr, "# FastTree.c::PairLogLk -- numerical problem!\n");
5333 fprintf(stderr, "# This block is intended for loading into R\n");
5335 fprintf(stderr, "lkAB = %.8g\n", lkAB);
5336 fprintf(stderr, "Branch_length= %.8g\nalignment_position=%d\nnCodes=%d\nrate_category=%d\nrate=%.8g\n",
5337 length, i, nCodes, iRate, rates->rates[iRate]);
5338 fprintf(stderr, "wA=%.8g\nwB=%.8g\n", wA, wB);
5339 fprintf(stderr, "codeA = %d\ncodeB = %d\n", pA->codes[i], pB->codes[i]);
5341 fprintf(stderr, "fA = c(");
5342 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", fA[j]);
5343 fprintf(stderr,")\n");
5345 fprintf(stderr, "fB = c(");
5346 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", fB[j]);
5347 fprintf(stderr,")\n");
5349 fprintf(stderr, "stat = c(");
5350 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", transmat->stat[j]);
5351 fprintf(stderr,")\n");
5353 fprintf(stderr, "eigenval = c(");
5354 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", transmat->eigenval[j]);
5355 fprintf(stderr,")\n");
5357 fprintf(stderr, "expeigen = c(");
5358 for (j = 0; j < nCodes; j++) fprintf(stderr, "%s %.8g", j==0?"":",", expeigen[j]);
5359 fprintf(stderr,")\n");
5362 fprintf(stderr, "codeFreq = c(");
5363 for (j = 0; j < nCodes; j++) for(k = 0; k < nCodes; k++) fprintf(stderr, "%s %.8g", j==0 && k==0?"":",",
5364 transmat->codeFreq[j][k]);
5365 fprintf(stderr,")\n");
5367 fprintf(stderr, "eigeninv = c(");
5368 for (j = 0; j < nCodes; j++) for(k = 0; k < nCodes; k++) fprintf(stderr, "%s %.8g", j==0 && k==0?"":",",
5369 transmat->eigeninv[j][k]);
5370 fprintf(stderr,")\n");
5372 fprintf(stderr, "# Transform into matrices and compute un-rotated vectors for profiles A and B\n");
5373 fprintf(stderr, "codeFreq = matrix(codeFreq,nrow=20);\n");
5374 fprintf(stderr, "eigeninv = matrix(eigeninv,nrow=20);\n");
5375 fputs("unrotA = stat * (eigeninv %*% fA)\n", stderr);
5376 fputs("unrotB = stat * (eigeninv %*% fB)\n", stderr);
5377 fprintf(stderr,"# End of R block\n");
5380 if (site_likelihoods != NULL)
5381 site_likelihoods[i] *= lkAB;
5383 while (lk < LkUnderflow) {
5384 lk *= LkUnderflowInv;
5385 loglk -= LogLkUnderflow;
5387 while (lk > LkUnderflowInv) {
5389 loglk += LogLkUnderflow;
5393 assert(0); /* illegal nCodes */
5395 if (transmat != NULL)
5396 expeigenRates = myfree(expeigenRates, sizeof(numeric_t) * rates->nRateCategories * 20);
5402 double MLQuartetLogLk(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
5403 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
5404 /*IN*/double branch_lengths[5],
5405 /*OPTIONAL OUT*/double *site_likelihoods) {
5406 profile_t *pAB = PosteriorProfile(pA, pB,
5407 branch_lengths[0], branch_lengths[1],
5410 nPos, /*nConstraints*/0);
5411 profile_t *pCD = PosteriorProfile(pC, pD,
5412 branch_lengths[2], branch_lengths[3],
5415 nPos, /*nConstraints*/0);
5416 if (site_likelihoods != NULL) {
5418 for (i = 0; i < nPos; i++)
5419 site_likelihoods[i] = 1.0;
5421 /* Roughly, P(A,B,C,D) = P(A) P(B|A) P(D|C) P(AB | CD) */
5422 double loglk = PairLogLk(pA, pB, branch_lengths[0]+branch_lengths[1],
5423 nPos, transmat, rates, /*OPTIONAL IN/OUT*/site_likelihoods)
5424 + PairLogLk(pC, pD, branch_lengths[2]+branch_lengths[3],
5425 nPos, transmat, rates, /*OPTIONAL IN/OUT*/site_likelihoods)
5426 + PairLogLk(pAB, pCD, branch_lengths[4],
5427 nPos, transmat, rates, /*OPTIONAL IN/OUT*/site_likelihoods);
5428 pAB = FreeProfile(pAB, nPos, /*nConstraints*/0);
5429 pCD = FreeProfile(pCD, nPos, /*nConstraints*/0);
5433 double PairNegLogLk(double x, void *data) {
5434 quartet_opt_t *qo = (quartet_opt_t *)data;
5436 assert(qo->pair1 != NULL && qo->pair2 != NULL);
5438 double loglk = PairLogLk(qo->pair1, qo->pair2, x, qo->nPos, qo->transmat, qo->rates, /*site_lk*/NULL);
5439 assert(loglk < 1e100);
5441 fprintf(stderr, "PairLogLk(%.4f) = %.4f\n", x, loglk);
5445 double MLQuartetOptimize(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
5446 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
5447 /*IN/OUT*/double branch_lengths[5],
5448 /*OPTIONAL OUT*/bool *pStarTest,
5449 /*OPTIONAL OUT*/double *site_likelihoods) {
5451 double start_length[5];
5452 for (j = 0; j < 5; j++) {
5453 start_length[j] = branch_lengths[j];
5454 if (branch_lengths[j] < MLMinBranchLength)
5455 branch_lengths[j] = MLMinBranchLength;
5457 quartet_opt_t qopt = { nPos, transmat, rates, /*nEval*/0,
5458 /*pair1*/NULL, /*pair2*/NULL };
5459 double f2x, negloglk;
5461 if (pStarTest != NULL)
5464 /* First optimize internal branch, then branch to A, B, C, D, in turn
5465 May use star test to quit after internal branch
5467 profile_t *pAB = PosteriorProfile(pA, pB,
5468 branch_lengths[LEN_A], branch_lengths[LEN_B],
5469 transmat, rates, nPos, /*nConstraints*/0);
5470 profile_t *pCD = PosteriorProfile(pC, pD,
5471 branch_lengths[LEN_C], branch_lengths[LEN_D],
5472 transmat, rates, nPos, /*nConstraints*/0);
5475 branch_lengths[LEN_I] = onedimenmin(/*xmin*/MLMinBranchLength,
5476 /*xguess*/branch_lengths[LEN_I],
5480 /*ftol*/MLFTolBranchLength,
5481 /*atol*/MLMinBranchLengthTolerance,
5485 if (pStarTest != NULL) {
5486 assert(site_likelihoods == NULL);
5487 double loglkStar = -PairNegLogLk(MLMinBranchLength, &qopt);
5488 if (loglkStar < -negloglk - closeLogLkLimit) {
5490 double off = PairLogLk(pA, pB,
5491 branch_lengths[LEN_A] + branch_lengths[LEN_B],
5492 qopt.nPos, qopt.transmat, qopt.rates, /*site_lk*/NULL)
5494 branch_lengths[LEN_C] + branch_lengths[LEN_D],
5495 qopt.nPos, qopt.transmat, qopt.rates, /*site_lk*/NULL);
5496 pAB = FreeProfile(pAB, nPos, /*nConstraints*/0);
5497 pCD = FreeProfile(pCD, nPos, /*nConstraints*/0);
5498 return (-negloglk + off);
5501 pAB = FreeProfile(pAB, nPos, /*nConstraints*/0);
5502 profile_t *pBCD = PosteriorProfile(pB, pCD,
5503 branch_lengths[LEN_B], branch_lengths[LEN_I],
5504 transmat, rates, nPos, /*nConstraints*/0);
5507 branch_lengths[LEN_A] = onedimenmin(/*xmin*/MLMinBranchLength,
5508 /*xguess*/branch_lengths[LEN_A],
5512 /*ftol*/MLFTolBranchLength,
5513 /*atol*/MLMinBranchLengthTolerance,
5516 pBCD = FreeProfile(pBCD, nPos, /*nConstraints*/0);
5517 profile_t *pACD = PosteriorProfile(pA, pCD,
5518 branch_lengths[LEN_A], branch_lengths[LEN_I],
5519 transmat, rates, nPos, /*nConstraints*/0);
5522 branch_lengths[LEN_B] = onedimenmin(/*xmin*/MLMinBranchLength,
5523 /*xguess*/branch_lengths[LEN_B],
5527 /*ftol*/MLFTolBranchLength,
5528 /*atol*/MLMinBranchLengthTolerance,
5531 pACD = FreeProfile(pACD, nPos, /*nConstraints*/0);
5532 pCD = FreeProfile(pCD, nPos, /*nConstraints*/0);
5533 pAB = PosteriorProfile(pA, pB,
5534 branch_lengths[LEN_A], branch_lengths[LEN_B],
5535 transmat, rates, nPos, /*nConstraints*/0);
5536 profile_t *pABD = PosteriorProfile(pAB, pD,
5537 branch_lengths[LEN_I], branch_lengths[LEN_D],
5538 transmat, rates, nPos, /*nConstraints*/0);
5541 branch_lengths[LEN_C] = onedimenmin(/*xmin*/MLMinBranchLength,
5542 /*xguess*/branch_lengths[LEN_C],
5546 /*ftol*/MLFTolBranchLength,
5547 /*atol*/MLMinBranchLengthTolerance,
5550 pABD = FreeProfile(pABD, nPos, /*nConstraints*/0);
5551 profile_t *pABC = PosteriorProfile(pAB, pC,
5552 branch_lengths[LEN_I], branch_lengths[LEN_C],
5553 transmat, rates, nPos, /*nConstraints*/0);
5556 branch_lengths[LEN_D] = onedimenmin(/*xmin*/MLMinBranchLength,
5557 /*xguess*/branch_lengths[LEN_D],
5561 /*ftol*/MLFTolBranchLength,
5562 /*atol*/MLMinBranchLengthTolerance,
5566 /* Compute the total quartet likelihood
5567 PairLogLk(ABC,D) + PairLogLk(AB,C) + PairLogLk(A,B)
5569 double loglkABCvsD = -negloglk;
5570 if (site_likelihoods) {
5571 for (j = 0; j < nPos; j++)
5572 site_likelihoods[j] = 1.0;
5573 PairLogLk(pABC, pD, branch_lengths[LEN_D],
5574 qopt.nPos, qopt.transmat, qopt.rates, /*IN/OUT*/site_likelihoods);
5576 double quartetloglk = loglkABCvsD
5577 + PairLogLk(pAB, pC, branch_lengths[LEN_I] + branch_lengths[LEN_C],
5578 qopt.nPos, qopt.transmat, qopt.rates,
5579 /*IN/OUT*/site_likelihoods)
5580 + PairLogLk(pA, pB, branch_lengths[LEN_A] + branch_lengths[LEN_B],
5581 qopt.nPos, qopt.transmat, qopt.rates,
5582 /*IN/OUT*/site_likelihoods);
5584 pABC = FreeProfile(pABC, nPos, /*nConstraints*/0);
5585 pAB = FreeProfile(pAB, nPos, /*nConstraints*/0);
5588 double loglkStart = MLQuartetLogLk(pA, pB, pC, pD, nPos, transmat, rates, start_length, /*site_lk*/NULL);
5589 fprintf(stderr, "Optimize loglk from %.5f to %.5f eval %d lengths from\n"
5590 " %.5f %.5f %.5f %.5f %.5f to\n"
5591 " %.5f %.5f %.5f %.5f %.5f\n",
5592 loglkStart, quartetloglk, qopt.nEval,
5593 start_length[0], start_length[1], start_length[2], start_length[3], start_length[4],
5594 branch_lengths[0], branch_lengths[1], branch_lengths[2], branch_lengths[3], branch_lengths[4]);
5596 return(quartetloglk);
5599 nni_t MLQuartetNNI(profile_t *profiles[4],
5600 /*OPTIONAL*/transition_matrix_t *transmat,
5602 int nPos, int nConstraints,
5603 /*OUT*/double criteria[3], /* The three potential quartet log-likelihoods */
5604 /*IN/OUT*/numeric_t len[5],
5608 double lenABvsCD[5] = {len[LEN_A], len[LEN_B], len[LEN_C], len[LEN_D], len[LEN_I]};
5609 double lenACvsBD[5] = {len[LEN_A], len[LEN_C], len[LEN_B], len[LEN_D], len[LEN_I]}; /* Swap B & C */
5610 double lenADvsBC[5] = {len[LEN_A], len[LEN_D], len[LEN_C], len[LEN_B], len[LEN_I]}; /* Swap B & D */
5611 bool bConsiderAC = true;
5612 bool bConsiderAD = true;
5614 int nRounds = mlAccuracy < 2 ? 2 : mlAccuracy;
5616 QuartetConstraintPenalties(profiles, nConstraints, /*OUT*/penalty);
5617 if (penalty[ABvsCD] > penalty[ACvsBD] || penalty[ABvsCD] > penalty[ADvsBC])
5620 bFast = false; /* turn off star topology test */
5623 for (iRound = 0; iRound < nRounds; iRound++) {
5624 bool bStarTest = false;
5627 #pragma omp parallel
5628 #pragma omp sections
5635 criteria[ABvsCD] = MLQuartetOptimize(profiles[0], profiles[1], profiles[2], profiles[3],
5636 nPos, transmat, rates,
5637 /*IN/OUT*/lenABvsCD,
5638 bFast ? &bStarTest : NULL,
5639 /*site_likelihoods*/NULL)
5640 - penalty[ABvsCD]; /* subtract penalty b/c we are trying to maximize log lk */
5648 criteria[ACvsBD] = -1e20;
5649 criteria[ADvsBC] = -1e20;
5650 len[LEN_I] = lenABvsCD[LEN_I];
5656 criteria[ACvsBD] = MLQuartetOptimize(profiles[0], profiles[2], profiles[1], profiles[3],
5657 nPos, transmat, rates,
5658 /*IN/OUT*/lenACvsBD, NULL, /*site_likelihoods*/NULL)
5667 criteria[ADvsBC] = MLQuartetOptimize(profiles[0], profiles[3], profiles[2], profiles[1],
5668 nPos, transmat, rates,
5669 /*IN/OUT*/lenADvsBC, NULL, /*site_likelihoods*/NULL)
5673 } /* end parallel sections */
5674 if (mlAccuracy < 2) {
5675 /* If clearly worse then ABvsCD, or have short internal branch length and worse, then
5677 if (criteria[ACvsBD] < criteria[ABvsCD] - closeLogLkLimit
5678 || (lenACvsBD[LEN_I] <= 2.0*MLMinBranchLength && criteria[ACvsBD] < criteria[ABvsCD]))
5679 bConsiderAC = false;
5680 if (criteria[ADvsBC] < criteria[ABvsCD] - closeLogLkLimit
5681 || (lenADvsBC[LEN_I] <= 2.0*MLMinBranchLength && criteria[ADvsBC] < criteria[ABvsCD]))
5682 bConsiderAD = false;
5683 if (!bConsiderAC && !bConsiderAD)
5685 /* If clearly better than either alternative, then give up
5686 (Comparison is probably biased in favor of ABvsCD anyway) */
5687 if (criteria[ACvsBD] > criteria[ABvsCD] + closeLogLkLimit
5688 && criteria[ACvsBD] > criteria[ADvsBC] + closeLogLkLimit)
5690 if (criteria[ADvsBC] > criteria[ABvsCD] + closeLogLkLimit
5691 && criteria[ADvsBC] > criteria[ACvsBD] + closeLogLkLimit)
5694 } /* end loop over rounds */
5697 fprintf(stderr, "Optimized quartet for %d rounds: ABvsCD %.5f ACvsBD %.5f ADvsBC %.5f\n",
5698 iRound, criteria[ABvsCD], criteria[ACvsBD], criteria[ADvsBC]);
5700 if (criteria[ACvsBD] > criteria[ABvsCD] && criteria[ACvsBD] > criteria[ADvsBC]) {
5701 for (i = 0; i < 5; i++) len[i] = lenACvsBD[i];
5703 } else if (criteria[ADvsBC] > criteria[ABvsCD] && criteria[ADvsBC] > criteria[ACvsBD]) {
5704 for (i = 0; i < 5; i++) len[i] = lenADvsBC[i];
5707 for (i = 0; i < 5; i++) len[i] = lenABvsCD[i];
5712 double TreeLength(/*IN/OUT*/NJ_t *NJ, bool recomputeProfiles) {
5713 if (recomputeProfiles) {
5714 traversal_t traversal2 = InitTraversal(NJ);
5716 while((j = TraversePostorder(j, NJ, /*IN/OUT*/traversal2, /*pUp*/NULL)) >= 0) {
5717 /* nothing to do for leaves or root */
5718 if (j >= NJ->nSeq && j != NJ->root)
5719 SetProfile(/*IN/OUT*/NJ, j, /*noweight*/-1.0);
5721 traversal2 = FreeTraversal(traversal2,NJ);
5723 UpdateBranchLengths(/*IN/OUT*/NJ);
5724 double total_len = 0;
5726 for (iNode = 0; iNode < NJ->maxnode; iNode++)
5727 total_len += NJ->branchlength[iNode];
5731 double TreeLogLk(/*IN*/NJ_t *NJ, /*OPTIONAL OUT*/double *site_loglk) {
5736 double *site_likelihood = NULL;
5737 if (site_loglk != NULL) {
5738 site_likelihood = mymalloc(sizeof(double)*NJ->nPos);
5739 for (i = 0; i < NJ->nPos; i++) {
5740 site_likelihood[i] = 1.0;
5741 site_loglk[i] = 0.0;
5744 traversal_t traversal = InitTraversal(NJ);
5745 int node = NJ->root;
5746 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
5747 int nChild = NJ->child[node].nChild;
5750 assert(nChild >= 2);
5751 int *children = NJ->child[node].child;
5752 double loglkchild = PairLogLk(NJ->profiles[children[0]], NJ->profiles[children[1]],
5753 NJ->branchlength[children[0]]+NJ->branchlength[children[1]],
5754 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/site_likelihood);
5755 loglk += loglkchild;
5756 if (site_likelihood != NULL) {
5757 /* prevent underflows */
5758 for (i = 0; i < NJ->nPos; i++) {
5759 while(site_likelihood[i] < LkUnderflow) {
5760 site_likelihood[i] *= LkUnderflowInv;
5761 site_loglk[i] -= LogLkUnderflow;
5766 fprintf(stderr, "At %d: LogLk(%d:%.4f,%d:%.4f) = %.3f\n",
5768 children[0], NJ->branchlength[children[0]],
5769 children[1], NJ->branchlength[children[1]],
5771 if (NJ->child[node].nChild == 3) {
5772 assert(node == NJ->root);
5773 /* Infer the common parent of the 1st two to define the third... */
5774 profile_t *pAB = PosteriorProfile(NJ->profiles[children[0]],
5775 NJ->profiles[children[1]],
5776 NJ->branchlength[children[0]],
5777 NJ->branchlength[children[1]],
5778 NJ->transmat, &NJ->rates,
5779 NJ->nPos, /*nConstraints*/0);
5780 double loglkup = PairLogLk(pAB, NJ->profiles[children[2]],
5781 NJ->branchlength[children[2]],
5782 NJ->nPos, NJ->transmat, &NJ->rates,
5783 /*IN/OUT*/site_likelihood);
5786 fprintf(stderr, "At root %d: LogLk((%d/%d),%d:%.3f) = %.3f\n",
5787 node, children[0], children[1], children[2],
5788 NJ->branchlength[children[2]],
5790 pAB = FreeProfile(pAB, NJ->nPos, NJ->nConstraints);
5793 traversal = FreeTraversal(traversal,NJ);
5794 if (site_likelihood != NULL) {
5795 for (i = 0; i < NJ->nPos; i++) {
5796 site_loglk[i] += log(site_likelihood[i]);
5798 site_likelihood = myfree(site_likelihood, sizeof(double)*NJ->nPos);
5801 /* For Jukes-Cantor, with a tree of size 4, if the children of the root are
5802 (A,B), C, and D, then
5803 P(ABCD) = P(A) P(B|A) P(C|AB) P(D|ABC)
5805 Above we compute P(B|A) P(C|AB) P(D|ABC) -- note P(B|A) is at the child of root
5806 and P(C|AB) P(D|ABC) is at root.
5808 Similarly if the children of the root are C, D, and (A,B), then
5809 P(ABCD) = P(C|D) P(A|B) P(AB|CD) P(D), and above we compute that except for P(D)
5811 So we need to multiply by P(A) = 0.25, so we pay log(4) at each position
5812 (if ungapped). Each gapped position in any sequence reduces the payment by log(4)
5814 For JTT or GTR, we are computing P(A & B) and the posterior profiles are scaled to take
5815 the prior into account, so we do not need any correction.
5816 codeFreq[NOCODE] is scaled x higher so that P(-) = 1 not P(-)=1/nCodes, so gaps
5817 do not need to be corrected either.
5820 if (nCodes == 4 && NJ->transmat == NULL) {
5822 double logNCodes = log((double)nCodes);
5823 for (i = 0; i < NJ->nPos; i++) {
5824 int nGapsThisPos = 0;
5825 for (node = 0; node < NJ->nSeq; node++) {
5826 unsigned char *codes = NJ->profiles[node]->codes;
5827 if (codes[i] == NOCODE)
5830 nGaps += nGapsThisPos;
5831 if (site_loglk != NULL) {
5832 site_loglk[i] += nGapsThisPos * logNCodes;
5833 if (nCodes == 4 && NJ->transmat == NULL)
5834 site_loglk[i] -= logNCodes;
5837 loglk -= NJ->nPos * logNCodes;
5838 loglk += nGaps * logNCodes; /* do not pay for gaps -- only Jukes-Cantor */
5843 void SetMLGtr(/*IN/OUT*/NJ_t *NJ, /*OPTIONAL IN*/double *freq_in, /*OPTIONAL WRITE*/FILE *fpLog) {
5848 if (freq_in != NULL) {
5850 gtr.freq[i]=freq_in[i];
5852 int n[4] = {1,1,1,1}; /* pseudocounts */
5853 for (i=0; i<NJ->nSeq; i++) {
5854 unsigned char *codes = NJ->profiles[i]->codes;
5856 for (iPos=0; iPos<NJ->nPos; iPos++)
5857 if (codes[iPos] < 4)
5860 int sum = n[0]+n[1]+n[2]+n[3];
5862 gtr.freq[i] = n[i]/(double)sum;
5866 int nRounds = mlAccuracy < 2 ? 2 : mlAccuracy;
5867 for (i = 0; i < nRounds; i++) {
5868 for (gtr.iRate = 0; gtr.iRate < 6; gtr.iRate++) {
5869 ProgressReport("Optimizing GTR model, step %d of %d", i*6+gtr.iRate+1, 12, 0, 0);
5870 double negloglk, f2x;
5871 gtr.rates[gtr.iRate] = onedimenmin(/*xmin*/0.05,
5872 /*xguess*/gtr.rates[gtr.iRate],
5882 /* normalize gtr so last rate is 1 -- specifying that rate separately is useful for optimization only */
5883 for (i = 0; i < 5; i++)
5884 gtr.rates[i] /= gtr.rates[5];
5887 fprintf(stderr, "GTR Frequencies: %.4f %.4f %.4f %.4f\n", gtr.freq[0], gtr.freq[1], gtr.freq[2], gtr.freq[3]);
5888 fprintf(stderr, "GTR rates(ac ag at cg ct gt) %.4f %.4f %.4f %.4f %.4f %.4f\n",
5889 gtr.rates[0],gtr.rates[1],gtr.rates[2],gtr.rates[3],gtr.rates[4],gtr.rates[5]);
5891 if (fpLog != NULL) {
5892 fprintf(fpLog, "GTRFreq\t%.4f\t%.4f\t%.4f\t%.4f\n", gtr.freq[0], gtr.freq[1], gtr.freq[2], gtr.freq[3]);
5893 fprintf(fpLog, "GTRRates\t%.4f\t%.4f\t%.4f\t%.4f\t%.4f\t%.4f\n",
5894 gtr.rates[0],gtr.rates[1],gtr.rates[2],gtr.rates[3],gtr.rates[4],gtr.rates[5]);
5896 myfree(NJ->transmat, sizeof(transition_matrix_t));
5897 NJ->transmat = CreateGTR(gtr.rates, gtr.freq);
5898 RecomputeMLProfiles(/*IN/OUT*/NJ);
5899 OptimizeAllBranchLengths(/*IN/OUT*/NJ);
5902 double GTRNegLogLk(double x, void *data) {
5903 gtr_opt_t *gtr = (gtr_opt_t*)data;
5904 assert(nCodes == 4);
5905 assert(gtr->NJ != NULL);
5906 assert(gtr->iRate >= 0 && gtr->iRate < 6);
5908 transition_matrix_t *old = gtr->NJ->transmat;
5911 for (i = 0; i < 6; i++)
5912 rates[i] = gtr->rates[i];
5913 rates[gtr->iRate] = x;
5915 gtr->NJ->transmat = CreateGTR(rates, gtr->freq);
5916 RecomputeMLProfiles(/*IN/OUT*/gtr->NJ);
5917 double loglk = TreeLogLk(gtr->NJ, /*site_loglk*/NULL);
5918 myfree(gtr->NJ->transmat, sizeof(transition_matrix_t));
5919 gtr->NJ->transmat = old;
5920 /* Do not recompute profiles -- assume the caller will do that */
5922 fprintf(stderr, "GTR LogLk(%.5f %.5f %.5f %.5f %.5f %.5f) = %f\n",
5923 rates[0], rates[1], rates[2], rates[3], rates[4], rates[5], loglk);
5927 /* Caller must free the resulting vector of n rates */
5928 numeric_t *MLSiteRates(int nRateCategories) {
5929 /* Even spacing from 1/nRate to nRate */
5930 double logNCat = log((double)nRateCategories);
5931 double logMinRate = -logNCat;
5932 double logMaxRate = logNCat;
5933 double logd = (logMaxRate-logMinRate)/(double)(nRateCategories-1);
5935 numeric_t *rates = mymalloc(sizeof(numeric_t)*nRateCategories);
5937 for (i = 0; i < nRateCategories; i++)
5938 rates[i] = exp(logMinRate + logd*(double)i);
5942 double *MLSiteLikelihoodsByRate(/*IN*/NJ_t *NJ, /*IN*/numeric_t *rates, int nRateCategories) {
5943 double *site_loglk = mymalloc(sizeof(double)*NJ->nPos*nRateCategories);
5945 /* save the original rates */
5946 assert(NJ->rates.nRateCategories > 0);
5947 numeric_t *oldRates = NJ->rates.rates;
5948 NJ->rates.rates = mymalloc(sizeof(numeric_t) * NJ->rates.nRateCategories);
5950 /* Compute site likelihood for each rate */
5953 for (iRate = 0; iRate < nRateCategories; iRate++) {
5955 for (i = 0; i < NJ->rates.nRateCategories; i++)
5956 NJ->rates.rates[i] = rates[iRate];
5957 RecomputeMLProfiles(/*IN/OUT*/NJ);
5958 double loglk = TreeLogLk(NJ, /*OUT*/&site_loglk[NJ->nPos*iRate]);
5959 ProgressReport("Site likelihoods with rate category %d of %d", iRate+1, nRateCategories, 0, 0);
5961 fprintf(stderr, "Rate %.3f Loglk %.3f SiteLogLk", rates[iRate], loglk);
5962 for (iPos = 0; iPos < NJ->nPos; iPos++)
5963 fprintf(stderr,"\t%.3f", site_loglk[NJ->nPos*iRate + iPos]);
5964 fprintf(stderr,"\n");
5968 /* restore original rates and profiles */
5969 myfree(NJ->rates.rates, sizeof(numeric_t) * NJ->rates.nRateCategories);
5970 NJ->rates.rates = oldRates;
5971 RecomputeMLProfiles(/*IN/OUT*/NJ);
5976 void SetMLRates(/*IN/OUT*/NJ_t *NJ, int nRateCategories) {
5977 assert(nRateCategories > 0);
5978 AllocRateCategories(/*IN/OUT*/&NJ->rates, 1, NJ->nPos); /* set to 1 category of rate 1 */
5979 if (nRateCategories == 1) {
5980 RecomputeMLProfiles(/*IN/OUT*/NJ);
5983 numeric_t *rates = MLSiteRates(nRateCategories);
5984 double *site_loglk = MLSiteLikelihoodsByRate(/*IN*/NJ, /*IN*/rates, nRateCategories);
5986 /* Select best rate for each site, correcting for the prior
5987 For a prior, use a gamma distribution with shape parameter 3, scale 1/3, so
5988 Prior(rate) ~ rate**2 * exp(-3*rate)
5989 log Prior(rate) = C + 2 * log(rate) - 3 * rate
5991 double sumRates = 0;
5994 for (iPos = 0; iPos < NJ->nPos; iPos++) {
5996 double dBest = -1e20;
5997 for (iRate = 0; iRate < nRateCategories; iRate++) {
5998 double site_loglk_with_prior = site_loglk[NJ->nPos*iRate + iPos]
5999 + 2.0 * log(rates[iRate]) - 3.0 * rates[iRate];
6000 if (site_loglk_with_prior > dBest) {
6002 dBest = site_loglk_with_prior;
6006 fprintf(stderr, "Selected rate category %d rate %.3f for position %d\n",
6007 iBest, rates[iBest], iPos+1);
6008 NJ->rates.ratecat[iPos] = iBest;
6009 sumRates += rates[iBest];
6011 site_loglk = myfree(site_loglk, sizeof(double)*NJ->nPos*nRateCategories);
6013 /* Force the rates to average to 1 */
6014 double avgRate = sumRates/NJ->nPos;
6015 for (iRate = 0; iRate < nRateCategories; iRate++)
6016 rates[iRate] /= avgRate;
6018 /* Save the rates */
6019 NJ->rates.rates = myfree(NJ->rates.rates, sizeof(numeric_t) * NJ->rates.nRateCategories);
6020 NJ->rates.rates = rates;
6021 NJ->rates.nRateCategories = nRateCategories;
6023 /* Update profiles based on rates */
6024 RecomputeMLProfiles(/*IN/OUT*/NJ);
6027 fprintf(stderr, "Switched to using %d rate categories (CAT approximation)\n", nRateCategories);
6028 fprintf(stderr, "Rate categories were divided by %.3f so that average rate = 1.0\n", avgRate);
6029 fprintf(stderr, "CAT-based log-likelihoods may not be comparable across runs\n");
6031 fprintf(stderr, "Use -gamma for approximate but comparable Gamma(20) log-likelihoods\n");
6035 double GammaLogLk(/*IN*/siteratelk_t *s, /*OPTIONAL OUT*/double *gamma_loglk_sites) {
6037 double *dRate = mymalloc(sizeof(double) * s->nRateCats);
6038 for (iRate = 0; iRate < s->nRateCats; iRate++) {
6039 /* The probability density for each rate is approximated by the total
6040 density between the midpoints */
6041 double pMin = iRate == 0 ? 0.0 :
6042 PGamma(s->mult * (s->rates[iRate-1] + s->rates[iRate])/2.0, s->alpha);
6043 double pMax = iRate == s->nRateCats-1 ? 1.0 :
6044 PGamma(s->mult * (s->rates[iRate]+s->rates[iRate+1])/2.0, s->alpha);
6045 dRate[iRate] = pMax-pMin;
6049 for (iPos = 0; iPos < s->nPos; iPos++) {
6050 /* Prevent underflow on large trees by comparing to maximum loglk */
6051 double maxloglk = -1e20;
6052 for (iRate = 0; iRate < s->nRateCats; iRate++) {
6053 double site_loglk = s->site_loglk[s->nPos*iRate + iPos];
6054 if (site_loglk > maxloglk)
6055 maxloglk = site_loglk;
6057 double rellk = 0; /* likelihood scaled by exp(maxloglk) */
6058 for (iRate = 0; iRate < s->nRateCats; iRate++) {
6059 double lk = exp(s->site_loglk[s->nPos*iRate + iPos] - maxloglk);
6060 rellk += lk * dRate[iRate];
6062 double loglk_site = maxloglk + log(rellk);
6063 loglk += loglk_site;
6064 if (gamma_loglk_sites != NULL)
6065 gamma_loglk_sites[iPos] = loglk_site;
6067 dRate = myfree(dRate, sizeof(double)*s->nRateCats);
6071 double OptAlpha(double alpha, void *data) {
6072 siteratelk_t *s = (siteratelk_t *)data;
6074 return(-GammaLogLk(s, NULL));
6077 double OptMult(double mult, void *data) {
6078 siteratelk_t *s = (siteratelk_t *)data;
6080 return(-GammaLogLk(s, NULL));
6083 /* Input site_loglk must be for each rate */
6084 double RescaleGammaLogLk(int nPos, int nRateCats, /*IN*/numeric_t *rates, /*IN*/double *site_loglk,
6085 /*OPTIONAL*/FILE *fpLog) {
6086 siteratelk_t s = { /*mult*/1.0, /*alpha*/1.0, nPos, nRateCats, rates, site_loglk };
6089 fx = -GammaLogLk(&s, NULL);
6091 fprintf(stderr, "Optimizing alpha, starting at loglk %.3f\n", -fx);
6092 for (i = 0; i < 10; i++) {
6093 ProgressReport("Optimizing alpha round %d", i+1, 0, 0, 0);
6095 s.alpha = onedimenmin(0.01, s.alpha, 10.0, OptAlpha, &s, 0.001, 0.001, &fx, &f2x);
6097 fprintf(stderr, "Optimize alpha round %d to %.3f lk %.3f\n", i+1, s.alpha, -fx);
6098 s.mult = onedimenmin(0.01, s.mult, 10.0, OptMult, &s, 0.001, 0.001, &fx, &f2x);
6100 fprintf(stderr, "Optimize mult round %d to %.3f lk %.3f\n", i+1, s.mult, -fx);
6101 if (fx > start - 0.001) {
6103 fprintf(stderr, "Optimizing alpha & mult converged\n");
6108 double *gamma_loglk_sites = mymalloc(sizeof(double) * nPos);
6109 double gammaLogLk = GammaLogLk(&s, /*OUT*/gamma_loglk_sites);
6111 fprintf(stderr, "Gamma(%d) LogLk = %.3f alpha = %.3f rescaling lengths by %.3f\n",
6112 nRateCats, gammaLogLk, s.alpha, 1/s.mult);
6116 fprintf(fpLog, "Gamma%dLogLk\t%.3f\tApproximate\tAlpha\t%.3f\tRescale\t%.3f\n",
6117 nRateCats, gammaLogLk, s.alpha, 1/s.mult);
6118 fprintf(fpLog, "Gamma%d\tSite\tLogLk", nRateCats);
6119 for (iRate = 0; iRate < nRateCats; iRate++)
6120 fprintf(fpLog, "\tr=%.3f", rates[iRate]/s.mult);
6121 fprintf(fpLog,"\n");
6122 for (iPos = 0; iPos < nPos; iPos++) {
6123 fprintf(fpLog, "Gamma%d\t%d\t%.3f", nRateCats, iPos, gamma_loglk_sites[iPos]);
6124 for (iRate = 0; iRate < nRateCats; iRate++)
6125 fprintf(fpLog, "\t%.3f", site_loglk[nPos*iRate + iPos]);
6126 fprintf(fpLog,"\n");
6129 gamma_loglk_sites = myfree(gamma_loglk_sites, sizeof(double) * nPos);
6133 double MLPairOptimize(profile_t *pA, profile_t *pB,
6134 int nPos, /*OPTIONAL*/transition_matrix_t *transmat, rates_t *rates,
6135 /*IN/OUT*/double *branch_length) {
6138 for (j=0;j<5;j++) len5[j] = *branch_length;
6139 quartet_opt_t qopt = { nPos, transmat, rates,
6140 /*nEval*/0, /*pair1*/pA, /*pair2*/pB };
6141 double f2x,negloglk;
6142 *branch_length = onedimenmin(/*xmin*/MLMinBranchLength,
6143 /*xguess*/*branch_length,
6147 /*ftol*/MLFTolBranchLength,
6148 /*atol*/MLMinBranchLengthTolerance,
6151 return(-negloglk); /* the log likelihood */
6154 void OptimizeAllBranchLengths(/*IN/OUT*/NJ_t *NJ) {
6157 if (NJ->nSeq == 2) {
6158 int parent = NJ->root;
6159 assert(NJ->child[parent].nChild==2);
6160 int nodes[2] = { NJ->child[parent].child[0], NJ->child[parent].child[1] };
6161 double length = 1.0;
6162 (void)MLPairOptimize(NJ->profiles[nodes[0]], NJ->profiles[nodes[1]],
6163 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/&length);
6164 NJ->branchlength[nodes[0]] = length/2.0;
6165 NJ->branchlength[nodes[1]] = length/2.0;
6169 traversal_t traversal = InitTraversal(NJ);
6170 profile_t **upProfiles = UpProfiles(NJ);
6171 int node = NJ->root;
6173 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6174 int nChild = NJ->child[node].nChild;
6176 if ((iDone % 100) == 0)
6177 ProgressReport("ML Lengths %d of %d splits", iDone+1, NJ->maxnode - NJ->nSeq, 0, 0);
6180 /* optimize the branch lengths between self, parent, and children,
6183 assert(nChild == 2 || nChild == 3);
6184 int nodes[3] = { NJ->child[node].child[0],
6185 NJ->child[node].child[1],
6186 nChild == 3 ? NJ->child[node].child[2] : node };
6187 profile_t *profiles[3] = { NJ->profiles[nodes[0]],
6188 NJ->profiles[nodes[1]],
6189 nChild == 3 ? NJ->profiles[nodes[2]]
6190 : GetUpProfile(/*IN/OUT*/upProfiles, NJ, node, /*useML*/true) };
6192 for (iter = 0; iter < 2; iter++) {
6194 for (i = 0; i < 3; i++) {
6195 profile_t *pA = profiles[i];
6198 profile_t *pB = PosteriorProfile(profiles[b1], profiles[b2],
6199 NJ->branchlength[nodes[b1]],
6200 NJ->branchlength[nodes[b2]],
6201 NJ->transmat, &NJ->rates, NJ->nPos, /*nConstraints*/0);
6202 double len = NJ->branchlength[nodes[i]];
6203 if (len < MLMinBranchLength)
6204 len = MLMinBranchLength;
6205 (void)MLPairOptimize(pA, pB, NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/&len);
6206 NJ->branchlength[nodes[i]] = len;
6207 pB = FreeProfile(pB, NJ->nPos, /*nConstraints*/0);
6209 fprintf(stderr, "Optimize length for %d to %.3f\n",
6210 nodes[i], NJ->branchlength[nodes[i]]);
6213 if (node != NJ->root) {
6214 RecomputeProfile(/*IN/OUT*/NJ, /*IN/OUT*/upProfiles, node, /*useML*/true);
6215 DeleteUpProfile(upProfiles, NJ, node);
6219 traversal = FreeTraversal(traversal,NJ);
6220 upProfiles = FreeUpProfiles(upProfiles,NJ);
6223 void RecomputeMLProfiles(/*IN/OUT*/NJ_t *NJ) {
6224 traversal_t traversal = InitTraversal(NJ);
6225 int node = NJ->root;
6226 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6227 if (NJ->child[node].nChild == 2) {
6228 NJ->profiles[node] = FreeProfile(NJ->profiles[node], NJ->nPos, NJ->nConstraints);
6229 int *children = NJ->child[node].child;
6230 NJ->profiles[node] = PosteriorProfile(NJ->profiles[children[0]], NJ->profiles[children[1]],
6231 NJ->branchlength[children[0]], NJ->branchlength[children[1]],
6232 NJ->transmat, &NJ->rates, NJ->nPos, NJ->nConstraints);
6235 traversal = FreeTraversal(traversal, NJ);
6238 void RecomputeProfiles(/*IN/OUT*/NJ_t *NJ, /*OPTIONAL*/distance_matrix_t *dmat) {
6239 traversal_t traversal = InitTraversal(NJ);
6240 int node = NJ->root;
6241 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6242 if (NJ->child[node].nChild == 2) {
6243 int *child = NJ->child[node].child;
6244 NJ->profiles[node] = FreeProfile(NJ->profiles[node], NJ->nPos, NJ->nConstraints);
6245 NJ->profiles[node] = AverageProfile(NJ->profiles[child[0]], NJ->profiles[child[1]],
6246 NJ->nPos, NJ->nConstraints,
6247 dmat, /*unweighted*/-1.0);
6250 traversal = FreeTraversal(traversal,NJ);
6253 int NNI(/*IN/OUT*/NJ_t *NJ, int iRound, int nRounds, bool useML,
6254 /*IN/OUT*/nni_stats_t *stats,
6255 /*OUT*/double *dMaxDelta) {
6256 /* For each non-root node N, with children A,B, sibling C, and uncle D,
6257 we compare the current topology AB|CD to the alternate topologies
6258 AC|BD and AD|BC, by using the 4 relevant profiles.
6260 If useML is true, it uses quartet maximum likelihood, and it
6261 updates branch lengths as it goes.
6263 If useML is false, it uses the minimum-evolution criterion with
6264 log-corrected distances on profiles. (If logdist is false, then
6265 the log correction is not done.) If useML is false, then NNI()
6266 does NOT modify the branch lengths.
6268 Regardless of whether it changes the topology, it recomputes the
6269 profile for the node, using the pairwise distances and BIONJ-like
6270 weightings (if bionj is set). The parent's profile has changed,
6271 but recomputing it is not necessary because we will visit it
6272 before we need it (we use postorder, so we may visit the sibling
6273 and its children before we visit the parent, but we never
6274 consider an ancestor's profile, so that is OK). When we change
6275 the parent's profile, this alters the uncle's up-profile, so we
6276 remove that. Finally, if the topology has changed, we remove the
6277 up-profiles of the nodes.
6279 If we do an NNI during post-order traversal, the result is a bit
6280 tricky. E.g. if we are at node N, and have visited its children A
6281 and B but not its uncle C, and we do an NNI that swaps B & C,
6282 then the post-order traversal will visit C, and its children, but
6283 then on the way back up, it will skip N, as it has already
6284 visited it. So, the profile of N will not be recomputed: any
6285 changes beneath C will not be reflected in the profile of N, and
6286 the profile of N will be slightly stale. This will be corrected
6287 on the next round of NNIs.
6289 double supportThreshold = useML ? treeLogLkDelta : MEMinDelta;
6292 int nNNIThisRound = 0;
6295 return(0); /* nothing to do */
6297 fprintf(stderr, "Beginning round %d of NNIs with ml? %d\n", iRound, useML?1:0);
6298 PrintNJInternal(/*WRITE*/stderr, NJ, /*useLen*/useML && iRound > 0 ? 1 : 0);
6300 /* For each node the upProfile or NULL */
6301 profile_t **upProfiles = UpProfiles(NJ);
6303 traversal_t traversal = InitTraversal(NJ);
6305 /* Identify nodes we can skip traversing into */
6308 for (node = 0; node < NJ->maxnode; node++) {
6309 if (node != NJ->root
6311 && stats[node].age >= 2
6312 && stats[node].subtreeAge >= 2
6313 && stats[node].support > supportThreshold) {
6315 SetupABCD(NJ, node, NULL, NULL, /*OUT*/nodeABCD, useML);
6316 for (i = 0; i < 4; i++)
6317 if (stats[nodeABCD[i]].age == 0 && stats[nodeABCD[i]].support > supportThreshold)
6320 SkipTraversalInto(node, /*IN/OUT*/traversal);
6322 fprintf(stderr, "Skipping subtree at %d: child %d %d parent %d age %d subtreeAge %d support %.3f\n",
6323 node, nodeABCD[0], nodeABCD[1], NJ->parent[node],
6324 stats[node].age, stats[node].subtreeAge, stats[node].support);
6333 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, &bUp)) >= 0) {
6334 if (node < NJ->nSeq || node == NJ->root)
6335 continue; /* nothing to do for leaves or root */
6338 fprintf(stderr, "Going up back to node %d\n", node);
6339 /* No longer needed */
6340 for (i = 0; i < NJ->child[node].nChild; i++)
6341 DeleteUpProfile(upProfiles, NJ, NJ->child[node].child[i]);
6342 DeleteUpProfile(upProfiles, NJ, node);
6343 RecomputeProfile(/*IN/OUT*/NJ, /*IN/OUT*/upProfiles, node, useML);
6346 if ((iDone % 100) == 0) {
6348 sprintf(buf, "%s NNI round %%d of %%d, %%d of %%d splits", useML ? "ML" : "ME");
6350 sprintf(buf+strlen(buf), ", %d changes", nNNIThisRound);
6351 if (nNNIThisRound > 0)
6352 sprintf(buf+strlen(buf), " (max delta %.3f)", *dMaxDelta);
6353 ProgressReport(buf, iRound+1, nRounds, iDone+1, NJ->maxnode - NJ->nSeq);
6357 profile_t *profiles[4];
6359 /* Note -- during the first round of ML NNIs, we use the min-evo-based branch lengths,
6360 which may be suboptimal */
6361 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, useML);
6363 /* Given our 4 profiles, consider doing a swap */
6364 int nodeA = nodeABCD[0];
6365 int nodeB = nodeABCD[1];
6366 int nodeC = nodeABCD[2];
6367 int nodeD = nodeABCD[3];
6369 nni_t choice = ABvsCD;
6372 fprintf(stderr,"Considering NNI around %d: Swap A=%d B=%d C=%d D=up(%d) or parent %d\n",
6373 node, nodeA, nodeB, nodeC, nodeD, NJ->parent[node]);
6374 if (verbose > 3 && useML) {
6375 double len[5] = { NJ->branchlength[nodeA], NJ->branchlength[nodeB], NJ->branchlength[nodeC], NJ->branchlength[nodeD],
6376 NJ->branchlength[node] };
6377 for (i=0; i < 5; i++)
6378 if (len[i] < MLMinBranchLength)
6379 len[i] = MLMinBranchLength;
6380 fprintf(stderr, "Starting quartet likelihood %.3f len %.3f %.3f %.3f %.3f %.3f\n",
6381 MLQuartetLogLk(profiles[0],profiles[1],profiles[2],profiles[3],NJ->nPos,NJ->transmat,&NJ->rates,len, /*site_lk*/NULL),
6382 len[0], len[1], len[2], len[3], len[4]);
6385 numeric_t newlength[5];
6388 for (i = 0; i < 4; i++)
6389 newlength[i] = NJ->branchlength[nodeABCD[i]];
6390 newlength[4] = NJ->branchlength[node];
6391 bool bFast = mlAccuracy < 2 && stats[node].age > 0;
6392 choice = MLQuartetNNI(profiles, NJ->transmat, &NJ->rates, NJ->nPos, NJ->nConstraints,
6393 /*OUT*/criteria, /*IN/OUT*/newlength, bFast);
6395 choice = ChooseNNI(profiles, NJ->distance_matrix, NJ->nPos, NJ->nConstraints,
6397 /* invert criteria so that higher is better, as in ML case, to simplify code below */
6398 for (i = 0; i < 3; i++)
6399 criteria[i] = -criteria[i];
6402 if (choice == ACvsBD) {
6404 ReplaceChild(/*IN/OUT*/NJ, node, nodeB, nodeC);
6405 ReplaceChild(/*IN/OUT*/NJ, NJ->parent[node], nodeC, nodeB);
6406 } else if (choice == ADvsBC) {
6408 ReplaceChild(/*IN/OUT*/NJ, node, nodeA, nodeC);
6409 ReplaceChild(/*IN/OUT*/NJ, NJ->parent[node], nodeC, nodeA);
6413 /* update branch length for the internal branch, and of any
6414 branches that lead to leaves, b/c those will not are not
6415 the internal branch for NNI and would not otherwise be set.
6417 if (choice == ADvsBC) {
6418 /* For ADvsBC, MLQuartetNNI swaps B with D, but we swap A with C */
6419 double length2[5] = { newlength[LEN_C], newlength[LEN_D],
6420 newlength[LEN_A], newlength[LEN_B],
6423 for (i = 0; i < 5; i++) newlength[i] = length2[i];
6424 /* and swap A and C */
6425 double tmp = newlength[LEN_A];
6426 newlength[LEN_A] = newlength[LEN_C];
6427 newlength[LEN_C] = tmp;
6428 } else if (choice == ACvsBD) {
6430 double tmp = newlength[LEN_B];
6431 newlength[LEN_B] = newlength[LEN_C];
6432 newlength[LEN_C] = tmp;
6435 NJ->branchlength[node] = newlength[LEN_I];
6436 NJ->branchlength[nodeA] = newlength[LEN_A];
6437 NJ->branchlength[nodeB] = newlength[LEN_B];
6438 NJ->branchlength[nodeC] = newlength[LEN_C];
6439 NJ->branchlength[nodeD] = newlength[LEN_D];
6442 if (verbose>2 && (choice != ABvsCD || verbose > 2))
6443 fprintf(stderr,"NNI around %d: Swap A=%d B=%d C=%d D=out(C) -- choose %s %s %.4f\n",
6444 node, nodeA, nodeB, nodeC,
6445 choice == ACvsBD ? "AC|BD" : (choice == ABvsCD ? "AB|CD" : "AD|BC"),
6446 useML ? "delta-loglk" : "-deltaLen",
6447 criteria[choice] - criteria[ABvsCD]);
6448 if(verbose >= 3 && slow && useML)
6449 fprintf(stderr, "Old tree lk -- %.4f\n", TreeLogLk(NJ, /*site_likelihoods*/NULL));
6451 /* update stats, *dMaxDelta, etc. */
6452 if (choice == ABvsCD) {
6460 stats[node].age = 0;
6461 stats[nodeA].age = 0;
6462 stats[nodeB].age = 0;
6463 stats[nodeC].age = 0;
6464 stats[nodeD].age = 0;
6466 stats[node].delta = criteria[choice] - criteria[ABvsCD]; /* 0 if ABvsCD */
6467 if (stats[node].delta > *dMaxDelta)
6468 *dMaxDelta = stats[node].delta;
6470 /* support is improvement of score for self over better of alternatives */
6471 stats[node].support = 1e20;
6472 for (i = 0; i < 3; i++)
6473 if (choice != i && criteria[choice]-criteria[i] < stats[node].support)
6474 stats[node].support = criteria[choice]-criteria[i];
6476 /* subtreeAge is the number of rounds since self or descendent had a significant improvement */
6477 if (stats[node].delta > supportThreshold)
6478 stats[node].subtreeAge = 0;
6480 stats[node].subtreeAge++;
6481 for (i = 0; i < 2; i++) {
6482 int child = NJ->child[node].child[i];
6483 if (stats[node].subtreeAge > stats[child].subtreeAge)
6484 stats[node].subtreeAge = stats[child].subtreeAge;
6488 /* update profiles and free up unneeded up-profiles */
6489 if (choice == ABvsCD) {
6490 /* No longer needed */
6491 DeleteUpProfile(upProfiles, NJ, nodeA);
6492 DeleteUpProfile(upProfiles, NJ, nodeB);
6493 DeleteUpProfile(upProfiles, NJ, nodeC);
6494 RecomputeProfile(/*IN/OUT*/NJ, /*IN/OUT*/upProfiles, node, useML);
6496 UpdateForNNI(NJ, node, upProfiles, useML);
6498 UpdateForNNI(NJ, node, upProfiles, useML);
6500 if(verbose > 2 && slow && useML) {
6501 /* Note we recomputed profiles back up to root already if slow */
6502 PrintNJInternal(/*WRITE*/stderr, NJ, /*useLen*/true);
6503 fprintf(stderr, "New tree lk -- %.4f\n", TreeLogLk(NJ, /*site_likelihoods*/NULL));
6505 } /* end postorder traversal */
6506 traversal = FreeTraversal(traversal,NJ);
6509 for (i = 0; i < NJ->maxnodes; i++)
6510 if (upProfiles[i] != NULL)
6512 fprintf(stderr, "N up profiles at end of NNI: %d\n", nUp);
6514 upProfiles = FreeUpProfiles(upProfiles,NJ);
6515 return(nNNIThisRound);
6518 nni_stats_t *InitNNIStats(NJ_t *NJ) {
6519 nni_stats_t *stats = mymalloc(sizeof(nni_stats_t)*NJ->maxnode);
6520 const int LargeAge = 1000000;
6522 for (i = 0; i < NJ->maxnode; i++) {
6524 stats[i].support = 0;
6525 if (i == NJ->root || i < NJ->nSeq) {
6526 stats[i].age = LargeAge;
6527 stats[i].subtreeAge = LargeAge;
6530 stats[i].subtreeAge = 0;
6536 nni_stats_t *FreeNNIStats(nni_stats_t *stats, NJ_t *NJ) {
6537 return(myfree(stats, sizeof(nni_stats_t)*NJ->maxnode));
6540 int FindSPRSteps(/*IN/OUT*/NJ_t *NJ,
6541 int nodeMove, /* the node to move multiple times */
6542 int nodeAround, /* sibling or parent of node to NNI to start the chain */
6543 /*IN/OUT*/profile_t **upProfiles,
6544 /*OUT*/spr_step_t *steps,
6548 for (iStep = 0; iStep < maxSteps; iStep++) {
6549 if (NJ->child[nodeAround].nChild != 2)
6550 break; /* no further to go */
6552 /* Consider the NNIs around nodeAround */
6553 profile_t *profiles[4];
6555 SetupABCD(NJ, nodeAround, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/false);
6557 (void) ChooseNNI(profiles, NJ->distance_matrix, NJ->nPos, NJ->nConstraints,
6560 /* Do & save the swap */
6561 spr_step_t *step = &steps[iStep];
6562 if (iStep == 0 ? bFirstAC : criteria[ACvsBD] < criteria[ADvsBC]) {
6563 /* swap B & C to put AC together */
6564 step->deltaLength = criteria[ACvsBD] - criteria[ABvsCD];
6565 step->nodes[0] = nodeABCD[1];
6566 step->nodes[1] = nodeABCD[2];
6568 /* swap AC to put AD together */
6569 step->deltaLength = criteria[ADvsBC] - criteria[ABvsCD];
6570 step->nodes[0] = nodeABCD[0];
6571 step->nodes[1] = nodeABCD[2];
6575 fprintf(stderr, "SPR chain step %d for %d around %d swap %d %d deltaLen %.5f\n",
6576 iStep+1, nodeAround, nodeMove, step->nodes[0], step->nodes[1], step->deltaLength);
6578 PrintNJInternal(stderr, NJ, /*useLen*/false);
6580 ReplaceChild(/*IN/OUT*/NJ, nodeAround, step->nodes[0], step->nodes[1]);
6581 ReplaceChild(/*IN/OUT*/NJ, NJ->parent[nodeAround], step->nodes[1], step->nodes[0]);
6582 UpdateForNNI(/*IN/OUT*/NJ, nodeAround, /*IN/OUT*/upProfiles, /*useML*/false);
6584 /* set the new nodeAround -- either parent(nodeMove) or sibling(nodeMove) --
6585 so that it different from current nodeAround
6587 int newAround[2] = { NJ->parent[nodeMove], Sibling(NJ, nodeMove) };
6588 if (NJ->parent[nodeMove] == NJ->root)
6589 RootSiblings(NJ, nodeMove, /*OUT*/newAround);
6590 assert(newAround[0] == nodeAround || newAround[1] == nodeAround);
6591 assert(newAround[0] != newAround[1]);
6592 nodeAround = newAround[newAround[0] == nodeAround ? 1 : 0];
6597 void UnwindSPRStep(/*IN/OUT*/NJ_t *NJ,
6598 /*IN*/spr_step_t *step,
6599 /*IN/OUT*/profile_t **upProfiles) {
6602 for (i = 0; i < 2; i++) {
6603 assert(step->nodes[i] >= 0 && step->nodes[i] < NJ->maxnodes);
6604 parents[i] = NJ->parent[step->nodes[i]];
6605 assert(parents[i] >= 0);
6607 assert(parents[0] != parents[1]);
6608 ReplaceChild(/*IN/OUT*/NJ, parents[0], step->nodes[0], step->nodes[1]);
6609 ReplaceChild(/*IN/OUT*/NJ, parents[1], step->nodes[1], step->nodes[0]);
6611 if (NJ->parent[parents[0]] == parents[1]) {
6614 assert(NJ->parent[parents[1]] == parents[0]);
6617 UpdateForNNI(/*IN/OUT*/NJ, parents[iYounger], /*IN/OUT*/upProfiles, /*useML*/false);
6620 /* Update the profile of node and its ancestor, and delete nearby out-profiles */
6621 void UpdateForNNI(/*IN/OUT*/NJ_t *NJ, int node, /*IN/OUT*/profile_t **upProfiles,
6625 /* exhaustive update */
6626 for (i = 0; i < NJ->maxnodes; i++)
6627 DeleteUpProfile(upProfiles, NJ, i);
6629 /* update profiles back to root */
6631 for (ancestor = node; ancestor >= 0; ancestor = NJ->parent[ancestor])
6632 RecomputeProfile(/*IN/OUT*/NJ, upProfiles, ancestor, useML);
6634 /* remove any up-profiles made while doing that*/
6635 for (i = 0; i < NJ->maxnodes; i++)
6636 DeleteUpProfile(upProfiles, NJ, i);
6638 /* if fast, only update around self
6639 note that upProfile(parent) is still OK after an NNI, but
6640 up-profiles of uncles may not be
6642 DeleteUpProfile(upProfiles, NJ, node);
6643 for (i = 0; i < NJ->child[node].nChild; i++)
6644 DeleteUpProfile(upProfiles, NJ, NJ->child[node].child[i]);
6645 assert(node != NJ->root);
6646 int parent = NJ->parent[node];
6647 int neighbors[2] = { parent, Sibling(NJ, node) };
6648 if (parent == NJ->root)
6649 RootSiblings(NJ, node, /*OUT*/neighbors);
6650 DeleteUpProfile(upProfiles, NJ, neighbors[0]);
6651 DeleteUpProfile(upProfiles, NJ, neighbors[1]);
6652 int uncle = Sibling(NJ, parent);
6654 DeleteUpProfile(upProfiles, NJ, uncle);
6655 RecomputeProfile(/*IN/OUT*/NJ, upProfiles, node, useML);
6656 RecomputeProfile(/*IN/OUT*/NJ, upProfiles, parent, useML);
6660 void SPR(/*IN/OUT*/NJ_t *NJ, int maxSPRLength, int iRound, int nRounds) {
6661 /* Given a non-root node N with children A,B, sibling C, and uncle D,
6662 we can try to move A by doing three types of moves (4 choices):
6663 "down" -- swap A with a child of B (if B is not a leaf) [2 choices]
6664 "over" -- swap B with C
6665 "up" -- swap A with D
6666 We follow down moves with down moves, over moves with down moves, and
6667 up moves with either up or over moves. (Other choices are just backing
6668 up and hence useless.)
6670 As with NNIs, we keep track of up-profiles as we go. However, some of the regular
6671 profiles may also become "stale" so it is a bit trickier.
6673 We store the traversal before we do SPRs to avoid any possible infinite loop
6675 double last_tot_len = 0.0;
6676 if (NJ->nSeq <= 3 || maxSPRLength < 1)
6679 last_tot_len = TreeLength(NJ, /*recomputeLengths*/true);
6680 int *nodeList = mymalloc(sizeof(int) * NJ->maxnodes);
6681 int nodeListLen = 0;
6682 traversal_t traversal = InitTraversal(NJ);
6683 int node = NJ->root;
6684 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6685 nodeList[nodeListLen++] = node;
6687 assert(nodeListLen == NJ->maxnode);
6688 traversal = FreeTraversal(traversal,NJ);
6690 profile_t **upProfiles = UpProfiles(NJ);
6691 spr_step_t *steps = mymalloc(sizeof(spr_step_t) * maxSPRLength); /* current chain of SPRs */
6694 for (i = 0; i < nodeListLen; i++) {
6697 ProgressReport("SPR round %3d of %3d, %d of %d nodes",
6698 iRound+1, nRounds, i+1, nodeListLen);
6699 if (node == NJ->root)
6700 continue; /* nothing to do for root */
6701 /* The nodes to NNI around */
6702 int nodeAround[2] = { NJ->parent[node], Sibling(NJ, node) };
6703 if (NJ->parent[node] == NJ->root) {
6704 /* NNI around both siblings instead */
6705 RootSiblings(NJ, node, /*OUT*/nodeAround);
6707 bool bChanged = false;
6709 for (iAround = 0; iAround < 2 && bChanged == false; iAround++) {
6711 for (ACFirst = 0; ACFirst < 2 && bChanged == false; ACFirst++) {
6713 PrintNJInternal(stderr, NJ, /*useLen*/false);
6714 int chainLength = FindSPRSteps(/*IN/OUT*/NJ, node, nodeAround[iAround],
6715 upProfiles, /*OUT*/steps, maxSPRLength, (bool)ACFirst);
6716 double dMinDelta = 0.0;
6718 double dTotDelta = 0.0;
6720 for (iC = 0; iC < chainLength; iC++) {
6721 dTotDelta += steps[iC].deltaLength;
6722 if (dTotDelta < dMinDelta) {
6723 dMinDelta = dTotDelta;
6729 fprintf(stderr, "SPR %s %d around %d chainLength %d of %d deltaLength %.5f swaps:",
6730 iCBest >= 0 ? "move" : "abandoned",
6731 node,nodeAround[iAround],iCBest+1,chainLength,dMinDelta);
6732 for (iC = 0; iC < chainLength; iC++)
6733 fprintf(stderr, " (%d,%d)%.4f", steps[iC].nodes[0], steps[iC].nodes[1], steps[iC].deltaLength);
6734 fprintf(stderr,"\n");
6736 for (iC = chainLength - 1; iC > iCBest; iC--)
6737 UnwindSPRStep(/*IN/OUT*/NJ, /*IN*/&steps[iC], /*IN/OUT*/upProfiles);
6739 PrintNJInternal(stderr, NJ, /*useLen*/false);
6740 while (slow && iCBest >= 0) {
6741 double expected_tot_len = last_tot_len + dMinDelta;
6742 double new_tot_len = TreeLength(NJ, /*recompute*/true);
6744 fprintf(stderr, "Total branch-length is now %.4f was %.4f expected %.4f\n",
6745 new_tot_len, last_tot_len, expected_tot_len);
6746 if (new_tot_len < last_tot_len) {
6747 last_tot_len = new_tot_len;
6748 break; /* no rewinding necessary */
6751 fprintf(stderr, "Rewinding SPR to %d\n",iCBest);
6752 UnwindSPRStep(/*IN/OUT*/NJ, /*IN*/&steps[iCBest], /*IN/OUT*/upProfiles);
6753 dMinDelta -= steps[iCBest].deltaLength;
6758 } /* loop over which step to take at 1st NNI */
6759 } /* loop over which node to pivot around */
6762 nSPR++; /* the SPR move is OK */
6763 /* make sure all the profiles are OK */
6765 for (j = 0; j < NJ->maxnodes; j++)
6766 DeleteUpProfile(upProfiles, NJ, j);
6768 for (ancestor = NJ->parent[node]; ancestor >= 0; ancestor = NJ->parent[ancestor])
6769 RecomputeProfile(/*IN/OUT*/NJ, upProfiles, ancestor, /*useML*/false);
6771 } /* end loop over subtrees to prune & regraft */
6772 steps = myfree(steps, sizeof(spr_step_t) * maxSPRLength);
6773 upProfiles = FreeUpProfiles(upProfiles,NJ);
6774 nodeList = myfree(nodeList, sizeof(int) * NJ->maxnodes);
6777 void RecomputeProfile(/*IN/OUT*/NJ_t *NJ, /*IN/OUT*/profile_t **upProfiles, int node,
6779 if (node < NJ->nSeq || node == NJ->root)
6780 return; /* no profile to compute */
6781 assert(NJ->child[node].nChild==2);
6783 profile_t *profiles[4];
6784 double weight = 0.5;
6785 if (useML || !bionj) {
6786 profiles[0] = NJ->profiles[NJ->child[node].child[0]];
6787 profiles[1] = NJ->profiles[NJ->child[node].child[1]];
6790 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, useML);
6791 weight = QuartetWeight(profiles, NJ->distance_matrix, NJ->nPos);
6795 fprintf(stderr, "Recompute %d from %d %d lengths %.4f %.4f\n",
6797 NJ->child[node].child[0],
6798 NJ->child[node].child[1],
6799 NJ->branchlength[NJ->child[node].child[0]],
6800 NJ->branchlength[NJ->child[node].child[1]]);
6802 fprintf(stderr, "Recompute %d from %d %d weight %.3f\n",
6803 node, NJ->child[node].child[0], NJ->child[node].child[1], weight);
6806 NJ->profiles[node] = FreeProfile(NJ->profiles[node], NJ->nPos, NJ->nConstraints);
6808 NJ->profiles[node] = PosteriorProfile(profiles[0], profiles[1],
6809 NJ->branchlength[NJ->child[node].child[0]],
6810 NJ->branchlength[NJ->child[node].child[1]],
6811 NJ->transmat, &NJ->rates, NJ->nPos, NJ->nConstraints);
6813 NJ->profiles[node] = AverageProfile(profiles[0], profiles[1],
6814 NJ->nPos, NJ->nConstraints,
6815 NJ->distance_matrix, weight);
6819 /* The BIONJ-like formula for the weight of A when building a profile for AB is
6820 1/2 + (avgD(B,CD) - avgD(A,CD))/(2*d(A,B))
6822 double QuartetWeight(profile_t *profiles[4], distance_matrix_t *dmat, int nPos) {
6824 return(-1.0); /* even weighting */
6826 CorrectedPairDistances(profiles, 4, dmat, nPos, /*OUT*/d);
6829 double weight = 0.5 + ((d[qBC]+d[qBD])-(d[qAC]+d[qAD]))/(4*d[qAB]);
6837 /* Resets the children entry of parent and also the parent entry of newchild */
6838 void ReplaceChild(/*IN/OUT*/NJ_t *NJ, int parent, int oldchild, int newchild) {
6839 NJ->parent[newchild] = parent;
6842 for (iChild = 0; iChild < NJ->child[parent].nChild; iChild++) {
6843 if (NJ->child[parent].child[iChild] == oldchild) {
6844 NJ->child[parent].child[iChild] = newchild;
6851 /* Recomputes all branch lengths
6853 For internal branches such as (A,B) vs. (C,D), uses the formula
6855 length(AB|CD) = (d(A,C)+d(A,D)+d(B,C)+d(B,D))/4 - d(A,B)/2 - d(C,D)/2
6857 (where all distances are profile distances - diameters).
6859 For external branches (e.g. to leaves) A vs. (B,C), use the formula
6861 length(A|BC) = (d(A,B)+d(A,C)-d(B,C))/2
6863 void UpdateBranchLengths(/*IN/OUT*/NJ_t *NJ) {
6866 else if (NJ->nSeq == 2) {
6867 int root = NJ->root;
6868 int nodeA = NJ->child[root].child[0];
6869 int nodeB = NJ->child[root].child[1];
6871 ProfileDist(NJ->profiles[nodeA],NJ->profiles[nodeB],
6872 NJ->nPos, NJ->distance_matrix, /*OUT*/&h);
6874 h.dist = LogCorrect(h.dist);
6875 NJ->branchlength[nodeA] = h.dist/2.0;
6876 NJ->branchlength[nodeB] = h.dist/2.0;
6880 profile_t **upProfiles = UpProfiles(NJ);
6881 traversal_t traversal = InitTraversal(NJ);
6882 int node = NJ->root;
6884 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6885 /* reset branch length of node (distance to its parent) */
6886 if (node == NJ->root)
6887 continue; /* no branch length to set */
6888 if (node < NJ->nSeq) { /* a leaf */
6889 profile_t *profileA = NJ->profiles[node];
6890 profile_t *profileB = NULL;
6891 profile_t *profileC = NULL;
6893 int sib = Sibling(NJ,node);
6894 if (sib == -1) { /* at root, have 2 siblings */
6896 RootSiblings(NJ, node, /*OUT*/sibs);
6897 profileB = NJ->profiles[sibs[0]];
6898 profileC = NJ->profiles[sibs[1]];
6900 profileB = NJ->profiles[sib];
6901 profileC = GetUpProfile(/*IN/OUT*/upProfiles, NJ, NJ->parent[node], /*useML*/false);
6903 profile_t *profiles[3] = {profileA,profileB,profileC};
6904 double d[3]; /*AB,AC,BC*/
6905 CorrectedPairDistances(profiles, 3, NJ->distance_matrix, NJ->nPos, /*OUT*/d);
6906 /* d(A,BC) = (dAB+dAC-dBC)/2 */
6907 NJ->branchlength[node] = (d[0]+d[1]-d[2])/2.0;
6909 profile_t *profiles[4];
6911 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/false);
6913 CorrectedPairDistances(profiles, 4, NJ->distance_matrix, NJ->nPos, /*OUT*/d);
6914 NJ->branchlength[node] = (d[qAC]+d[qAD]+d[qBC]+d[qBD])/4.0 - (d[qAB]+d[qCD])/2.0;
6916 /* no longer needed */
6917 DeleteUpProfile(upProfiles, NJ, nodeABCD[0]);
6918 DeleteUpProfile(upProfiles, NJ, nodeABCD[1]);
6921 traversal = FreeTraversal(traversal,NJ);
6922 upProfiles = FreeUpProfiles(upProfiles,NJ);
6925 /* Pick columns for resampling, stored as returned_vector[iBoot*nPos + j] */
6926 int *ResampleColumns(int nPos, int nBootstrap) {
6927 long lPos = nPos; /* to prevent overflow on very long alignments when multiplying nPos * nBootstrap */
6928 int *col = (int*)mymalloc(sizeof(int)*lPos*(size_t)nBootstrap);
6930 for (i = 0; i < nBootstrap; i++) {
6932 for (j = 0; j < nPos; j++) {
6933 int pos = (int)(knuth_rand() * nPos);
6936 else if (pos == nPos)
6938 col[i*lPos + j] = pos;
6942 for (i=0; i < 3 && i < nBootstrap; i++) {
6943 fprintf(stderr,"Boot%d",i);
6945 for (j = 0; j < nPos; j++) {
6946 fprintf(stderr,"\t%d",col[i*lPos+j]);
6948 fprintf(stderr,"\n");
6954 void ReliabilityNJ(/*IN/OUT*/NJ_t *NJ, int nBootstrap) {
6955 /* For each non-root node N, with children A,B, parent P, sibling C, and grandparent G,
6956 we test the reliability of the split (A,B) versus rest by comparing the profiles
6957 of A, B, C, and the "up-profile" of P.
6959 Each node's upProfile is the average of its sibling's (down)-profile + its parent's up-profile
6960 (If node's parent is the root, then there are two siblings and we don't need an up-profile)
6962 To save memory, we do depth-first-search down from the root, and we only keep
6963 up-profiles for nodes in the active path.
6965 if (NJ->nSeq <= 3 || nBootstrap <= 0)
6966 return; /* nothing to do */
6967 int *col = ResampleColumns(NJ->nPos, nBootstrap);
6969 profile_t **upProfiles = UpProfiles(NJ);
6970 traversal_t traversal = InitTraversal(NJ);
6971 int node = NJ->root;
6973 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
6974 if (node < NJ->nSeq || node == NJ->root)
6975 continue; /* nothing to do for leaves or root */
6977 if(iNodesDone > 0 && (iNodesDone % 100) == 0)
6978 ProgressReport("Local bootstrap for %6d of %6d internal splits", iNodesDone, NJ->nSeq-3, 0, 0);
6981 profile_t *profiles[4];
6983 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/false);
6985 NJ->support[node] = SplitSupport(profiles[0], profiles[1], profiles[2], profiles[3],
6986 NJ->distance_matrix,
6991 /* no longer needed */
6992 DeleteUpProfile(upProfiles, NJ, nodeABCD[0]);
6993 DeleteUpProfile(upProfiles, NJ, nodeABCD[1]);
6994 DeleteUpProfile(upProfiles, NJ, nodeABCD[2]);
6996 traversal = FreeTraversal(traversal,NJ);
6997 upProfiles = FreeUpProfiles(upProfiles,NJ);
6998 col = myfree(col, sizeof(int)*((size_t)NJ->nPos)*nBootstrap);
7001 profile_t *NewProfile(int nPos, int nConstraints) {
7002 profile_t *profile = (profile_t *)mymalloc(sizeof(profile_t));
7003 profile->weights = mymalloc(sizeof(numeric_t)*nPos);
7004 profile->codes = mymalloc(sizeof(unsigned char)*nPos);
7005 profile->vectors = NULL;
7006 profile->nVectors = 0;
7007 profile->codeDist = NULL;
7008 if (nConstraints == 0) {
7009 profile->nOn = NULL;
7010 profile->nOff = NULL;
7012 profile->nOn = mymalloc(sizeof(int)*nConstraints);
7013 profile->nOff = mymalloc(sizeof(int)*nConstraints);
7018 profile_t *FreeProfile(profile_t *profile, int nPos, int nConstraints) {
7019 if(profile==NULL) return(NULL);
7020 myfree(profile->codes, nPos);
7021 myfree(profile->weights, nPos);
7022 myfree(profile->vectors, sizeof(numeric_t)*nCodes*profile->nVectors);
7023 myfree(profile->codeDist, sizeof(numeric_t)*nCodes*nPos);
7024 if (nConstraints > 0) {
7025 myfree(profile->nOn, sizeof(int)*nConstraints);
7026 myfree(profile->nOff, sizeof(int)*nConstraints);
7028 return(myfree(profile, sizeof(profile_t)));
7031 void SetupABCD(NJ_t *NJ, int node,
7032 /* the 4 profiles; the last one is an outprofile */
7033 /*OPTIONAL OUT*/profile_t *profiles[4],
7034 /*OPTIONAL IN/OUT*/profile_t **upProfiles,
7035 /*OUT*/int nodeABCD[4],
7037 int parent = NJ->parent[node];
7038 assert(parent >= 0);
7039 assert(NJ->child[node].nChild == 2);
7040 nodeABCD[0] = NJ->child[node].child[0]; /*A*/
7041 nodeABCD[1] = NJ->child[node].child[1]; /*B*/
7043 profile_t *profile4 = NULL;
7044 if (parent == NJ->root) {
7046 RootSiblings(NJ, node, /*OUT*/sibs);
7047 nodeABCD[2] = sibs[0];
7048 nodeABCD[3] = sibs[1];
7049 if (profiles == NULL)
7051 profile4 = NJ->profiles[sibs[1]];
7053 nodeABCD[2] = Sibling(NJ,node);
7054 assert(nodeABCD[2] >= 0);
7055 nodeABCD[3] = parent;
7056 if (profiles == NULL)
7058 profile4 = GetUpProfile(upProfiles,NJ,parent,useML);
7060 assert(upProfiles != NULL);
7062 for (i = 0; i < 3; i++)
7063 profiles[i] = NJ->profiles[nodeABCD[i]];
7064 profiles[3] = profile4;
7068 int Sibling(NJ_t *NJ, int node) {
7069 int parent = NJ->parent[node];
7070 if (parent < 0 || parent == NJ->root)
7073 for(iChild=0;iChild<NJ->child[parent].nChild;iChild++) {
7074 if(NJ->child[parent].child[iChild] != node)
7075 return (NJ->child[parent].child[iChild]);
7081 void RootSiblings(NJ_t *NJ, int node, /*OUT*/int sibs[2]) {
7082 assert(NJ->parent[node] == NJ->root);
7083 assert(NJ->child[NJ->root].nChild == 3);
7087 for(iChild=0; iChild < NJ->child[NJ->root].nChild; iChild++) {
7088 int child = NJ->child[NJ->root].child[iChild];
7089 if (child != node) sibs[nSibs++] = child;
7094 void TestSplitsML(/*IN/OUT*/NJ_t *NJ, /*OUT*/SplitCount_t *splitcount, int nBootstrap) {
7095 const double tolerance = 1e-6;
7096 splitcount->nBadSplits = 0;
7097 splitcount->nConstraintViolations = 0;
7098 splitcount->nBadBoth = 0;
7099 splitcount->nSplits = 0;
7100 splitcount->dWorstDeltaUnconstrained = 0;
7101 splitcount->dWorstDeltaConstrained = 0;
7103 profile_t **upProfiles = UpProfiles(NJ);
7104 traversal_t traversal = InitTraversal(NJ);
7105 int node = NJ->root;
7107 int *col = nBootstrap > 0 ? ResampleColumns(NJ->nPos, nBootstrap) : NULL;
7108 double *site_likelihoods[3];
7110 for (choice = 0; choice < 3; choice++)
7111 site_likelihoods[choice] = mymalloc(sizeof(double)*NJ->nPos);
7114 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
7115 if (node < NJ->nSeq || node == NJ->root)
7116 continue; /* nothing to do for leaves or root */
7118 if(iNodesDone > 0 && (iNodesDone % 100) == 0)
7119 ProgressReport("ML split tests for %6d of %6d internal splits", iNodesDone, NJ->nSeq-3, 0, 0);
7122 profile_t *profiles[4];
7124 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/true);
7128 for (i = 0; i < 4; i++)
7129 len[i] = NJ->branchlength[nodeABCD[i]];
7130 len[4] = NJ->branchlength[node];
7131 double lenABvsCD[5] = {len[LEN_A], len[LEN_B], len[LEN_C], len[LEN_D], len[LEN_I]};
7132 double lenACvsBD[5] = {len[LEN_A], len[LEN_C], len[LEN_B], len[LEN_D], len[LEN_I]}; /* Swap B & C */
7133 double lenADvsBC[5] = {len[LEN_A], len[LEN_D], len[LEN_C], len[LEN_B], len[LEN_I]}; /* Swap B & D */
7137 #pragma omp parallel
7138 #pragma omp sections
7145 /* Lengths are already optimized for ABvsCD */
7146 loglk[ABvsCD] = MLQuartetLogLk(profiles[0], profiles[1], profiles[2], profiles[3],
7147 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenABvsCD,
7148 /*OUT*/site_likelihoods[ABvsCD]);
7155 loglk[ACvsBD] = MLQuartetOptimize(profiles[0], profiles[2], profiles[1], profiles[3],
7156 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenACvsBD, /*pStarTest*/NULL,
7157 /*OUT*/site_likelihoods[ACvsBD]);
7164 loglk[ADvsBC] = MLQuartetOptimize(profiles[0], profiles[3], profiles[2], profiles[1],
7165 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenADvsBC, /*pStarTest*/NULL,
7166 /*OUT*/site_likelihoods[ADvsBC]);
7171 /* do a second pass on the better alternative if it is close */
7172 if (loglk[ACvsBD] > loglk[ADvsBC]) {
7173 if (mlAccuracy > 1 || loglk[ACvsBD] > loglk[ABvsCD] - closeLogLkLimit) {
7174 loglk[ACvsBD] = MLQuartetOptimize(profiles[0], profiles[2], profiles[1], profiles[3],
7175 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenACvsBD, /*pStarTest*/NULL,
7176 /*OUT*/site_likelihoods[ACvsBD]);
7179 if (mlAccuracy > 1 || loglk[ADvsBC] > loglk[ABvsCD] - closeLogLkLimit) {
7180 loglk[ADvsBC] = MLQuartetOptimize(profiles[0], profiles[3], profiles[2], profiles[1],
7181 NJ->nPos, NJ->transmat, &NJ->rates, /*IN/OUT*/lenADvsBC, /*pStarTest*/NULL,
7182 /*OUT*/site_likelihoods[ADvsBC]);
7186 if (loglk[ABvsCD] >= loglk[ACvsBD] && loglk[ABvsCD] >= loglk[ADvsBC])
7188 else if (loglk[ACvsBD] >= loglk[ABvsCD] && loglk[ACvsBD] >= loglk[ADvsBC])
7192 bool badSplit = loglk[choice] > loglk[ABvsCD] + treeLogLkDelta; /* ignore small changes in likelihood */
7194 /* constraint penalties, indexed by nni_t (lower is better) */
7196 QuartetConstraintPenalties(profiles, NJ->nConstraints, /*OUT*/p);
7197 bool bBadConstr = p[ABvsCD] > p[ACvsBD] + tolerance || p[ABvsCD] > p[ADvsBC] + tolerance;
7198 bool violateConstraint = false;
7200 for (iC=0; iC < NJ->nConstraints; iC++) {
7201 if (SplitViolatesConstraint(profiles, iC)) {
7202 violateConstraint = true;
7206 splitcount->nSplits++;
7207 if (violateConstraint)
7208 splitcount->nConstraintViolations++;
7210 splitcount->nBadSplits++;
7211 if (badSplit && bBadConstr)
7212 splitcount->nBadBoth++;
7214 double delta = loglk[choice] - loglk[ABvsCD];
7215 /* If ABvsCD is favored over the more likely NNI by constraints,
7216 then this is probably a bad split because of the constraint */
7217 if (p[choice] > p[ABvsCD] + tolerance)
7218 splitcount->dWorstDeltaConstrained = MAX(delta, splitcount->dWorstDeltaConstrained);
7220 splitcount->dWorstDeltaUnconstrained = MAX(delta, splitcount->dWorstDeltaUnconstrained);
7223 NJ->support[node] = badSplit ? 0.0 : SHSupport(NJ->nPos, nBootstrap, col, loglk, site_likelihoods);
7225 /* No longer needed */
7226 DeleteUpProfile(upProfiles, NJ, nodeABCD[0]);
7227 DeleteUpProfile(upProfiles, NJ, nodeABCD[1]);
7228 DeleteUpProfile(upProfiles, NJ, nodeABCD[2]);
7230 traversal = FreeTraversal(traversal,NJ);
7231 upProfiles = FreeUpProfiles(upProfiles,NJ);
7233 col = myfree(col, sizeof(int)*((size_t)NJ->nPos)*nBootstrap);
7234 for (choice = 0; choice < 3; choice++)
7235 site_likelihoods[choice] = myfree(site_likelihoods[choice], sizeof(double)*NJ->nPos);
7239 void TestSplitsMinEvo(NJ_t *NJ, /*OUT*/SplitCount_t *splitcount) {
7240 const double tolerance = 1e-6;
7241 splitcount->nBadSplits = 0;
7242 splitcount->nConstraintViolations = 0;
7243 splitcount->nBadBoth = 0;
7244 splitcount->nSplits = 0;
7245 splitcount->dWorstDeltaUnconstrained = 0.0;
7246 splitcount->dWorstDeltaConstrained = 0.0;
7248 profile_t **upProfiles = UpProfiles(NJ);
7249 traversal_t traversal = InitTraversal(NJ);
7250 int node = NJ->root;
7252 while((node = TraversePostorder(node, NJ, /*IN/OUT*/traversal, /*pUp*/NULL)) >= 0) {
7253 if (node < NJ->nSeq || node == NJ->root)
7254 continue; /* nothing to do for leaves or root */
7256 profile_t *profiles[4];
7258 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, /*useML*/false);
7261 fprintf(stderr,"Testing Split around %d: A=%d B=%d C=%d D=up(%d) or node parent %d\n",
7262 node, nodeABCD[0], nodeABCD[1], nodeABCD[2], nodeABCD[3], NJ->parent[node]);
7264 double d[6]; /* distances, perhaps log-corrected distances, no constraint penalties */
7265 CorrectedPairDistances(profiles, 4, NJ->distance_matrix, NJ->nPos, /*OUT*/d);
7267 /* alignment-based scores for each split (lower is better) */
7268 double sABvsCD = d[qAB] + d[qCD];
7269 double sACvsBD = d[qAC] + d[qBD];
7270 double sADvsBC = d[qAD] + d[qBC];
7272 /* constraint penalties, indexed by nni_t (lower is better) */
7274 QuartetConstraintPenalties(profiles, NJ->nConstraints, /*OUT*/p);
7276 int nConstraintsViolated = 0;
7278 for (iC=0; iC < NJ->nConstraints; iC++) {
7279 if (SplitViolatesConstraint(profiles, iC)) {
7280 nConstraintsViolated++;
7282 double penalty[3] = {0.0,0.0,0.0};
7283 (void)QuartetConstraintPenaltiesPiece(profiles, iC, /*OUT*/penalty);
7284 fprintf(stderr, "Violate constraint %d at %d (children %d %d) penalties %.3f %.3f %.3f %d/%d %d/%d %d/%d %d/%d\n",
7285 iC, node, NJ->child[node].child[0], NJ->child[node].child[1],
7286 penalty[ABvsCD], penalty[ACvsBD], penalty[ADvsBC],
7287 profiles[0]->nOn[iC], profiles[0]->nOff[iC],
7288 profiles[1]->nOn[iC], profiles[1]->nOff[iC],
7289 profiles[2]->nOn[iC], profiles[2]->nOff[iC],
7290 profiles[3]->nOn[iC], profiles[3]->nOff[iC]);
7295 double delta = sABvsCD - MIN(sACvsBD,sADvsBC);
7296 bool bBadDist = delta > tolerance;
7297 bool bBadConstr = p[ABvsCD] > p[ACvsBD] + tolerance || p[ABvsCD] > p[ADvsBC] + tolerance;
7299 splitcount->nSplits++;
7301 nni_t choice = sACvsBD < sADvsBC ? ACvsBD : ADvsBC;
7302 /* If ABvsCD is favored over the shorter NNI by constraints,
7303 then this is probably a bad split because of the constraint */
7304 if (p[choice] > p[ABvsCD] + tolerance)
7305 splitcount->dWorstDeltaConstrained = MAX(delta, splitcount->dWorstDeltaConstrained);
7307 splitcount->dWorstDeltaUnconstrained = MAX(delta, splitcount->dWorstDeltaUnconstrained);
7310 if (nConstraintsViolated > 0)
7311 splitcount->nConstraintViolations++; /* count splits with any violations, not #constraints in a splits */
7313 splitcount->nBadSplits++;
7314 if (bBadDist && bBadConstr)
7315 splitcount->nBadBoth++;
7316 if (bBadConstr && verbose > 2) {
7317 /* Which NNI would be better */
7318 double dist_advantage = 0;
7319 double constraint_penalty = 0;
7320 if (p[ACvsBD] < p[ADvsBC]) {
7321 dist_advantage = sACvsBD - sABvsCD;
7322 constraint_penalty = p[ABvsCD] - p[ACvsBD];
7324 dist_advantage = sADvsBC - sABvsCD;
7325 constraint_penalty = p[ABvsCD] - p[ADvsBC];
7327 fprintf(stderr, "Violate constraints %d distance_advantage %.3f constraint_penalty %.3f (children %d %d):",
7328 node, dist_advantage, constraint_penalty,
7329 NJ->child[node].child[0], NJ->child[node].child[1]);
7330 /* list the constraints with a penalty, meaning that ABCD all have non-zero
7331 values and that AB|CD worse than others */
7332 for (iC = 0; iC < NJ->nConstraints; iC++) {
7334 if (QuartetConstraintPenaltiesPiece(profiles, iC, /*OUT*/ppart)) {
7335 if (ppart[qAB] + ppart[qCD] > ppart[qAD] + ppart[qBC] + tolerance
7336 || ppart[qAB] + ppart[qCD] > ppart[qAC] + ppart[qBD] + tolerance)
7337 fprintf(stderr, " %d (%d/%d %d/%d %d/%d %d/%d)", iC,
7338 profiles[0]->nOn[iC], profiles[0]->nOff[iC],
7339 profiles[1]->nOn[iC], profiles[1]->nOff[iC],
7340 profiles[2]->nOn[iC], profiles[2]->nOff[iC],
7341 profiles[3]->nOn[iC], profiles[3]->nOff[iC]);
7344 fprintf(stderr, "\n");
7347 /* no longer needed */
7348 DeleteUpProfile(upProfiles, NJ, nodeABCD[0]);
7349 DeleteUpProfile(upProfiles, NJ, nodeABCD[1]);
7351 traversal = FreeTraversal(traversal,NJ);
7352 upProfiles = FreeUpProfiles(upProfiles,NJ);
7355 /* Computes support for (A,B),(C,D) compared to that for (A,C),(B,D) and (A,D),(B,C) */
7356 double SplitSupport(profile_t *pA, profile_t *pB, profile_t *pC, profile_t *pD,
7357 /*OPTIONAL*/distance_matrix_t *dmat,
7362 long lPos = nPos; /* to avoid overflow when multiplying */
7364 /* Note distpieces are weighted */
7365 double *distpieces[6];
7367 for (j = 0; j < 6; j++) {
7368 distpieces[j] = (double*)mymalloc(sizeof(double)*nPos);
7369 weights[j] = (double*)mymalloc(sizeof(double)*nPos);
7376 for (i = 0; i < nPos; i++) {
7377 numeric_t *fA = GET_FREQ(pA, i, /*IN/OUT*/iFreqA);
7378 numeric_t *fB = GET_FREQ(pB, i, /*IN/OUT*/iFreqB);
7379 numeric_t *fC = GET_FREQ(pC, i, /*IN/OUT*/iFreqC);
7380 numeric_t *fD = GET_FREQ(pD, i, /*IN/OUT*/iFreqD);
7382 weights[qAB][i] = pA->weights[i] * pB->weights[i];
7383 weights[qAC][i] = pA->weights[i] * pC->weights[i];
7384 weights[qAD][i] = pA->weights[i] * pD->weights[i];
7385 weights[qBC][i] = pB->weights[i] * pC->weights[i];
7386 weights[qBD][i] = pB->weights[i] * pD->weights[i];
7387 weights[qCD][i] = pC->weights[i] * pD->weights[i];
7389 distpieces[qAB][i] = weights[qAB][i] * ProfileDistPiece(pA->codes[i], pB->codes[i], fA, fB, dmat, NULL);
7390 distpieces[qAC][i] = weights[qAC][i] * ProfileDistPiece(pA->codes[i], pC->codes[i], fA, fC, dmat, NULL);
7391 distpieces[qAD][i] = weights[qAD][i] * ProfileDistPiece(pA->codes[i], pD->codes[i], fA, fD, dmat, NULL);
7392 distpieces[qBC][i] = weights[qBC][i] * ProfileDistPiece(pB->codes[i], pC->codes[i], fB, fC, dmat, NULL);
7393 distpieces[qBD][i] = weights[qBD][i] * ProfileDistPiece(pB->codes[i], pD->codes[i], fB, fD, dmat, NULL);
7394 distpieces[qCD][i] = weights[qCD][i] * ProfileDistPiece(pC->codes[i], pD->codes[i], fC, fD, dmat, NULL);
7396 assert(iFreqA == pA->nVectors);
7397 assert(iFreqB == pB->nVectors);
7398 assert(iFreqC == pC->nVectors);
7399 assert(iFreqD == pD->nVectors);
7401 double totpieces[6];
7402 double totweights[6];
7404 for (j = 0; j < 6; j++) {
7406 totweights[j] = 0.0;
7407 for (i = 0; i < nPos; i++) {
7408 totpieces[j] += distpieces[j][i];
7409 totweights[j] += weights[j][i];
7411 dists[j] = totweights[j] > 0.01 ? totpieces[j]/totweights[j] : 3.0;
7413 dists[j] = LogCorrect(dists[j]);
7416 /* Support1 = Support(AB|CD over AC|BD) = d(A,C)+d(B,D)-d(A,B)-d(C,D)
7417 Support2 = Support(AB|CD over AD|BC) = d(A,D)+d(B,C)-d(A,B)-d(C,D)
7419 double support1 = dists[qAC] + dists[qBD] - dists[qAB] - dists[qCD];
7420 double support2 = dists[qAD] + dists[qBC] - dists[qAB] - dists[qCD];
7422 if (support1 < 0 || support2 < 0) {
7423 nSuboptimalSplits++; /* Another split seems superior */
7426 assert(nBootstrap > 0);
7430 for (iBoot=0;iBoot<nBootstrap;iBoot++) {
7431 int *colw = &col[lPos*iBoot];
7433 for (j = 0; j < 6; j++) {
7436 double *d = distpieces[j];
7437 double *w = weights[j];
7438 for (i=0; i<nPos; i++) {
7443 dists[j] = totw > 0.01 ? totp/totw : 3.0;
7445 dists[j] = LogCorrect(dists[j]);
7447 support1 = dists[qAC] + dists[qBD] - dists[qAB] - dists[qCD];
7448 support2 = dists[qAD] + dists[qBC] - dists[qAB] - dists[qCD];
7449 if (support1 > 0 && support2 > 0)
7451 } /* end loop over bootstrap replicates */
7453 for (j = 0; j < 6; j++) {
7454 distpieces[j] = myfree(distpieces[j], sizeof(double)*nPos);
7455 weights[j] = myfree(weights[j], sizeof(double)*nPos);
7457 return( nSupport/(double)nBootstrap );
7460 double SHSupport(int nPos, int nBootstrap, int *col, double loglk[3], double *site_likelihoods[3]) {
7461 long lPos = nPos; /* to avoid overflow when multiplying */
7462 assert(nBootstrap>0);
7463 double delta1 = loglk[0]-loglk[1];
7464 double delta2 = loglk[0]-loglk[2];
7465 double delta = delta1 < delta2 ? delta1 : delta2;
7467 double *siteloglk[3];
7469 for (i = 0; i < 3; i++) {
7470 siteloglk[i] = mymalloc(sizeof(double)*nPos);
7471 for (j = 0; j < nPos; j++)
7472 siteloglk[i][j] = log(site_likelihoods[i][j]);
7477 for (iBoot = 0; iBoot < nBootstrap; iBoot++) {
7478 double resampled[3];
7479 for (i = 0; i < 3; i++)
7480 resampled[i] = -loglk[i];
7481 for (j = 0; j < nPos; j++) {
7482 int pos = col[iBoot*lPos+j];
7483 for (i = 0; i < 3; i++)
7484 resampled[i] += siteloglk[i][pos];
7487 for (i = 1; i < 3; i++)
7488 if (resampled[i] > resampled[iBest])
7490 double resample1 = resampled[iBest] - resampled[(iBest+1)%3];
7491 double resample2 = resampled[iBest] - resampled[(iBest+2)%3];
7492 double resampleDelta = resample1 < resample2 ? resample1 : resample2;
7493 if (resampleDelta < delta)
7497 siteloglk[i] = myfree(siteloglk[i], sizeof(double)*nPos);
7498 return(nSupport/(double)nBootstrap);
7502 void SetDistCriterion(/*IN/OUT*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *hit) {
7503 if (hit->i < NJ->nSeq && hit->j < NJ->nSeq) {
7504 SeqDist(NJ->profiles[hit->i]->codes,
7505 NJ->profiles[hit->j]->codes,
7506 NJ->nPos, NJ->distance_matrix, /*OUT*/hit);
7508 ProfileDist(NJ->profiles[hit->i],
7509 NJ->profiles[hit->j],
7510 NJ->nPos, NJ->distance_matrix, /*OUT*/hit);
7511 hit->dist -= (NJ->diameter[hit->i] + NJ->diameter[hit->j]);
7513 hit->dist += constraintWeight
7514 * (double)JoinConstraintPenalty(NJ, hit->i, hit->j);
7515 SetCriterion(NJ,nActive,/*IN/OUT*/hit);
7518 void SetCriterion(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *join) {
7521 || NJ->parent[join->i] >= 0
7522 || NJ->parent[join->j] >= 0)
7524 assert(NJ->nOutDistActive[join->i] >= nActive);
7525 assert(NJ->nOutDistActive[join->j] >= nActive);
7527 int nDiffAllow = tophitsMult > 0 ? (int)(nActive*staleOutLimit) : 0;
7528 if (NJ->nOutDistActive[join->i] - nActive > nDiffAllow)
7529 SetOutDistance(NJ, join->i, nActive);
7530 if (NJ->nOutDistActive[join->j] - nActive > nDiffAllow)
7531 SetOutDistance(NJ, join->j, nActive);
7532 double outI = NJ->outDistances[join->i];
7533 if (NJ->nOutDistActive[join->i] != nActive)
7534 outI *= (nActive-1)/(double)(NJ->nOutDistActive[join->i]-1);
7535 double outJ = NJ->outDistances[join->j];
7536 if (NJ->nOutDistActive[join->j] != nActive)
7537 outJ *= (nActive-1)/(double)(NJ->nOutDistActive[join->j]-1);
7538 join->criterion = join->dist - (outI+outJ)/(double)(nActive-2);
7539 if (verbose > 2 && nActive <= 5) {
7540 fprintf(stderr, "Set Criterion to join %d %d with nActive=%d dist+penalty %.3f criterion %.3f\n",
7541 join->i, join->j, nActive, join->dist, join->criterion);
7545 void SetOutDistance(NJ_t *NJ, int iNode, int nActive) {
7546 if (NJ->nOutDistActive[iNode] == nActive)
7549 /* May be called by InitNJ before we have parents */
7550 assert(iNode>=0 && (NJ->parent == NULL || NJ->parent[iNode]<0));
7552 ProfileDist(NJ->profiles[iNode], NJ->outprofile, NJ->nPos, NJ->distance_matrix, &dist);
7555 /* out(A) = sum(X!=A) d(A,X)
7556 = sum(X!=A) (profiledist(A,X) - diam(A) - diam(X))
7557 = sum(X!=A) profiledist(A,X) - (N-1)*diam(A) - (totdiam - diam(A))
7559 in the absence of gaps:
7560 profiledist(A,out) = mean profiledist(A, all active nodes)
7561 sum(X!=A) profiledist(A,X) = N * profiledist(A,out) - profiledist(A,A)
7563 With gaps, we need to take the weights of the comparisons into account, where
7564 w(Ai) is the weight of position i in profile A:
7565 w(A,B) = sum_i w(Ai) * w(Bi)
7566 d(A,B) = sum_i w(Ai) * w(Bi) * d(Ai,Bi) / w(A,B)
7568 sum(X!=A) profiledist(A,X) ~= (N-1) * profiledist(A, Out w/o A)
7569 profiledist(A, Out w/o A) = sum_X!=A sum_i d(Ai,Xi) * w(Ai) * w(Bi) / ( sum_X!=A sum_i w(Ai) * w(Bi) )
7570 d(A, Out) = sum_A sum_i d(Ai,Xi) * w(Ai) * w(Bi) / ( sum_X sum_i w(Ai) * w(Bi) )
7573 profiledist(A,out w/o A) = (top of d(A,Out) - top of d(A,A)) / (weight of d(A,Out) - weight of d(A,A))
7575 with another correction of nActive because the weight of the out-profile is the average
7576 weight not the total weight.
7578 double top = (nActive-1)
7579 * (dist.dist * dist.weight * nActive - NJ->selfweight[iNode] * NJ->selfdist[iNode]);
7580 double bottom = (dist.weight * nActive - NJ->selfweight[iNode]);
7581 double pdistOutWithoutA = top/bottom;
7582 NJ->outDistances[iNode] = bottom > 0.01 ?
7583 pdistOutWithoutA - NJ->diameter[iNode] * (nActive-1) - (NJ->totdiam - NJ->diameter[iNode])
7585 NJ->nOutDistActive[iNode] = nActive;
7587 if(verbose>3 && iNode < 5)
7588 fprintf(stderr,"NewOutDist for %d %f from dist %f selfd %f diam %f totdiam %f newActive %d\n",
7589 iNode, NJ->outDistances[iNode], dist.dist, NJ->selfdist[iNode], NJ->diameter[iNode],
7590 NJ->totdiam, nActive);
7591 if (verbose>6 && (iNode % 10) == 0) {
7592 /* Compute the actual out-distance and compare */
7594 double total_pd = 0.0;
7596 for (j=0;j<NJ->maxnode;j++) {
7597 if (j!=iNode && (NJ->parent==NULL || NJ->parent[j]<0)) {
7599 ProfileDist(NJ->profiles[iNode], NJ->profiles[j], NJ->nPos, NJ->distance_matrix, /*OUT*/&bh);
7600 total_pd += bh.dist;
7601 total += bh.dist - (NJ->diameter[iNode] + NJ->diameter[j]);
7604 fprintf(stderr,"OutDist for Node %d %f truth %f profiled %f truth %f pd_err %f\n",
7605 iNode, NJ->outDistances[iNode], total, pdistOutWithoutA, total_pd,fabs(pdistOutWithoutA-total_pd));
7609 top_hits_t *FreeTopHits(top_hits_t *tophits) {
7610 if (tophits == NULL)
7613 for (iNode = 0; iNode < tophits->maxnodes; iNode++) {
7614 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
7615 if (l->hits != NULL)
7616 l->hits = myfree(l->hits, sizeof(hit_t) * l->nHits);
7618 tophits->top_hits_lists = myfree(tophits->top_hits_lists, sizeof(top_hits_list_t) * tophits->maxnodes);
7619 tophits->visible = myfree(tophits->visible, sizeof(hit_t*) * tophits->maxnodes);
7620 tophits->topvisible = myfree(tophits->topvisible, sizeof(int) * tophits->nTopVisible);
7622 for (iNode = 0; iNode < tophits->maxnodes; iNode++)
7623 omp_destroy_lock(&tophits->locks[iNode]);
7624 tophits->locks = myfree(tophits->locks, sizeof(omp_lock_t) * tophits->maxnodes);
7626 return(myfree(tophits, sizeof(top_hits_t)));
7629 top_hits_t *InitTopHits(NJ_t *NJ, int m) {
7632 top_hits_t *tophits = mymalloc(sizeof(top_hits_t));
7634 tophits->q = (int)(0.5 + tophits2Mult * sqrt(tophits->m));
7635 if (!useTopHits2nd || tophits->q >= tophits->m)
7637 tophits->maxnodes = NJ->maxnodes;
7638 tophits->top_hits_lists = mymalloc(sizeof(top_hits_list_t) * tophits->maxnodes);
7639 tophits->visible = mymalloc(sizeof(hit_t) * tophits->maxnodes);
7640 tophits->nTopVisible = (int)(0.5 + topvisibleMult*m);
7641 tophits->topvisible = mymalloc(sizeof(int) * tophits->nTopVisible);
7643 tophits->locks = mymalloc(sizeof(omp_lock_t) * tophits->maxnodes);
7644 for (iNode = 0; iNode < tophits->maxnodes; iNode++)
7645 omp_init_lock(&tophits->locks[iNode]);
7648 for (i = 0; i < tophits->nTopVisible; i++)
7649 tophits->topvisible[i] = -1; /* empty */
7650 tophits->topvisibleAge = 0;
7652 for (iNode = 0; iNode < tophits->maxnodes; iNode++) {
7653 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
7658 hit_t *v = &tophits->visible[iNode];
7665 /* Helper function for sorting in SetAllLeafTopHits,
7666 and the global variables it needs
7668 NJ_t *CompareSeedNJ = NULL;
7669 int *CompareSeedGaps = NULL;
7670 int CompareSeeds(const void *c1, const void *c2) {
7671 int seed1 = *(int *)c1;
7672 int seed2 = *(int *)c2;
7673 int gapdiff = CompareSeedGaps[seed1] - CompareSeedGaps[seed2];
7674 if (gapdiff != 0) return(gapdiff); /* fewer gaps is better */
7675 double outdiff = CompareSeedNJ->outDistances[seed1] - CompareSeedNJ->outDistances[seed2];
7676 if(outdiff < 0) return(-1); /* closer to more nodes is better */
7677 if(outdiff > 0) return(1);
7681 /* Using the seed heuristic and the close global variable */
7682 void SetAllLeafTopHits(/*IN/UPDATE*/NJ_t *NJ, /*IN/OUT*/top_hits_t *tophits) {
7683 double close = tophitsClose;
7685 if (fastest && NJ->nSeq >= 50000) {
7688 double logN = log((double)NJ->nSeq)/log(2.0);
7689 close = logN/(logN+2.0);
7692 /* Sort the potential seeds, by a combination of nGaps and NJ->outDistances
7693 We don't store nGaps so we need to compute that
7695 int *nGaps = (int*)mymalloc(sizeof(int)*NJ->nSeq);
7697 for(iNode=0; iNode<NJ->nSeq; iNode++) {
7698 nGaps[iNode] = (int)(0.5 + NJ->nPos - NJ->selfweight[iNode]);
7700 int *seeds = (int*)mymalloc(sizeof(int)*NJ->nSeq);
7701 for (iNode=0; iNode<NJ->nSeq; iNode++) seeds[iNode] = iNode;
7703 CompareSeedGaps = nGaps;
7704 qsort(/*IN/OUT*/seeds, NJ->nSeq, sizeof(int), CompareSeeds);
7705 CompareSeedNJ = NULL;
7706 CompareSeedGaps = NULL;
7708 /* For each seed, save its top 2*m hits and then look for close neighbors */
7709 assert(2 * tophits->m <= NJ->nSeq);
7711 int nHasTopHits = 0;
7713 #pragma omp parallel for schedule(dynamic, 50)
7715 for(iSeed=0; iSeed < NJ->nSeq; iSeed++) {
7716 int seed = seeds[iSeed];
7717 if (iSeed > 0 && (iSeed % 100) == 0) {
7719 #pragma omp critical
7721 ProgressReport("Top hits for %6d of %6d seqs (at seed %6d)",
7722 nHasTopHits, NJ->nSeq,
7725 if (tophits->top_hits_lists[seed].nHits > 0) {
7726 if(verbose>2) fprintf(stderr, "Skipping seed %d\n", seed);
7730 besthit_t *besthitsSeed = (besthit_t*)mymalloc(sizeof(besthit_t)*NJ->nSeq);
7731 besthit_t *besthitsNeighbor = (besthit_t*)mymalloc(sizeof(besthit_t) * 2 * tophits->m);
7734 if(verbose>2) fprintf(stderr,"Trying seed %d\n", seed);
7735 SetBestHit(seed, NJ, /*nActive*/NJ->nSeq, /*OUT*/&bestjoin, /*OUT*/besthitsSeed);
7737 /* sort & save top hits of self. besthitsSeed is now sorted. */
7738 SortSaveBestHits(seed, /*IN/SORT*/besthitsSeed, /*IN-SIZE*/NJ->nSeq,
7739 /*OUT-SIZE*/tophits->m, /*IN/OUT*/tophits);
7742 /* find "close" neighbors and compute their top hits */
7743 double neardist = besthitsSeed[2 * tophits->m - 1].dist * close;
7744 /* must have at least average weight, rem higher is better
7745 and allow a bit more than average, e.g. if we are looking for within 30% away,
7746 20% more gaps than usual seems OK
7747 Alternatively, have a coverage requirement in case neighbor is short
7748 If fastest, consider the top q/2 hits to be close neighbors, regardless
7750 double nearweight = 0;
7752 for (iClose = 0; iClose < 2 * tophits->m; iClose++)
7753 nearweight += besthitsSeed[iClose].weight;
7754 nearweight = nearweight/(2.0 * tophits->m); /* average */
7755 nearweight *= (1.0-2.0*neardist/3.0);
7756 double nearcover = 1.0 - neardist/2.0;
7758 if(verbose>2) fprintf(stderr,"Distance limit for close neighbors %f weight %f ungapped %d\n",
7759 neardist, nearweight, NJ->nPos-nGaps[seed]);
7760 for (iClose = 0; iClose < tophits->m; iClose++) {
7761 besthit_t *closehit = &besthitsSeed[iClose];
7762 int closeNode = closehit->j;
7763 if (tophits->top_hits_lists[closeNode].nHits > 0)
7766 /* If within close-distance, or identical, use as close neighbor */
7767 bool close = closehit->dist <= neardist
7768 && (closehit->weight >= nearweight
7769 || closehit->weight >= (NJ->nPos-nGaps[closeNode])*nearcover);
7770 bool identical = closehit->dist < 1e-6
7771 && fabs(closehit->weight - (NJ->nPos - nGaps[seed])) < 1e-5
7772 && fabs(closehit->weight - (NJ->nPos - nGaps[closeNode])) < 1e-5;
7773 if (useTopHits2nd && iClose < tophits->q && (close || identical)) {
7776 int nUse = MIN(tophits->q * tophits2Safety, 2 * tophits->m);
7777 besthit_t *besthitsClose = mymalloc(sizeof(besthit_t) * nUse);
7778 TransferBestHits(NJ, /*nActive*/NJ->nSeq,
7780 /*IN*/besthitsSeed, /*SIZE*/nUse,
7781 /*OUT*/besthitsClose,
7782 /*updateDistance*/true);
7783 SortSaveBestHits(closeNode, /*IN/SORT*/besthitsClose,
7784 /*IN-SIZE*/nUse, /*OUT-SIZE*/tophits->q,
7786 tophits->top_hits_lists[closeNode].hitSource = seed;
7787 besthitsClose = myfree(besthitsClose, sizeof(besthit_t) * nUse);
7788 } else if (close || identical || (fastest && iClose < (tophits->q+1)/2)) {
7791 if(verbose>2) fprintf(stderr, "Near neighbor %d (rank %d weight %f ungapped %d %d)\n",
7792 closeNode, iClose, besthitsSeed[iClose].weight,
7793 NJ->nPos-nGaps[seed],
7794 NJ->nPos-nGaps[closeNode]);
7796 /* compute top 2*m hits */
7797 TransferBestHits(NJ, /*nActive*/NJ->nSeq,
7799 /*IN*/besthitsSeed, /*SIZE*/2 * tophits->m,
7800 /*OUT*/besthitsNeighbor,
7801 /*updateDistance*/true);
7802 SortSaveBestHits(closeNode, /*IN/SORT*/besthitsNeighbor,
7803 /*IN-SIZE*/2 * tophits->m, /*OUT-SIZE*/tophits->m,
7806 /* And then try for a second level of transfer. We assume we
7807 are in a good area, because of the 1st
7808 level of transfer, and in a small neighborhood, because q is
7809 small (32 for 1 million sequences), so we do not make any close checks.
7812 for (iClose2 = 0; iClose2 < tophits->q && iClose2 < 2 * tophits->m; iClose2++) {
7813 int closeNode2 = besthitsNeighbor[iClose2].j;
7814 assert(closeNode2 >= 0);
7815 if (tophits->top_hits_lists[closeNode2].hits == NULL) {
7818 int nUse = MIN(tophits->q * tophits2Safety, 2 * tophits->m);
7819 besthit_t *besthitsClose2 = mymalloc(sizeof(besthit_t) * nUse);
7820 TransferBestHits(NJ, /*nActive*/NJ->nSeq,
7822 /*IN*/besthitsNeighbor, /*SIZE*/nUse,
7823 /*OUT*/besthitsClose2,
7824 /*updateDistance*/true);
7825 SortSaveBestHits(closeNode2, /*IN/SORT*/besthitsClose2,
7826 /*IN-SIZE*/nUse, /*OUT-SIZE*/tophits->q,
7828 tophits->top_hits_lists[closeNode2].hitSource = closeNode;
7829 besthitsClose2 = myfree(besthitsClose2, sizeof(besthit_t) * nUse);
7830 } /* end if should do 2nd-level transfer */
7833 } /* end loop over close candidates */
7834 besthitsSeed = myfree(besthitsSeed, sizeof(besthit_t)*NJ->nSeq);
7835 besthitsNeighbor = myfree(besthitsNeighbor, sizeof(besthit_t) * 2 * tophits->m);
7836 } /* end loop over seeds */
7838 for (iNode=0; iNode<NJ->nSeq; iNode++) {
7839 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
7840 assert(l->hits != NULL);
7841 assert(l->hits[0].j >= 0);
7842 assert(l->hits[0].j < NJ->nSeq);
7843 assert(l->hits[0].j != iNode);
7844 tophits->visible[iNode] = l->hits[0];
7847 if (verbose >= 2) fprintf(stderr, "#Close neighbors among leaves: 1st-level %ld 2nd-level %ld seeds %ld\n",
7848 nCloseUsed, nClose2Used, NJ->nSeq-nCloseUsed-nClose2Used);
7849 nGaps = myfree(nGaps, sizeof(int)*NJ->nSeq);
7850 seeds = myfree(seeds, sizeof(int)*NJ->nSeq);
7852 /* Now add a "checking phase" where we ensure that the q or 2*sqrt(m) hits
7853 of i are represented in j (if they should be)
7856 int nCheck = tophits->q > 0 ? tophits->q : (int)(0.5 + 2.0*sqrt(tophits->m));
7857 for (iNode = 0; iNode < NJ->nSeq; iNode++) {
7858 if ((iNode % 100) == 0)
7859 ProgressReport("Checking top hits for %6d of %6d seqs",
7860 iNode+1, NJ->nSeq, 0, 0);
7861 top_hits_list_t *lNode = &tophits->top_hits_lists[iNode];
7863 for (iHit = 0; iHit < nCheck && iHit < lNode->nHits; iHit++) {
7864 besthit_t bh = HitToBestHit(iNode, lNode->hits[iHit]);
7865 SetCriterion(NJ, /*nActive*/NJ->nSeq, /*IN/OUT*/&bh);
7866 top_hits_list_t *lTarget = &tophits->top_hits_lists[bh.j];
7868 /* If this criterion is worse than the nCheck-1 entry of the target,
7869 then skip the check.
7870 This logic is based on assuming that the list is sorted,
7871 which is true initially but may not be true later.
7872 Still, is a good heuristic.
7875 assert(nCheck <= lTarget->nHits);
7876 besthit_t bhCheck = HitToBestHit(bh.j, lTarget->hits[nCheck-1]);
7877 SetCriterion(NJ, /*nActive*/NJ->nSeq, /*IN/OUT*/&bhCheck);
7878 if (bhCheck.criterion < bh.criterion)
7879 continue; /* no check needed */
7881 /* Check if this is present in the top-hit list */
7883 bool bFound = false;
7884 for (iHit2 = 0; iHit2 < lTarget->nHits && !bFound; iHit2++)
7885 if (lTarget->hits[iHit2].j == iNode)
7888 /* Find the hit with the worst criterion and replace it with this one */
7890 double dWorstCriterion = -1e20;
7891 for (iHit2 = 0; iHit2 < lTarget->nHits; iHit2++) {
7892 besthit_t bh2 = HitToBestHit(bh.j, lTarget->hits[iHit2]);
7893 SetCriterion(NJ, /*nActive*/NJ->nSeq, /*IN/OUT*/&bh2);
7894 if (bh2.criterion > dWorstCriterion) {
7896 dWorstCriterion = bh2.criterion;
7899 if (dWorstCriterion > bh.criterion) {
7900 assert(iWorst >= 0);
7901 lTarget->hits[iWorst].j = iNode;
7902 lTarget->hits[iWorst].dist = bh.dist;
7904 /* and perhaps update visible */
7906 bool bSuccess = GetVisible(NJ, /*nActive*/NJ->nSeq, tophits, bh.j, /*OUT*/&v);
7908 if (bh.criterion < v.criterion)
7909 tophits->visible[bh.j] = lTarget->hits[iWorst];
7916 fprintf(stderr, "Replaced %ld top hit entries\n", lReplace);
7919 /* Updates out-distances but does not reset or update visible set */
7920 void GetBestFromTopHits(int iNode,
7921 /*IN/UPDATE*/NJ_t *NJ,
7923 /*IN*/top_hits_t *tophits,
7924 /*OUT*/besthit_t *bestjoin) {
7926 assert(NJ->parent[iNode] < 0);
7927 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
7928 assert(l->nHits > 0);
7929 assert(l->hits != NULL);
7932 SetOutDistance(NJ, iNode, nActive); /* ensure out-distances are not stale */
7936 bestjoin->dist = 1e20;
7937 bestjoin->criterion = 1e20;
7940 for(iBest=0; iBest < l->nHits; iBest++) {
7941 besthit_t bh = HitToBestHit(iNode, l->hits[iBest]);
7942 if (UpdateBestHit(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/&bh, /*update dist*/true)) {
7943 SetCriterion(/*IN/OUT*/NJ, nActive, /*IN/OUT*/&bh); /* make sure criterion is correct */
7944 if (bh.criterion < bestjoin->criterion)
7948 assert(bestjoin->j >= 0); /* a hit was found */
7949 assert(bestjoin->i == iNode);
7952 int ActiveAncestor(/*IN*/NJ_t *NJ, int iNode) {
7955 while(NJ->parent[iNode] >= 0)
7956 iNode = NJ->parent[iNode];
7960 bool UpdateBestHit(/*IN/UPDATE*/NJ_t *NJ, int nActive, /*IN/OUT*/besthit_t *hit,
7962 int i = ActiveAncestor(/*IN*/NJ, hit->i);
7963 int j = ActiveAncestor(/*IN*/NJ, hit->j);
7964 if (i < 0 || j < 0 || i == j) {
7969 hit->criterion = 1e20;
7972 if (i != hit->i || j != hit->j) {
7976 SetDistCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/hit);
7979 hit->criterion = 1e20;
7985 bool GetVisible(/*IN/UPDATE*/NJ_t *NJ, int nActive,
7986 /*IN/OUT*/top_hits_t *tophits,
7987 int iNode, /*OUT*/besthit_t *visible) {
7988 if (iNode < 0 || NJ->parent[iNode] >= 0)
7990 hit_t *v = &tophits->visible[iNode];
7991 if (v->j < 0 || NJ->parent[v->j] >= 0)
7993 *visible = HitToBestHit(iNode, *v);
7994 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/visible);
7998 besthit_t *UniqueBestHits(/*IN/UPDATE*/NJ_t *NJ, int nActive,
7999 /*IN/SORT*/besthit_t *combined, int nCombined,
8000 /*OUT*/int *nUniqueOut) {
8002 for (iHit = 0; iHit < nCombined; iHit++) {
8003 besthit_t *hit = &combined[iHit];
8004 UpdateBestHit(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/hit, /*update*/false);
8006 qsort(/*IN/OUT*/combined, nCombined, sizeof(besthit_t), CompareHitsByIJ);
8008 besthit_t *uniqueList = (besthit_t*)mymalloc(sizeof(besthit_t)*nCombined);
8010 int iSavedLast = -1;
8012 /* First build the new list */
8013 for (iHit = 0; iHit < nCombined; iHit++) {
8014 besthit_t *hit = &combined[iHit];
8015 if (hit->i < 0 || hit->j < 0)
8017 if (iSavedLast >= 0) {
8018 /* toss out duplicates */
8019 besthit_t *saved = &combined[iSavedLast];
8020 if (saved->i == hit->i && saved->j == hit->j)
8023 assert(nUnique < nCombined);
8024 assert(hit->j >= 0 && NJ->parent[hit->j] < 0);
8025 uniqueList[nUnique++] = *hit;
8028 *nUniqueOut = nUnique;
8030 /* Then do any updates to the criterion or the distances in parallel */
8032 #pragma omp parallel for schedule(dynamic, 50)
8034 for (iHit = 0; iHit < nUnique; iHit++) {
8035 besthit_t *hit = &uniqueList[iHit];
8036 if (hit->dist < 0.0)
8037 SetDistCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/hit);
8039 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/hit);
8045 Create a top hit list for the new node, either
8046 from children (if there are enough best hits left) or by a "refresh"
8047 Also set visible set for newnode
8048 Also update visible set for other nodes if we stumble across a "better" hit
8051 void TopHitJoin(int newnode,
8052 /*IN/UPDATE*/NJ_t *NJ,
8054 /*IN/OUT*/top_hits_t *tophits) {
8055 long startProfileOps = profileOps;
8056 long startOutProfileOps = outprofileOps;
8057 assert(NJ->child[newnode].nChild == 2);
8058 top_hits_list_t *lNew = &tophits->top_hits_lists[newnode];
8059 assert(lNew->hits == NULL);
8063 top_hits_list_t *lChild[2];
8064 for (i = 0; i< 2; i++) {
8065 lChild[i] = &tophits->top_hits_lists[NJ->child[newnode].child[i]];
8066 assert(lChild[i]->hits != NULL && lChild[i]->nHits > 0);
8068 int nCombined = lChild[0]->nHits + lChild[1]->nHits;
8069 besthit_t *combinedList = (besthit_t*)mymalloc(sizeof(besthit_t)*nCombined);
8070 HitsToBestHits(lChild[0]->hits, lChild[0]->nHits, NJ->child[newnode].child[0],
8071 /*OUT*/combinedList);
8072 HitsToBestHits(lChild[1]->hits, lChild[1]->nHits, NJ->child[newnode].child[1],
8073 /*OUT*/combinedList + lChild[0]->nHits);
8075 /* UniqueBestHits() replaces children (used in the calls to HitsToBestHits)
8076 with active ancestors, so all distances & criteria will be recomputed */
8077 besthit_t *uniqueList = UniqueBestHits(/*IN/UPDATE*/NJ, nActive,
8078 /*IN/SORT*/combinedList,
8081 int nUniqueAlloc = nCombined;
8082 combinedList = myfree(combinedList, sizeof(besthit_t)*nCombined);
8084 /* Forget the top-hit lists of the joined nodes */
8085 for (i = 0; i < 2; i++) {
8086 lChild[i]->hits = myfree(lChild[i]->hits, sizeof(hit_t) * lChild[i]->nHits);
8087 lChild[i]->nHits = 0;
8090 /* Use the average age, rounded up, by 1 Versions 2.0 and earlier
8091 used the maximum age, which leads to more refreshes without
8092 improving the accuracy of the NJ phase. Intuitively, if one of
8093 them was just refreshed then another refresh is unlikely to help.
8095 lNew->age = (lChild[0]->age+lChild[1]->age+1)/2 + 1;
8097 /* If top hit ages always match (perfectly balanced), then a
8098 limit of log2(m) would mean a refresh after
8099 m joins, which is about what we want.
8101 int tophitAgeLimit = MAX(1, (int)(0.5 + log((double)tophits->m)/log(2.0)));
8103 /* Either use the merged list as candidate top hits, or
8104 move from 2nd level to 1st level, or do a refresh
8105 UniqueBestHits eliminates hits to self, so if nUnique==nActive-1,
8106 we've already done the exhaustive search.
8108 Either way, we set tophits, visible(newnode), update visible of its top hits,
8109 and modify topvisible: if we do a refresh, then we reset it, otherwise we update
8111 bool bSecondLevel = lChild[0]->hitSource >= 0 && lChild[1]->hitSource >= 0;
8112 bool bUseUnique = nUnique==nActive-1
8113 || (lNew->age <= tophitAgeLimit
8114 && nUnique >= (bSecondLevel ? (int)(0.5 + tophits2Refresh * tophits->q)
8115 : (int)(0.5 + tophits->m * tophitsRefresh) ));
8116 if (bUseUnique && verbose > 2)
8117 fprintf(stderr,"Top hits for %d from combined %d nActive=%d tophitsage %d %s\n",
8118 newnode,nUnique,nActive,lNew->age,
8119 bSecondLevel ? "2ndlevel" : "1stlevel");
8123 && lNew->age <= tophitAgeLimit) {
8124 int source = ActiveAncestor(NJ, lChild[0]->hitSource);
8125 if (source == newnode)
8126 source = ActiveAncestor(NJ, lChild[1]->hitSource);
8127 /* In parallel mode, it is possible that we would select a node as the
8128 hit-source and then over-write that top hit with a short list.
8129 So we need this sanity check.
8131 if (source != newnode
8133 && tophits->top_hits_lists[source].hitSource < 0) {
8135 /* switch from 2nd-level to 1st-level top hits -- compute top hits list
8136 of node from what we have so far plus the active source plus its top hits */
8137 top_hits_list_t *lSource = &tophits->top_hits_lists[source];
8138 assert(lSource->hitSource < 0);
8139 assert(lSource->nHits > 0);
8140 int nMerge = 1 + lSource->nHits + nUnique;
8141 besthit_t *mergeList = mymalloc(sizeof(besthit_t) * nMerge);
8142 memcpy(/*to*/mergeList, /*from*/uniqueList, nUnique * sizeof(besthit_t));
8144 int iMerge = nUnique;
8145 mergeList[iMerge].i = newnode;
8146 mergeList[iMerge].j = source;
8147 SetDistCriterion(NJ, nActive, /*IN/OUT*/&mergeList[iMerge]);
8149 HitsToBestHits(lSource->hits, lSource->nHits, newnode, /*OUT*/mergeList+iMerge);
8150 for (i = 0; i < lSource->nHits; i++) {
8151 SetDistCriterion(NJ, nActive, /*IN/OUT*/&mergeList[iMerge]);
8154 assert(iMerge == nMerge);
8156 uniqueList = myfree(uniqueList, nUniqueAlloc * sizeof(besthit_t));
8157 uniqueList = UniqueBestHits(/*IN/UPDATE*/NJ, nActive,
8158 /*IN/SORT*/mergeList,
8161 nUniqueAlloc = nMerge;
8162 mergeList = myfree(mergeList, sizeof(besthit_t)*nMerge);
8164 assert(nUnique > 0);
8165 bUseUnique = nUnique >= (int)(0.5 + tophits->m * tophitsRefresh);
8166 bSecondLevel = false;
8168 if (bUseUnique && verbose > 2)
8169 fprintf(stderr, "Top hits for %d from children and source %d's %d hits, nUnique %d\n",
8170 newnode, source, lSource->nHits, nUnique);
8176 /* pick arbitrarily */
8177 lNew->hitSource = lChild[0]->hitSource;
8179 int nSave = MIN(nUnique, bSecondLevel ? tophits->q : tophits->m);
8182 fprintf(stderr, "Combined %d ops so far %ld\n", nUnique, profileOps - startProfileOps);
8183 SortSaveBestHits(newnode, /*IN/SORT*/uniqueList, /*nIn*/nUnique,
8184 /*nOut*/nSave, /*IN/OUT*/tophits);
8185 assert(lNew->hits != NULL); /* set by sort/save */
8186 tophits->visible[newnode] = lNew->hits[0];
8187 UpdateTopVisible(/*IN*/NJ, nActive, newnode, &tophits->visible[newnode],
8189 UpdateVisible(/*IN/UPDATE*/NJ, nActive, /*IN*/uniqueList, nSave, /*IN/OUT*/tophits);
8191 /* need to refresh: set top hits for node and for its top hits */
8192 if(verbose > 2) fprintf(stderr,"Top hits for %d by refresh (%d unique age %d) nActive=%d\n",
8193 newnode,nUnique,lNew->age,nActive);
8198 /* ensure all out-distances are up to date ahead of time
8199 to avoid any data overwriting issues.
8202 #pragma omp parallel for schedule(dynamic, 50)
8204 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
8205 if (NJ->parent[iNode] < 0) {
8211 SetCriterion(/*IN/UPDATE*/NJ, nActive, &bh);
8213 SetOutDistance(/*IN/UDPATE*/NJ, iNode, nActive);
8218 /* exhaustively get the best 2*m hits for newnode, set visible, and save the top m */
8219 besthit_t *allhits = (besthit_t*)mymalloc(sizeof(besthit_t)*NJ->maxnode);
8220 assert(2 * tophits->m <= NJ->maxnode);
8222 SetBestHit(newnode, NJ, nActive, /*OUT*/&bh, /*OUT*/allhits);
8223 qsort(/*IN/OUT*/allhits, NJ->maxnode, sizeof(besthit_t), CompareHitsByCriterion);
8224 SortSaveBestHits(newnode, /*IN/SORT*/allhits, /*nIn*/NJ->maxnode,
8225 /*nOut*/tophits->m, /*IN/OUT*/tophits);
8227 /* Do not need to call UpdateVisible because we set visible below */
8229 /* And use the top 2*m entries to expand other best-hit lists, but only for top m */
8232 #pragma omp parallel for schedule(dynamic, 50)
8234 for (iHit=0; iHit < tophits->m; iHit++) {
8235 if (allhits[iHit].i < 0) continue;
8236 int iNode = allhits[iHit].j;
8238 if (NJ->parent[iNode] >= 0) continue;
8239 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
8240 int nHitsOld = l->nHits;
8241 assert(nHitsOld <= tophits->m);
8244 /* Merge: old hits into 0->nHitsOld and hits from iNode above that */
8245 besthit_t *bothList = (besthit_t*)mymalloc(sizeof(besthit_t) * 3 * tophits->m);
8246 HitsToBestHits(/*IN*/l->hits, nHitsOld, iNode, /*OUT*/bothList); /* does not compute criterion */
8247 for (i = 0; i < nHitsOld; i++)
8248 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/&bothList[i]);
8249 if (nActive <= 2 * tophits->m)
8250 l->hitSource = -1; /* abandon the 2nd-level top-hits heuristic */
8251 int nNewHits = l->hitSource >= 0 ? tophits->q : tophits->m;
8252 assert(nNewHits > 0);
8254 TransferBestHits(/*IN/UPDATE*/NJ, nActive, iNode,
8255 /*IN*/allhits, /*nOldHits*/2 * nNewHits,
8256 /*OUT*/&bothList[nHitsOld],
8257 /*updateDist*/false); /* rely on UniqueBestHits to update dist and/or criterion */
8259 besthit_t *uniqueList2 = UniqueBestHits(/*IN/UPDATE*/NJ, nActive,
8260 /*IN/SORT*/bothList, nHitsOld + 2 * nNewHits,
8262 assert(nUnique2 > 0);
8263 bothList = myfree(bothList,3 * tophits->m * sizeof(besthit_t));
8265 /* Note this will overwrite l, but we saved nHitsOld */
8266 SortSaveBestHits(iNode, /*IN/SORT*/uniqueList2, /*nIn*/nUnique2,
8267 /*nOut*/nNewHits, /*IN/OUT*/tophits);
8268 /* will update topvisible below */
8269 tophits->visible[iNode] = tophits->top_hits_lists[iNode].hits[0];
8270 uniqueList2 = myfree(uniqueList2, (nHitsOld + 2 * tophits->m) * sizeof(besthit_t));
8273 ResetTopVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits); /* outside of the parallel phase */
8274 allhits = myfree(allhits,sizeof(besthit_t)*NJ->maxnode);
8276 uniqueList = myfree(uniqueList, nUniqueAlloc * sizeof(besthit_t));
8278 fprintf(stderr, "New top-hit list for %d profile-ops %ld (out-ops %ld): source %d age %d members ",
8280 profileOps - startProfileOps,
8281 outprofileOps - startOutProfileOps,
8282 lNew->hitSource, lNew->age);
8285 for (i = 0; i < lNew->nHits; i++)
8286 fprintf(stderr, " %d", lNew->hits[i].j);
8287 fprintf(stderr,"\n");
8291 void UpdateVisible(/*IN/UPDATE*/NJ_t *NJ, int nActive,
8292 /*IN*/besthit_t *tophitsNode,
8294 /*IN/OUT*/top_hits_t *tophits) {
8297 for(iHit = 0; iHit < nTopHits; iHit++) {
8298 besthit_t *hit = &tophitsNode[iHit];
8299 if (hit->i < 0) continue; /* possible empty entries */
8300 assert(NJ->parent[hit->i] < 0);
8301 assert(hit->j >= 0 && NJ->parent[hit->j] < 0);
8303 bool bSuccess = GetVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, hit->j, /*OUT*/&visible);
8304 if (!bSuccess || hit->criterion < visible.criterion) {
8307 hit_t *v = &tophits->visible[hit->j];
8309 v->dist = hit->dist;
8310 UpdateTopVisible(NJ, nActive, hit->j, v, /*IN/OUT*/tophits);
8311 if(verbose>5) fprintf(stderr,"NewVisible %d %d %f\n",
8312 hit->j,v->j,v->dist);
8314 } /* end loop over hits */
8317 /* Update the top-visible list to perhaps include visible[iNode] */
8318 void UpdateTopVisible(/*IN*/NJ_t * NJ, int nActive,
8319 int iIn, /*IN*/hit_t *hit,
8320 /*IN/OUT*/top_hits_t *tophits) {
8321 assert(tophits != NULL);
8322 bool bIn = false; /* placed in the list */
8325 /* First, if the list is not full, put it in somewhere */
8326 for (i = 0; i < tophits->nTopVisible && !bIn; i++) {
8327 int iNode = tophits->topvisible[i];
8329 /* this node is already in the top hit list */
8331 } else if (iNode < 0 || NJ->parent[iNode] >= 0) {
8332 /* found an empty spot */
8334 tophits->topvisible[i] = iIn;
8339 double dCriterionWorst = -1e20;
8341 /* Search for the worst hit */
8342 for (i = 0; i < tophits->nTopVisible && !bIn; i++) {
8343 int iNode = tophits->topvisible[i];
8344 assert(iNode >= 0 && NJ->parent[iNode] < 0 && iNode != iIn);
8346 if (!GetVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, iNode, /*OUT*/&visible)) {
8347 /* found an empty spot */
8348 tophits->topvisible[i] = iIn;
8350 } else if (visible.i == hit->j && visible.j == iIn) {
8351 /* the reverse hit is already in the top hit list */
8353 } else if (visible.criterion >= dCriterionWorst) {
8355 dCriterionWorst = visible.criterion;
8360 if (!bIn && iPosWorst >= 0) {
8361 besthit_t visible = HitToBestHit(iIn, *hit);
8362 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/&visible);
8363 if (visible.criterion < dCriterionWorst) {
8365 int iOld = tophits->topvisible[iPosWorst];
8366 fprintf(stderr, "TopVisible replace %d=>%d with %d=>%d\n",
8367 iOld, tophits->visible[iOld].j, visible.i, visible.j);
8369 tophits->topvisible[iPosWorst] = iIn;
8374 fprintf(stderr, "Updated TopVisible: ");
8375 for (i = 0; i < tophits->nTopVisible; i++) {
8376 int iNode = tophits->topvisible[i];
8377 if (iNode >= 0 && NJ->parent[iNode] < 0) {
8378 besthit_t bh = HitToBestHit(iNode, tophits->visible[iNode]);
8379 SetDistCriterion(NJ, nActive, &bh);
8380 fprintf(stderr, " %d=>%d:%.4f", bh.i, bh.j, bh.criterion);
8383 fprintf(stderr,"\n");
8387 /* Recompute the topvisible list */
8388 void ResetTopVisible(/*IN/UPDATE*/NJ_t *NJ,
8390 /*IN/OUT*/top_hits_t *tophits) {
8391 besthit_t *visibleSorted = mymalloc(sizeof(besthit_t)*nActive);
8392 int nVisible = 0; /* #entries in visibleSorted */
8394 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
8395 /* skip joins involving stale nodes */
8396 if (NJ->parent[iNode] >= 0)
8399 if (GetVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, iNode, /*OUT*/&v)) {
8400 assert(nVisible < nActive);
8401 visibleSorted[nVisible++] = v;
8404 assert(nVisible > 0);
8406 qsort(/*IN/OUT*/visibleSorted,nVisible,sizeof(besthit_t),CompareHitsByCriterion);
8408 /* Only keep the top m items, and try to avoid duplicating i->j with j->i
8409 Note that visible(i) -> j does not necessarily imply visible(j) -> i,
8410 so we store what the pairing was (or -1 for not used yet)
8412 int *inTopVisible = malloc(sizeof(int) * NJ->maxnodes);
8414 for (i = 0; i < NJ->maxnodes; i++)
8415 inTopVisible[i] = -1;
8418 fprintf(stderr, "top-hit search: nActive %d nVisible %d considering up to %d items\n",
8419 nActive, nVisible, tophits->m);
8421 /* save the sorted indices in topvisible */
8423 for (i = 0; i < nVisible && iSave < tophits->nTopVisible; i++) {
8424 besthit_t *v = &visibleSorted[i];
8425 if (inTopVisible[v->i] != v->j) { /* not seen already */
8426 tophits->topvisible[iSave++] = v->i;
8427 inTopVisible[v->i] = v->j;
8428 inTopVisible[v->j] = v->i;
8431 while(iSave < tophits->nTopVisible)
8432 tophits->topvisible[iSave++] = -1;
8433 myfree(visibleSorted, sizeof(besthit_t)*nActive);
8434 myfree(inTopVisible, sizeof(int) * NJ->maxnodes);
8435 tophits->topvisibleAge = 0;
8437 fprintf(stderr, "Reset TopVisible: ");
8438 for (i = 0; i < tophits->nTopVisible; i++) {
8439 int iNode = tophits->topvisible[i];
8442 fprintf(stderr, " %d=>%d", iNode, tophits->visible[iNode].j);
8444 fprintf(stderr,"\n");
8449 Find best hit to do in O(N*log(N) + m*L*log(N)) time, by
8450 copying and sorting the visible list
8451 updating out-distances for the top (up to m) candidates
8452 selecting the best hit
8454 local hill-climbing for a better join,
8455 using best-hit lists only, and updating
8456 all out-distances in every best-hit list
8458 void TopHitNJSearch(/*IN/UPDATE*/NJ_t *NJ, int nActive,
8459 /*IN/OUT*/top_hits_t *tophits,
8460 /*OUT*/besthit_t *join) {
8461 /* first, do we have at least m/2 candidates in topvisible?
8462 And remember the best one */
8464 int iNodeBestCandidate = -1;
8465 double dBestCriterion = 1e20;
8468 for (i = 0; i < tophits->nTopVisible; i++) {
8469 int iNode = tophits->topvisible[i];
8471 if (GetVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits, iNode, /*OUT*/&visible)) {
8473 if (iNodeBestCandidate < 0 || visible.criterion < dBestCriterion) {
8474 iNodeBestCandidate = iNode;
8475 dBestCriterion = visible.criterion;
8480 tophits->topvisibleAge++;
8481 /* Note we may have only nActive/2 joins b/c we try to store them once */
8482 if (2 * tophits->topvisibleAge > tophits->m
8483 || (3*nCandidate < tophits->nTopVisible && 3*nCandidate < nActive)) {
8484 /* recompute top visible */
8486 fprintf(stderr, "Resetting the top-visible list at nActive=%d\n",nActive);
8488 /* If age is low, then our visible set is becoming too sparse, because we have
8489 recently recomputed the top visible subset. This is very rare but can happen
8490 with -fastest. A quick-and-dirty solution is to walk up
8491 the parents to get additional entries in top hit lists. To ensure that the
8492 visible set becomes full, pick an arbitrary node if walking up terminates at self.
8494 if (tophits->topvisibleAge <= 2) {
8496 fprintf(stderr, "Expanding visible set by walking up to active nodes at nActive=%d\n", nActive);
8498 for (iNode = 0; iNode < NJ->maxnode; iNode++) {
8499 if (NJ->parent[iNode] >= 0)
8501 hit_t *v = &tophits->visible[iNode];
8502 int newj = ActiveAncestor(NJ, v->j);
8503 if (newj >= 0 && newj != v->j) {
8504 if (newj == iNode) {
8505 /* pick arbitrarily */
8507 while (NJ->parent[newj] >= 0 || newj == iNode)
8510 assert(newj >= 0 && newj < NJ->maxnodes
8512 && NJ->parent[newj] < 0);
8514 /* Set v to point to newj */
8515 besthit_t bh = { iNode, newj, -1e20, -1e20, -1e20 };
8516 SetDistCriterion(NJ, nActive, /*IN/OUT*/&bh);
8522 ResetTopVisible(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/tophits);
8523 /* and recurse to try again */
8524 TopHitNJSearch(NJ, nActive, tophits, join);
8528 fprintf(stderr, "Top-visible list size %d (nActive %d m %d)\n",
8529 nCandidate, nActive, tophits->m);
8530 assert(iNodeBestCandidate >= 0 && NJ->parent[iNodeBestCandidate] < 0);
8531 bool bSuccess = GetVisible(NJ, nActive, tophits, iNodeBestCandidate, /*OUT*/join);
8533 assert(join->i >= 0 && NJ->parent[join->i] < 0);
8534 assert(join->j >= 0 && NJ->parent[join->j] < 0);
8544 GetBestFromTopHits(join->i, NJ, nActive, tophits, /*OUT*/&bestI);
8545 assert(bestI.i == join->i);
8546 if (bestI.j != join->j && bestI.criterion < join->criterion) {
8549 fprintf(stderr,"BetterI\t%d\t%d\t%d\t%d\t%f\t%f\n",
8550 join->i,join->j,bestI.i,bestI.j,
8551 join->criterion,bestI.criterion);
8556 GetBestFromTopHits(join->j, NJ, nActive, tophits, /*OUT*/&bestJ);
8557 assert(bestJ.i == join->j);
8558 if (bestJ.j != join->i && bestJ.criterion < join->criterion) {
8561 fprintf(stderr,"BetterJ\t%d\t%d\t%d\t%d\t%f\t%f\n",
8562 join->i,join->j,bestJ.i,bestJ.j,
8563 join->criterion,bestJ.criterion);
8566 if(changed) nHillBetter++;
8570 int NGaps(/*IN*/NJ_t *NJ, int iNode) {
8571 assert(iNode < NJ->nSeq);
8574 for(p=0; p<NJ->nPos; p++) {
8575 if (NJ->profiles[iNode]->codes[p] == NOCODE)
8581 int CompareHitsByCriterion(const void *c1, const void *c2) {
8582 const besthit_t *hit1 = (besthit_t*)c1;
8583 const besthit_t *hit2 = (besthit_t*)c2;
8584 if (hit1->criterion < hit2->criterion) return(-1);
8585 if (hit1->criterion > hit2->criterion) return(1);
8589 int CompareHitsByIJ(const void *c1, const void *c2) {
8590 const besthit_t *hit1 = (besthit_t*)c1;
8591 const besthit_t *hit2 = (besthit_t*)c2;
8592 return hit1->i != hit2->i ? hit1->i - hit2->i : hit1->j - hit2->j;
8595 void SortSaveBestHits(int iNode, /*IN/SORT*/besthit_t *besthits,
8597 /*IN/OUT*/top_hits_t *tophits) {
8600 top_hits_list_t *l = &tophits->top_hits_lists[iNode];
8602 qsort(/*IN/OUT*/besthits,nIn,sizeof(besthit_t),CompareHitsByCriterion);
8604 /* First count how many we will save
8605 Not sure if removing duplicates is actually necessary.
8610 for (iBest = 0; iBest < nIn && nSave < nOut; iBest++) {
8611 if (besthits[iBest].i < 0)
8613 assert(besthits[iBest].i == iNode);
8614 int j = besthits[iBest].j;
8615 if (j != iNode && j != jLast && j >= 0) {
8624 omp_set_lock(&tophits->locks[iNode]);
8626 if (l->hits != NULL) {
8627 l->hits = myfree(l->hits, l->nHits * sizeof(hit_t));
8630 l->hits = mymalloc(sizeof(hit_t) * nSave);
8634 for (iBest = 0; iBest < nIn && iSave < nSave; iBest++) {
8635 int j = besthits[iBest].j;
8636 if (j != iNode && j != jLast && j >= 0) {
8637 l->hits[iSave].j = j;
8638 l->hits[iSave].dist = besthits[iBest].dist;
8644 omp_unset_lock(&tophits->locks[iNode]);
8646 assert(iSave == nSave);
8649 void TransferBestHits(/*IN/UPDATE*/NJ_t *NJ,
8652 /*IN*/besthit_t *oldhits,
8654 /*OUT*/besthit_t *newhits,
8655 bool updateDistances) {
8657 assert(NJ->parent[iNode] < 0);
8660 for(iBest = 0; iBest < nOldHits; iBest++) {
8661 besthit_t *old = &oldhits[iBest];
8662 besthit_t *new = &newhits[iBest];
8664 new->j = ActiveAncestor(/*IN*/NJ, old->j);
8665 new->dist = old->dist; /* may get reset below */
8666 new->weight = old->weight;
8667 new->criterion = old->criterion;
8669 if(new->j < 0 || new->j == iNode) {
8672 new->criterion = 1e20;
8673 } else if (new->i != old->i || new->j != old->j) {
8674 if (updateDistances)
8675 SetDistCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/new);
8678 new->criterion = 1e20;
8681 if (updateDistances)
8682 SetCriterion(/*IN/UPDATE*/NJ, nActive, /*IN/OUT*/new);
8684 new->criterion = 1e20; /* leave dist alone */
8689 void HitsToBestHits(/*IN*/hit_t *hits, int nHits, int iNode, /*OUT*/besthit_t *newhits) {
8691 for (i = 0; i < nHits; i++) {
8692 hit_t *hit = &hits[i];
8693 besthit_t *bh = &newhits[i];
8696 bh->dist = hit->dist;
8697 bh->criterion = 1e20;
8698 bh->weight = -1; /* not the true value -- we compute these directly when needed */
8702 besthit_t HitToBestHit(int i, hit_t hit) {
8707 bh.criterion = 1e20;
8712 char *OpenMPString(void) {
8714 static char buf[100];
8715 sprintf(buf, ", OpenMP (%d threads)", omp_get_max_threads());
8722 /* Algorithm 26.2.17 from Abromowitz and Stegun, Handbook of Mathematical Functions
8723 Absolute accuracy of only about 1e-7, which is enough for us
8725 double pnorm(double x)
8727 double b1 = 0.319381530;
8728 double b2 = -0.356563782;
8729 double b3 = 1.781477937;
8730 double b4 = -1.821255978;
8731 double b5 = 1.330274429;
8732 double p = 0.2316419;
8733 double c = 0.39894228;
8736 double t = 1.0 / ( 1.0 + p * x );
8737 return (1.0 - c * exp( -x * x / 2.0 ) * t *
8738 ( t *( t * ( t * ( t * b5 + b4 ) + b3 ) + b2 ) + b1 ));
8741 double t = 1.0 / ( 1.0 - p * x );
8742 return ( c * exp( -x * x / 2.0 ) * t *
8743 ( t *( t * ( t * ( t * b5 + b4 ) + b3 ) + b2 ) + b1 ));
8746 void *mymalloc(size_t sz) {
8747 if (sz == 0) return(NULL);
8748 void *new = malloc(sz);
8750 fprintf(stderr, "Out of memory\n");
8756 struct mallinfo mi = mallinfo();
8757 if (mi.arena+mi.hblkhd > maxmallocHeap)
8758 maxmallocHeap = mi.arena+mi.hblkhd;
8760 /* gcc malloc should always return 16-byte-aligned values... */
8761 assert(IS_ALIGNED(new));
8765 void *mymemdup(void *data, size_t sz) {
8766 if(data==NULL) return(NULL);
8767 void *new = mymalloc(sz);
8768 memcpy(/*to*/new, /*from*/data, sz);
8772 void *myrealloc(void *data, size_t szOld, size_t szNew, bool bCopy) {
8773 if (data == NULL && szOld == 0)
8774 return(mymalloc(szNew));
8775 if (data == NULL || szOld == 0 || szNew == 0) {
8776 fprintf(stderr,"Empty myrealloc\n");
8783 /* Try to reduce memory fragmentation by allocating anew and copying
8784 Seems to help in practice */
8785 new = mymemdup(data, szNew);
8786 myfree(data, szOld);
8788 new = realloc(data,szNew);
8790 fprintf(stderr, "Out of memory\n");
8793 assert(IS_ALIGNED(new));
8794 szAllAlloc += (szNew-szOld);
8795 mymallocUsed += (szNew-szOld);
8797 struct mallinfo mi = mallinfo();
8798 if (mi.arena+mi.hblkhd > maxmallocHeap)
8799 maxmallocHeap = mi.arena+mi.hblkhd;
8805 void *myfree(void *p, size_t sz) {
8806 if(p==NULL) return(NULL);
8812 /******************************************************************************/
8813 /* Minimization of a 1-dimensional function by Brent's method (Numerical Recipes)
8814 * Borrowed from Tree-Puzzle 5.1 util.c under GPL
8815 * Modified by M.N.P to pass in the accessory data for the optimization function,
8816 * to use 2x bounds around the starting guess and expand them if necessary,
8817 * and to use both a fractional and an absolute tolerance
8821 #define CGOLD 0.3819660
8822 #define TINY 1.0e-20
8823 #define ZEPS 1.0e-10
8824 #define SHFT(a,b,c,d) (a)=(b);(b)=(c);(c)=(d);
8825 #define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))
8827 /* Brents method in one dimension */
8828 double brent(double ax, double bx, double cx, double (*f)(double, void *), void *data,
8829 double ftol, double atol,
8830 double *foptx, double *f2optx, double fax, double fbx, double fcx)
8833 double a,b,d=0,etemp,fu,fv,fw,fx,p,q,r,tol1,tol2,u,v,w,x,xm;
8837 a=(ax < cx ? ax : cx);
8838 b=(ax > cx ? ax : cx);
8852 for (iter=1;iter<=ITMAX;iter++) {
8855 tol2=2.0*(tol1+ZEPS);
8856 if (fabs(x-xm) <= (tol2-0.5*(b-a))
8857 || fabs(a-b) < atol) {
8862 *f2optx = 2.0*(fv*xw + fx*wv + fw*vx)/
8863 (v*v*xw + x*x*wv + w*w*vx);
8866 if (fabs(e) > tol1) {
8871 if (q > 0.0) p = -p;
8875 if (fabs(p) >= fabs(0.5*q*etemp) || p <= q*(a-x) || p >= q*(b-x))
8876 d=CGOLD*(e=(x >= xm ? a-x : b-x));
8880 if (u-a < tol2 || b-u < tol2)
8884 d=CGOLD*(e=(x >= xm ? a-x : b-x));
8886 u=(fabs(d) >= tol1 ? x+d : x+SIGN(tol1,d));
8889 if (u >= x) a=x; else b=x;
8893 if (u < x) a=u; else b=u;
8894 if (fu <= fw || w == x) {
8899 } else if (fu <= fv || v == x || v == w) {
8909 *f2optx = 2.0*(fv*xw + fx*wv + fw*vx)/
8910 (v*v*xw + x*x*wv + w*w*vx);
8919 /* one-dimensional minimization - as input a lower and an upper limit and a trial
8920 value for the minimum is needed: xmin < xguess < xmax
8921 the function and a fractional tolerance has to be specified
8922 onedimenmin returns the optimal x value and the value of the function
8923 and its second derivative at this point
8925 double onedimenmin(double xmin, double xguess, double xmax, double (*f)(double,void*), void *data,
8926 double ftol, double atol,
8927 /*OUT*/double *fx, /*OUT*/double *f2x)
8929 double optx, ax, bx, cx, fa, fb, fc;
8931 /* first attempt to bracketize minimum */
8932 if (xguess == xmin) {
8936 } else if (xguess <= 2.0 * xmin) {
8950 fprintf(stderr, "onedimenmin lo %.4f guess %.4f hi %.4f range %.4f %.4f\n",
8951 ax, bx, cx, xmin, xmax);
8952 /* ideally this range includes the true minimum, i.e.,
8954 if not, we gradually expand the boundaries until it does,
8955 or we near the boundary of the allowed range and use that
8960 while(fa < fb && ax > xmin) {
8962 if (ax < 2.0*xmin) /* give up on shrinking the region */
8966 while(fc < fb && cx < xmax) {
8968 if (cx > xmax * 0.95)
8972 optx = brent(ax, bx, cx, f, data, ftol, atol, fx, f2x, fa, fb, fc);
8975 fprintf(stderr, "onedimenmin reaches optimum f(%.4f) = %.4f f2x %.4f\n", optx, *fx, *f2x);
8976 return optx; /* return optimal x */
8979 /* Numerical code for the gamma distribution is modified from the PhyML 3 code
8980 (GNU public license) of Stephane Guindon
8983 double LnGamma (double alpha)
8985 /* returns ln(gamma(alpha)) for alpha>0, accurate to 10 decimal places.
8986 Stirling's formula is used for the central polynomial part of the procedure.
8987 Pike MC & Hill ID (1966) Algorithm 291: Logarithm of the gamma function.
8988 Communications of the Association for Computing Machinery, 9:684
8990 double x=alpha, f=0, z;
8994 x=z; f=-(double)log(f);
8997 return f + (x-0.5)*(double)log(x) - x + .918938533204673
8998 + (((-.000595238095238*z+.000793650793651)*z-.002777777777778)*z
8999 +.083333333333333)/x;
9002 double IncompleteGamma(double x, double alpha, double ln_gamma_alpha)
9004 /* returns the incomplete gamma ratio I(x,alpha) where x is the upper
9005 limit of the integration and alpha is the shape parameter.
9006 returns (-1) if in error
9007 ln_gamma_alpha = ln(Gamma(alpha)), is almost redundant.
9008 (1) series expansion if (alpha>x || x<=1)
9009 (2) continued fraction otherwise
9011 Bhattacharjee GP (1970) The incomplete gamma integral. Applied Statistics,
9015 double p=alpha, g=ln_gamma_alpha;
9016 double accurate=1e-8, overflow=1e30;
9017 double factor, gin=0, rn=0, a=0,b=0,an=0,dif=0, term=0, pn[6];
9019 if (x==0) return (0);
9020 if (x<0 || p<=0) return (-1);
9022 factor=(double)exp(p*(double)log(x)-x-g);
9023 if (x>1 && x>=p) goto l30;
9024 /* (1) series expansion */
9025 gin=1; term=1; rn=p;
9028 term*=x/rn; gin+=term;
9030 if (term > accurate) goto l20;
9034 /* (2) continued fraction */
9035 a=1-p; b=a+x+1; term=0;
9036 pn[0]=1; pn[1]=x; pn[2]=x+1; pn[3]=x*b;
9039 a++; b+=2; term++; an=a*term;
9040 for (i=0; i<2; i++) pn[i+4]=b*pn[i+2]-an*pn[i];
9041 if (pn[5] == 0) goto l35;
9042 rn=pn[4]/pn[5]; dif=fabs(gin-rn);
9043 if (dif>accurate) goto l34;
9044 if (dif<=accurate*rn) goto l42;
9048 for (i=0; i<4; i++) pn[i]=pn[i+2];
9049 if (fabs(pn[4]) < overflow) goto l32;
9050 for (i=0; i<4; i++) pn[i]/=overflow;
9059 double PGamma(double x, double alpha)
9061 /* scale = 1/alpha */
9062 return IncompleteGamma(x*alpha,alpha,LnGamma(alpha));
9065 /* helper function to subtract timval structures */
9066 /* Subtract the `struct timeval' values X and Y,
9067 storing the result in RESULT.
9068 Return 1 if the difference is negative, otherwise 0. */
9069 int timeval_subtract (struct timeval *result, struct timeval *x, struct timeval *y)
9071 /* Perform the carry for the later subtraction by updating y. */
9072 if (x->tv_usec < y->tv_usec) {
9073 int nsec = (y->tv_usec - x->tv_usec) / 1000000 + 1;
9074 y->tv_usec -= 1000000 * nsec;
9077 if (x->tv_usec - y->tv_usec > 1000000) {
9078 int nsec = (x->tv_usec - y->tv_usec) / 1000000;
9079 y->tv_usec += 1000000 * nsec;
9083 /* Compute the time remaining to wait.
9084 tv_usec is certainly positive. */
9085 result->tv_sec = x->tv_sec - y->tv_sec;
9086 result->tv_usec = x->tv_usec - y->tv_usec;
9088 /* Return 1 if result is negative. */
9089 return x->tv_sec < y->tv_sec;
9092 double clockDiff(/*IN*/struct timeval *clock_start) {
9093 struct timeval time_now, elapsed;
9094 gettimeofday(/*OUT*/&time_now,NULL);
9095 timeval_subtract(/*OUT*/&elapsed,/*IN*/&time_now,/*IN*/clock_start);
9096 return(elapsed.tv_sec + elapsed.tv_usec*1e-6);
9100 /* The random number generator is taken from D E Knuth
9101 http://www-cs-faculty.stanford.edu/~knuth/taocp.html
9104 /* This program by D E Knuth is in the public domain and freely copyable.
9105 * It is explained in Seminumerical Algorithms, 3rd edition, Section 3.6
9106 * (or in the errata to the 2nd edition --- see
9107 * http://www-cs-faculty.stanford.edu/~knuth/taocp.html
9108 * in the changes to Volume 2 on pages 171 and following). */
9110 /* N.B. The MODIFICATIONS introduced in the 9th printing (2002) are
9111 included here; there's no backwards compatibility with the original. */
9113 /* This version also adopts Brendan McKay's suggestion to
9114 accommodate naive users who forget to call ran_start(seed). */
9116 /* If you find any bugs, please report them immediately to
9117 * taocp@cs.stanford.edu
9118 * (and you will be rewarded if the bug is genuine). Thanks! */
9120 /************ see the book for explanations and caveats! *******************/
9121 /************ in particular, you need two's complement arithmetic **********/
9123 #define KK 100 /* the long lag */
9124 #define LL 37 /* the short lag */
9125 #define MM (1L<<30) /* the modulus */
9126 #define mod_diff(x,y) (((x)-(y))&(MM-1)) /* subtraction mod MM */
9128 long ran_x[KK]; /* the generator state */
9131 void ran_array(long aa[],int n)
9133 void ran_array(aa,n) /* put n new random numbers in aa */
9134 long *aa; /* destination */
9135 int n; /* array length (must be at least KK) */
9139 for (j=0;j<KK;j++) aa[j]=ran_x[j];
9140 for (;j<n;j++) aa[j]=mod_diff(aa[j-KK],aa[j-LL]);
9141 for (i=0;i<LL;i++,j++) ran_x[i]=mod_diff(aa[j-KK],aa[j-LL]);
9142 for (;i<KK;i++,j++) ran_x[i]=mod_diff(aa[j-KK],ran_x[i-LL]);
9145 /* the following routines are from exercise 3.6--15 */
9146 /* after calling ran_start, get new randoms by, e.g., "x=ran_arr_next()" */
9148 #define QUALITY 1009 /* recommended quality level for high-res use */
9149 long ran_arr_buf[QUALITY];
9150 long ran_arr_dummy=-1, ran_arr_started=-1;
9151 long *ran_arr_ptr=&ran_arr_dummy; /* the next random number, or -1 */
9153 #define TT 70 /* guaranteed separation between streams */
9154 #define is_odd(x) ((x)&1) /* units bit of x */
9157 void ran_start(long seed)
9159 void ran_start(seed) /* do this before using ran_array */
9160 long seed; /* selector for different streams */
9164 long x[KK+KK-1]; /* the preparation buffer */
9165 register long ss=(seed+2)&(MM-2);
9166 for (j=0;j<KK;j++) {
9167 x[j]=ss; /* bootstrap the buffer */
9168 ss<<=1; if (ss>=MM) ss-=MM-2; /* cyclic shift 29 bits */
9170 x[1]++; /* make x[1] (and only x[1]) odd */
9171 for (ss=seed&(MM-1),t=TT-1; t; ) {
9172 for (j=KK-1;j>0;j--) x[j+j]=x[j], x[j+j-1]=0; /* "square" */
9173 for (j=KK+KK-2;j>=KK;j--)
9174 x[j-(KK-LL)]=mod_diff(x[j-(KK-LL)],x[j]),
9175 x[j-KK]=mod_diff(x[j-KK],x[j]);
9176 if (is_odd(ss)) { /* "multiply by z" */
9177 for (j=KK;j>0;j--) x[j]=x[j-1];
9178 x[0]=x[KK]; /* shift the buffer cyclically */
9179 x[LL]=mod_diff(x[LL],x[KK]);
9181 if (ss) ss>>=1; else t--;
9183 for (j=0;j<LL;j++) ran_x[j+KK-LL]=x[j];
9184 for (;j<KK;j++) ran_x[j-LL]=x[j];
9185 for (j=0;j<10;j++) ran_array(x,KK+KK-1); /* warm things up */
9186 ran_arr_ptr=&ran_arr_started;
9189 #define ran_arr_next() (*ran_arr_ptr>=0? *ran_arr_ptr++: ran_arr_cycle())
9190 long ran_arr_cycle()
9192 if (ran_arr_ptr==&ran_arr_dummy)
9193 ran_start(314159L); /* the user forgot to initialize */
9194 ran_array(ran_arr_buf,QUALITY);
9196 ran_arr_ptr=ran_arr_buf+1;
9197 return ran_arr_buf[0];
9200 /* end of code from Knuth */
9202 double knuth_rand() {
9203 return(9.31322574615479e-10 * ran_arr_next()); /* multiply by 2**-30 */
9206 hashstrings_t *MakeHashtable(char **strings, int nStrings) {
9207 hashstrings_t *hash = (hashstrings_t*)mymalloc(sizeof(hashstrings_t));
9208 hash->nBuckets = 8*nStrings;
9209 hash->buckets = (hashbucket_t*)mymalloc(sizeof(hashbucket_t) * hash->nBuckets);
9211 for (i=0; i < hash->nBuckets; i++) {
9212 hash->buckets[i].string = NULL;
9213 hash->buckets[i].nCount = 0;
9214 hash->buckets[i].first = -1;
9216 for (i=0; i < nStrings; i++) {
9217 hashiterator_t hi = FindMatch(hash, strings[i]);
9218 if (hash->buckets[hi].string == NULL) {
9219 /* save a unique entry */
9220 assert(hash->buckets[hi].nCount == 0);
9221 hash->buckets[hi].string = strings[i];
9222 hash->buckets[hi].nCount = 1;
9223 hash->buckets[hi].first = i;
9225 /* record a duplicate entry */
9226 assert(hash->buckets[hi].string != NULL);
9227 assert(strcmp(hash->buckets[hi].string, strings[i]) == 0);
9228 assert(hash->buckets[hi].first >= 0);
9229 hash->buckets[hi].nCount++;
9235 hashstrings_t *FreeHashtable(hashstrings_t* hash) {
9237 myfree(hash->buckets, sizeof(hashbucket_t) * hash->nBuckets);
9238 myfree(hash, sizeof(hashstrings_t));
9243 #define MAXADLER 65521
9244 hashiterator_t FindMatch(hashstrings_t *hash, char *string) {
9245 /* Adler-32 checksum */
9246 unsigned int hashA = 1;
9247 unsigned int hashB = 0;
9249 for (p = string; *p != '\0'; p++) {
9250 hashA = ((unsigned int)*p + hashA);
9251 hashB = hashA+hashB;
9255 hashiterator_t hi = (hashB*65536+hashA) % hash->nBuckets;
9256 while(hash->buckets[hi].string != NULL
9257 && strcmp(hash->buckets[hi].string, string) != 0) {
9259 if (hi >= hash->nBuckets)
9265 char *GetHashString(hashstrings_t *hash, hashiterator_t hi) {
9266 return(hash->buckets[hi].string);
9269 int HashCount(hashstrings_t *hash, hashiterator_t hi) {
9270 return(hash->buckets[hi].nCount);
9273 int HashFirst(hashstrings_t *hash, hashiterator_t hi) {
9274 return(hash->buckets[hi].first);
9277 uniquify_t *UniquifyAln(alignment_t *aln) {
9279 char **uniqueSeq = (char**)mymalloc(aln->nSeq * sizeof(char*)); /* iUnique -> seq */
9280 int *uniqueFirst = (int*)mymalloc(aln->nSeq * sizeof(int)); /* iUnique -> iFirst in aln */
9281 int *alnNext = (int*)mymalloc(aln->nSeq * sizeof(int)); /* i in aln -> next, or -1 */
9282 int *alnToUniq = (int*)mymalloc(aln->nSeq * sizeof(int)); /* i in aln -> iUnique; many -> -1 */
9285 for (i = 0; i < aln->nSeq; i++) {
9286 uniqueSeq[i] = NULL;
9287 uniqueFirst[i] = -1;
9291 hashstrings_t *hashseqs = MakeHashtable(aln->seqs, aln->nSeq);
9292 for (i=0; i<aln->nSeq; i++) {
9293 hashiterator_t hi = FindMatch(hashseqs,aln->seqs[i]);
9294 int first = HashFirst(hashseqs,hi);
9296 uniqueSeq[nUniqueSeq] = aln->seqs[i];
9297 uniqueFirst[nUniqueSeq] = i;
9298 alnToUniq[i] = nUniqueSeq;
9302 while (alnNext[last] != -1)
9303 last = alnNext[last];
9306 assert(alnToUniq[last] >= 0 && alnToUniq[last] < nUniqueSeq);
9307 alnToUniq[i] = alnToUniq[last];
9310 assert(nUniqueSeq>0);
9311 hashseqs = FreeHashtable(hashseqs);
9313 uniquify_t *uniquify = (uniquify_t*)mymalloc(sizeof(uniquify_t));
9314 uniquify->nSeq = aln->nSeq;
9315 uniquify->nUnique = nUniqueSeq;
9316 uniquify->uniqueFirst = uniqueFirst;
9317 uniquify->alnNext = alnNext;
9318 uniquify->alnToUniq = alnToUniq;
9319 uniquify->uniqueSeq = uniqueSeq;
9323 uniquify_t *FreeUniquify(uniquify_t *unique) {
9324 if (unique != NULL) {
9325 myfree(unique->uniqueFirst, sizeof(int)*unique->nSeq);
9326 myfree(unique->alnNext, sizeof(int)*unique->nSeq);
9327 myfree(unique->alnToUniq, sizeof(int)*unique->nSeq);
9328 myfree(unique->uniqueSeq, sizeof(char*)*unique->nSeq);
9329 myfree(unique,sizeof(uniquify_t));
9335 traversal_t InitTraversal(NJ_t *NJ) {
9336 traversal_t worked = (bool*)mymalloc(sizeof(bool)*NJ->maxnodes);
9338 for (i=0; i<NJ->maxnodes; i++)
9343 void SkipTraversalInto(int node, /*IN/OUT*/traversal_t traversal) {
9344 traversal[node] = true;
9347 int TraversePostorder(int node, NJ_t *NJ, /*IN/OUT*/traversal_t traversal,
9348 /*OPTIONAL OUT*/bool *pUp) {
9354 /* move to a child if possible */
9357 for (iChild=0; iChild < NJ->child[node].nChild; iChild++) {
9358 int child = NJ->child[node].child[iChild];
9359 if (!traversal[child]) {
9366 continue; /* keep moving down */
9367 if (!traversal[node]) {
9368 traversal[node] = true;
9371 /* If we've already done this node, need to move up */
9372 if (node == NJ->root)
9373 return(-1); /* nowhere to go -- done traversing */
9374 node = NJ->parent[node];
9375 /* If we go up to someplace that was already marked as visited, this is due
9376 to a change in topology, so return it marked as "up" */
9377 if (pUp && traversal[node]) {
9384 traversal_t FreeTraversal(traversal_t traversal, NJ_t *NJ) {
9385 myfree(traversal, sizeof(bool)*NJ->maxnodes);
9389 profile_t **UpProfiles(NJ_t *NJ) {
9390 profile_t **upProfiles = (profile_t**)mymalloc(sizeof(profile_t*)*NJ->maxnodes);
9392 for (i=0; i<NJ->maxnodes; i++) upProfiles[i] = NULL;
9396 profile_t *GetUpProfile(/*IN/OUT*/profile_t **upProfiles, NJ_t *NJ, int outnode, bool useML) {
9397 assert(outnode != NJ->root && outnode >= NJ->nSeq); /* not for root or leaves */
9398 if (upProfiles[outnode] != NULL)
9399 return(upProfiles[outnode]);
9402 int *pathToRoot = PathToRoot(NJ, outnode, /*OUT*/&depth);
9404 /* depth-1 is root */
9405 for (i = depth-2; i>=0; i--) {
9406 int node = pathToRoot[i];
9408 if (upProfiles[node] == NULL) {
9409 /* Note -- SetupABCD may call GetUpProfile, but it should do it farther
9410 up in the path to the root
9412 profile_t *profiles[4];
9414 SetupABCD(NJ, node, /*OUT*/profiles, /*IN/OUT*/upProfiles, /*OUT*/nodeABCD, useML);
9416 /* If node is a child of root, then the 4th profile is of the 2nd root-sibling of node
9417 Otherwise, the 4th profile is the up-profile of the parent of node, and that
9418 is the branch-length we need
9420 double lenC = NJ->branchlength[nodeABCD[2]];
9421 double lenD = NJ->branchlength[nodeABCD[3]];
9423 fprintf(stderr, "Computing UpProfile for node %d with lenC %.4f lenD %.4f pair-loglk %.3f\n",
9425 PairLogLk(profiles[2],profiles[3],lenC+lenD,NJ->nPos,NJ->transmat,&NJ->rates, /*site_lk*/NULL));
9426 PrintNJInternal(stderr, NJ, /*useLen*/true);
9428 upProfiles[node] = PosteriorProfile(/*C*/profiles[2], /*D*/profiles[3],
9430 NJ->transmat, &NJ->rates, NJ->nPos, NJ->nConstraints);
9432 profile_t *profilesCDAB[4] = { profiles[2], profiles[3], profiles[0], profiles[1] };
9433 double weight = QuartetWeight(profilesCDAB, NJ->distance_matrix, NJ->nPos);
9435 fprintf(stderr, "Compute upprofile of %d from %d and parents (vs. children %d %d) with weight %.3f\n",
9436 node, nodeABCD[2], nodeABCD[0], nodeABCD[1], weight);
9437 upProfiles[node] = AverageProfile(profiles[2], profiles[3],
9438 NJ->nPos, NJ->nConstraints,
9439 NJ->distance_matrix,
9444 FreePath(pathToRoot,NJ);
9445 assert(upProfiles[outnode] != NULL);
9446 return(upProfiles[outnode]);
9449 profile_t *DeleteUpProfile(/*IN/OUT*/profile_t **upProfiles, NJ_t *NJ, int node) {
9450 assert(node>=0 && node < NJ->maxnodes);
9451 if (upProfiles[node] != NULL)
9452 upProfiles[node] = FreeProfile(upProfiles[node], NJ->nPos, NJ->nConstraints); /* returns NULL */
9456 profile_t **FreeUpProfiles(profile_t **upProfiles, NJ_t *NJ) {
9459 for (i=0; i < NJ->maxnodes; i++) {
9460 if (upProfiles[i] != NULL)
9462 DeleteUpProfile(upProfiles, NJ, i);
9464 myfree(upProfiles, sizeof(profile_t*)*NJ->maxnodes);
9466 fprintf(stderr,"FreeUpProfiles -- freed %d\n", nUsed);
9470 int *PathToRoot(NJ_t *NJ, int node, /*OUT*/int *outDepth) {
9471 int *pathToRoot = (int*)mymalloc(sizeof(int)*NJ->maxnodes);
9473 int ancestor = node;
9474 while(ancestor >= 0) {
9475 pathToRoot[depth] = ancestor;
9476 ancestor = NJ->parent[ancestor];
9483 int *FreePath(int *path, NJ_t *NJ) {
9484 myfree(path, sizeof(int)*NJ->maxnodes);
9488 transition_matrix_t *CreateGTR(double *r/*ac ag at cg ct gt*/, double *f/*acgt*/) {
9489 double matrix[4][MAXCODES];
9492 /* Place rates onto a symmetric matrix, but correct by f(target), so that
9493 stationary distribution f[] is maintained
9494 Leave diagonals as 0 (CreateTransitionMatrix will fix them)
9497 for (i = 0; i < nCodes; i++) {
9499 for (j = i+1; j < nCodes; j++) {
9500 double rate = r[imat++];
9502 /* Want t(matrix) * f to be 0 */
9503 matrix[i][j] = rate * f[i];
9504 matrix[j][i] = rate * f[j];
9507 /* Compute average mutation rate */
9508 double total_rate = 0;
9509 for (i = 0; i < nCodes; i++)
9510 for (j = 0; j < nCodes; j++)
9511 total_rate += f[i] * matrix[i][j];
9512 assert(total_rate > 1e-6);
9513 double inv = 1.0/total_rate;
9514 for (i = 0; i < nCodes; i++)
9515 for (j = 0; j < nCodes; j++)
9516 matrix[i][j] *= inv;
9517 return(CreateTransitionMatrix(matrix,f));
9520 transition_matrix_t *CreateTransitionMatrix(/*IN*/double matrix[MAXCODES][MAXCODES],
9521 /*IN*/double stat[MAXCODES]) {
9523 transition_matrix_t *transmat = mymalloc(sizeof(transition_matrix_t));
9524 double sqrtstat[20];
9525 for (i = 0; i < nCodes; i++) {
9526 transmat->stat[i] = stat[i];
9527 transmat->statinv[i] = 1.0/stat[i];
9528 sqrtstat[i] = sqrt(stat[i]);
9531 double sym[20*20]; /* symmetrized matrix M' */
9532 /* set diagonals so columns sums are 0 before symmetrization */
9533 for (i = 0; i < nCodes; i++)
9534 for (j = 0; j < nCodes; j++)
9535 sym[nCodes*i+j] = matrix[i][j];
9536 for (j = 0; j < nCodes; j++) {
9538 sym[nCodes*j+j] = 0;
9539 for (i = 0; i < nCodes; i++)
9540 sum += sym[nCodes*i+j];
9541 sym[nCodes*j+j] = -sum;
9543 /* M' = S**-1 M S */
9544 for (i = 0; i < nCodes; i++)
9545 for (j = 0; j < nCodes; j++)
9546 sym[nCodes*i+j] *= sqrtstat[j]/sqrtstat[i];
9548 /* eigen decomposition of M' -- note that eigenW is the transpose of what we want,
9549 which is eigenvectors in columns */
9550 double eigenW[20*20], eval[20], e[20];
9551 for (i = 0; i < nCodes*nCodes; i++)
9553 tred2(eigenW, nCodes, nCodes, eval, e);
9554 tqli(eval, e, nCodes , nCodes, eigenW);
9556 /* save eigenvalues */
9557 for (i = 0; i < nCodes; i++)
9558 transmat->eigenval[i] = eval[i];
9560 /* compute eigen decomposition of M into t(codeFreq): V = S*W */
9561 /* compute inverse of V in eigeninv: V**-1 = t(W) S**-1 */
9562 for (i = 0; i < nCodes; i++) {
9563 for (j = 0; j < nCodes; j++) {
9564 transmat->eigeninv[i][j] = eigenW[nCodes*i+j] / sqrtstat[j];
9565 transmat->eigeninvT[j][i] = transmat->eigeninv[i][j];
9568 for (i = 0; i < nCodes; i++)
9569 for (j = 0; j < nCodes; j++)
9570 transmat->codeFreq[i][j] = eigenW[j*nCodes+i] * sqrtstat[i];
9571 /* codeFreq[NOCODE] is the rotation of (1,1,...) not (1/nCodes,1/nCodes,...), which
9572 gives correct posterior probabilities
9574 for (j = 0; j < nCodes; j++) {
9575 transmat->codeFreq[NOCODE][j] = 0.0;
9576 for (i = 0; i < nCodes; i++)
9577 transmat->codeFreq[NOCODE][j] += transmat->codeFreq[i][j];
9579 /* save some posterior probabilities for approximating later:
9580 first, we compute P(B | A, t) for t = approxMLnearT, by using
9581 V * exp(L*t) * V**-1 */
9582 double expvalues[MAXCODES];
9583 for (i = 0; i < nCodes; i++)
9584 expvalues[i] = exp(approxMLnearT * transmat->eigenval[i]);
9585 double LVinv[MAXCODES][MAXCODES]; /* exp(L*t) * V**-1 */
9586 for (i = 0; i < nCodes; i++) {
9587 for (j = 0; j < nCodes; j++)
9588 LVinv[i][j] = transmat->eigeninv[i][j] * expvalues[i];
9590 /* matrix transform for converting A -> B given t: transt[i][j] = P(j->i | t) */
9591 double transt[MAXCODES][MAXCODES];
9592 for (i = 0; i < nCodes; i++) {
9593 for (j = 0; j < nCodes; j++) {
9595 for (k = 0; k < nCodes; k++)
9596 transt[i][j] += transmat->codeFreq[i][k] * LVinv[k][j];
9599 /* nearP[i][j] = P(parent = j | both children are i) = P(j | i,i) ~ stat(j) * P(j->i | t)**2 */
9600 for (i = 0; i < nCodes; i++) {
9601 double nearP[MAXCODES];
9603 for (j = 0; j < nCodes; j++) {
9604 assert(transt[j][i] > 0);
9605 assert(transmat->stat[j] > 0);
9606 nearP[j] = transmat->stat[j] * transt[i][j] * transt[i][j];
9610 for (j = 0; j < nCodes; j++)
9611 nearP[j] *= 1.0/tot;
9612 /* save nearP in transmat->nearP[i][] */
9613 for (j = 0; j < nCodes; j++)
9614 transmat->nearP[i][j] = nearP[j];
9615 /* multiply by 1/stat and rotate nearP */
9616 for (j = 0; j < nCodes; j++)
9617 nearP[j] /= transmat->stat[j];
9618 for (j = 0; j < nCodes; j++) {
9620 for (k = 0; k < nCodes; k++)
9621 rot += nearP[k] * transmat->codeFreq[i][j];
9622 transmat->nearFreq[i][j] = rot;
9629 distance_matrix_t *TransMatToDistanceMat(transition_matrix_t *transmat) {
9630 if (transmat == NULL)
9632 distance_matrix_t *dmat = mymalloc(sizeof(distance_matrix_t));
9634 for (i=0; i<nCodes; i++) {
9635 for (j=0; j<nCodes; j++) {
9636 dmat->distances[i][j] = 0; /* never actually used */
9637 dmat->eigeninv[i][j] = transmat->eigeninv[i][j];
9638 dmat->codeFreq[i][j] = transmat->codeFreq[i][j];
9641 /* eigentot . rotated-vector is the total frequency of the unrotated vector
9642 (used to normalize in NormalizeFreq()
9643 For transition matrices, we rotate by transpose of eigenvectors, so
9644 we need to multiply by the inverse matrix by 1....1 to get this vector,
9645 or in other words, sum the columns
9647 for(i = 0; i<nCodes; i++) {
9648 dmat->eigentot[i] = 0.0;
9649 for (j = 0; j<nCodes; j++)
9650 dmat->eigentot[i] += transmat->eigeninv[i][j];
9655 /* Numerical recipes code for eigen decomposition (actually taken from RAxML rev_functions.c) */
9656 void tred2 (double *a, const int n, const int np, double *d, double *e)
9658 #define a(i,j) a[(j-1)*np + (i-1)]
9662 double f, g, h, hh, scale;
9663 for (i = n; i > 1; i--) {
9668 for ( k = 1; k <= l; k++ )
9669 scale += fabs(a(i,k));
9673 for (k = 1; k <= l; k++) {
9675 h += a(i,k) * a(i,k);
9684 for (j = 1; j <=l ; j++) {
9685 a(j,i) = a(i,j) / h;
9687 for (k = 1; k <= j; k++)
9689 for (k = j+1; k <= l; k++)
9695 for (j = 1; j <= l; j++) {
9699 for (k = 1; k <= j; k++)
9700 a(j,k) -= f*e(k) + g*a(i,k);
9709 for (i = 1; i <= n; i++) {
9712 for (j = 1; j <=l; j++) {
9714 for (k = 1; k <= l; k++)
9716 for (k=1; k <=l; k++)
9717 a(k,j) -= g * a(k,i);
9722 for (j=1; j<=l; j++)
9723 a(i,j) = a(j,i) = 0;
9732 double pythag(double a, double b) {
9733 double absa = fabs(a), absb = fabs(b);
9734 return (absa > absb) ?
9735 absa * sqrt(1+ (absb/absa)*(absb/absa)) :
9738 absb * sqrt(1+ (absa/absb)*(absa/absb));
9741 void tqli(double *d, double *e, int n, int np, double *z)
9743 #define z(i,j) z[(j-1)*np + (i-1)]
9747 int i = 0, iter = 0, k = 0, l = 0, m = 0;
9748 double b = 0, c = 0, dd = 0, f = 0, g = 0, p = 0, r = 0, s = 0;
9754 for (l = 1; l <= n; l++)
9759 for (m = l; (m < n); m++)
9761 dd = fabs(d(m))+fabs(d(m+1));
9763 if (fabs(e(m))+dd == dd)
9772 g = (d(l+1)-d(l))/(2*e(l));
9774 g = d(m)-d(l)+e(l)/(g+(g<0?-r:r));
9779 for (i = m-1; i>=l; i--)
9796 r = (d(i)-g)*s + 2*c*b;
9800 for (k=1; k <= n; k++)
9803 z(k,i+1) = s * z(k,i) + c*f;
9804 z(k,i) = c * z(k,i) - s*f;
9823 inline float mm_sum(register __m128 sum) {
9825 /* stupider but faster */
9827 _mm_store_ps(f,sum);
9828 return(f[0]+f[1]+f[2]+f[3]);
9830 /* first we get sum[0]+sum[1], sum[2]+sum[3] by selecting 0/1 and 2/3 */
9831 sum = _mm_add_ps(sum,_mm_shuffle_ps(sum,sum,_MM_SHUFFLE(0,1,2,3)));
9832 /* then get sum[0]+sum[1]+sum[2]+sum[3] by selecting 0/1 and 0/1 */
9833 sum = _mm_add_ps(sum,_mm_shuffle_ps(sum,sum,_MM_SHUFFLE(0,1,0,1)));
9835 _mm_store_ss(&f, sum); /* save the lowest word */
9841 void vector_multiply(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, int n, /*OUT*/numeric_t *fOut) {
9844 for (i = 0; i < n; i += 4) {
9846 a = _mm_load_ps(f1+i);
9847 b = _mm_load_ps(f2+i);
9848 c = _mm_mul_ps(a, b);
9849 _mm_store_ps(fOut+i,c);
9853 for (i = 0; i < n; i++)
9854 fOut[i] = f1[i]*f2[i];
9858 numeric_t vector_multiply_sum(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, int n) {
9861 return(f1[0]*f2[0]+f1[1]*f2[1]+f1[2]*f2[2]+f1[3]*f2[3]);
9862 __m128 sum = _mm_setzero_ps();
9864 for (i = 0; i < n; i += 4) {
9866 a = _mm_load_ps(f1+i);
9867 b = _mm_load_ps(f2+i);
9868 c = _mm_mul_ps(a, b);
9869 sum = _mm_add_ps(c, sum);
9871 return(mm_sum(sum));
9874 numeric_t out = 0.0;
9875 for (i=0; i < n; i++)
9882 numeric_t vector_multiply3_sum(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, /*IN*/numeric_t* f3, int n) {
9884 __m128 sum = _mm_setzero_ps();
9886 for (i = 0; i < n; i += 4) {
9888 a1 = _mm_load_ps(f1+i);
9889 a2 = _mm_load_ps(f2+i);
9890 a3 = _mm_load_ps(f3+i);
9891 sum = _mm_add_ps(_mm_mul_ps(_mm_mul_ps(a1,a2),a3),sum);
9893 return(mm_sum(sum));
9896 numeric_t sum = 0.0;
9897 for (i = 0; i < n; i++)
9898 sum += f1[i]*f2[i]*f3[i];
9903 numeric_t vector_dot_product_rot(/*IN*/numeric_t *f1, /*IN*/numeric_t *f2, /*IN*/numeric_t *fBy, int n) {
9905 __m128 sum1 = _mm_setzero_ps();
9906 __m128 sum2 = _mm_setzero_ps();
9908 for (i = 0; i < n; i += 4) {
9910 a1 = _mm_load_ps(f1+i);
9911 a2 = _mm_load_ps(f2+i);
9912 aBy = _mm_load_ps(fBy+i);
9913 sum1 = _mm_add_ps(_mm_mul_ps(a1, aBy), sum1);
9914 sum2 = _mm_add_ps(_mm_mul_ps(a2, aBy), sum2);
9916 return(mm_sum(sum1)*mm_sum(sum2));
9919 numeric_t out1 = 0.0;
9920 numeric_t out2 = 0.0;
9921 for (i=0; i < n; i++) {
9922 out1 += f1[i]*fBy[i];
9923 out2 += f2[i]*fBy[i];
9929 numeric_t vector_sum(/*IN*/numeric_t *f1, int n) {
9932 return(f1[0]+f1[1]+f1[2]+f1[3]);
9933 __m128 sum = _mm_setzero_ps();
9935 for (i = 0; i < n; i+=4) {
9937 a = _mm_load_ps(f1+i);
9938 sum = _mm_add_ps(a, sum);
9940 return(mm_sum(sum));
9942 numeric_t out = 0.0;
9944 for (i = 0; i < n; i++)
9950 void vector_multiply_by(/*IN/OUT*/numeric_t *f, /*IN*/numeric_t fBy, int n) {
9953 __m128 c = _mm_set1_ps(fBy);
9954 for (i = 0; i < n; i += 4) {
9956 a = _mm_load_ps(f+i);
9957 b = _mm_mul_ps(a,c);
9958 _mm_store_ps(f+i,b);
9961 for (i = 0; i < n; i++)
9966 void vector_add_mult(/*IN/OUT*/numeric_t *fTot, /*IN*/numeric_t *fAdd, numeric_t weight, int n) {
9969 __m128 w = _mm_set1_ps(weight);
9970 for (i = 0; i < n; i += 4) {
9972 tot = _mm_load_ps(fTot+i);
9973 add = _mm_load_ps(fAdd+i);
9974 _mm_store_ps(fTot+i, _mm_add_ps(tot, _mm_mul_ps(add,w)));
9978 for (i = 0; i < n; i++)
9979 fTot[i] += fAdd[i] * weight;
9983 void matrixt_by_vector4(/*IN*/numeric_t mat[4][MAXCODES], /*IN*/numeric_t vec[4], /*OUT*/numeric_t out[4]) {
9985 /*__m128 v = _mm_load_ps(vec);*/
9986 __m128 o = _mm_setzero_ps();
9988 /* result is a sum of vectors: sum(k) v[k] * mat[k][] */
9989 for (j = 0; j < 4; j++) {
9990 __m128 m = _mm_load_ps(&mat[j][0]);
9991 __m128 vj = _mm_load1_ps(&vec[j]); /* is it faster to shuffle v? */
9992 o = _mm_add_ps(o, _mm_mul_ps(vj,m));
9994 _mm_store_ps(out, o);
9997 for (j = 0; j < 4; j++) {
9999 for (k = 0; k < 4; k++)
10000 sum += vec[k] * mat[k][j];
10006 distance_matrix_t matrixBLOSUM45 =
10010 {0, 1.31097856157468, 1.06573001937323, 1.2682782988532, 0.90471293383305, 1.05855446876905, 1.05232790675508, 0.769574440593014, 1.27579668305679, 0.964604099952603, 0.987178199640556, 1.05007594438157, 1.05464162250736, 1.1985987403937, 0.967404475245526, 0.700490199584332, 0.880060189098976, 1.09748548316685, 1.28141710375267, 0.800038509951648},
10011 {1.31097856157468, 0, 0.8010890222701, 0.953340718498495, 1.36011107208122, 0.631543775840481, 0.791014908659279, 1.15694899265629, 0.761152570032029, 1.45014917711188, 1.17792001455227, 0.394661075648738, 0.998807558909651, 1.135143404599, 1.15432562628921, 1.05309036790541, 1.05010474413616, 1.03938321130789, 0.963216908696184, 1.20274751778601},
10012 {1.06573001937323, 0.8010890222701, 0, 0.488217214273568, 1.10567116937273, 0.814970207038261, 0.810176440932339, 0.746487413974582, 0.61876156253224, 1.17886558630004, 1.52003670190022, 0.808442678243754, 1.2889025816028, 1.16264109995678, 1.18228799147301, 0.679475681649858, 0.853658619686283, 1.68988558988005, 1.24297493464833, 1.55207513886163},
10013 {1.2682782988532, 0.953340718498495, 0.488217214273568, 0, 1.31581050011876, 0.769778474953791, 0.482077627352988, 0.888361752320536, 0.736360849050364, 1.76756333403346, 1.43574761894039, 0.763612910719347, 1.53386612356483, 1.74323672079854, 0.886347403928663, 0.808614044804528, 1.01590147813779, 1.59617804551619, 1.1740494822217, 1.46600946033173},
10014 {0.90471293383305, 1.36011107208122, 1.10567116937273, 1.31581050011876, 0, 1.3836789310481, 1.37553994252576, 1.26740695314856, 1.32361065635259, 1.26087264215993, 1.02417540515351, 1.37259631233791, 1.09416720447891, 0.986982088723923, 1.59321190226694, 0.915638787768407, 0.913042853922533, 1.80744143643002, 1.3294417177004, 0.830022143283238},
10015 {1.05855446876905, 0.631543775840481, 0.814970207038261, 0.769778474953791, 1.3836789310481, 0, 0.506942797642807, 1.17699648087288, 0.614595446514896, 1.17092829494457, 1.19833088638994, 0.637341078675405, 0.806490842729072, 1.83315144709714, 0.932064479113502, 0.850321696813199, 1.06830084665916, 1.05739353225849, 0.979907428113788, 1.5416250309563},
10016 {1.05232790675508, 0.791014908659279, 0.810176440932339, 0.482077627352988, 1.37553994252576, 0.506942797642807, 0, 1.17007322676118, 0.769786956320484, 1.46659942462342, 1.19128214039009, 0.633592151371708, 1.27269395724349, 1.44641491621774, 0.735428579892476, 0.845319988414402, 1.06201695511881, 1.324395996498, 1.22734387448031, 1.53255698189437},
10017 {0.769574440593014, 1.15694899265629, 0.746487413974582, 0.888361752320536, 1.26740695314856, 1.17699648087288, 1.17007322676118, 0, 1.1259007054424, 1.7025415585924, 1.38293205218175, 1.16756929156758, 1.17264582493965, 1.33271035269688, 1.07564768421292, 0.778868281341681, 1.23287107008366, 0.968539655354582, 1.42479529031801, 1.41208067821187},
10018 {1.27579668305679, 0.761152570032029, 0.61876156253224, 0.736360849050364, 1.32361065635259, 0.614595446514896, 0.769786956320484, 1.1259007054424, 0, 1.4112324673522, 1.14630894167097, 0.967795284542623, 0.771479459384692, 1.10468029976148, 1.12334774065132, 1.02482926701639, 1.28754326478771, 1.27439749294131, 0.468683841672724, 1.47469999960758},
10019 {0.964604099952603, 1.45014917711188, 1.17886558630004, 1.76756333403346, 1.26087264215993, 1.17092829494457, 1.46659942462342, 1.7025415585924, 1.4112324673522, 0, 0.433350517223017, 1.463460928818, 0.462965544381851, 0.66291968000662, 1.07010201755441, 1.23000200130049, 0.973485453109068, 0.963546200571036, 0.708724769805536, 0.351200119909572},
10020 {0.987178199640556, 1.17792001455227, 1.52003670190022, 1.43574761894039, 1.02417540515351, 1.19833088638994, 1.19128214039009, 1.38293205218175, 1.14630894167097, 0.433350517223017, 0, 1.49770950074319, 0.473800072611076, 0.538473125003292, 1.37979627224964, 1.5859723170438, 0.996267398224516, 0.986095542821092, 0.725310666139274, 0.570542199221932},
10021 {1.05007594438157, 0.394661075648738, 0.808442678243754, 0.763612910719347, 1.37259631233791, 0.637341078675405, 0.633592151371708, 1.16756929156758, 0.967795284542623, 1.463460928818, 1.49770950074319, 0, 1.0079761868248, 1.44331961488922, 0.924599080166146, 1.06275728888356, 1.05974425835993, 1.04892430642749, 0.972058829603409, 1.21378822764856},
10022 {1.05464162250736, 0.998807558909651, 1.2889025816028, 1.53386612356483, 1.09416720447891, 0.806490842729072, 1.27269395724349, 1.17264582493965, 0.771479459384692, 0.462965544381851, 0.473800072611076, 1.0079761868248, 0, 0.72479754849538, 1.1699868662153, 1.34481214251794, 1.06435197383538, 1.05348497728858, 0.774878150710318, 0.609532859331199},
10023 {1.1985987403937, 1.135143404599, 1.16264109995678, 1.74323672079854, 0.986982088723923, 1.83315144709714, 1.44641491621774, 1.33271035269688, 1.10468029976148, 0.66291968000662, 0.538473125003292, 1.44331961488922, 0.72479754849538, 0, 1.32968844979665, 1.21307373491949, 0.960087571600877, 0.475142555482979, 0.349485367759138, 0.692733248746636},
10024 {0.967404475245526, 1.15432562628921, 1.18228799147301, 0.886347403928663, 1.59321190226694, 0.932064479113502, 0.735428579892476, 1.07564768421292, 1.12334774065132, 1.07010201755441, 1.37979627224964, 0.924599080166146, 1.1699868662153, 1.32968844979665, 0, 0.979087429691819, 0.97631161216338, 1.21751652292503, 1.42156458605332, 1.40887880416009},
10025 {0.700490199584332, 1.05309036790541, 0.679475681649858, 0.808614044804528, 0.915638787768407, 0.850321696813199, 0.845319988414402, 0.778868281341681, 1.02482926701639, 1.23000200130049, 1.5859723170438, 1.06275728888356, 1.34481214251794, 1.21307373491949, 0.979087429691819, 0, 0.56109848274013, 1.76318885009194, 1.29689226231656, 1.02015839286433},
10026 {0.880060189098976, 1.05010474413616, 0.853658619686283, 1.01590147813779, 0.913042853922533, 1.06830084665916, 1.06201695511881, 1.23287107008366, 1.28754326478771, 0.973485453109068, 0.996267398224516, 1.05974425835993, 1.06435197383538, 0.960087571600877, 0.97631161216338, 0.56109848274013, 0, 1.39547634461879, 1.02642577026706, 0.807404666228614},
10027 {1.09748548316685, 1.03938321130789, 1.68988558988005, 1.59617804551619, 1.80744143643002, 1.05739353225849, 1.324395996498, 0.968539655354582, 1.27439749294131, 0.963546200571036, 0.986095542821092, 1.04892430642749, 1.05348497728858, 0.475142555482979, 1.21751652292503, 1.76318885009194, 1.39547634461879, 0, 0.320002937404137, 1.268589159299},
10028 {1.28141710375267, 0.963216908696184, 1.24297493464833, 1.1740494822217, 1.3294417177004, 0.979907428113788, 1.22734387448031, 1.42479529031801, 0.468683841672724, 0.708724769805536, 0.725310666139274, 0.972058829603409, 0.774878150710318, 0.349485367759138, 1.42156458605332, 1.29689226231656, 1.02642577026706, 0.320002937404137, 0, 0.933095433689795},
10029 {0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795, 0}
10033 {-0.216311217101265, -0.215171653035930, -0.217000020881064, -0.232890860601250, -0.25403526530177, -0.211569372858927, -0.218073620637049, -0.240585637190076, -0.214507049619293, -0.228476323330312, -0.223235445346107, -0.216116483840334, -0.206903836810903, -0.223553828183343, -0.236937609127783, -0.217652789023588, -0.211982652566286, -0.245995223308316, -0.206187718714279, -0.227670670439422},
10034 {-0.0843931919568687, -0.0342164464991033, 0.393702284928246, -0.166018266253027, 0.0500896782860136, -0.262731388032538, 0.030139964190519, -0.253997503551094, -0.0932603349591988, -0.32884667697173, 0.199966846276877, -0.117543453869516, 0.196248237055757, -0.456448703853250, 0.139286961076387, 0.241166801918811, -0.0783508285295053, 0.377438091416498, 0.109499076984234, 0.128581669647144},
10035 {-0.0690428674271772, 0.0133858672878363, -0.208289917312908, 0.161232925220819, 0.0735806288007248, -0.316269599838174, -0.0640708424745702, -0.117078801507436, 0.360805085405857, 0.336899760384943, 0.0332447078185156, 0.132954055834276, 0.00595209121998118, -0.157755611190327, -0.199839273133436, 0.193688928807663, 0.0970290928040946, 0.374683975138541, -0.478110944870958, -0.243290196936098},
10036 {0.117284581850481, 0.310399467781876, -0.143513477698805, 0.088808130300351, 0.105747812943691, -0.373871701179853, 0.189069306295134, 0.133258225034741, -0.213043549687694, 0.301303731259140, -0.182085224761849, -0.161971915020789, 0.229301173581378, -0.293586313243755, -0.0260480060747498, -0.0217953684540699, 0.0202675755458796, -0.160134624443657, 0.431950096999465, -0.329885160320501},
10037 {0.256496969244703, 0.0907408349583135, 0.0135731083898029, 0.477557831930769, -0.0727379669280703, 0.101732675207959, -0.147293025369251, -0.348325291603251, -0.255678082078362, -0.187092643740172, -0.177164064346593, -0.225921480146133, 0.422318841046522, 0.319959853469398, -0.0623652546300045, 0.0824203908606883, -0.102057926881110, 0.120728407576411, -0.156845807891241, -0.123528163091204},
10038 {-0.00906668858975576, -0.0814722888231236, -0.0762715085459023, 0.055819989938286, -0.0540516675257271, -0.0070589302769034, -0.315813159989213, -0.0103527463419808, -0.194634331372293, -0.0185860407566822, 0.50134169352609, 0.384531812730061, -0.0405008616742061, 0.0781033650669525, 0.069334900096687, 0.396455180448549, -0.204065801866462, -0.215272089630713, 0.171046818996465, -0.396393364716348},
10039 {0.201971098571663, 0.489747667606921, 0.00226258734592836, 0.0969514005747054, 0.0853921636903791, 0.0862068740282345, -0.465412154271164, -0.130516676347786, 0.165513616974634, 0.0712238027886633, 0.140746943067963, -0.325919272273406, -0.421213488261598, -0.163508199065965, 0.269695802810568, -0.110296405171437, -0.106834099902202, 0.00509414588152415, 0.00909215239544615, 0.0500401865589727},
10040 {0.515854176692456, -0.087468413428258, 0.102796468891449, -0.06046105990993, -0.212014383772414, -0.259853648383794, -0.0997372883043333, -0.109934574535736, 0.284891018406112, -0.250578342940183, 0.142174204994568, 0.210384918947619, 0.118803190788946, -0.0268434355996836, 0.0103721198836548, -0.355555176478458, 0.428042332431476, -0.150610175411631, 0.0464090887952940, -0.140238796382057},
10041 {-0.239392215229762, -0.315483492656425, 0.100205194952396, 0.197830195325302, 0.40178804665223, 0.195809461460298, -0.407817115321684, 0.0226836686147386, -0.169780276210306, 0.0818161585952184, -0.172886230584939, 0.174982644851064, 0.0868786992159535, -0.198450519980824, 0.168581078329968, -0.361514336004068, 0.238668430084722, 0.165494019791904, 0.110437707249228, -0.169592003035203},
10042 {-0.313151735678025, 0.10757884850664, -0.49249098807229, 0.0993472335619114, -0.148695715250836, 0.0573801136941699, -0.190040373500722, 0.254848437434773, 0.134147888304352, -0.352719341442756, 0.0839609323513986, -0.207904182300122, 0.253940523323376, -0.109832138553288, 0.0980084518687944, 0.209026594443723, 0.406236051871548, -0.0521120230935943, 0.0554108014592302, 0.134681046631955},
10043 {-0.102905214421384, 0.235803606800009, 0.213414976431981, -0.253606415825635, 0.00945656859370683, 0.259551282655855, 0.159527348902192, 0.083218761193016, -0.286815935191867, 0.0135069477264877, 0.336758103107357, -0.271707359524149, -0.0400009875851839, 0.0871186292716414, -0.171506310409388, -0.0954276577211755, 0.393467571460712, 0.111732846649458, -0.239886066474217, -0.426474828195231},
10044 {-0.0130795552324104, 0.0758967690968058, -0.165099404017689, -0.46035152559912, 0.409888158016031, -0.0235053940299396, 0.0699393201709723, -0.161320910316996, 0.226111732196825, -0.177811841258496, -0.219073917645916, -0.00703219376737286, 0.162831878334912, 0.271670554900684, 0.451033612762052, 0.0820942662443393, -0.0904983490498446, -0.0587000279313978, -0.0938852980928252, -0.306078621571843},
10045 {0.345092040577428, -0.257721588971295, -0.301689123771848, -0.0875212184538126, 0.161012613069275, 0.385104899829821, 0.118355290985046, -0.241723794416731, 0.083201920119646, -0.0809095291508749, -0.0820275390511991, -0.115569770103317, -0.250105681098033, -0.164197583037664, -0.299481453795592, 0.255906951902366, 0.129042051416371, 0.203761730442746, 0.347550071284268, -0.109264854744020},
10046 {0.056345924962239, 0.072536751679082, 0.303127492633681, -0.368877185781648, -0.343024497082421, 0.206879529669083, -0.413012709639426, 0.078538816203612, 0.103382383425097, 0.288319996147499, -0.392663258459423, 0.0319588502083897, 0.220316797792669, -0.0563686494606947, -0.0869286063283735, 0.323677017794391, 0.0984875197088935, -0.0303289828821742, 0.0450197853450979, -0.0261771221270139},
10047 {-0.253701638374729, -0.148922815783583, 0.111794052194159, 0.157313977830326, -0.269846001260543, -0.222989872703583, 0.115441028189268, -0.350456582262355, -0.0409581422905941, 0.174078744248002, -0.130673397086811, -0.123963802708056, -0.351609207081548, 0.281548012920868, 0.340382662112428, 0.180262131025562, 0.3895263830793, 0.0121546812430960, 0.214830943227063, -0.0617782909660214},
10048 {-0.025854479416026, 0.480654788977767, -0.138024550829229, -0.130191670810919, 0.107816875829919, -0.111243997319276, -0.0679814460571245, -0.183167991080677, -0.363355166018786, -0.183934891092050, -0.216097125080962, 0.520240628803255, -0.179616013606479, 0.0664131536100941, -0.178350708111064, 0.0352047611606709, 0.223857228692892, 0.128363679623513, -0.000403433628490731, 0.224972110977704},
10049 {0.159207394033448, -0.0371517305736114, -0.294302634912281, -0.0866954375908417, -0.259998567870054, 0.284966673982689, 0.205356416771391, -0.257613708650298, -0.264820519037270, 0.293359248624603, 0.0997476397434102, 0.151390539497369, 0.165571346773648, -0.347569523551258, 0.43792310820533, -0.0723248163210163, 0.0379214984816955, -0.0542758730251438, -0.258020301801603, 0.128680501102363},
10050 {0.316853842351797, -0.153950010941153, -0.13387065213508, -0.0702971390607613, -0.202558481846057, -0.172941438694837, -0.068882524588574, 0.524738203063889, -0.271670479920716, -0.112864756695310, -0.146831636946145, -0.0352336188578041, -0.211108490884767, 0.097857111349555, 0.276459740956662, 0.0231297536754823, -0.0773173324868396, 0.487208384389438, -0.0734191389266824, -0.113198765573319},
10051 {-0.274285525741087, 0.227334266052039, -0.0973746625709059, -0.00965256583655389, -0.402438444750043, 0.198586229519026, 0.0958135064575833, -0.108934376958686, 0.253641732094319, -0.0551918478254021, 0.0243640218331436, 0.181936272247179, 0.090952738347629, 0.0603352483029044, -0.0043821671755761, -0.347720824658591, -0.267879988539971, 0.403804652116592, 0.337654323971186, -0.241509293972297},
10052 {-0.0197089518344238, 0.139681034626696, 0.251980475788267, 0.341846624362846, -0.075141195125153, 0.2184951591319, 0.268870823491343, 0.150392399018138, 0.134592404015057, -0.337050200539163, -0.313109373497998, 0.201993318439135, -0.217140733851970, -0.337622749083808, 0.135253284365068, 0.181729249828045, -0.00627813335422765, -0.197218833324039, -0.194060005031698, -0.303055888528004}
10056 20.29131, 0.5045685, 0.2769945, 0.1551147, 0.03235484, -0.04127639, -0.3516426, -0.469973, -0.5835191, -0.6913107, -0.7207972, -0.7907875, -0.9524307, -1.095310, -1.402153, -1.424179, -1.936704, -2.037965, -3.273561, -5.488734
10058 /*eigentot and codeFreq left out, these are initialized elsewhere*/
10061 /* The JTT92 matrix, D. T. Jones, W. R. Taylor, & J. M. Thorton, CABIOS 8:275 (1992)
10062 Derived from the PhyML source code (models.c) by filling in the other side of the symmetric matrix,
10063 scaling the entries by the stationary rate (to give the rate of a->b not b|a), to set the diagonals
10064 so the rows sum to 0, to rescale the matrix so that the implied rate of evolution is 1.
10065 The resulting matrix is the transpose (I think).
10070 for (i=0; i<20; i++) for (j=0; j<i; j++) daa[j*20+i] = daa[i*20+j];
10071 for (i = 0; i < 20; i++) for (j = 0; j < 20; j++) daa[i*20+j] *= pi[j] / 100.0;
10072 double mr = 0; /* mean rate */
10073 for (i = 0; i < 20; i++) {
10075 for (j = 0; j < 20; j++)
10076 sum += daa[i*20+j];
10077 daa[i*20+i] = -sum;
10080 for (i = 0; i < 20*20; i++)
10085 double statJTT92[MAXCODES] = {0.07674789,0.05169087,0.04264509,0.05154407,0.01980301,0.04075195,0.06182989,0.07315199,0.02294399,0.05376110,0.09190390,0.05867583,0.02382594,0.04012589,0.05090097,0.06876503,0.05856501,0.01426057,0.03210196,0.06600504};
10086 double matrixJTT92[MAXCODES][MAXCODES] = {
10087 { -1.247831,0.044229,0.041179,0.061769,0.042704,0.043467,0.08007,0.136501,0.02059,0.027453,0.022877,0.02669,0.041179,0.011439,0.14794,0.288253,0.362223,0.006863,0.008388,0.227247 },
10088 { 0.029789,-1.025965,0.023112,0.008218,0.058038,0.159218,0.014895,0.070364,0.168463,0.011299,0.019517,0.33179,0.022599,0.002568,0.038007,0.051874,0.032871,0.064714,0.010272,0.008731 },
10089 { 0.022881,0.019068,-1.280568,0.223727,0.014407,0.03644,0.024576,0.034322,0.165676,0.019915,0.005085,0.11144,0.012712,0.004237,0.006356,0.213134,0.098304,0.00339,0.029661,0.00678 },
10090 { 0.041484,0.008194,0.270413,-1.044903,0.005121,0.025095,0.392816,0.066579,0.05736,0.005634,0.003585,0.013316,0.007682,0.002049,0.007682,0.030217,0.019462,0.002049,0.023559,0.015877 },
10091 { 0.011019,0.022234,0.00669,0.001968,-0.56571,0.001771,0.000984,0.011609,0.013577,0.003345,0.004526,0.001377,0.0061,0.015348,0.002755,0.043878,0.008264,0.022628,0.041124,0.012199 },
10092 { 0.02308,0.125524,0.034823,0.019841,0.003644,-1.04415,0.130788,0.010528,0.241735,0.003644,0.029154,0.118235,0.017411,0.00162,0.066406,0.021461,0.020651,0.007288,0.009718,0.008098 },
10093 { 0.064507,0.017816,0.035632,0.471205,0.003072,0.198435,-0.944343,0.073107,0.015973,0.007372,0.005529,0.111197,0.011058,0.003072,0.011058,0.01843,0.019659,0.006143,0.0043,0.027646 },
10094 { 0.130105,0.099578,0.058874,0.09449,0.042884,0.018898,0.086495,-0.647831,0.016717,0.004361,0.004361,0.019625,0.010176,0.003634,0.017444,0.146096,0.023986,0.039976,0.005815,0.034162 },
10095 { 0.006155,0.074775,0.089138,0.025533,0.01573,0.1361,0.005927,0.005243,-1.135695,0.003648,0.012767,0.010259,0.007523,0.009119,0.026217,0.016642,0.010487,0.001824,0.130629,0.002508 },
10096 { 0.01923,0.011752,0.025106,0.005876,0.009081,0.004808,0.00641,0.003205,0.008547,-1.273602,0.122326,0.011218,0.25587,0.047542,0.005342,0.021367,0.130873,0.004808,0.017094,0.513342 },
10097 { 0.027395,0.0347,0.010958,0.006392,0.021003,0.065748,0.008219,0.005479,0.051137,0.209115,-0.668139,0.012784,0.354309,0.226465,0.093143,0.053877,0.022829,0.047485,0.021916,0.16437 },
10098 { 0.020405,0.376625,0.153332,0.015158,0.004081,0.170239,0.105525,0.015741,0.026235,0.012243,0.008162,-0.900734,0.037896,0.002332,0.012243,0.027401,0.06005,0.00583,0.004664,0.008162 },
10099 { 0.012784,0.010416,0.007102,0.003551,0.007339,0.01018,0.004261,0.003314,0.007812,0.113397,0.091854,0.015388,-1.182051,0.01018,0.003788,0.006865,0.053503,0.005682,0.004261,0.076466 },
10100 { 0.00598,0.001993,0.003987,0.001595,0.031098,0.001595,0.001993,0.001993,0.015948,0.035484,0.098877,0.001595,0.017144,-0.637182,0.006778,0.03668,0.004784,0.021131,0.213701,0.024719 },
10101 { 0.098117,0.037426,0.007586,0.007586,0.007081,0.082944,0.009104,0.012138,0.058162,0.005058,0.051587,0.010621,0.008092,0.008598,-0.727675,0.144141,0.059679,0.003035,0.005058,0.011632 },
10102 { 0.258271,0.069009,0.343678,0.040312,0.152366,0.036213,0.020498,0.137334,0.049878,0.02733,0.040312,0.032113,0.019814,0.06286,0.194728,-1.447863,0.325913,0.023914,0.043045,0.025964 },
10103 { 0.276406,0.037242,0.135003,0.022112,0.02444,0.029677,0.018621,0.019203,0.026768,0.142567,0.014548,0.059936,0.131511,0.006983,0.068665,0.27757,-1.335389,0.006983,0.01222,0.065174 },
10104 { 0.001275,0.017854,0.001134,0.000567,0.016295,0.002551,0.001417,0.007793,0.001134,0.001275,0.007368,0.001417,0.003401,0.00751,0.00085,0.004959,0.0017,-0.312785,0.010061,0.003542 },
10105 { 0.003509,0.006379,0.022328,0.014673,0.066664,0.007655,0.002233,0.002552,0.182769,0.010207,0.007655,0.002552,0.005741,0.170967,0.00319,0.020095,0.006698,0.022647,-0.605978,0.005103 },
10106 { 0.195438,0.011149,0.010493,0.020331,0.040662,0.013117,0.029512,0.030824,0.007214,0.630254,0.11805,0.009182,0.211834,0.040662,0.015084,0.024922,0.073453,0.016396,0.010493,-1.241722 }
10109 double statWAG01[MAXCODES] = {0.0866279,0.043972, 0.0390894,0.0570451,0.0193078,0.0367281,0.0580589,0.0832518,0.0244314,0.048466, 0.086209, 0.0620286,0.0195027,0.0384319,0.0457631,0.0695179,0.0610127,0.0143859,0.0352742,0.0708956};
10110 double matrixWAG01[MAXCODES][MAXCODES] = {
10111 {-1.117151, 0.050147, 0.046354, 0.067188, 0.093376, 0.082607, 0.143908, 0.128804, 0.028817, 0.017577, 0.036177, 0.082395, 0.081234, 0.019138, 0.130789, 0.306463, 0.192846, 0.010286, 0.021887, 0.182381},
10112 {0.025455, -0.974318, 0.029321, 0.006798, 0.024376, 0.140086, 0.020267, 0.026982, 0.098628, 0.008629, 0.022967, 0.246964, 0.031527, 0.004740, 0.031358, 0.056495, 0.025586, 0.053714, 0.017607, 0.011623},
10113 {0.020916, 0.026065, -1.452438, 0.222741, 0.010882, 0.063328, 0.038859, 0.046176, 0.162306, 0.022737, 0.005396, 0.123567, 0.008132, 0.003945, 0.008003, 0.163042, 0.083283, 0.002950, 0.044553, 0.008051},
10114 {0.044244, 0.008819, 0.325058, -0.989665, 0.001814, 0.036927, 0.369645, 0.051822, 0.055719, 0.002361, 0.005077, 0.028729, 0.006212, 0.002798, 0.025384, 0.064166, 0.022443, 0.007769, 0.019500, 0.009120},
10115 {0.020812, 0.010703, 0.005375, 0.000614, -0.487357, 0.002002, 0.000433, 0.006214, 0.005045, 0.003448, 0.007787, 0.001500, 0.007913, 0.008065, 0.002217, 0.028525, 0.010395, 0.014531, 0.011020, 0.020307},
10116 {0.035023, 0.117008, 0.059502, 0.023775, 0.003809, -1.379785, 0.210830, 0.012722, 0.165524, 0.004391, 0.033516, 0.150135, 0.059565, 0.003852, 0.035978, 0.039660, 0.033070, 0.008316, 0.008777, 0.011613},
10117 {0.096449, 0.026759, 0.057716, 0.376214, 0.001301, 0.333275, -1.236894, 0.034593, 0.034734, 0.007763, 0.009400, 0.157479, 0.019202, 0.004944, 0.041578, 0.042955, 0.050134, 0.009540, 0.011961, 0.035874},
10118 {0.123784, 0.051085, 0.098345, 0.075630, 0.026795, 0.028838, 0.049604, -0.497615, 0.021792, 0.002661, 0.005356, 0.032639, 0.015212, 0.004363, 0.021282, 0.117240, 0.019732, 0.029444, 0.009052, 0.016361},
10119 {0.008127, 0.054799, 0.101443, 0.023863, 0.006384, 0.110105, 0.014616, 0.006395, -0.992342, 0.003543, 0.012807, 0.022832, 0.010363, 0.017420, 0.017851, 0.018979, 0.012136, 0.006733, 0.099319, 0.003035},
10120 {0.009834, 0.009511, 0.028192, 0.002006, 0.008654, 0.005794, 0.006480, 0.001549, 0.007029, -1.233162, 0.161294, 0.016472, 0.216559, 0.053891, 0.005083, 0.016249, 0.074170, 0.010808, 0.021372, 0.397837},
10121 {0.036002, 0.045028, 0.011900, 0.007673, 0.034769, 0.078669, 0.013957, 0.005547, 0.045190, 0.286902, -0.726011, 0.023303, 0.439180, 0.191376, 0.037625, 0.031191, 0.029552, 0.060196, 0.036066, 0.162890},
10122 {0.058998, 0.348377, 0.196082, 0.031239, 0.004820, 0.253558, 0.168246, 0.024319, 0.057967, 0.021081, 0.016767, -1.124580, 0.060821, 0.005783, 0.036254, 0.062960, 0.090292, 0.008952, 0.008675, 0.019884},
10123 {0.018288, 0.013983, 0.004057, 0.002124, 0.007993, 0.031629, 0.006450, 0.003564, 0.008272, 0.087143, 0.099354, 0.019123, -1.322098, 0.024370, 0.003507, 0.010109, 0.031033, 0.010556, 0.008769, 0.042133},
10124 {0.008490, 0.004143, 0.003879, 0.001885, 0.016054, 0.004030, 0.003273, 0.002014, 0.027402, 0.042734, 0.085315, 0.003583, 0.048024, -0.713669, 0.006512, 0.022020, 0.006934, 0.061698, 0.260332, 0.026213},
10125 {0.069092, 0.032635, 0.009370, 0.020364, 0.005255, 0.044829, 0.032773, 0.011698, 0.033438, 0.004799, 0.019973, 0.026747, 0.008229, 0.007754, -0.605590, 0.077484, 0.038202, 0.006695, 0.010376, 0.015124},
10126 {0.245933, 0.089317, 0.289960, 0.078196, 0.102703, 0.075066, 0.051432, 0.097899, 0.054003, 0.023306, 0.025152, 0.070562, 0.036035, 0.039831, 0.117705, -1.392239, 0.319421, 0.038212, 0.057419, 0.016981},
10127 {0.135823, 0.035501, 0.129992, 0.024004, 0.032848, 0.054936, 0.052685, 0.014461, 0.030308, 0.093371, 0.020915, 0.088814, 0.097083, 0.011008, 0.050931, 0.280341, -1.154973, 0.007099, 0.018643, 0.088894},
10128 {0.001708, 0.017573, 0.001086, 0.001959, 0.010826, 0.003257, 0.002364, 0.005088, 0.003964, 0.003208, 0.010045, 0.002076, 0.007786, 0.023095, 0.002105, 0.007908, 0.001674, -0.466694, 0.037525, 0.005516},
10129 {0.008912, 0.014125, 0.040205, 0.012058, 0.020133, 0.008430, 0.007267, 0.003836, 0.143398, 0.015555, 0.014757, 0.004934, 0.015861, 0.238943, 0.007998, 0.029135, 0.010779, 0.092011, -0.726275, 0.011652},
10130 {0.149259, 0.018739, 0.014602, 0.011335, 0.074565, 0.022417, 0.043805, 0.013932, 0.008807, 0.581952, 0.133956, 0.022726, 0.153161, 0.048356, 0.023429, 0.017317, 0.103293, 0.027186, 0.023418, -1.085487},