X-Git-Url: https://git.donarmstrong.com/?a=blobdiff_plain;f=NEWS;h=82646ba812fa66f90c1ca4cf2473645f87809b31;hb=ac32543c12eac42f12e0c66b78c3d53211caeaab;hp=b8b8bc74def5e030795f642c38340a07d664b82e;hpb=bea524c4fd680119bdc118ec07ebe89722e2e697;p=samtools.git diff --git a/NEWS b/NEWS index b8b8bc7..82646ba 100644 --- a/NEWS +++ b/NEWS @@ -1,4 +1,325 @@ -Beta Release 0.1.3 (XX April, 2009) +Beta Release 0.1.9 (27 October, 2010) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This release is featured as the first major improvement to the samtools' +SNP caller. It comes with a revised MAQ error model, the support of +multi-sample SNP calling and the computation of base alignment quality +(BAQ). + +The revised MAQ error model is based on the original model. It solves an +issue of miscalling SNPs in repetitive regions. Althought such SNPs can +usually be filtered at a later step, they mess up unfiltered calls. This +is a theoretical flaw in the original model. The revised MAQ model +deprecates the orginal MAQ model and the simplified SOAPsnp model. + +Multi-sample SNP calling is separated in two steps. The first is done by +samtools mpileup and the second by a new program, bcftools, which is +included in the samtools source code tree. Multi-sample SNP calling also +works for single sample and has the advantage of enabling more powerful +filtration. It is likely to deprecate pileup in future once a proper +indel calling method is implemented. + +BAQ is the Phred-scaled probability of a read base being wrongly +aligned. Capping base quality by BAQ has been shown to be very effective +in suppressing false SNPs caused by misalignments around indels or in +low-complexity regions with acceptable compromise on computation +time. This strategy is highly recommended and can be used with other SNP +callers as well. + +In addition to the three major improvements, other notable changes are: + + * Changes to the pileup format. A reference skip (the N CIGAR operator) + is shown as '<' or '>' depending on the strand. Tview is also changed + accordingly. + + * Accelerated pileup. The plain pileup is about 50% faster. + + * Regional merge. The merge command now accepts a new option to merge + files in a specified region. + + * Fixed a bug in bgzip and razip which causes source files to be + deleted even if option -c is applied. + + * In APIs, propogate errors to downstream callers and make samtools + return non-zero values once errors occur. + +(0.1.9: 27 October 2010, r783) + + + +Beta Release 0.1.8 (11 July, 2010) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Notable functional changes: + + * Added the `reheader' command which replaces a BAM header with a new + header. This command is much faster than replacing header by + BAM->SAM->BAM conversions. + + * Added the `mpileup' command which computes the pileup of multiple + alignments. + + * The `index' command now stores the number of mapped and unmapped + reads in the index file. This information can be retrieved quickly by + the new `idxstats' command. + + * By default, pileup used the SOAPsnp model for SNP calling. This + avoids the floating overflow in the MAQ model which leads to spurious + calls in repetitive regions, although these calls will be immediately + filtered by varFilter. + + * The `tview' command now correctly handles CIGARs like 7I10M and + 10M1P1I10M which cause assertion failure in earlier versions. + + * Tview accepts a region like `=10,000' where `=' stands for the + current sequence name. This saves typing for long sequence names. + + * Added the `-d' option to `pileup' which avoids slow indel calling + in ultradeep regions by subsampling reads locally. + + * Added the `-R' option to `view' which retrieves alignments in read + groups listed in the specified file. + +Performance improvements: + + * The BAM->SAM conversion is up to twice faster, depending on the + characteristic of the input. + + * Parsing SAM headers with a lot of reference sequences is now much + faster. + + * The number of lseek() calls per query is reduced when the query + region contains no read alignments. + +Bug fixes: + + * Fixed an issue in the indel caller that leads to miscall of indels. + Note that this solution may not work well when the sequencing indel + error rate is higher than the rate of SNPs. + + * Fixed another issue in the indel caller which may lead to incorrect + genotype. + + * Fixed a bug in `sort' when option `-o' is applied. + + * Fixed a bug in `view -r'. + +APIs and other changes: + + * Added iterator interfaces to random access and pileup. The callback + interfaces directly call the iterator interfaces. + + * The BGZF blocks holding the BAM header are indepedent of alignment + BGZF blocks. Alignment records shorter than 64kB is guaranteed to be + fully contained in one BGZF block. This change is fully compatible + with the old version of samtools/picard. + +Changes in other utilities: + + * Updated export2sam.pl by Chris Saunders. + + * Improved the sam2vcf.pl script. + + * Added a Python version of varfilter.py by Aylwyn Scally. + +(0.1.8: 11 July 2010, r613) + + + +Beta Release 0.1.7 (10 November, 2009) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Notable changes: + + * Improved the indel caller in complex scenariors, in particular for + long reads. The indel caller is now able to make reasonable indel + calls from Craig Venter capillary reads. + + * Rewrote single-end duplicate removal with improved + performance. Paired-end reads are not touched. + + * Duplicate removal is now library aware. Samtools remove potential + PCR/optical dupliates inside a library rather than across libraries. + + * SAM header is now fully parsed, although this functionality is not + used in merging and so on. + + * In samtools merge, optionally take the input file name as RG-ID and + attach the RG tag to each alignment. + + * Added FTP support in the RAZF library. RAZF-compressed reference + sequence can be retrieved remotely. + + * Improved network support for Win32. + + * Samtools sort and merge are now stable. + +Changes in other utilities: + + * Implemented sam2vcf.pl that converts the pileup format to the VCF + format. + + * This release of samtools is known to work with the latest + Bio-Samtools Perl module. + +(0.1.7: 10 November 2009, r510) + + + +Beta Release 0.1.6 (2 September, 2009) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Notable changes: + + * In tview, do not show a blank screen when no reads mapped to the + corresponding region. + + * Implemented native HTTP support in the BGZF library. Samtools is now + able to directly open a BAM file on HTTP. HTTP proxy is also + supported via the "http_proxy" environmental variable. + + * Samtools is now compitable with the MinGW (win32) compiler and the + PDCurses library. + + * The calmd (or fillmd) command now calculates the NM tag and replaces + MD tags if they are wrong. + + * The view command now recognizes and optionally prints FLAG in HEXs or + strings to make a SAM file more friendly to human eyes. This is a + samtools-C extension, not implemented in Picard for the time + being. Please type `samtools view -?' for more information. + + * BAM files now have an end-of-file (EOF) marker to facilitate + truncation detection. A warning will be given if an on-disk BAM file + does not have this marker. The warning will be seen on BAM files + generated by an older version of samtools. It does NO harm. + + * New key bindings in tview: `r' to show read names and `s' to show + reference skip (N operation) as deletions. + + * Fixed a bug in `samtools merge -n'. + + * Samtools merge now optionally copies the header of a user specified + SAM file to the resultant BAM output. + + * Samtools pileup/tview works with a CIGAR with the first or the last + operation is an indel. + + * Fixed a bug in bam_aux_get(). + + +Changes in other utilies: + + * Fixed wrong FLAG in maq2sam. + + +(0.1.6: 2 September 2009, r453) + + + +Beta Release 0.1.5 (7 July, 2009) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Notable changes: + + * Support opening a BAM alignment on FTP. Users can now use "tview" to + view alignments at the NCBI ftp site. Please read manual for more + information. + + * In library, propagate errors rather than exit or complain assertion + failure. + + * Simplified the building system and fixed compiling errors caused by + zlib<1.2.2.1. + + * Fixed an issue about lost header information when a SAM is imported + with "view -t". + + * Implemented "samtool.pl varFilter" which filters both SNPs and short + indels. This command replaces "indelFilter". + + * Implemented "samtools.pl pileup2fq" to generate FASTQ consensus from + pileup output. + + * In pileup, cap mapping quality at 60. This helps filtering when + different aligners are in use. + + * In pileup, allow to output variant sites only. + + * Made pileup generate correct calls in repetitive region. At the same + time, I am considering to implement a simplified model in SOAPsnp, + although this has not happened yet. + + * In view, added '-u' option to output BAM without compression. This + option is preferred when the output is piped to other commands. + + * In view, added '-l' and '-r' to get the alignments for one library or + read group. The "@RG" header lines are now partially parsed. + + * Do not include command line utilities to libbam.a. + + * Fixed memory leaks in pileup and bam_view1(). + + * Made faidx more tolerant to empty lines right before or after FASTA > + lines. + + +Changes in other utilities: + + * Updated novo2sam.pl by Colin Hercus, the key developer of novoalign. + + +This release involves several modifications to the key code base which +may potentially introduce new bugs even though we have tried to minimize +this by testing on several examples. Please let us know if you catch +bugs. + +(0.1.5: 7 July 2009, r373) + + + +Beta Release 0.1.4 (21 May, 2009) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Notable changes: + + * Added the 'rmdupse' command: removing duplicates for SE reads. + + * Fixed a critical bug in the indel caller: clipped alignments are not + processed correctly. + + * Fixed a bug in the tview: gapped alignment may be incorrectly + displayed. + + * Unified the interface to BAM and SAM I/O. This is done by + implementing a wrapper on top of the old APIs and therefore old APIs + are still valid. The new I/O APIs also recognize the @SQ header + lines. + + * Generate the MD tag. + + * Generate "=" bases. However, the indel caller will not work when "=" + bases are present. + + * Enhanced support of color-read display (by Nils Homer). + + * Implemented the GNU building system. However, currently the building + system does not generate libbam.a. We will improve this later. For + the time being, `make -f Makefile.generic' is preferred. + + * Fixed a minor bug in pileup: the first read in a chromosome may be + skipped. + + * Fixed bugs in bam_aux.c. These bugs do not affect other components as + they were not used previously. + + * Output the 'SM' tag from maq2sam. + +(0.1.4: 21 May 2009, r297) + + + +Beta Release 0.1.3 (15 April, 2009) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Notable changes in SAMtools: @@ -15,6 +336,8 @@ Notable changes in SAMtools: * Fixed a bug in alignment retrieval: alignments bridging n*16384bp are not correctly retrieved sometimes. + * Fixed a bug in rmdup: segfault if unmapped reads are present. + * Move indel_filter.pl to samtools.pl and improved the filtering by checking the actual number of alignments containing indels. The indel pileup line is also changed a little to make this filtration easier. @@ -37,7 +360,9 @@ Changes in other utilities: * Various converters: improved functionality in general. -(0.1.3: XX April 2009, rXXX) + * Updated the example SAM due to the previous bug in fixmate. + +(0.1.3: 15 April 2009, r227)