Erik Garrison [Thu, 29 Nov 2012 17:34:46 +0000 (12:34 -0500)]
add input file list method to bamtools merge
By using -list users can specify a file list. Workaround for the case
when there are more input files than allowed by the maximum number of
command-line arguments.
Derek Barnett [Mon, 19 Nov 2012 17:24:55 +0000 (12:24 -0500)]
Version 2.2.1
* Added:
BamAlignment::GetTagNames()
BamReader::GetConstSamHeader()
* Fixed:
'bamtools convert' - issues with unsigned char data in tags
empty quality string handling in BamWriter
derek [Tue, 27 Mar 2012 16:03:54 +0000 (12:03 -0400)]
Fixed: sorting order lost during merge step of sort tool, if input BAM
lacked SAM header
* Due to lack of SO tag in temp files. This tag is set just fine on
input BAMs containing SAM headers. However, when an input file lacked
one, especially the (required) VN number, the entire @HD line was
dropped.
* Forcing the current SAM version number, if none exists, on sort
output.
derek [Thu, 10 Nov 2011 04:58:20 +0000 (23:58 -0500)]
Added generic I/O device to BamIndex side of things
* Remote BAM access (now w/ random access) seems to be working with the
simple test cases so far
* Major TODO: not yet implemented for Windows
derek [Mon, 7 Nov 2011 17:50:10 +0000 (12:50 -0500)]
Implemented basic TCP support layer
* buffered I/O
* design should support future expansion of protocols, proxies, etc
* so far, HTTP range requests working well (on plain HTML text tests,
not yet BAM-tested)
derek [Wed, 12 Oct 2011 20:30:59 +0000 (16:30 -0400)]
Major speedup in SamSequenceDictionary & SamReadGroupDictionary classes
* Please note that this does introduce a minor source-incompatibility,
only affecting those working directly with the provided Sam*Iterator
typedefs. The short answer is that the iterator now references a
std::pair instead of the 'plain old' data. Use the pair's "second" field
to access the desired SamSequence or SamReadGroup.
* Doxygen docs have been updated to reflect this and provide a bit more
explanation/examples (in docs folder run 'doxygen Doxyfile' to get the
updated API pages).
derek [Tue, 11 Oct 2011 20:52:51 +0000 (16:52 -0400)]
Cleanup in SortTool
* Now using the new BamTools::Algorithms::Sort function objects
* Special handling of unmapped alignments should no longer be necessary
as the sorting function objects (also used by multi-reader merging
strategy) handles those cases.
derek [Fri, 7 Oct 2011 20:11:43 +0000 (16:11 -0400)]
Merge with earlier IODevice work
* This commit still has some console pollution. I need to work in the
recent Exception/ErrorString approach, but wanted to go ahead and do the
merge-conflict resolution now before diving into remote file support.
derek [Fri, 7 Oct 2011 19:12:57 +0000 (15:12 -0400)]
Removed STDERR pollution by API
* Accomplished this by introducing a GetErrorString() on most API
objects. When a method returns false, you can ignore it, parse the error
string to decide what to do next, prompt the user, make a sandwich,
whatever. But nothing should leak out to the console.
* Internally the error messages are passed by a new BamException class.
This new exception should not cross the library boundary. The exception
should be caught "under the hood" and its what() string should be
(possibly formatted and) stored as the error string in one of the high-
In standard indexed BAM files with with sparce coverage (our test case was a roughly 1M read RNAseq BAM file), queries made to intervals may not have any of the candidate offesets present in the index as the BAM index only contains bins that have reads.
Without this bail out, we would get a crash. Returning false silently is the preferred behavior in our view as it allows our read logic to go to the next query and does not add noise to stderr.
Removed 'core mode' concept from BamMultiReader internals
* Now char data is only generated if needed by multi-merger
implementation or on-demand by client call to
BamMultiReader::GetNextAlignment()
Basic internal implementation of BamFile & BamPipe
* BgzfStream now working on IBamIODevice instead of FILE*
* BamReaderPrivate now queries stream's IsOpen() method instead of
touching member variable directly
* Empty implementations of BamHttp & BamFtp
* Added global BT_ASSERT_X macro for convenience
Bug discovered. The chunkStop was not being read from the correct offset (rather always being read as the first chunkStart value for the # alignment chunks in that bin of the index.
The result of this is that chunkStop will never be >= minOffset (or maybe rarely, since it always equals the first chunkStart for the first chunk) and thus the linear index doesn't really help in reducing the number of seeks performed.