-.TH samtools 1 "21 May 2009" "samtools-0.1.4" "Bioinformatics tools"
+.TH samtools 1 "6 July 2009" "samtools-0.1.5" "Bioinformatics tools"
.SH NAME
.PP
samtools - Utilities for the Sequence Alignment/Map (SAM) format
.SH DESCRIPTION
.PP
Samtools is a set of utilities that manipulate alignments in the BAM
-format. It imports from and exports to the SAM (Sequence
-Alignment/Map) format, does sorting, merging and indexing, and
-allows to retrieve reads in any regions swiftly.
+format. It imports from and exports to the SAM (Sequence Alignment/Map)
+format, does sorting, merging and indexing, and allows to retrieve reads
+in any regions swiftly.
+
+Samtools is designed to work on a stream. It regards an input file `-'
+as the standard input (stdin) and an output file `-' as the standard
+output (stdout). Several commands can thus be combined with Unix
+pipes. Samtools always output warning and error messages to the standard
+error output (stderr).
+
+Samtools is also able to open a BAM (not SAM) file on a remote FTP
+server if the BAM file name starts with `ftp://'. Samtools checks the
+current working directory for the index file and will download the index
+upon absence. Samtools achieves random FTP file access with the `REST'
+ftp command. It does not retrieve the entire alignment file unless it is
+asked to do so.
.SH COMMANDS AND OPTIONS
+
.TP 10
.B import
samtools import <in.ref_list> <in.sam> <out.bam>
.TP
.B view
-samtools view [-bhHS] [-t in.refList] [-o output] [-f reqFlag] [-F
-skipFlag] [-q minMapQ] <in.bam> [region1 [...]]
+samtools view [-bhuHS] [-t in.refList] [-o output] [-f reqFlag] [-F
+skipFlag] [-q minMapQ] [-l library] [-r readGroup] <in.bam>|<in.sam> [region1 [...]]
Extract/print all or sub alignments in SAM or BAM format. If no region
is specified, all the alignments will be printed; otherwise only
-alignments overlapping with the specified regions will be output. An
+alignments overlapping the specified regions will be output. An
alignment may be given multiple times if it is overlapping several
regions. A region can be presented, for example, in the following
-format: `chr2', `chr2:1000000' or `chr2:1,000,000-2,000,000'.
+format: `chr2', `chr2:1000000' or `chr2:1,000,000-2,000,000'. The
+coordinate is 1-based.
.B OPTIONS:
.RS
.B -b
Output in the BAM format.
.TP
+.B -u
+Output uncompressed BAM. This option saves time spent on
+compression/decomprssion and is thus preferred when the output is piped
+to another samtools command.
+.TP
.B -h
Include the header in the output.
.TP
.TP
.B -q INT
Skip alignments with MAPQ smaller than INT [0]
+.TP
+.B -l STR
+Only output reads in library STR [null]
+.TP
+.B -r STR
+Only output reads in read group STR [null]
.RE
.TP
.TP
.B pileup
samtools pileup [-f in.ref.fasta] [-t in.ref_list] [-l in.site_list]
-[-iscg] [-T theta] [-N nHap] [-r pairDiffRate] <in.alignment>
+[-iscgS2] [-T theta] [-N nHap] [-r pairDiffRate] <in.bam>|<in.sam>
Print the alignment in the pileup format. In the pileup format, each
line represents a genomic position, consisting of chromosome name,
Print the mapping quality as the last column. This option makes the
output easier to parse, although this format is not space efficient.
+.TP
+.B -S
+The input file is in SAM.
+
.TP
.B -i
Only output pileup lines containing indels.
will be created if
absent.
+.TP
+.B -M INT
+Cap mapping quality at INT [60]
+
.TP
.B -t FILE
List of reference names ane sequence lengths, in the format described
.B -c
Call the consensus sequence using MAQ consensus model. Options
.B -T,
-.B -N
+.B -N,
+.B -I
and
.B -r
are only effective when
.B -c
+or
+.B -g
is in use.
.TP
.B -r FLOAT
Expected fraction of differences between a pair of haplotypes [0.001]
+.TP
+.B -I INT
+Phred probability of an indel in sequencing/prep. [40]
+
.RE
.TP
Text alignment viewer (based on the ncurses library). In the viewer,
press `?' for help and press `g' to check the alignment start from a
-region in the format like `chr10:10,000,000'. Note that if the region
-showed on the screen contains no mapped reads, a blank screen will be
-seen. This is a known issue and will be improved later.
+region in the format like `chr10:10,000,000'.
.RE
.RE
-
-.SH SAM FORFAM
+.SH SAM FORMAT
SAM is TAB-delimited. Apart from the header lines, which are started
with the `@' symbol, each alignment line consists of: