1 =Biopiece: create_weight_matrix=
5 Create a residue composition weight matrix of an alignment in the stream.
9 [create_weight_matrix] calculates the frequency of all residues per column in aligned
10 sequences from the stream - either as exact residue counts or percentages.
15 ... | create_weight_matrix [options]
21 [-p | --percent] - Output the result in percent - Default=absolute
22 [-I <file> | --stream_in=<file>] - Read input from stream file - Default=STDIN
23 [-O <file> | --stream_out=<file>] - Write output to stream file - Default=STDOUT
28 Consider the following alignment in the file `aln.fna` in FASTA format:
43 To create a weight matrix from the above alignment, read it in with [read_fasta] and pipe the
44 stream through [create_weight_matrix]:
47 read_fasta -i aln.fna | create_weight_matrix
50 The resulting five records will look the first one below, which is not really understandable:
71 To make sense pipe the result through [write_tab] like this:
74 read_fasta -i aln.fna | create_weight_matrix | write_tab -x
76 - 4 4 3 2 1 0 0 0 0 0 0 0 0 0
77 A 1 0 0 1 4 2 1 3 1 0 0 5 0 0
78 C 0 1 1 0 0 0 4 0 0 3 2 0 4 1
79 G 0 0 1 0 0 3 0 0 1 2 3 0 0 0
80 T 0 0 0 2 0 0 0 2 3 0 0 0 1 4
83 The above weight matrix shows the frequencies of all residue types (1st column) found at
84 all positions throughout the alignment.
86 To obtain the percentwise frequencies use the `-p` switch to [create_weight_matrix]:
89 read_fasta -i aln.fna | create_weight_matrix -p | write_tab -x
91 - 80 80 60 40 20 0 0 0 0 0 0 0 0 0
92 A 20 0 0 20 80 40 20 60 20 0 0 100 0 0
93 C 0 20 20 0 0 0 80 0 0 60 40 0 80 20
94 G 0 0 20 0 0 60 0 0 20 40 60 0 0 0
95 T 0 0 0 40 0 0 0 40 60 0 0 0 20 80
106 Martin Asser Hansen - Copyright (C) - All rights reserved.
114 GNU General Public License version 2
116 http://www.gnu.org/copyleft/gpl.html
120 [create_weight_matrix] is part of the Biopieces framework.
122 http://code.google.com/p/biopieces/