wiki update to trunk

author martinahansen <martinahansen@74ccb610-7750-0410-82ae-013aeee3265d>

Mon, 30 Jun 2008 07:21:24 +0000 (07:21 +0000)

committer martinahansen <martinahansen@74ccb610-7750-0410-82ae-013aeee3265d>

Mon, 30 Jun 2008 07:21:24 +0000 (07:21 +0000)
author martinahansen <martinahansen@74ccb610-7750-0410-82ae-013aeee3265d>
Mon, 30 Jun 2008 07:21:24 +0000 (07:21 +0000)
committer martinahansen <martinahansen@74ccb610-7750-0410-82ae-013aeee3265d>
Mon, 30 Jun 2008 07:21:24 +0000 (07:21 +0000)
diff --git a/bp_usage/00README b/bp_usage/00README

new file mode 100644 (file)

index 0000000..ba2ff78
--- /dev/null
+++ b/bp_usage/00README
@@ -0,0 +1,39 @@
+This directory contains the 'usage' file for all biopieces.
+They are written in Google's wiki format descripted here:
+
+http://code.google.com/p/support/wiki/WikiSyntax
+
+Since the format is quite imcomplete, only some of the syntax
+is supported by the biopieces until Google upgrades it. The
+reason for using this format is that it will render nicely
+on the Google biopieces wiki:
+
+http://code.google.com/p/biopieces/w/list
+
+... and at the same time, the biopiece *read_gwiki*
+
+will deal with simple ASCII markup.
+
+The below features are supported, and if more is needed, look at
+the ~/code_perl/Maasha/Gwiki.pm module.
+
+= heading =
+== subheading ==
+=== Level 3 ===
+
+{{{
+verbatim block
+}}}
+
+
+*inline bold*
+
+_inline emphasis_
+
+`inline verbatim`
+
+---- (horizontal line)
+
+
+
+Martin A. Hansen,  June 2008
diff --git a/bp_usage/print_usage.wiki b/bp_usage/print_usage.wiki

new file mode 100644 (file)

index 0000000..15af0d1
--- /dev/null
+++ b/bp_usage/print_usage.wiki
@@ -0,0 +1,53 @@
+=Biopiece: print_usage=
+
+==Synopsis==
+
+Print a biopiece' 'usage' information from a usage file.
+
+==Description==
+
+Biopiece usage is kept in seperate files, one for each biopiece. The usage files are written in Google's wiki format and are thus independent of programming language. The usage files are kept in `~/biopieces/bp_usage/`
+
+For more about Google's wiki format:
+
+http://code.google.com/p/support/wiki/WikiSyntax
+
+==Usage==
+
+{{{
+print_usage [options] -i <usage file>
+}}}
+
+==Options==
+
+{{{
+[-i <file>    | --data_in=<file>]  -  Read tabular data from file.
+[-h           | --help]            -  Print verbose usage.
+}}}
+
+==Examples==
+
+{{{
+print_usage -i ~/biopieces/bp_usage/print_usage -h
+}}}
+
+==See also==
+
+
+==Author==
+
+Martin Asser Hansen - Copyright (C) - All rights reserved.
+mail@maasha.dk
+August 2007
+
+==License==
+
+GNU General Public License version 2
+
+http://www.gnu.org/copyleft/gpl.html
+
+==Help==
+
+*print_usage* is part of the Biopieces framework.
+
+http://code.google.com/p/biopieces/
diff --git a/bp_usage/read_fasta.wiki b/bp_usage/read_fasta.wiki

index 98a3742f377600989a331b434e8dddd3493d2971..45a065aa0d810ce4cc904fd8987a19a05f51276a 100644 (file)
--- a/bp_usage/read_fasta.wiki
+++ b/bp_usage/read_fasta.wiki
@@ -62,6 +62,11 @@ read_fasta -i '*.fna'
  }}}
  
  
+==See also==
+
+[read_align]
+[write_fasta]
+
  ==Author==
  
  Martin Asser Hansen - Copyright (C) - All rights reserved.
diff --git a/bp_usage/read_tab.wiki b/bp_usage/read_tab.wiki

new file mode 100644 (file)

index 0000000..4774a46
--- /dev/null
+++ b/bp_usage/read_tab.wiki
@@ -0,0 +1,154 @@
+=Biopiece: read_tab=
+
+==Synopsis==
+
+Read tabular data.
+
+==Description==
+
+Tabular input can be read with *read_tab* which will read in chosen rows and chosen columns (separated by a given delimiter) from a table in ASCII text format.
+
+==Usage==
+
+{{{
+read_tab [options] -i <table file(s)>
+}}}
+
+==Options==
+
+{{{
+[-i <file(s)> | --data_in=<file(s)>]  -  Read tabular data from file.
+[-d <regex>   | --delimit=<regex>]    -  Changes delimiter  -  Default='\s+'
+[-c <string>  | --cols=<string>]      -  Comma separated list of cols to read in that order.
+[-k <string>  | --keys]=<string>]     -  Comma separated list of keys to use for each column.
+[-s <int>     | --skip=<int>]         -  Skip number of initial records.
+[-n <int>     | --num=<int>]          -  Limit number of records to read.
+[-I <file>    | --stream_in=<file>]   -  Read input stream from file  -  Default=STDIN
+[-O <file>    | --stream_out=<file>]  -  Write output stream to file  -  Default=STDOUT
+}}}
+
+==Examples==
+
+Consider the following table from the file from the file `test.tab`:
+
+{{{
+Organism   Sequence    Count
+Human      ATACGTCAG   23524
+Dog        AGCATGAC    2442
+Mouse      GACTG       234
+Cat        AAATGCA     2342
+}}}
+
+Reading the entire table:
+
+{{{
+read_tab -i test.tab
+}}}
+
+The above command will result in 5 records, one for each row, where the keys V0, V1, V2 are the default keys for the columns:
+
+{{{
+V0: Organism
+V2: Count
+V1: Sequence
+---
+V0: Human
+V2: 23524
+V1: ATACGTCAG
+---
+V0: Dog
+V2: 2442
+V1: AGCATGAC
+---
+V0: Mouse
+V2: 234
+V1: GACTG
+---
+V0: Cat
+V2: 2342
+V1: AAATGCA
+---
+}}}
+
+However, the first line is a comment line that can be skipped using the `-s` switch which will skip a specified number of lines before reading. So to get the rows with data do:
+
+{{{
+read_tab -i test.tab -s 1
+
+V0: Human
+V2: 23524
+V1: ATACGTCAG
+---
+V0: Dog
+V2: 2442
+V1: AGCATGAC
+---
+V0: Mouse
+V2: 234
+V1: GACTG
+---
+V0: Cat
+V2: 2342
+V1: AAATGCA
+---
+}}}
+
+It is possible to select a subset of columns to read by using the `-c` switch which takes a comma separated list of columns numbers (first column is designated 0) as argument. So to read in only the sequence and the count so that the count comes before the sequence do:
+
+{{{
+read_tab -i test.tab -s 1 -c 2,1
+
+V0: 23524
+V1: ATACGTCAG
+---
+V0: 2442
+V1: AGCATGAC
+---
+V0: 234
+V1: GACTG
+---
+V0: 2342
+V1: AAATGCA
+---
+}}}
+
+It is also possible to rename the columns with the `-k` switch:
+
+{{{
+read_tab -i test.tab -s 1 -c 2,1 -k COUNT,SEQ
+
+SEQ: ATACGTCAG
+COUNT: 23524
+---
+SEQ: AGCATGAC
+COUNT: 2442
+---
+SEQ: GACTG
+COUNT: 234
+---
+SEQ: AAATGCA
+COUNT: 2342
+---
+}}}
+
+==See also==
+
+[write_tab]
+
+==Author==
+
+Martin Asser Hansen - Copyright (C) - All rights reserved.
+mail@maasha.dk
+August 2007
+
+==License==
+
+GNU General Public License version 2
+
+http://www.gnu.org/copyleft/gpl.html
+
+==Help==
+
+*read_tab* is part of the Biopieces framework.
+
+http://code.google.com/p/biopieces/
diff --git a/code_perl/Maasha/Gwiki.pm b/code_perl/Maasha/Gwiki.pm

new file mode 100644 (file)

index 0000000..3bfdc59
--- /dev/null
+++ b/code_perl/Maasha/Gwiki.pm
@@ -0,0 +1,216 @@
+package Maasha::Gwiki;
+
+# Copyright (C) 2008 Martin A. Hansen.
+
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version 2
+# of the License, or (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+
+# http://www.gnu.org/copyleft/gpl.html
+
+
+# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DESCRIPTION <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+
+
+# Routines for manipulation of Google's wiki format
+
+
+# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+
+
+use strict;
+use Data::Dumper;
+use Term::ANSIColor;
+use Maasha::Common;
+use vars qw ( @ISA @EXPORT );
+
+@ISA = qw( Exporter );
+
+
+# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+
+
+sub gwiki2ascii
+{
+    # Martin A. Hansen, June 2008.
+
+    # Convert Google style wiki as ASCII text lines.
+
+    my ( $wiki,  # wiki data structure 
+       ) = @_;
+
+    # Returns a list of lines.
+
+    my ( $block, $triple, $line, @lines );
+
+    foreach $block ( @{ $wiki } )
+    {
+        if ( $block->[ 0 ]->{ 'FORMAT' } eq "heading" )
+        {
+            push @lines, text_underline( text_bold( "\n$block->[ 0 ]->{ 'TEXT' }" ) );
+        }
+        elsif ( $block->[ 0 ]->{ 'FORMAT' } eq "subheading" )
+        {
+            push @lines, text_bold( "$block->[ 0 ]->{ 'TEXT' }" );
+        }
+        elsif ( $block->[ 0 ]->{ 'FORMAT' } eq "level_3" )
+        {
+            push @lines, "$block->[ 0 ]->{ 'TEXT' }";
+        }
+        elsif ( $block->[ 0 ]->{ 'FORMAT' } eq "verbatim" )
+        {
+            map { push @lines, "   $_->{ 'TEXT' }" } @{ $block };
+        }
+        elsif ( $block->[ 0 ]->{ 'FORMAT' } eq "paragraph" )
+        {
+            foreach $triple ( @{ $block } )
+            {
+                $line = $triple->{ 'TEXT' };
+
+                $line =~ s/^\s*//;
+                $line =~ s/\s*$//;
+                $line =~ s/\s+/ /g;
+                $line =~ tr/`//d;
+                $line =~ s/\[([^\]]+?)\]/$1/g;
+                $line =~ s/\*([^\*]+?)\*/&text_bold($1)/ge;
+                $line =~ s/_([^_]+?)_/&text_underline($1)/ge;
+
+                push @lines, $_ foreach &Maasha::Common::wrap_line( $line, 80 );
+            }
+        }
+    }
+
+    return wantarray ? @lines : \@lines;
+}
+
+
+sub gwiki_read
+{
+    # Martin A. Hansen, June 2008.
+
+    # Parses a subset of features from Googles wiki format
+    # into a data structure. The structure consists of a 
+    # list of blocks. Each block consists of one or more lines,
+    # represented as triples with the line text, section, and format option.
+
+    # http://code.google.com/p/support/wiki/WikiSyntax
+
+    my ( $file,   # file to parse
+       ) = @_;
+
+    # Returns data structure.
+
+    my ( $fh, @lines, $i, $c, $section, @block, @output );
+
+    $fh = &Maasha::Common::read_open( $file );
+
+    @lines = <$fh>;
+    
+    chomp @lines;
+
+    close $fh;
+
+    $i = 0;
+    $c = 0;
+
+    while ( $i < @lines )
+    {
+        undef @block;
+
+        if ( $lines[ $i ] =~ /^===\s*(.+)\s*===$/ )
+        {
+            $section = $1;
+
+            push @block, {
+                TEXT    => $section,
+                SECTION => $section,
+                FORMAT  => "level_3",
+            };
+        }
+        elsif ( $lines[ $i ] =~ /^==\s*(.+)\s*==$/ )
+        {
+            $section = $1;
+
+            push @block, {
+                TEXT    => $section,
+                SECTION => $section,
+                FORMAT  => "subheading",
+            };
+        }
+        elsif ( $lines[ $i ] =~ /^=\s*(.+)\s*=$/ )
+        {
+            $section = $1;
+
+            push @block, {
+                TEXT    => $section,
+                SECTION => $section,
+                FORMAT  => "heading",
+            };
+        }
+        elsif ( $lines[ $i ] =~ /^\{\{\{$/ )
+        {
+            $c = $i + 1;
+
+            while ( $lines[ $c ] !~ /^\}\}\}$/ )
+            {
+                push @block, {
+                    TEXT    => $lines[ $c ],
+                    SECTION => $section,
+                    FORMAT  => "verbatim",
+                };
+
+                $c++;
+            }
+        }
+        else
+        {
+            push @block, {
+                TEXT    => $lines[ $i ],
+                SECTION => $section,
+                FORMAT  => "paragraph",
+            };
+        }
+
+        push @output, [ @block ], if @block;
+
+        if ( $c > $i ) {
+            $i = $c + 1;
+        } else {
+            $i++;
+        }
+    }
+
+    return wantarray ? @output : \@output;
+}
+
+
+sub text_bold
+{
+    my ( $txt,
+       ) = @_;
+
+    return colored ( $txt, "bold white" );
+}
+
+
+sub text_underline
+{
+    my ( $txt,
+       ) = @_;
+
+    return colored ( $txt, "underline" );
+}
+
+
+# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+
author	martinahansen <martinahansen@74ccb610-7750-0410-82ae-013aeee3265d>
	Mon, 30 Jun 2008 07:21:24 +0000 (07:21 +0000)
committer	martinahansen <martinahansen@74ccb610-7750-0410-82ae-013aeee3265d>
	Mon, 30 Jun 2008 07:21:24 +0000 (07:21 +0000)
bp_usage/00README	[new file with mode: 0644]	patch \| blob
bp_usage/print_usage.wiki	[new file with mode: 0644]	patch \| blob
bp_usage/read_fasta.wiki		patch \| blob \| history
bp_usage/read_tab.wiki	[new file with mode: 0644]	patch \| blob
code_perl/Maasha/Gwiki.pm	[new file with mode: 0644]	patch \| blob