X-Git-Url: https://git.donarmstrong.com/?p=rsem.git;a=blobdiff_plain;f=README.md;h=77c0693af045ab7daa551055dff244905bca111f;hp=b46826a58209fda36bc59cff8e860a95707dfe25;hb=237bbdf363c9e42ee24e2fd63106dccf20d9bf2f;hpb=4a435fbee229af1dd5f76b4feb0ca71398ac8796 diff --git a/README.md b/README.md index b46826a..77c0693 100644 --- a/README.md +++ b/README.md @@ -22,15 +22,21 @@ Table of Contents ## Introduction RSEM is a software package for estimating gene and isoform expression -levels from RNA-Seq data. The new RSEM package (rsem-1.x) provides an -user-friendly interface, supports threads for parallel computation of -the EM algorithm, single-end and paired-end read data, quality scores, -variable-length reads and RSPD estimation. It can also generate -genomic-coordinate BAM files and UCSC wiggle files for -visualization. In addition, it provides posterior mean and 95% -credibility interval estimates for expression levels. For -visualization, it can also generate transcript-coordinate BAM files -and visualize them and also models learned. +levels from RNA-Seq data. The RSEM package provides an user-friendly +interface, supports threads for parallel computation of the EM +algorithm, single-end and paired-end read data, quality scores, +variable-length reads and RSPD estimation. In addition, it provides +posterior mean and 95% credibility interval estimates for expression +levels. For visualization, It can generate BAM and Wiggle files in +both transcript-coordinate and genomic-coordinate. Genomic-coordinate +files can be visualized by both UCSC Genome browser and Broad +Institute's Integrative Genomics Viewer (IGV). Transcript-coordinate +files can be visualized by IGV. RSEM also has its own scripts to +generate transcript read depth plots in pdf format. The unique feature +of RSEM is, the read depth plots can be stacked, with read depth +contributed to unique reads shown in black and contributed to +multi-reads shown in red. In addition, models learned from data can +also be visualized. Last but not least, RSEM contains a simulator. ## Compilation & Installation @@ -103,8 +109,10 @@ and provide the SAM or BAM file as an argument. When using an alternative aligner, you may also want to provide the '--no-bowtie' option to 'rsem-prepare-reference' so that the Bowtie indices are not built. -Some aligners' (other than Bowtie) output might need to be converted -so that RSEM can use. For conversion, please run +RSEM requires all alignments of the same read group together. For +paired-end reads, RSEM also requires the two mates of any alignment be +adjacent. If the alternative aligner does not satisfy the first +requirement, you can use 'convert-sam-for-rsem' for conversion. Please run convert-sam-for-rsem --help @@ -132,24 +140,27 @@ unsorted BAM file, 'sample_name.genome.sorted.bam' and generated by the samtools included. All these files are in genomic coordinates. -#### a) Generating a UCSC Wiggle file +#### a) Generating a Wiggle file A wiggle plot representing the expected number of reads overlapping -each position in the genome can be generated from the sorted genome -BAM file output. To generate the wiggle plot, run the 'rsem-bam2wig' -program on the 'sample_name.genome.sorted.bam' file. +each position in the genome/transcript set can be generated from the +sorted genome/transcript BAM file output. To generate the wiggle +plot, run the 'rsem-bam2wig' program on the +'sample_name.genome.sorted.bam'/'sample_name.transcript.sorted.bam' file. Usage: - rsem-bam2wig bam_input wig_output wiggle_name + rsem-bam2wig sorted_bam_input wig_output wiggle_name -bam_input: sorted bam file +sorted_bam_input: sorted bam file wig_output: output file name, e.g. output.wig wiggle_name: the name the user wants to use for this wiggle plot -#### b) Loading a BAM and/or Wiggle file into the UCSC Genome Browser +#### b) Loading a BAM and/or Wiggle file into the UCSC Genome Browser or Integrative Genomics Viewer(IGV) -Refer to the [UCSC custom track help page](http://genome.ucsc.edu/goldenPath/help/customTrack.html). +For UCSC genome browser, please refer to the [UCSC custom track help page](http://genome.ucsc.edu/goldenPath/help/customTrack.html). + +For integrative genomics viewer, please refer to the [IGV home page](http://www.broadinstitute.org/software/igv/home). Note: Although IGV can generate read depth plot from the BAM file given, it cannot recognize "ZW" tag RSEM puts. Therefore IGV counts each alignment as weight 1 instead of the expected weight for the plot it generates. So we recommend to use the wiggle file generated by RSEM for read depth visualization. #### c) Generating Transcript Wiggle Plots @@ -255,6 +266,8 @@ map_file: transcript-to-gene-map file's name. RSEM uses the [Boost C++](http://www.boost.org) and [samtools](http://samtools.sourceforge.net) libraries. +We thank earonesty for contributing patches. + ## License RSEM is licensed under the [GNU General Public License v3](http://www.gnu.org/licenses/gpl-3.0.html).