user-friendly interface, supports threads for parallel computation of
the EM algorithm, single-end and paired-end read data, quality scores,
variable-length reads and RSPD estimation. It can also generate
-genomic-coordinate BAM files and UCSC wiggle files for visualization. In
-addition, it provides posterior mean and 95% credibility interval
-estimates for expression levels.
+genomic-coordinate BAM files and UCSC wiggle files for
+visualization. In addition, it provides posterior mean and 95%
+credibility interval estimates for expression levels. For
+visualization, it can also generate transcript-coordinate BAM files
+and visualize them and also models learned.
## <a name="compilation"></a> Compilation & Installation
### III. Visualization
-RSEM contains a version of samtools in the 'sam' subdirectory. When
-users specify the --out-bam option RSEM will produce three files:
-'sample_name.bam', the unsorted BAM file, 'sample_name.sorted.bam' and
-'sample_name.sorted.bam.bai' the sorted BAM file and indices generated
-by the samtools included.
+RSEM contains a version of samtools in the 'sam' subdirectory. RSEM
+will always produce three files:'sample_name.transcript.bam', the
+unsorted BAM file, 'sample_name.transcript.sorted.bam' and
+'sample_name.transcript.sorted.bam.bai' the sorted BAM file and
+indices generated by the samtools included. All three files are in
+transcript coordinates. When users specify the --output-genome-bam
+option RSEM will produce three files: 'sample_name.genome.bam', the
+unsorted BAM file, 'sample_name.genome.sorted.bam' and
+'sample_name.genome.sorted.bam.bai' the sorted BAM file and indices
+generated by the samtools included. All these files are in genomic
+coordinates.
#### a) Generating a UCSC Wiggle file
A wiggle plot representing the expected number of reads overlapping
-each position in the genome can be generated from the sorted BAM file
-output. To generate the wiggle plot, run the 'rsem-bam2wig' program on
-the 'sample_name.sorted.bam' file.
+each position in the genome can be generated from the sorted genome
+BAM file output. To generate the wiggle plot, run the 'rsem-bam2wig'
+program on the 'sample_name.genome.sorted.bam' file.
Usage:
Refer to the [UCSC custom track help page](http://genome.ucsc.edu/goldenPath/help/customTrack.html).
-#### c) Visualize the model learned by RSEM
+#### c) Generating Transcript Wiggle Plots
+
+To generate transcript wiggle plots, you should run the
+'rsem-plot-transcript-wiggles' program. Run
+
+ rsem-plot-transcript-wiggles --help
+
+to get usage information or visit the [rsem-plot-transcript-wiggles
+documentation page](http://deweylab.biostat.wisc.edu/rsem/rsem-plot-transcript-wiggles.html).
+
+#### d) Visualize the model learned by RSEM
RSEM provides an R script, 'rsem-plot-model', for visulazing the model learned.
Usage:
- rsem-plot-model sample_name outF
+ rsem-plot-model sample_name output_plot_file
sample_name: the name of the sample analyzed
-outF: the file name for plots generated from the model. It is a pdf file
+output_plot_file: the file name for plots generated from the model. It is a pdf file
The plots generated depends on read type and user configuration. It
may include fragment length distribution, mate length distribution,
## <a name="example"></a> Example
-Suppose we download the mouse genome from UCSC Genome Browser. We will
-use a reference_name of 'mm9'. We have a FASTQ-formatted file,
-'mmliver.fq', containing single-end reads from one sample, which we call
-'mmliver_single_quals'. We want to estimate expression values by using
-the single-end model with a fragment length distribution. We know that
-the fragment length distribution is approximated by a normal
-distribution with a mean of 150 and a standard deviation of 35. We wish
-to generate 95% credibility intervals in addition to maximum likelihood
-estimates. RSEM will be allowed 1G of memory for the credibility
-interval calculation. We will visualize the probabilistic read mappings
-generated by RSEM.
+Suppose we download the mouse genome from UCSC Genome Browser. We
+will use a reference_name of 'mm9'. We have a FASTQ-formatted file,
+'mmliver.fq', containing single-end reads from one sample, which we
+call 'mmliver_single_quals'. We want to estimate expression values by
+using the single-end model with a fragment length distribution. We
+know that the fragment length distribution is approximated by a normal
+distribution with a mean of 150 and a standard deviation of 35. We
+wish to generate 95% credibility intervals in addition to maximum
+likelihood estimates. RSEM will be allowed 1G of memory for the
+credibility interval calculation. We will visualize the probabilistic
+read mappings generated by RSEM on UCSC genome browser. We will
+generate a list of genes' transcript wiggle plots in 'output.pdf'. The
+list is 'gene_ids.txt'. We will visualize the models learned in
+'mmliver_single_quals.models.pdf'
The commands for this scenario are as follows:
rsem-prepare-reference --gtf mm9.gtf --mapping knownIsoforms.txt --bowtie-path /sw/bowtie /data/mm9 /ref/mm9
- rsem-calculate-expression --bowtie-path /sw/bowtie --phred64-quals --fragment-length-mean 150.0 --fragment-length-sd 35.0 -p 8 --out-bam --calc-ci --memory-allocate 1024 /data/mmliver.fq /ref/mm9 mmliver_single_quals
+ rsem-calculate-expression --bowtie-path /sw/bowtie --phred64-quals --fragment-length-mean 150.0 --fragment-length-sd 35.0 -p 8 --output-genome-bam --calc-ci --memory-allocate 1024 /data/mmliver.fq /ref/mm9 mmliver_single_quals
rsem-bam2wig mmliver_single_quals.sorted.bam mmliver_single_quals.sorted.wig mmliver_single_quals
+ rsem-plot-transcript-wiggles --gene-list --show-unique mmliver_single_quals gene_ids.txt output.pdf
+ rsem-plot-model mmliver_single_quals mmliver_single_quals.models.pdf
## <a name="simulation"></a> Simulation