From: Bo Li Date: Wed, 26 Jun 2013 22:28:09 +0000 (-0500) Subject: Updated EBSeq from v1.1.5 to v1.1.6 and fixed a bug in 'rsem-generate-data-matrix... X-Git-Url: https://git.donarmstrong.com/?p=rsem.git;a=commitdiff_plain;h=bf2e8ee71918a524c5dddbe34cefeec33f16de9f Updated EBSeq from v1.1.5 to v1.1.6 and fixed a bug in 'rsem-generate-data-matrix', which can cause 'rsem-find-DE' to crash --- diff --git a/EBSeq/EBSeq_1.1.5.tar.gz b/EBSeq/EBSeq_1.1.5.tar.gz deleted file mode 100644 index e0bf8de..0000000 Binary files a/EBSeq/EBSeq_1.1.5.tar.gz and /dev/null differ diff --git a/EBSeq/EBSeq_1.1.6.tar.gz b/EBSeq/EBSeq_1.1.6.tar.gz new file mode 100644 index 0000000..f2b0d5e Binary files /dev/null and b/EBSeq/EBSeq_1.1.6.tar.gz differ diff --git a/EBSeq/makefile b/EBSeq/makefile index 020a32a..ac97910 100644 --- a/EBSeq/makefile +++ b/EBSeq/makefile @@ -6,8 +6,8 @@ all : $(PROGRAMS) blockmodeling : blockmodeling_0.1.8.tar.gz R CMD INSTALL -l "." blockmodeling_0.1.8.tar.gz -EBSeq : blockmodeling EBSeq_1.1.5.tar.gz - R CMD INSTALL -l "." EBSeq_1.1.5.tar.gz +EBSeq : blockmodeling EBSeq_1.1.6.tar.gz + R CMD INSTALL -l "." EBSeq_1.1.6.tar.gz rsem-for-ebseq-calculate-clustering-info : calcClusteringInfo.cpp $(CC) -O3 -Wall calcClusteringInfo.cpp -o $@ diff --git a/README.md b/README.md index 087493d..7b9a4b8 100644 --- a/README.md +++ b/README.md @@ -375,13 +375,13 @@ you. Usage: - rsem-find-DE data_matrix_file [--ngvector ngvector_file] number_sample_condition1 FDR_rate output_file + rsem-find-DE data_matrix_file [--ngvector ngvector_file] number_of_samples_in_condition_1 FDR_rate output_file This script calls EBSeq to find differentially expressed genes/transcripts in two conditions. data_matrix_file: m by n matrix containing expected counts, m is the number of transcripts/genes, n is the number of total samples. [--ngvector ngvector_file]: optional field. 'ngvector_file' is calculated by 'rsem-generate-ngvector'. Having this field is recommended for transcript data. -number_sample_condition1: the number of samples in condition 1. A condition's samples must be adjacent. The left group of samples are defined as condition 1. +number_of_samples_in_condition_1: the number of samples in condition 1. A condition's samples must be adjacent. The left group of samples are defined as condition 1. FDR_rate: false discovery rate. output_file: the output file. Three files will be generated: 'output_file', 'output_file.hard_threshold' and 'output_file.all'. The first file reports all DE genes/transcripts using a soft threshold (calculated by crit_func in EBSeq). The second file reports all DE genes/transcripts using a hard threshold (only report if PPEE <= fdr). The third file reports all genes/transcripts. The first file is recommended to be used as DE results because it generally contains more called genes/transcripts. @@ -414,6 +414,7 @@ RSEM uses the [Boost C++](http://www.boost.org) and differential expression analysis. We thank earonesty for contributing patches. + We thank Han Lin for suggesting possible fixes. ## License diff --git a/WHAT_IS_NEW b/WHAT_IS_NEW index 28108dc..86e2194 100644 --- a/WHAT_IS_NEW +++ b/WHAT_IS_NEW @@ -1,3 +1,10 @@ +RSEM v1.2.5 + +- Updated EBSeq from v1.1.5 to v1.1.6 +- Fixed a bug in 'rsem-generate-data-matrix', which can cause 'rsem-find-DE' to crash + +-------------------------------------------------------------------------------------------- + RSEM v1.2.4 - Fixed a bug that leads to poor parallelization performance in Mac OS systems diff --git a/rsem-find-DE b/rsem-find-DE index e9d65cf..a69ebd4 100755 --- a/rsem-find-DE +++ b/rsem-find-DE @@ -1,11 +1,11 @@ #!/usr/bin/env Rscript printUsage <- function() { - cat("Usage: rsem-find-DE data_matrix_file [--ngvector ngvector_file] number_sample_condition1 FDR_rate output_file\n\n") + cat("Usage: rsem-find-DE data_matrix_file [--ngvector ngvector_file] number_of_samples_in_condition_1 FDR_rate output_file\n\n") cat("This script calls EBSeq to find differentially expressed genes/transcripts in two conditions.\n\n") cat("data_matrix_file: m by n matrix containing expected counts, m is the number of transcripts/genes, n is the number of total samples.\n") cat("[--ngvector ngvector_file]: optional field. 'ngvector_file' is calculated by 'rsem-generate-ngvector'. Having this field is recommended for transcript data.\n") - cat("number_sample_condition1: the number of samples in condition 1. A condition's samples must be adjacent. The left group of samples are defined as condition 1.\n") + cat("number_of_samples_in_condition_1: the number of samples in condition 1. A condition's samples must be adjacent. The left group of samples are defined as condition 1.\n") cat("FDR_rate: false discovery rate.\n") cat("output_file: the output file. Three files will be generated: 'output_file', 'output_file.hard_threshold' and 'output_file.all'. The first file reports all DE genes/transcripts using a soft threshold (calculated by crit_func in EBSeq). The second file reports all DE genes/transcripts using a hard threshold (only report if PPEE <= fdr). The third file reports all genes/transcripts. The first file is recommended to be used as DE results because it generally contains more called genes/transcripts.\n\n") cat("The results are written as a matrix with row and column names. The row names are the genes'/transcripts' ids. The column names are 'PPEE', 'PPDE', 'PostFC' and 'RealFC'.\n\n") diff --git a/rsem-generate-data-matrix b/rsem-generate-data-matrix index 951f415..b1b4f63 100755 --- a/rsem-generate-data-matrix +++ b/rsem-generate-data-matrix @@ -22,7 +22,7 @@ sub loadData { while ($line = ) { chomp($line); my @fields = split(/\t/, $line); - push(@{$_[2]}, $fields[0]); + push(@{$_[2]}, "\"$fields[0]\""); push(@{$_[1]}, $fields[$offsite]); } close(INPUT); @@ -59,7 +59,11 @@ for (my $i = 0; $i < $n; $i++) { exit(-1); } - @ecs = ($ARGV[$i], @ecs); + my $colname; + if (substr($ARGV[$i], 0, 2) eq "./") { $colname = substr($ARGV[$i], 2); } + else { $colname = $ARGV[$i]; } + $colname = "\"$colname\""; + @ecs = ($colname, @ecs); push(@matrix, \@ecs); }