EBSeq/man/EBTest.Rd

   1 \name{EBTest}
   2 \alias{EBTest}
   3 %- Also NEED an '\alias' for EACH other topic documented here.
   4 \title{
   5 Using EM algorithm to calculate the posterior probabilities of being DE
   6 }
   7 \description{
   8 Base on the assumption of NB-Beta Empirical Bayes model, the EM algorithm is used to get the posterior probability of being DE.
   9 }
  10 \usage{
  11 EBTest(Data, NgVector=NULL, Vect5End=NULL, Vect3End=NULL, Conditions, sizeFactors, maxround,tau=NULL,CI=NULL,CIthre=NULL, Pool=F, NumBin=1000)
  12 }
  13 %- maybe also 'usage' for other objects documented here.
  14 \arguments{
  15
  16   \item{Data}{
  17 A data matrix contains expression values for each transcript .(Gene level or Isoform level.). In which rows should be transcripts and columns should be samples.
  18 }
  19   \item{NgVector}{
  20 A vector contains the Ng value of each isoform. If the isoform is in a gene with 2 isoforms, Ng should be 2. Ng could be only 1, 2 or 3. If it's gene level data, Ngvector should all be 1. The vector length should be the same as the number of rows in Data.
  21 }
  22   \item{Vect5End}{
  23 A vector contains the 5' end information of each isoform. It should be 1 if the isoform contains 5' end and otherwise should be 0. If it's gene level data, Vect5End should all be 1. The vector length should be the same as the number of rows in Data.
  24 (Not recommended)
  25 }
  26   \item{Vect3End}{
  27 A vector contains the 3' end information of each isoform. It should be 1 if the
  28 isoform contains 3' end and otherwise should be 0. If it's gene level data, Vect3End should all be 1. The vector length should be the same as the number of rows in Data.
  29 (Not recommended)
  30 }
  31   \item{Conditions}{
  32 A vector indicates the condition each sample belongs to.
  33 }
  34
  35
  36   \item{sizeFactors}{
  37 The normalization factors.
  38 The normalization factors could be a vector with lane specitic numbers.
  39 Or it could be a matrix with lane and transcript specific numbers.
  40 }
  41   \item{maxround}{
  42 Number of iterations. The suggested value is 5.
  43 }
  44
  45 \item{tau}{
  46 The tau value from RSEM output. If the data has no replicates within condition,
  47 EBSeq will use the CI of tau to capture the variation from mapping
  48 uncertainty and estimate the variance.
  49         }
  50 \item{CI}{
  51 The CI of each tau from RSEM output
  52         }
  53 \item{CIthre}{
  54 The threshold of CI RSEM used.
  55         }
  56 \item{Pool, NumBin}{
  57 Working without replicates, we should define the Pool=T in the
  58  EBTest function to enable pooling.
  59 By defining NumBin = 1000, EBSeq will group the genes with similar means
  60 together into 1,000 bins.
  61 With the assumption that no more than 50\% genes are DE in the data set,
  62 We take genes whose FC are in the 25\% - 75\% quantile of the FC's  as the
  63 candidate genes.
  64 For each bin, the bin-wise variance estimation would be the median of the
  65 cross condition variance estimations of the candidate genes within that bin.
  66 We use the cross condition variance estimations for the candidate genes
  67 and the bin-wise variance estimations of the host bin for the non-candidate genes.
  68 }
  69
  70 }
  71
  72 \details{
  73 For each transcript gi within condition, the model assumes:
  74 X_gis|mu_gi ~ NB (r_gi0 * l_s, q_gi)
  75 q_gi|alpha, beta^N_g,b_gi ~ Beta (alpha, neta^N_g,b_gi)
  76 In which the l_s is the sizeFactors of sample s.
  77
  78 The function will test:
  79 H0: q_giC1 = q_giC2
  80 H1: q_giC1 != q_giC2
  81
  82
  83 }
  84 \value{
  85 \item{Alpha }{Fitted parameter alpha of the prior beta distribution. Rows are the values for each iteration.}
  86 \item{Beta }{Fitted parameter beta of the prior beta distribution. Rows are the values for each iteration.}
  87 \item{P, PFromZ }{ The bayes estimator of being DE.Rows are the values for each iteration.}
  88 \item{Z, PoissonZ}{ The Posterior Probability of being DE for each transcript. (Maybe not in the same order of input)}
  89 \item{RList}{ The fitted values of r for each transcript.}
  90 \item{MeanList}{The mean of each transcript. (Cross conditions)}
  91 \item{VarList}{The variance of each transcript. (Cross conditions, using the expression values devided by it's sizeFactors)}
  92 \item{QListi1}{The fitted q values of each transcript within condition 1.}
  93 \item{QListi2}{The fitted q values of each transcript within condition 2.}
  94 \item{C1Mean}{The mean of each transcript within Condition 1}
  95 \item{C2Mean}{The mean of each transcript within Condition 2}
  96 \item{C1EstVar}{The estimated variance of each transcript within Condition 1}
  97 \item{C2EstVar}{The estimated variance of each transcript within Condition 2}
  98 \item{PoolVar}{The variance of each transcript. (The pooled value of within condition EstVar)}
  99 \item{DataList}{A List of data that grouped with Ng and bias.}
 100 \item{PPDE}{The Posterior Probability of being DE for each transcript. (The same order of input)}
 101
 102
 103 }
 104 \references{
 105 }
 106 \author{
 107 Ning Leng
 108 }
 109 \note{
 110 }
 111
 112
 113 \seealso{
 114 }
 115 \examples{
 116 #Simulate Gene level data
 117 GeneGenerate=GeneSimu(DVDconstant=4, DVDqt1=NULL, DVDqt2=NULL, Conditions=rep(c(1,2),each=5), NumofSample=10, NumofGene=10000, DEGeneProp=.1, Phiconstant=NULL, Phi.qt1=.25, Phi.qt2=.75, Meanconstant=NULL, OnlyData="Y")
 118 GeneData=GeneGenerate$data
 119
 120 # Run EBSeq
 121 # sizeFactors could be obtained by MedianNorm, QuantileNorm or RankNorm
 122 EBres=EBTest(Data=GeneData, NgVector=rep(1,10^4), Vect5End=rep(1,10^4), Vect3End=rep(1,10^4), Conditions=as.factor(rep(c(1,2),each=5)), sizeFactors=rep(1,10),maxround=5)
 123
 124 # Isoform Level
 125 IsoGenerate=IsoSimu(DVDconstant=NULL, DVDqt1=.97, DVDqt2=.98, Conditions=as.factor(rep(c(1,2),each=5)), NumofSample=10, NumofIso=c(1000,2000,3000), DEIsoProp=.1, Phiconstant=NULL, Phi.qt1=.25, Phi.qt2=.75, OnlyData=T )
 126
 127 IsoMat=do.call(rbind,IsoGenerate$data)
 128 IsoNames=rownames(IsoMat)
 129
 130 Ngvector=GetNg(IsoNames, IsosGeneNames)
 131 IsoNgTrun=Ngvector$IsoformNgTrun
 132
 133 IsoEBres=EBTest(Data=IsoMat, NgVector=IsoNgTrun, Conditions=as.factor(rep(c(1,2),each=5)),sizeFactors=rep(1,10), maxround=5)
 134
 135 }
 136 % Add one or more standard keywords, see file 'KEYWORDS' in the
 137 % R documentation directory.
 138 \keyword{ ~kwd1 }
 139 \keyword{ ~kwd2 }% __ONLY ONE__ keyword per line