Meeting Abstract
10.4 Friday, Jan. 4 Accuracy of pooled RNA-seq KONCZAL, M.*; KOTEJA, P.; RADWAN, J.; STUGLIK, M.; BABIK, W.; Jagiellonian University in Krakow; Jagiellonian University in Krakow; Jagiellonian University in Krakow; Jagiellonian University in Krakow; Jagiellonian University in Krakow mateusz.konczal@uj.edu.pl
For non-model organisms without reference genome, genome-wide information focusing on functionally relevant variation may be obtained through RNA-seq with de novo assembled reference transcriptome. Sequencing itself has become relatively cheap, but library preparation for many samples remains prohibitively expensive. In such cases pooling appears an attractive, but nontrivial approach. Inter-individual and inter-locus variation in expression level could cause inaccuracy in allele frequency (AF) estimation, the problem which does not affect pooled genome resequencing. To estimate the accuracy of pooled RNA-seq in predicting AF we analyzed liver transcriptomes of 10 bank voles (Myodes glareolus). Each sample was sequenced both as an individually barcoded library and as a part of a pool. The pool consisted of equal amount of total RNA from each vole, combined prior to mRNA selection and library construction. On average 16.8 million reads (100bp PE) were obtained per individual. Reads were mapped on the de novo assembled reference transcriptome. For 33 000 SNPs high quality genotype was available for each vole. These genotypes allowed us to calculate true AF in the sample. AFs estimated from the pool were compared to the true values. High correlation between true frequencies and those estimated from the pool (R2=0,89) was observed. Mean estimation error reached 21% of true value and was independent of expression level, which indicates that accuracy of AF estimation from pooled samples is relatively robust to variation in expression between individuals. However, we observed highly negative correlation between minor AF and calculated error, the problem affecting also genome studies. Our results indicate that the efficiency of pooled RNA-seq may be comparable to pooled genome resequencing.