Name: Ignacio Alvarez
Institution: Universidad de la República, Uruguay
E-mail: nachalca@iesta.edu.uy
Co-authors:  Jarad Niemi

Abstract:

Diploid organisms have two copies of each gene, called alleles, that can be separately transcribed. The RNA abundance associated to any particular allele is known as allele-specific expression (ASE). When two alleles have polymorphisms in transcribed regions, ASE can be studied using RNA-seq read count data.

ASE has characteristics different from the regular RNA-seq expression: ASE cannot be assessed for every gene, measures of ASE can be biased towards one of the alleles (reference allele), and ASE provides two measures of expression for a single gene for each biological samples with leads to additional complications for single-gene models.

We present statistical methods for modeling ASE and detecting genes with differential allelic expression. We propose a hierarchical, overdispersed, count regression model to deal with ASE counts. The model accommodates gene-specific overdispersion, has an internal measure of the reference allele bias, and uses random effects to model the gene-specific regression parameters. Fully Bayesian inference is obtained using the fbseq package that implements a parallel strategy to make the computational times reasonable. Simulation and real data analysis suggest the proposed model is a practical and powerful tool for the study of differential ASE.