ESLiM: Exon Splicing by Linear Modeling Analysis

(Method designed to find alternative splicing events in defined genes)

Here are included all the packages and files needed to use the method called ESLiM (Exon Splicing by Linear Modeling Analysis), that allows robust analysis and estimation of gene and exon expression using exon-specific platforms (such as GeneChip Human Exon ST from Affymetrix, Inc). The method has been implemented in an R package and it is designed to find alternative splicing events affecting specific exons in any defined gene loci at genome-wide level.

Recent advances in large scale genomic and transcriptomic technologies have generated many datasets corresponding to expression studies on human genes in different conditions including signal information at exon level. Analysis of this type of data with accurate measurements of whole-gene expression and individual-exon expression is essential in order to identify alternative splicing events. One of the most frequently applied technologies aiming to this type of studies are the exon-specific expression microarray platforms used extensible, for example, in the ENCODE project (http://genome.ucsc.edu/encode/). Since there are not many validated comparative analyses to identify specific splicing events using data derived from this type of platforms, we have developed ESLiM as a method to detect significant changes in exon use, and we have applied it to a reference dataset of 270 human genes (i.e. genes that have been proven to present alternative expression in different tissues). All the files needed to use the method ESLiMc, including an example of use, are available to be downloaded in the links below.

ESLiM R package. R package including the full code to run the method.
- File name: ESLiM_1.0.tar.gz

Example of Use. Full example to use and run the ESLiM R package.
- File name: ESLiM_Install_and_Use.R

ExonMapper CDF R package. This is a file that includes the mapping of all exons included in GeneChip Human Exon 1.0 ST platform to the corresponding human gene locus (as defined in the last version of Ensembl).
- File name: exonmapperhumanexon1.0cdf_3.0.tar.gz

ESLiMc GeneMapper CDF R package. This is a file that includes the human gene/probes mapping designed to calculate each whole gene signal using only the probes that map in transcripts that cover >=60% of the corresponding gene locus (applied in ESLiMc).
- File name: eslimcgenemapperhumanexon1.0cdf_3.0.tar.gz

Exon Annotation R data file This is a file that provides the information and annotation about all human exons including the genes where they are located (including gene symbol, gene description and gene chromosomal location).
- File name: exons.human.Annotation.RData

Data set of microarrays. Compressed file including the raw data files corresponding to a dataset of 33 exon microarrays GeneChip Human Exon 1.0 ST (from Affymetrix, Inc.) containing 3 replicas of 11 different healthy human tissues.
- File name: affy_dataset.zip




EXAMPLE view of a gene locus illustrating the strategy used in ESLiMc method. The figure presents the locus of human gene regucalcin (RGN, Ensembl ID: ENSG00000130988, located in chromosome X, band p11.23) that includes 6 different transcripts: RGN-001, RGN-002, RGN-003, RGN-201 and RGN-202. The transcripts cover different parts of the locus and include different exons (represented as rectangular boxes placed along the transcripts). The whole RGN gene locus includes 15 different exons. The filled colored boxes within the transcripts correspond to the protein-coding sequence regions. The "green shadows" mark the exons that are kept common in the long transcripts. The "pink shadows" mark two short transcripts that cover less than 60% of the gene locus. At the top of the figure a black and white bar shows the genomic coordinates to indicate the position of each exon.