| Statistics of the PROBE MAPPING |
| Statistics of the mapping to different TRANSCRIBED ENTITIES. |
This page provides the statistics derived from the sequence
mapping of all the oligonucleotide probes from Affymetrix human expression
microarrays into different types of RNAs, separating mapping to protein-coding RNAs
(i.e. exons encoding for proteins corresponding to mature mRNAs) versus mapping to non protein-coding RNAs
(ncRNAs) and probes not assigned (NA) to any RNA.
The probes which mapped only into introns were included as putative ncRNAs.
The percentages for these mappings are provided for four widely used human expression array models.
|
 |
|
 |
|
|
| Statistics of the mapping to transcripts and gene loci:
COVERAGE and EFFICIENCY. |
This page provides the statistics derived from the sequence
mapping of all the oligonucleotide probes from Affymetrix expression microarrays
into all transcripts and genes of Ensembl (v57)
(including protein-coding genes and RNA genes) for human (Homo sapiens), mouse
(Mus musculus) and rat (Rattus norvegicus).
Accurate expression determination requires that microarray probes have minimal cross-hybridization
with other genes or other transcribed entities. Detailed information regarding the
coverage and efficiency of the probe mapping is included below.
Coverage is defined as the proportion (i.e. %) of gene loci or transcripts from the
total protein-coding genes/transcripts of the Ensembl genomes (human, mouse or rat) that are mapped by the
probes of a given microarray.
Efficiency is defined as the proportion (%) of probes from the total probes of a
given microarray that map to Ensembl genes or transcripts.
The term "unique mapped" indicates the gene loci or transcripts that are mapped
by a unique set of probes of a given microarray which do not cross-hybridize with
any other gene loci or transcript. Therefore, such probes are not ambiguous.
The data about coverage are presented in Tables 1 and 3 and
data about efficiency are presented in Tables 2 and 4.
The statistics
are calculated both for transcripts and genes only for the mapping on exons. Therefore, the
probes that map on introns are not considered in these calculations.
TABLE 1. Number of protein-coding transcripts and gene loci mapped: MAPPING COVERAGE.
– This table presents a summary of the number and percentage of protein-coding transcripts and
gene loci that are mapped by the oligonucleotide probes present in the Affymetrix
expression microarrays.
– The table includes the data for the most widely used human microarrays based on 3'
expression (HG_U133 series) and the new all-exon expression microarrays
(Gene_1.0 and Exon_1.0). The complete data for human, mouse and rat
microarrays are presented below, in TABLE 3.
– The numbers in TABLE 1 show that the coverage of known human genes (21281
in Ensembl v57) has improved with the progressive models of expression
microarrays: A 63.0 % ; Plus_2 89.0 % ; Gene_1 94.9 % ; Exon_1 98.7 %.
– The transcripts coverage also improves progressively as the gene coverage,
considering the total number of known transcripts derived from protein-coding gene loci (100299 in Ensembl
v57).
– The statistics also include the transcripts that are mapped by unique probes and
therefore that can be detected specifically; for example, 39.23 % with Exon_1 array. These
data can be very useful for studies about alternative splicing. |
|
Transcripts |
Gene Loci |
|
|
| Unique mapped |
All mapped |
Unique mapped |
All mapped |
TOTAL Nº of
Transcripts |
TOTAL Nº of Gene Loci |
| Microarray |
N transcripts |
% |
N transcripts |
% |
N gene loci |
% |
N gene loci |
% |
| Human |
|
|
|
|
|
|
|
|
|
|
| HG_U133A |
5646 |
5.63% |
47376 |
47.23% |
12299 |
57.79% |
13415 |
63.04% |
100299 |
21281 |
| HG_U133_Plus_2 |
10561 |
10.53% |
68147 |
67.94% |
17724 |
83.29% |
18950 |
89.05% |
100299 |
21281 |
| Human_Gene_1.0 |
17117 |
17.07% |
97192 |
96.90% |
19213 |
90.28% |
20192 |
94.88% |
100299 |
21281 |
| Human_Exon_1.0 |
39350 |
39.23% |
99816 |
99.52% |
20238 |
95.10% |
21012 |
98.74% |
100299 |
21281 |
|
| Go to full statistics in TABLE 3
|
|
TABLE 2. Number of oligo probes that map to transcripts and gene loci: MAPPING EFFICIENCY.
– This table presents the number and percentage of oligonucleotide probes of distinct
sequence present in each microarray that map to one or more transcripts and to one or more
gene loci. Therefore, the columns with >1 include the probes that map to more
than one transcript or more than one gene locus. These oligo probes mapping in several
transcripts or loci can be considered AMBIGUOUS probes.
– The complete data for human, mouse and rat microarrays are presented
below, in TABLE 4.
– The figures show that the best mapping efficiency is 91.22 %, obtained with
the Gene_1.0 array. For the case of HG_U133A array, 16.5 % of the probes
do not map to gene loci of the current human genome version. Even more if only
specific probes are considered (i.e. 192213 probes for HG_U133A) the mapping
efficiency is only 79.5 % (192213/241898). These calculations indicate that a
proportion of probes (16 - 21 %) can produce noise using standard expression signal
calculation based on the probesets assigned by Affymetrix. This problem is also
present in the new Exon_1.0 arrays, that show the lowest efficiency with only about 31%
of the probes mapping on exons. This has to be taken in consideration by the
manufacturer and the users because many probes in the expression microarrays are apparently not
mapping entities (exons, transcripts or genes) and they may produce large noise in the
expression calculations. |
|
Transcripts |
Gene Loci |
|
|
|
|
1 |
>1 |
1 |
>1 |
TOTAL nº of probes
mapping |
TOTAL nº of probes in the
microarray |
Mapping efficiency |
| Microarray |
N probes |
% |
N probes |
% |
N probes |
% |
N probes |
% |
% |
| HG_U133A |
65103 |
32.22% |
136940 |
67.78% |
192213 |
95.13% |
9830 |
4.87% |
202043 |
241898 |
83.52% |
| HG_U133_Plus_2 |
149924 |
39.90% |
225817 |
60.10% |
360264 |
95.88% |
15477 |
4.12% |
375741 |
594532 |
63.20% |
| Human_Gene_1.0 |
294841 |
40.19% |
438868 |
59.81% |
673873 |
91.84% |
59836 |
8.16% |
733709 |
804372 |
91.22% |
| Human_Exon_1.0 |
619903 |
37.86% |
1017473 |
62.14% |
1543530 |
94.27% |
93846 |
5.73% |
1637376 |
5270588 |
31.07% |
|
| Go to full statistics in TABLE 4
|
|
TABLE 3. Number of protein-coding transcripts and gene loci mapped: MAPPING COVERAGE.
– This table presents the complete data about the transcripts and gene loci
that are mapped by the oligo probes of the Affymetrix expression microarrays built for human,
mouse and rat. |
|
Transcripts |
Gene Loci |
|
|
| Unique mapped |
All mapped |
Unique mapped |
All mapped |
TOTAL Nº of
Transcripts |
TOTAL Nº of
Gene Loci |
| Microarray |
N transcripts |
% |
N transcripts |
% |
N gene loci |
% |
N gene loci |
% |
| Human |
|
|
|
|
|
|
|
|
|
|
| HG_U133A |
5646 |
5.63% |
47376 |
47.23% |
12299 |
57.79% |
13415 |
63.04% |
100299 |
21281 |
| HG_U133A_2 |
5646 |
5.63% |
47376 |
47.23% |
12299 |
57.79% |
13415 |
63.04% |
100299 |
21281 |
| HG_U133B |
4270 |
4.26% |
24072 |
24.00% |
7433 |
34.93% |
8139 |
38.25% |
100299 |
21281 |
| HG_U133_Plus_2 |
10561 |
10.53% |
68147 |
67.94% |
17724 |
83.29% |
18950 |
89.05% |
100299 |
21281 |
| HG_U95A |
3626 |
3.62% |
31634 |
31.54% |
8545 |
40.15% |
9561 |
44.93% |
100299 |
21281 |
| HG_U95Av2 |
3627 |
3.62% |
31615 |
31.52% |
8546 |
40.16% |
9560 |
44.92% |
100299 |
21281 |
| HG_U95B |
2719 |
2.71% |
17303 |
17.25% |
5483 |
25.76% |
5955 |
27.98% |
100299 |
21281 |
| HG_U95C |
2041 |
2.03% |
14519 |
14.48% |
3996 |
18.78% |
5195 |
24.41% |
100299 |
21281 |
| HG_U95D |
1448 |
1.44% |
9211 |
9.18% |
2572 |
12.09% |
3433 |
16.13% |
100299 |
21281 |
| HG_U95E |
2178 |
2.17% |
13717 |
13.68% |
4004 |
18.81% |
4638 |
21.79% |
100299 |
21281 |
| HG_Focus |
3198 |
3.19% |
28725 |
28.64% |
8156 |
38.33% |
9017 |
42.37% |
100299 |
21281 |
| HC_G110 |
533 |
0.53% |
5914 |
5.90% |
1343 |
6.31% |
1845 |
8.67% |
100299 |
21281 |
| U133_X3P |
9953 |
9.92% |
63392 |
63.20% |
17583 |
82.62% |
18787 |
88.28% |
100299 |
21281 |
| Human_Gene_1.0 |
17117 |
17.07% |
97192 |
96.90% |
19213 |
90.28% |
20192 |
94.88% |
100299 |
21281 |
| Human_Exon_1.0 |
39350 |
39.23% |
99816 |
99.52% |
20238 |
95.10% |
21012 |
98.74% |
100299 |
21281 |
| Mouse |
|
|
|
|
|
|
|
|
|
|
| MG_U74A |
4429 |
6.29% |
19521 |
27.73% |
7371 |
32.32% |
8815 |
38.65% |
70406 |
22806 |
| MG_U74Av2 |
4870 |
6.92% |
20934 |
29.73% |
8123 |
35.62% |
9330 |
40.91% |
70406 |
22806 |
| MG_U74B |
3170 |
4.50% |
12574 |
17.86% |
5179 |
22.71% |
5711 |
25.04% |
70406 |
22806 |
| MG_U74Bv2 |
3843 |
5.46% |
15176 |
21.55% |
6311 |
27.67% |
6880 |
30.17% |
70406 |
22806 |
| MG_U74C |
1160 |
1.65% |
4502 |
6.39% |
1757 |
7.70% |
2498 |
10.95% |
70406 |
22806 |
| MG_U74Cv2 |
2165 |
3.08% |
7561 |
10.74% |
3399 |
14.90% |
3928 |
17.22% |
70406 |
22806 |
| Mouse430_2 |
11967 |
17.00% |
44667 |
63.44% |
17037 |
74.70% |
18402 |
80.69% |
70406 |
22806 |
| Mouse430A_2 |
7850 |
11.15% |
33714 |
47.89% |
12572 |
55.13% |
13795 |
60.49% |
70406 |
22806 |
| MOE430A |
7850 |
11.15% |
33714 |
47.89% |
12572 |
55.13% |
13795 |
60.49% |
70406 |
22806 |
| MOE430B |
4996 |
7.10% |
15321 |
21.76% |
6853 |
30.05% |
7379 |
32.36% |
70406 |
22806 |
| Mu11KsubA |
2621 |
3.72% |
12435 |
17.66% |
4530 |
19.86% |
6026 |
26.42% |
70406 |
22806 |
| Mu11KsubB |
1754 |
2.49% |
9477 |
13.46% |
3023 |
13.26% |
3873 |
16.98% |
70406 |
22806 |
| Mouse_Gene_1.0 |
19692 |
27.97% |
69162 |
98.23% |
21390 |
93.79% |
22354 |
98.02% |
70406 |
22806 |
| Mouse_Exon_1.0 |
39114 |
55.55% |
69962 |
99.37% |
21506 |
94.30% |
22412 |
98.27% |
70406 |
22806 |
| Rat |
|
|
|
|
|
|
|
|
|
|
| RG_U34A |
3426 |
10.39% |
8254 |
25.03% |
4406 |
19.21% |
5664 |
24.69% |
32971 |
22938 |
| RG_U34B |
2348 |
7.12% |
4871 |
14.77% |
3117 |
13.59% |
3453 |
15.05% |
32971 |
22938 |
| RG_U34C |
2622 |
7.95% |
5536 |
16.79% |
3499 |
15.25% |
3970 |
17.31% |
32971 |
22938 |
| Rat230_2 |
9253 |
28.06% |
19554 |
59.31% |
12065 |
52.60% |
13428 |
58.54% |
32971 |
22938 |
| RAE230A |
6828 |
20.71% |
14859 |
45.07% |
8986 |
39.18% |
10209 |
44.51% |
32971 |
22938 |
| RAE230B |
3114 |
9.44% |
6513 |
19.75% |
4201 |
18.31% |
4498 |
19.61% |
32971 |
22938 |
| RN_U34 |
545 |
1.65% |
1463 |
4.44% |
723 |
3.15% |
896 |
3.91% |
32971 |
22938 |
| RT_U34 |
434 |
1.32% |
982 |
2.98% |
536 |
2.34% |
735 |
3.20% |
32971 |
22938 |
| Rat_Gene_1.0 |
21469 |
65.11% |
32451 |
98.42% |
21787 |
94.98% |
22464 |
97.93% |
32971 |
22938 |
| Rat_Exon_1.0 |
22442 |
68.07% |
32463 |
98.46% |
21773 |
94.92% |
22483 |
98.02% |
32971 |
22938 |
|
|
TABLE 4. Number of oligo probes that map to transcripts and
gene loci: MAPPING EFFICIENCY.
– This table presents the complete data about the oligonucleotide probes from the
expression microarrays for human, mouse and rat that map to transcripts and
gene loci of the corresponding genomes. |
|
Transcripts |
Gene Loci |
|
|
|
|
1 |
>1 |
1 |
>1 |
TOTAL nº of probes
mapping |
TOTAL nº of probes in the
microarray |
Mapping efficiency |
| Microarray |
N probes |
% |
N probes |
% |
N probes |
% |
N probes |
% |
% |
| HG_U133A |
65103 |
32.22% |
136940 |
67.78% |
192213 |
95.13% |
9830 |
4.87% |
202043 |
241898 |
83.52% |
| HG_U133A_2 |
65081 |
32.22% |
136932 |
67.78% |
192191 |
95.14% |
9822 |
4.86% |
202013 |
241837 |
83.53% |
| HG_U133B |
56958 |
45.56% |
68073 |
54.44% |
120712 |
96.55% |
4319 |
3.45% |
125031 |
248525 |
50.31% |
| HG_U133_Plus_2 |
149924 |
39.90% |
225817 |
60.10% |
360264 |
95.88% |
15477 |
4.12% |
375741 |
594532 |
63.20% |
| HG_U95A |
53670 |
31.85% |
114860 |
68.15% |
160229 |
95.07% |
8301 |
4.93% |
168530 |
197599 |
85.29% |
| HG_U95Av2 |
53685 |
31.86% |
114839 |
68.14% |
160219 |
95.07% |
8305 |
4.93% |
168524 |
197582 |
85.29% |
| HG_U95B |
46773 |
42.00% |
64599 |
58.00% |
108609 |
97.52% |
2763 |
2.48% |
111372 |
199191 |
55.91% |
| HG_U95C |
36330 |
43.66% |
46878 |
56.34% |
79842 |
95.95% |
3366 |
4.05% |
83208 |
200491 |
41.50% |
| HG_U95D |
24633 |
49.94% |
24694 |
50.06% |
47522 |
96.34% |
1805 |
3.66% |
49327 |
201274 |
24.51% |
| HG_U95E |
37240 |
44.70% |
46072 |
55.30% |
80104 |
96.15% |
3208 |
3.85% |
83312 |
201012 |
41.45% |
| HG_Focus |
29521 |
32.70% |
60753 |
67.30% |
85854 |
95.10% |
4420 |
4.90% |
90274 |
97810 |
92.30% |
| HC_G110 |
7548 |
28.70% |
18752 |
71.30% |
24687 |
93.87% |
1613 |
6.13% |
26300 |
30294 |
86.82% |
| U133_X3P |
159798 |
40.73% |
232564 |
59.27% |
374931 |
95.56% |
17431 |
4.44% |
392362 |
631714 |
62.11% |
| Human_Gene_1.0 |
294841 |
40.19% |
438868 |
59.81% |
673873 |
91.84% |
59836 |
8.16% |
733709 |
804372 |
91.22% |
| Human_Exon_1.0 |
619903 |
37.86% |
1017473 |
62.14% |
1543530 |
94.27% |
93846 |
5.73% |
1637376 |
5270588 |
31.07% |
| Mouse |
|
|
|
|
|
|
|
|
|
|
|
| MG_U74A |
63534 |
48.92% |
66344 |
51.08% |
122415 |
94.25% |
7463 |
5.75% |
129878 |
200843 |
64.67% |
| MG_U74Av2 |
70767 |
49.01% |
73635 |
50.99% |
136408 |
94.46% |
7994 |
5.54% |
144402 |
197037 |
73.29% |
| MG_U74B |
50527 |
51.59% |
47419 |
48.41% |
95976 |
97.99% |
1970 |
2.01% |
97946 |
201514 |
48.61% |
| MG_U74Bv2 |
61653 |
51.40% |
58306 |
48.60% |
117670 |
98.09% |
2289 |
1.91% |
119959 |
196971 |
60.90% |
| MG_U74C |
16164 |
59.45% |
11027 |
40.55% |
26505 |
97.48% |
686 |
2.52% |
27191 |
200299 |
13.58% |
| MG_U74Cv2 |
27675 |
57.30% |
20627 |
42.70% |
47329 |
97.99% |
973 |
2.01% |
48302 |
182488 |
26.47% |
| Mouse430_2 |
158665 |
51.51% |
149387 |
48.49% |
296966 |
96.40% |
11086 |
3.60% |
308052 |
490490 |
62.80% |
| Mouse430A_2 |
99246 |
48.08% |
107163 |
51.92% |
197309 |
95.59% |
9100 |
4.41% |
206409 |
245487 |
84.08% |
| MOE430A |
99246 |
48.08% |
107163 |
51.92% |
197309 |
95.59% |
9100 |
4.41% |
206409 |
245487 |
84.08% |
| MOE430B |
59861 |
58.13% |
43114 |
41.87% |
100780 |
97.87% |
2195 |
2.13% |
102975 |
247199 |
41.66% |
| Mu11KsubA |
48328 |
49.85% |
48628 |
50.15% |
91735 |
94.62% |
5221 |
5.38% |
96956 |
131205 |
73.90% |
| Mu11KsubB |
31148 |
44.66% |
38591 |
55.34% |
64613 |
92.65% |
5126 |
7.35% |
69739 |
118591 |
58.81% |
| Mouse_Gene_1.0 |
352685 |
50.36% |
347663 |
49.64% |
662327 |
94.57% |
38023 |
5.43% |
700348 |
833688 |
84.01% |
| Mouse_Exon_1.0 |
618129 |
47.47% |
684002 |
52.53% |
1251511 |
96.11% |
50622 |
3.89% |
1302131 |
4625878 |
28.15% |
| Rat |
|
|
|
|
|
|
|
|
|
|
|
| RG_U34A |
57918 |
64.96% |
31235 |
35.04% |
84557 |
94.84% |
4596 |
5.16% |
89153 |
140057 |
63.65% |
| RG_U34B |
33570 |
70.08% |
14332 |
29.92% |
47057 |
98.24% |
845 |
1.76% |
47902 |
140293 |
34.14% |
| RG_U34C |
37294 |
70.14% |
15880 |
29.86% |
51938 |
97.68% |
1236 |
2.32% |
53174 |
140252 |
37.91% |
| Rat230_2 |
101652 |
67.92% |
48018 |
32.08% |
145812 |
97.42% |
3858 |
2.58% |
149670 |
341442 |
43.83% |
| RAE230A |
69932 |
67.82% |
33180 |
32.18% |
99912 |
96.90% |
3200 |
3.10% |
103112 |
174975 |
58.93% |
| RAE230B |
32336 |
68.08% |
15160 |
31.92% |
46737 |
98.40% |
759 |
1.60% |
47496 |
168505 |
28.19% |
| RN_U34 |
9480 |
60.60% |
6164 |
39.40% |
15182 |
97.05% |
462 |
2.95% |
15644 |
21300 |
73.45% |
| RT_U34 |
9591 |
67.07% |
4708 |
32.93% |
13190 |
92.24% |
1109 |
7.76% |
14299 |
20407 |
70.07% |
| Rat_Gene_1.0 |
451267 |
70.94% |
184883 |
29.06% |
609761 |
95.85% |
26389 |
4.15% |
636150 |
793624 |
80.16% |
| Rat_Exon_1.0 |
572414 |
60.16% |
379114 |
39.84% |
919569 |
96.64% |
31959 |
3.36% |
951528 |
3997586 |
23.80% |
|