Statistics of the PROBE MAPPING
Statistics of the mapping to different TRANSCRIBED ENTITIES.
This page provides the statistics derived from the sequence mapping of all the oligonucleotide probes from Affymetrix human expression microarrays into different types of RNAs, separating mapping to protein-coding RNAs (i.e. exons encoding for proteins corresponding to mature mRNAs) versus mapping to non protein-coding RNAs (ncRNAs) and probes not assigned (NA) to any RNA. The probes which mapped only into introns were included as putative ncRNAs. The percentages for these mappings are provided for four widely used human expression array models.

Statistics of the mapping to transcripts and gene loci: COVERAGE and EFFICIENCY.
This page provides the statistics derived from the sequence mapping of all the oligonucleotide probes from Affymetrix expression microarrays into all transcripts and genes of Ensembl (v57) (including protein-coding genes and RNA genes) for human (Homo sapiens), mouse (Mus musculus) and rat (Rattus norvegicus).

Accurate expression determination requires that microarray probes have minimal cross-hybridization with other genes or other transcribed entities. Detailed information regarding the coverage and efficiency of the probe mapping is included below. Coverage is defined as the proportion (i.e. %) of gene loci or transcripts from the total protein-coding genes/transcripts of the Ensembl genomes (human, mouse or rat) that are mapped by the probes of a given microarray. Efficiency is defined as the proportion (%) of probes from the total probes of a given microarray that map to Ensembl genes or transcripts. The term "unique mapped" indicates the gene loci or transcripts that are mapped by a unique set of probes of a given microarray which do not cross-hybridize with any other gene loci or transcript. Therefore, such probes are not ambiguous.

The data about coverage are presented in Tables 1 and 3 and data about efficiency are presented in Tables 2 and 4. The statistics are calculated both for transcripts and genes only for the mapping on exons. Therefore, the probes that map on introns are not considered in these calculations.

TABLE 1. Number of protein-coding transcripts and gene loci mapped: MAPPING COVERAGE.
– This table presents a summary of the number and percentage of protein-coding transcripts and gene loci that are mapped by the oligonucleotide probes present in the Affymetrix expression microarrays.
– The table includes the data for the most widely used human microarrays based on 3' expression (HG_U133 series) and the new all-exon expression microarrays (Gene_1.0 and Exon_1.0). The complete data for human, mouse and rat microarrays are presented below, in TABLE 3.
– The numbers in TABLE 1 show that the coverage of known human genes (21281 in Ensembl v57) has improved with the progressive models of expression microarrays: A 63.0 % ; Plus_2 89.0 % ; Gene_1 94.9 % ; Exon_1 98.7 %.
– The transcripts coverage also improves progressively as the gene coverage, considering the total number of known transcripts derived from protein-coding gene loci (100299 in Ensembl v57).
– The statistics also include the transcripts that are mapped by unique probes and therefore that can be detected specifically; for example, 39.23 % with Exon_1 array. These data can be very useful for studies about alternative splicing.
Transcripts Gene Loci

Unique mapped All mapped Unique mapped All mapped TOTAL Nº of
Transcripts
TOTAL Nº of
Gene Loci
Microarray N transcripts % N transcripts % N gene loci % N gene loci %
Human









HG_U133A 5646 5.63% 47376 47.23% 12299 57.79% 13415 63.04% 100299 21281
HG_U133_Plus_2 10561 10.53% 68147 67.94% 17724 83.29% 18950 89.05% 100299 21281
Human_Gene_1.0 17117 17.07% 97192 96.90% 19213 90.28% 20192 94.88% 100299 21281
Human_Exon_1.0 39350 39.23% 99816 99.52% 20238 95.10% 21012 98.74% 100299 21281
Go to full statistics in TABLE 3
TABLE 2. Number of oligo probes that map to transcripts and gene loci: MAPPING EFFICIENCY.
– This table presents the number and percentage of oligonucleotide probes of distinct sequence present in each microarray that map to one or more transcripts and to one or more gene loci. Therefore, the columns with >1 include the probes that map to more than one transcript or more than one gene locus. These oligo probes mapping in several transcripts or loci can be considered AMBIGUOUS probes.
– The complete data for human, mouse and rat microarrays are presented below, in TABLE 4.
– The figures show that the best mapping efficiency is 91.22 %, obtained with the Gene_1.0 array. For the case of HG_U133A array, 16.5 % of the probes do not map to gene loci of the current human genome version. Even more if only specific probes are considered (i.e. 192213 probes for HG_U133A) the mapping efficiency is only 79.5 % (192213/241898). These calculations indicate that a proportion of probes (16 - 21 %) can produce noise using standard expression signal calculation based on the probesets assigned by Affymetrix. This problem is also present in the new Exon_1.0 arrays, that show the lowest efficiency with only about 31% of the probes mapping on exons. This has to be taken in consideration by the manufacturer and the users because many probes in the expression microarrays are apparently not mapping entities (exons, transcripts or genes) and they may produce large noise in the expression calculations.

Transcripts Gene Loci


1 >1 1 >1 TOTAL nº of probes mapping TOTAL nº of probes in the microarray Mapping efficiency
Microarray N probes % N probes % N probes % N probes % %
HG_U133A 65103 32.22% 136940 67.78% 192213 95.13% 9830 4.87% 202043 241898 83.52%
HG_U133_Plus_2 149924 39.90% 225817 60.10% 360264 95.88% 15477 4.12% 375741 594532 63.20%
Human_Gene_1.0 294841 40.19% 438868 59.81% 673873 91.84% 59836 8.16% 733709 804372 91.22%
Human_Exon_1.0 619903 37.86% 1017473 62.14% 1543530 94.27% 93846 5.73% 1637376 5270588 31.07%
Go to full statistics in TABLE 4
TABLE 3. Number of protein-coding transcripts and gene loci mapped: MAPPING COVERAGE.
– This table presents the complete data about the transcripts and gene loci that are mapped by the oligo probes of the Affymetrix expression microarrays built for human, mouse and rat.
Transcripts Gene Loci

Unique mapped All mapped Unique mapped All mapped TOTAL Nº of
Transcripts
TOTAL Nº of
Gene Loci
Microarray N transcripts % N transcripts % N gene loci % N gene loci %
Human









HG_U133A 5646 5.63% 47376 47.23% 12299 57.79% 13415 63.04% 100299 21281
HG_U133A_2 5646 5.63% 47376 47.23% 12299 57.79% 13415 63.04% 100299 21281
HG_U133B 4270 4.26% 24072 24.00% 7433 34.93% 8139 38.25% 100299 21281
HG_U133_Plus_2 10561 10.53% 68147 67.94% 17724 83.29% 18950 89.05% 100299 21281
HG_U95A 3626 3.62% 31634 31.54% 8545 40.15% 9561 44.93% 100299 21281
HG_U95Av2 3627 3.62% 31615 31.52% 8546 40.16% 9560 44.92% 100299 21281
HG_U95B 2719 2.71% 17303 17.25% 5483 25.76% 5955 27.98% 100299 21281
HG_U95C 2041 2.03% 14519 14.48% 3996 18.78% 5195 24.41% 100299 21281
HG_U95D 1448 1.44% 9211 9.18% 2572 12.09% 3433 16.13% 100299 21281
HG_U95E 2178 2.17% 13717 13.68% 4004 18.81% 4638 21.79% 100299 21281
HG_Focus 3198 3.19% 28725 28.64% 8156 38.33% 9017 42.37% 100299 21281
HC_G110 533 0.53% 5914 5.90% 1343 6.31% 1845 8.67% 100299 21281
U133_X3P 9953 9.92% 63392 63.20% 17583 82.62% 18787 88.28% 100299 21281
Human_Gene_1.0 17117 17.07% 97192 96.90% 19213 90.28% 20192 94.88% 100299 21281
Human_Exon_1.0 39350 39.23% 99816 99.52% 20238 95.10% 21012 98.74% 100299 21281
Mouse









MG_U74A 4429 6.29% 19521 27.73% 7371 32.32% 8815 38.65% 70406 22806
MG_U74Av2 4870 6.92% 20934 29.73% 8123 35.62% 9330 40.91% 70406 22806
MG_U74B 3170 4.50% 12574 17.86% 5179 22.71% 5711 25.04% 70406 22806
MG_U74Bv2 3843 5.46% 15176 21.55% 6311 27.67% 6880 30.17% 70406 22806
MG_U74C 1160 1.65% 4502 6.39% 1757 7.70% 2498 10.95% 70406 22806
MG_U74Cv2 2165 3.08% 7561 10.74% 3399 14.90% 3928 17.22% 70406 22806
Mouse430_2 11967 17.00% 44667 63.44% 17037 74.70% 18402 80.69% 70406 22806
Mouse430A_2 7850 11.15% 33714 47.89% 12572 55.13% 13795 60.49% 70406 22806
MOE430A 7850 11.15% 33714 47.89% 12572 55.13% 13795 60.49% 70406 22806
MOE430B 4996 7.10% 15321 21.76% 6853 30.05% 7379 32.36% 70406 22806
Mu11KsubA 2621 3.72% 12435 17.66% 4530 19.86% 6026 26.42% 70406 22806
Mu11KsubB 1754 2.49% 9477 13.46% 3023 13.26% 3873 16.98% 70406 22806
Mouse_Gene_1.0 19692 27.97% 69162 98.23% 21390 93.79% 22354 98.02% 70406 22806
Mouse_Exon_1.0 39114 55.55% 69962 99.37% 21506 94.30% 22412 98.27% 70406 22806
Rat









RG_U34A 3426 10.39% 8254 25.03% 4406 19.21% 5664 24.69% 32971 22938
RG_U34B 2348 7.12% 4871 14.77% 3117 13.59% 3453 15.05% 32971 22938
RG_U34C 2622 7.95% 5536 16.79% 3499 15.25% 3970 17.31% 32971 22938
Rat230_2 9253 28.06% 19554 59.31% 12065 52.60% 13428 58.54% 32971 22938
RAE230A 6828 20.71% 14859 45.07% 8986 39.18% 10209 44.51% 32971 22938
RAE230B 3114 9.44% 6513 19.75% 4201 18.31% 4498 19.61% 32971 22938
RN_U34 545 1.65% 1463 4.44% 723 3.15% 896 3.91% 32971 22938
RT_U34 434 1.32% 982 2.98% 536 2.34% 735 3.20% 32971 22938
Rat_Gene_1.0 21469 65.11% 32451 98.42% 21787 94.98% 22464 97.93% 32971 22938
Rat_Exon_1.0 22442 68.07% 32463 98.46% 21773 94.92% 22483 98.02% 32971 22938
TABLE 4. Number of oligo probes that map to transcripts and gene loci: MAPPING EFFICIENCY.
– This table presents the complete data about the oligonucleotide probes from the expression microarrays for human, mouse and rat that map to transcripts and gene loci of the corresponding genomes.

Transcripts Gene Loci


1 >1 1 >1 TOTAL nº of probes mapping TOTAL nº of probes in the microarray Mapping efficiency
Microarray N probes % N probes % N probes % N probes % %
HG_U133A 65103 32.22% 136940 67.78% 192213 95.13% 9830 4.87% 202043 241898 83.52%
HG_U133A_2 65081 32.22% 136932 67.78% 192191 95.14% 9822 4.86% 202013 241837 83.53%
HG_U133B 56958 45.56% 68073 54.44% 120712 96.55% 4319 3.45% 125031 248525 50.31%
HG_U133_Plus_2 149924 39.90% 225817 60.10% 360264 95.88% 15477 4.12% 375741 594532 63.20%
HG_U95A 53670 31.85% 114860 68.15% 160229 95.07% 8301 4.93% 168530 197599 85.29%
HG_U95Av2 53685 31.86% 114839 68.14% 160219 95.07% 8305 4.93% 168524 197582 85.29%
HG_U95B 46773 42.00% 64599 58.00% 108609 97.52% 2763 2.48% 111372 199191 55.91%
HG_U95C 36330 43.66% 46878 56.34% 79842 95.95% 3366 4.05% 83208 200491 41.50%
HG_U95D 24633 49.94% 24694 50.06% 47522 96.34% 1805 3.66% 49327 201274 24.51%
HG_U95E 37240 44.70% 46072 55.30% 80104 96.15% 3208 3.85% 83312 201012 41.45%
HG_Focus 29521 32.70% 60753 67.30% 85854 95.10% 4420 4.90% 90274 97810 92.30%
HC_G110 7548 28.70% 18752 71.30% 24687 93.87% 1613 6.13% 26300 30294 86.82%
U133_X3P 159798 40.73% 232564 59.27% 374931 95.56% 17431 4.44% 392362 631714 62.11%
Human_Gene_1.0 294841 40.19% 438868 59.81% 673873 91.84% 59836 8.16% 733709 804372 91.22%
Human_Exon_1.0 619903 37.86% 1017473 62.14% 1543530 94.27% 93846 5.73% 1637376 5270588 31.07%
Mouse










MG_U74A 63534 48.92% 66344 51.08% 122415 94.25% 7463 5.75% 129878 200843 64.67%
MG_U74Av2 70767 49.01% 73635 50.99% 136408 94.46% 7994 5.54% 144402 197037 73.29%
MG_U74B 50527 51.59% 47419 48.41% 95976 97.99% 1970 2.01% 97946 201514 48.61%
MG_U74Bv2 61653 51.40% 58306 48.60% 117670 98.09% 2289 1.91% 119959 196971 60.90%
MG_U74C 16164 59.45% 11027 40.55% 26505 97.48% 686 2.52% 27191 200299 13.58%
MG_U74Cv2 27675 57.30% 20627 42.70% 47329 97.99% 973 2.01% 48302 182488 26.47%
Mouse430_2 158665 51.51% 149387 48.49% 296966 96.40% 11086 3.60% 308052 490490 62.80%
Mouse430A_2 99246 48.08% 107163 51.92% 197309 95.59% 9100 4.41% 206409 245487 84.08%
MOE430A 99246 48.08% 107163 51.92% 197309 95.59% 9100 4.41% 206409 245487 84.08%
MOE430B 59861 58.13% 43114 41.87% 100780 97.87% 2195 2.13% 102975 247199 41.66%
Mu11KsubA 48328 49.85% 48628 50.15% 91735 94.62% 5221 5.38% 96956 131205 73.90%
Mu11KsubB 31148 44.66% 38591 55.34% 64613 92.65% 5126 7.35% 69739 118591 58.81%
Mouse_Gene_1.0 352685 50.36% 347663 49.64% 662327 94.57% 38023 5.43% 700348 833688 84.01%
Mouse_Exon_1.0 618129 47.47% 684002 52.53% 1251511 96.11% 50622 3.89% 1302131 4625878 28.15%
Rat










RG_U34A 57918 64.96% 31235 35.04% 84557 94.84% 4596 5.16% 89153 140057 63.65%
RG_U34B 33570 70.08% 14332 29.92% 47057 98.24% 845 1.76% 47902 140293 34.14%
RG_U34C 37294 70.14% 15880 29.86% 51938 97.68% 1236 2.32% 53174 140252 37.91%
Rat230_2 101652 67.92% 48018 32.08% 145812 97.42% 3858 2.58% 149670 341442 43.83%
RAE230A 69932 67.82% 33180 32.18% 99912 96.90% 3200 3.10% 103112 174975 58.93%
RAE230B 32336 68.08% 15160 31.92% 46737 98.40% 759 1.60% 47496 168505 28.19%
RN_U34 9480 60.60% 6164 39.40% 15182 97.05% 462 2.95% 15644 21300 73.45%
RT_U34 9591 67.07% 4708 32.93% 13190 92.24% 1109 7.76% 14299 20407 70.07%
Rat_Gene_1.0 451267 70.94% 184883 29.06% 609761 95.85% 26389 4.15% 636150 793624 80.16%
Rat_Exon_1.0 572414 60.16% 379114 39.84% 919569 96.64% 31959 3.36% 951528 3997586 23.80%