How to read a Promethease Report.

If you ran from and filled in your email address, you were sent an email email with a link labeled 'View'. This is a direct link to the UI2 report, which is the most useful subreport.

The email also includes a 'Download' link to a .zip file. You must correctly unzip this file or you will be unable to see some of the reports inside the .zip.

When you open this .zip file, there is a single top level report.

A Promethease Report lists some basic information in a "header" at the top. The first two lines show the version of Promethease that was run (e.g., Version 0.1.161) and a time stamp when the report was generated. Keep track of this information for the following reason: SNPedia's content and the Promethease program are improving rapidly, so you will want to rerun the analysis periodically, perhaps as often as once a month. The last line of the header (e.g., "3148 genotypes annotated") lists the number of SNPs that Promethease analyzed using your raw SNP file as input. This number will depend on both the size and completeness of SNPedia itself and the platforms on which you've been tested.

If you paid, the first report is labeled 'UI version 2 interactive report'. This is the same report you would see by clicking on 'View' in the email.

Other report sections include

  • Medicines
  • Medical Conditions
  • Topics

which are all similar in how they group together related SNPs. They show an image like this

Medical Conditions bars.png

Which indicates that your promethease report has 6 snps which are linked to Thyroid cancer, 1 snp related to Torsion dystonia etc. The bar shows you how many of those snps are classified as having Good (green), Unknown (Grey) or Bad (Red) Repute.

Each of these sections can be expanded or contracted. Click on ...more... to expand a section.

If there are more than 10 snps the bar can be treated as a percentage (see Type-1 and Type-2 diabetes in the image). If there are less than 10 snps, the bar will be less than full width to visually indicate how many snps are present.

Most report sections follow a few more conventions.

If you click 'Show everything' in the upper right you can see just how big this really is. Most of the boring stuff is hiding in the bottom, while the interesting stuff is moving towards the top. Reload the page to put it back into the compact format.

Pages are sorted according to their magnitude. This is a subjective measure of interest. If you feel some of the information in your report is out of order please leave a note on the relevant pages. UI2 allows you to change the sorting order.

Pages without a magnitude are sorted according to population frequency. A number such as 6.7 indicates that 6.7% of the Caucasian/European (CEU) population shares this genotype.

Colors are used to make it easier to skim your report. Text in a blue background is not specific to your genotype, it is about this location. A green background indicates that your genotype is considered the normal form. Yellow covers several other cases such as an ambiguous flip or a genoset.

If your genotype has been assigned a large magnitude, or is known to be rare the first box will be a bright red. If your genotype is common it will be nearly white.

6.7 rs1234(A;G)
Magnitude: 2.1
text about your genotype
this text is NOT about your genotype

The rs1234 links to the blue text on the rspage. But the genotype portion (A;G) links to the white text on the genotype page.

The top of your report may have some genosets, such as gs101 or gs115. These are conclusions based on fairly solid scientific evidence.

Below that your genotypes are ordered from rarest to most common. This helps to identify which of your snps are most worthy of your scrutiny.

A middle section collects snps which are either of unknown frequency in the general population, or which are of otherwise dubious value.

The bottom portion of the results groups them by disease. We're not big fans of this one, since the connection between SNPs and disease often has too many unknown intermediates in these early days of genotype:phenotype correlation, and so all of this is really just probability statistics and far from clear cause and effect. However we recognize that fundamentally people still want to see this, and perhaps it serves a certain purpose.

To the right are three boxes. The top box is a one line summary of your genotype. The text is taken from a box on the rs# page. This is often blank, indicating nothing is known yet. Below this is a second box (also quite often empty) with full details about your genotype. The third box has a light blue background and the text is specific to the rs#, rather than your actual genotype. Often there is information here just waiting to be propagated into genotype-specific locations.

As you continue to scroll down, you'll see the red becoming lighter. Rare SNPs are red, common ones are white.

A SNP is included with a disease if the SNPedia disease page mentions it. SNPs are sorted by population frequency within a disease. This allows you to see all 10+ Crohn's disease SNPs at the same time, even if the population frequency of your genotype(s) varies widely.

Your improvements to SNPedia will make the next Promethease report better, so please feel free to contribute both information and suggestions.

If the red box says 0.0, does that really mean this genotype has not been observed in the CEU population? And if so, where are those population data from?[edit]

As an example look at the graph in the lower right of rs916977. This is a SNP in which all 3 genotypes are observed globally. But (A;A) was not observed in the CEU population. You can confirm this at NCBI with the ncbi. Given the frequency of heterozygotes in the CEU population, it's inevitable there are plenty among CEU. CEU was only based on 120 individuals. You'll notice the step size goes from 0.0 to 1.7 to 3.4. CEU calls everything below 1.7 a zero, because the population size just isn't big enough. Notice that the '?' in the population box directs you to Help_(population_diversity) for more information about CEU.

When the red box says 0.0, does that mean less than .1 percent of people have that SNP?[edit]

Notice the step size. It jumps from 0.0 to 1.7 to 3.3 (or something like that), thus most of those numbers are based on the hapmap project with 120 caucasians so the 'resolution' is pretty poor. There are plenty of examples where the heterzygous frequency is ~15% but the frequency for the homozygous minor is 0.0, which surely just shows some 'luck of the draw' with the sampling. Also, sometimes the numbers are just plain wrong. If you click on the right hand dbSNP link and scroll to the bottom you'll see the graphs for all populations. There are quite a few cases where dbSNP has some clearly wrong data but User:SNPediaBot couldn't be sure, and ends up with equally wrong numbers.

What if the lefthand box says "None?" Is that the same as 0.0 or does it mean the SNP has not been typed?[edit]

It means SNPedia has no population data. Which probably means NCBI has no HapMap population data. Which means this SNP wasn't part of the HapMap. This is common for many of the OMIM SNPs which are from rare disorders or mutations - too rare to have been found by HapMap.

No odds provided?[edit]

Some genotypes have been associated with traits, but actual risk estimates are not present, e.g.,

   6.8         rs3129934(T;T)        associated with type-1 diabetes

What does this mean? Just that there are not enough data or no one's comfortable assigning a risk estimate to it?

Its not about comfort, it about effort and availability. All it means is that no one has yet added that information to SNPedia. If you're curious enough to go digging please add what you find to SNPedia.

Are the commercial services prone to ambiguous flips? Have they addressed them? Is your recommendation I just ignore them?[edit]

Microarrays are prone to ambiguous flips. It doesn't mean the manufacturer doesn't know which way to disambiguate, it just means that SNPedia doesn't yet know. The notation we all use is fundamentally ambiguous. It needs to be compared to a reference standard which NCBI continually evolves. Vendors usually work against one standard, but this can change with later microarrays. Some microarrays even use a mix. Until SNPedia has extra information about the vendor orientations Promethease can't say unambiguously.

rs7566605 has 2 alleles C and G. If you flip the strand they become G and C. Here the NCBI shows that

  • Affy seems to have used rev/B (reverse bottom)
  • while most others fwd/T (forward top)
  • and others only indicate fwd but not T or B.

The notation is good for locating a position, but awkward for genotyping.

For the moment the best solution is to manually determine which orientations were used for Affymetrix and Illumina relative to the current NCBI reference, and simply write it into the corresponding SNPedia page for others to read. 23andMe and deCODEme both use Illumina microarrays.

Does the appearance of a Haplogroup R1b1b2g section under “Topics” imply I am a R1b1b2g? 23andme has me as a Q - to be precise a Q1b. I had thought Q and R were disjoint.[edit]

Under topics shows you all of the snps which are known to be related to Haplogroup R1b1b2g. Whether or not you have the forms that indicate Haplogroup R1b1b2g is a separate question.

look at this report under 'Topics' and notice that his Haplogroup R1b1b2g is nothing special, but it looks as though his Y Haplogroup I topic is getting lots of good hits. In time we will have genosets to calculate haplogroups automatically. For now Promethease is just trying to show which snps are relevant to a particular topic.

I have 3.9x increased risk of wet ARMD. The genotype has a frequency of 46.7 yet it's still bright red. Is that because it was done in a Taiwanese and not a CEU population?[edit]

There are a heck of a lot of snps linked to ARMD. If you scan through that list you'll see some of them increase the odds by 10x or more. I wouldn't be too worried about any one snp especially at a 3.9x risk.

The one you're noticing is rs1136287 which has made it near the top of your report, but its only one of 9 ARMD related snps which was checked by the Affy 6.0 platform used by Navigenics. To see all 9 visit your report Then click on

  • Medical Conditions ...more...


  • Age related macular degeneration ...more... 9 snps

There you'll see all 9 snps. The top one is rs1136287(T;T) and you'll see it (used to) say 'Magnitude:3'. This is the specific answer to your question. Your're seeing it at the top of your report and in bright red because 'we' have decided that this snp is important enough to have a Magnitude score of 3.

At the bottom of that section you'll see 3 snps

which are colored green. This is because your genotype is considered normal, and has been assigned a Magnitude score of 0.

Notably rs3775291(C;C) is found in 43.3% of caucasians while the snp your asking about is even more common at 46.7%. So if it is more common, why isn't it considered normal? Why is it given a magnitude=3? I don't have a great answer. It only has a single paper in a fairly small Taiwanese population. [PMID 18226801]

Given how well studied ARMD is, the lack of replication and the lack of a more dramatic risk score makes it not terribly compelling. It seems I picked a score of 3 rather arbitrarily back on June 25, 2008

Today I've revisited it for the first time, and spent more time scrutinizing it. I think 3 is definitely too high, and have chosen to bump it down to 2.2. The next time you or anyone else generates a promethease report they'll see this as a 2.2. Perhaps it should be lower, but I'll hope that others think about this and leave comments or change the magnitude. Over time these scores should converge to reasonable values, but we're still in the early days.