23andMe misreading rs41303343??
I have data for 5 individuals created by 23andMe, and they all show homozygotic minor alleles ( "(I;I}" ). This seems a bit unlikely since the MA frequency is about 3%. The individuals are related: 3 are siblings and each of the other two are first cousins of the siblings and of each other, so there are descents of these 5 through 3 siblings of the previous generation and 3 unrelated spouses. (2 tests were on V3 and 3 tests on V4 chip versions)
Moreover, for one of these individuals I have a second set of data from a separate sample and reading via Genes For Good. This reading conflicts with the (I;I) = (T;T) reading of 23andMe; Genes For Good gives (D;D) = (-;-), the major allele.
To me this strongly suggests the 23andMe reports giving the minor allele are likely to be consistently wrong.
- Are you looking at rs41303343, or perhaps instead, an "i-SNP"? We see 23andMe listing the alleles called for rs41303343 as being either "-" or "A", so we don't see how you would wind up seeing (I;I) in any 23andMe data for this SNP, at least if it's rs41303343. Greg (talk) 18:18, 5 October 2016 (UTC)
- I believe I'm looking at rs41303343 via Promethease, SNPedia, 23andme browser, and 23andme raw data file. In the 23andme browser for a V3 user it shows:
CYP3A5 rs41303343 99250394 — or A A / A
- The raw data file line for the same person, and the line for one of the V4 kits, are:
rs41303343 7 99250393 II rs41303343 7 99250393 II
- whereas the Genes For Good (23andme format) raw data file for the first person shows:
rs41303343 7 99250393 DD
- My (shaky?) interpretation based in part on the dbSNP page is that insertion of A is the minor allele, and that it is also known as insertion of T from detection on the complementary strand (shown as "-/T (REV)" in dbSNP). Maybe there is a issue with the genome build involved or with SNPedia's mapping of alleles, or ??? --Deoxydog (talk) 19:19, 6 October 2016 (UTC)
- ExAC shows that the insertion allele is indeed the minor allele and is rare (~1%). And OpenSNP shows that the most commonly reported genotype is II (see this page. Assuming that most of these OpenSNP files are from 23andMe (which is possible to check), it sure looks as if 23andMe is indeed - incorrectly - representing the reference/common allele as "I".Greg (talk) 23:29, 6 October 2016 (UTC)
- I checked about ten "II" OpenSNP datasets and they were all 23andme sets. Also checked about ten "DD" datasets, and about 8 were Ancestry.com and 2 had both Ancestry and 23andMe datasets uploaded. Supports the idea 23andme is misreading or mislabelling this SNP. Also, I note all the entries show as homozygous, either II or DD, with a rare allele it should occur mainly in heterozygous state.
- It seems like SNPedia should note this situation somehow. I propose not only text on this page, but also changing the (I;I) page from redirecting to the (T;T) which it does now (and has bad Repute) so that it redirects to the (-;-) page. I'd add text to the (-;-) page too. Also I'd add a redirect from the (D;D), which Genes For Good produces, and perhaps Ancestry, to go to the (-;-) page. The true insert minor allele does have a higher frequency in Africans (I seem to remember seeing 10%), and there are some homozygotic true inserts, so maybe it would be better to leave (I;I) redirecting to the bad (T;T) and just add text there saying it's probably false if it came from 23andMe. Any thoughts on this? --Deoxydog (talk) 02:39, 8 October 2016 (UTC)