Bulk

From SNPedia
Jump to: navigation, search

Contents

[edit] Reminder

The content in SNPedia is available under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. Commercial licenses are available from info@snpedia.com.

[edit] Introduction

Based on the format, frequency and complexity of your particular needs, you may wish to consider these sources:

The European Bioinformatics Institute hosts a DAS http://www.ebi.ac.uk/das-srv/easydas/bernat/das/SNPedia/features?segment=10:1,51319502

http://kokki.uku.fi/bioinformatics/varietas/ provides a web interface which includes SNPedia content

The file at http://www.snpedia.com/files/gbrowse/SNPedia.gff is updated semi-regularly and can be parsed to provide a reasonable list.

[edit] Page History

Bots which try to pull every version of every page crush the server, and will be banned long before you complete the full scape.

[edit] Programmers

Please aim your bots at http://bots.snpedia.com/api.php and see these two projects

[edit] Perl

Please notice and use the line

$bot->{api}->{use_http_get} = 1;

which is necessary to ensure GET instead of POST for some older versions of the library.

Semantic-MediaWiki-Bot is a new Semantic MediaWiki aware bot library.

[edit] Get all SNP names

use MediaWiki::Bot;
my $bot = MediaWiki::Bot->new({
   protocol => 'http',
   host => 'bots.snpedia.com',
   path => '/',
   });
$bot->{api}->{use_http_get} = 1;
my @rsnums = $bot->get_pages_in_category('Category:Is_a_snp', {max=>0});
print join("\n",@rsnums),"\n";

[edit] How can I grab the text from pages?

#!/usr/bin/env perl
use MediaWiki::Bot;
my $bot = MediaWiki::Bot->new({
   protocol => 'http',
   host => 'bots.snpedia.com',
   path => '/',
   });
$bot->{api}->{use_http_get} = 1;

foreach my $rs ('rs1815739',
                'rs4420638',
                'rs6152') {
   my $text = $bot->get_text($rs);
   print '=' x 20,"$rs\n";
   print $text;
}


[edit] I need Genotypes and their Magnitude

#!/usr/bin/env perl;
use strict;
use warnings;
use MediaWiki::Bot;

my $bot = MediaWiki::Bot->new({
   protocol => 'http',
   host => 'bots.snpedia.com',
   path => '/',
   });
$bot->{api}->{use_http_get} = 1;
my $text = $bot->get_text('rs1234');
print '=' x 20,"$text\n";
print "\n\nThe above text should prove that we can read from SNPedia\n";
print "Getting some more info from SNPedia\n";

my @genotype = $bot->get_pages_in_category('Category:Is a genotype', {max=>0}) ;

foreach my $geno (@genotype) {
   my $genotext       = $bot->get_text($geno);
   my ($magnitude)    = $genotext =~ m/magnitude\s*=\s*([+-\.\d]+)/;
   my ($beginingtext) = $genotext =~ m/\}\}(.{3,30})/s;
   $beginingtext = $genotext unless $beginingtext;
   $beginingtext =~ tr/\n/ /;
   $magnitude = '' unless defined $magnitude;
   print "Magnitude\t${magnitude}\tfor\t${geno}\t${beginingtext}\n";
}

[edit] Python

Those examples use the python-wikitools

[edit] Get all SNP names

site = wiki.Wiki("http://bots.snpedia.com/api.php")                  # open snpedia
snps = category.Category(site, "Is_a_snp")
snpedia = []
       
for article in snps.getAllMembersGen(namespaces=[0]):   # get all snp-names as list and print them
    snpedia.append(article.title.lower())
    print article.title

[edit] Grab a single SNP-page in full text

You get back a string that contains the unformated wiki-code:

site = wiki.Wiki("http://bots.snpedia.com/api.php")
snp = "rs7412"
pagehandle = page.Page(site,single_snp.name)
snp_page = pagehandle.getWikiText()


[edit] Ruby

These examples use the Mediawiki-gateway-gem

Please use versions 0.5.0 or later due to http://github.com/jpatokal/mediawiki-gateway/issues/24

[edit] Grab all SNP-pages that contain a specific text and iterate over the content

This example grabs all genotype-pages of a specific SNP

@snp = "Rs7412"
mw = MediaWiki::Gateway.new("http://bots.snpedia.com/api.php")
pages = mw.list(@snp + "(") # return an array of page-titles
if pages != nil
  pages.each do |p| # iterate over the results and grab the full text for each page
    single_page = mw.get(p)
    puts single_page
  end
end
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox