Examples
- Select all taxa from the UniProt taxonomySELECT ?taxon FROM <http://sparql.uniprot.org/taxonomy> WHERE { ?taxon a up:Taxon . }
- Select all bacterial taxa and their scientific name from the UniProt taxonomySELECT ?taxon ?name WHERE { ?taxon a up:Taxon . ?taxon up:scientificName ?name . # Taxon subclasses are materialized, do not use rdfs:subClassOf+ ?taxon rdfs:subClassOf taxon:2 . }
- Select all UniProt entries, and their organism and amino acid sequences (including isoforms), for E. coli K12 and all its strainsSELECT ?protein ?organism ?isoform ?sequence WHERE { ?protein a up:Protein . ?protein up:organism ?organism . # Taxon subclasses are materialized, do not use rdfs:subClassOf+ ?organism rdfs:subClassOf taxon:83333 . ?protein up:sequence ?isoform . ?isoform rdf:value ?sequence . }
- Select the UniProt entry with the mnemonic 'A4_HUMAN'SELECT ?protein WHERE { ?protein a up:Protein . ?protein up:mnemonic 'A4_HUMAN' }
- Select a mapping of UniProt to PDB entries using the UniProt cross-references to the PDB databaseSELECT ?protein ?db WHERE { ?protein a up:Protein . ?protein rdfs:seeAlso ?db . ?db up:database <http://purl.uniprot.org/database/PDB> }
- Select all cross-references to external databases of the category '3D structure databases' of UniProt entries that are classified with the keyword 'Acetoin biosynthesis (KW-0005)'SELECT DISTINCT ?link WHERE { ?protein a up:Protein . ?protein up:classifiedWith keywords:5 . ?protein rdfs:seeAlso ?link . ?link up:database ?db . ?db up:category '3D structure databases' }
- Select reviewed UniProt entries (Swiss-Prot), and their recommended protein name, that have a preferred gene name that contains the text 'DNA'SELECT ?protein ?name WHERE { ?protein a up:Protein . ?protein up:reviewed true . ?protein up:recommendedName ?recommended . ?recommended up:fullName ?name . ?protein up:encodedBy ?gene . ?gene skos:prefLabel ?text . FILTER CONTAINS(?text, 'DNA') }
- Select the preferred gene name and disease annotation of all human UniProt entries that are known to be involved in a diseaseSELECT ?name ?text WHERE { ?protein a up:Protein . ?protein up:organism taxon:9606 . ?protein up:encodedBy ?gene . ?gene skos:prefLabel ?name . ?protein up:annotation ?annotation . ?annotation a up:Disease_Annotation . ?annotation rdfs:comment ?text }
- Select all human UniProt entries with a sequence variant that leads to a 'loss of function'SELECT ?protein ?text WHERE { ?protein a up:Protein . ?protein up:organism taxon:9606 . ?protein up:annotation ?annotation . ?annotation a up:Natural_Variant_Annotation . ?annotation rdfs:comment ?text . FILTER (CONTAINS(?text, 'loss of function')) }
- Select all human UniProt entries with a sequence variant that leads to a tyrosine to phenylalanine substitutionSELECT ?protein ?annotation ?begin ?text WHERE { ?protein a up:Protein ; up:organism taxon:9606 ; up:annotation ?annotation . ?annotation a up:Natural_Variant_Annotation ; rdfs:comment ?text ; up:substitution ?substitution ; up:range/faldo:begin [ faldo:position ?begin ; faldo:reference ?sequence ] . ?sequence rdf:value ?value . BIND (substr(?value, ?begin, 1) as ?original) . FILTER(?original = 'Y' && ?substitution = 'F') . }
- Select all UniProt entries with annotated transmembrane regions and the regions' begin and end coordinates on the canonical sequenceSELECT ?protein ?begin ?end WHERE { ?protein a up:Protein . ?protein up:annotation ?annotation . ?annotation a up:Transmembrane_Annotation . ?annotation up:range ?range . ?range faldo:begin/faldo:position ?begin . ?range faldo:end/faldo:position ?end }
- Select all UniProt entries that were integrated on the 30th of November 2010SELECT ?protein WHERE { ?protein a up:Protein . ?protein up:created '2010-11-30'^^xsd:date }
- Was any UniProt entry integrated on the 9th of January 2013ASK WHERE { ?protein a up:Protein . ?protein up:created '2013-01-09'^^xsd:date }
- Construct new triples of the type 'HumanProtein' from all human UniProt entriesCONSTRUCT { ?protein a up:HumanProtein . } WHERE { ?protein a up:Protein . ?protein up:organism taxon:9606 }
- Select all triples that relate to the EMBL CDS entry AA089367.1: DESCRIBE <http://purl.uniprot.org/embl-cds/AAO89367.1>
- More examples
About
This SPARQL endpoint contains all UniProt data. It is free to access and supports the SPARQL 1.1 Standard.
There are 140,325,185,546 triples in this release (2023_05). The query timeout is 45 minutes. All triples are available in the default graph. There are 21 named graphs.