uniprot_potential_isoforms: List all Human UniProt entries and their computationaly potential isoforms..

PREFIX taxon: <http://purl.uniprot.org/taxonomy/> PREFIX up: <http://purl.uniprot.org/core/> SELECT ?entry ?sequence ?isCanonical WHERE { # We don't want to look into the UniParc graph which will # confuse matters GRAPH <http://sparql.uniprot.org/uniprot> { # we need the UniProt entries that are human ?entry a up:Protein ; up:organism taxon:9606 ; # and we select the computationally mapped sequences up:potentialSequence ?sequence . } }Use

uniprot_primary_accession: Extracting an UniProtKB primary accession from our IRIs. Is done with a bit of string manipulation. While UniProt primary accession are unique within UniProtKB they may be reused by accident or itentionally by other data sources. If we provided them as strings (not IRI) and if you used them in a query that way, you might accidentaly retrieve completly wrong records.

SELECT ?primaryAccession ?protein WHERE { ?protein a up:Protein . BIND(substr(str(?protein), strlen(str(uniprotkb:))+1) AS ?primaryAccession) }Use

uniprot_primary_accession: Extracting an UniProtKB primary accession from our IRIs. Is done with a bit of string manipulation. While UniProt primary accession are unique within UniProtKB they may be reused by accident or itentionally by other data sources. If we provided them as strings (not IRI) and if you used them in a query that way, you might accidentaly retrieve completly wrong records.

PREFIX uniprotkb: <http://purl.uniprot.org/uniprot/> PREFIX up: <http://purl.uniprot.org/core/> SELECT ?primaryAccession ?protein WHERE { ?protein a up:Protein . BIND(substr(str(?protein), strlen(str(uniprotkb:))+1) AS ?primaryAccession) }Use

uniprot_proteome_location_of_gene: List UniProt proteins with genetic replicon that they are encoded on using the Proteome data.

SELECT DISTINCT ?proteomeData ?replicon ?proteome WHERE { # reviewed entries (UniProtKB/Swiss-Prot) ?protein up:reviewed true . # restricted to Human taxid ?uniprot up:organism taxon:9606 . ?uniprot up:proteome ?proteomeData . BIND( strbefore( str(?proteomeData), "#" ) as ?proteome ) BIND( strafter( str(?proteomeData), "#" ) as ?replicon ) }Use

uniprot_proteome_location_of_gene: List UniProt proteins with genetic replicon that they are encoded on using the Proteome data.

PREFIX taxon: <http://purl.uniprot.org/taxonomy/> PREFIX up: <http://purl.uniprot.org/core/> SELECT DISTINCT ?proteomeData ?replicon ?proteome WHERE { # reviewed entries (UniProtKB/Swiss-Prot) ?protein up:reviewed true . # restricted to Human taxid ?uniprot up:organism taxon:9606 . ?uniprot up:proteome ?proteomeData . BIND( strbefore( str(?proteomeData), "#" ) as ?proteome ) BIND( strafter( str(?proteomeData), "#" ) as ?replicon ) }Use

uniprot_recomended_protein_full_name: The recommended protein full names for UniProtKB entries

SELECT ?protein ?fullName WHERE { ?protein a up:Protein ; up:recommendedName ?recommendedName . ?recommendedName up:fullName ?fullName . }Use

uniprot_recomended_protein_full_name: The recommended protein full names for UniProtKB entries

PREFIX up: <http://purl.uniprot.org/core/> SELECT ?protein ?fullName WHERE { ?protein a up:Protein ; up:recommendedName ?recommendedName . ?recommendedName up:fullName ?fullName . }Use

uniprot_recomended_protein_short_name: The recommended protein short names for UniProtKB entries

SELECT ?protein ?fullName WHERE { ?protein a up:Protein ; up:recommendedName ?recommendedName . ?recommendedName up:shortName ?fullName . }Use

uniprot_recomended_protein_short_name: The recommended protein short names for UniProtKB entries

PREFIX up: <http://purl.uniprot.org/core/> SELECT ?protein ?fullName WHERE { ?protein a up:Protein ; up:recommendedName ?recommendedName . ?recommendedName up:shortName ?fullName . }Use

uniprot_reviewed_or_not: List all UniProt protein and if they are reviewed (Swiss-Prot) or unreviewed (TrEMBL)

SELECT ?protein ?reviewed WHERE { ?protein a up:Protein . ?protein up:reviewed ?reviewed . }Use

uniprot_reviewed_or_not: List all UniProt protein and if they are reviewed (Swiss-Prot) or unreviewed (TrEMBL)

PREFIX up: <http://purl.uniprot.org/core/> SELECT ?protein ?reviewed WHERE { ?protein a up:Protein . ?protein up:reviewed ?reviewed . }Use

uniprot_sequences_and_mark_which_is_cannonical_for_human: List all Human UniProt entries and their sequences, marking if the sequence listed is the cannonical sequence of the matching entry.

PREFIX taxon: <http://purl.uniprot.org/taxonomy/> PREFIX up: <http://purl.uniprot.org/core/> SELECT ?entry ?sequence ?isCanonical WHERE { # We don't want to look into the UniParc graph which will # confuse matters GRAPH <http://sparql.uniprot.org/uniprot> { # we need the UniProt entries that are human ?entry a up:Protein ; up:organism taxon:9606 ; up:sequence ?sequence . # If the sequence is a "Simple_Sequence" it is likely to be the # cannonical sequence OPTIONAL { ?sequence a up:Simple_Sequence . BIND(true AS ?likelyIsCanonical) } # unless we are dealing with an external isoform # see https://www.uniprot.org/help/canonical_and_isoforms OPTIONAL { FILTER(?likelyIsCanonical) ?sequence a up:External_Sequence . BIND(true AS ?isComplicated) } # If it is an external isoform it's id would not match the # entry primary accession BIND(IF(?isComplicated, STRENDS(STR(?entry), STRBEFORE(SUBSTR(STR(?sequence), 34),'-')),?likelyIsCanonical) AS ?isCanonical) } }Use

uniprot_signature_match_start_end: List all InterPro member database signature match start and end for a specific UniProt protein.

SELECT ?protein ?interproMemberDatabaseXref ?matchStart ?matchEnd WHERE{ GRAPH <http://sparql.uniprot.org/uniprot> { VALUES ?protein {<http://purl.uniprot.org/uniprot/P05067>} . ?protein rdfs:seeAlso ?sa . } GRAPH <http://sparql.uniprot.org/uniparc> { ?uniparc up:sequenceFor ?protein ; rdfs:seeAlso ?interproMemberDatabaseXref . ?interproDatabaseXref up:signatureSequenceMatch ?sam . ?sam faldo:begin ?sab ; faldo:end ?sae . ?sab faldo:position ?matchStart ; faldo:reference ?uniparc . ?sae faldo:position ?matchEnd ; faldo:reference ?uniparc . } }Use

uniprot_signature_match_start_end: List all InterPro member database signature match start and end for a specific UniProt protein.

PREFIX faldo: <http://biohackathon.org/resource/faldo#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX up: <http://purl.uniprot.org/core/> SELECT ?protein ?interproMemberDatabaseXref ?matchStart ?matchEnd WHERE{ GRAPH <http://sparql.uniprot.org/uniprot> { VALUES ?protein {<http://purl.uniprot.org/uniprot/P05067>} . ?protein rdfs:seeAlso ?sa . } GRAPH <http://sparql.uniprot.org/uniparc> { ?uniparc up:sequenceFor ?protein ; rdfs:seeAlso ?interproMemberDatabaseXref . ?interproDatabaseXref up:signatureSequenceMatch ?sam . ?sam faldo:begin ?sab ; faldo:end ?sae . ?sab faldo:position ?matchStart ; faldo:reference ?uniparc . ?sae faldo:position ?matchEnd ; faldo:reference ?uniparc . } }Use

uniprot_transporter_in_liver: Find human transporter proteins in reviewed UniProtKB, that are expressed in the liver (Uses Bgee and UBERON).

SELECT ?rhea ?protein ?anat WHERE { GRAPH <https://sparql.rhea-db.org/rhea> { ?rhea rh:isTransport true . } ?protein up:annotation ?ann . ?protein up:organism taxon:9606 . ?ann up:catalyticActivity ?ca . ?ca up:catalyzedReaction ?rhea . BIND(uberon:0002107 AS ?anat) SERVICE <https://www.bgee.org/sparql> { ?seq genex:isExpressedIn ?anat . ?seq lscr:xrefUniprot ?protein . ?seq orth:organism ?organism . ?organism obo:RO_0002162 taxon:9606 . } }Use

uniprot_transporter_in_liver: Find human transporter proteins in reviewed UniProtKB, that are expressed in the liver (Uses Bgee and UBERON).

PREFIX genex: <http://purl.org/genex#> PREFIX lscr: <http://purl.org/lscr#> PREFIX obo: <http://purl.obolibrary.org/obo/> PREFIX orth: <http://purl.org/net/orth#> PREFIX rh: <http://rdf.rhea-db.org/> PREFIX taxon: <http://purl.uniprot.org/taxonomy/> PREFIX uberon: <http://purl.obolibrary.org/obo/uo#> PREFIX up: <http://purl.uniprot.org/core/> SELECT ?rhea ?protein ?anat WHERE { GRAPH <https://sparql.rhea-db.org/rhea> { ?rhea rh:isTransport true . } ?protein up:annotation ?ann . ?protein up:organism taxon:9606 . ?ann up:catalyticActivity ?ca . ?ca up:catalyzedReaction ?rhea . BIND(uberon:0002107 AS ?anat) SERVICE <https://www.bgee.org/sparql> { ?seq genex:isExpressedIn ?anat . ?seq lscr:xrefUniprot ?protein . ?seq orth:organism ?organism . ?organism obo:RO_0002162 taxon:9606 . } }Use

uniprot_unamed_plasmids: Sometimes it is known that a gene encoding a protein UniProtKB is located on a plasmid, but the name of the plasmid is unknown.

SELECT ?protein ?plasmidOrOrganelle ?label WHERE { ?protein a up:Protein ; up:encodedIn ?plasmidOrOrganelle . OPTIONAL { ?plasmidOrOrganelle rdfs:label ?label . } }Use

uniprot_unamed_plasmids: Sometimes it is known that a gene encoding a protein UniProtKB is located on a plasmid, but the name of the plasmid is unknown.

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX up: <http://purl.uniprot.org/core/> SELECT ?protein ?plasmidOrOrganelle ?label WHERE { ?protein a up:Protein ; up:encodedIn ?plasmidOrOrganelle . OPTIONAL { ?plasmidOrOrganelle rdfs:label ?label . } }Use