Semantic resources project/PRO

We are developing the "PRO Resource" for packaging in Neurocommons and use by the Science Collaboration Framework and SWAN software. The Protein Ontology (PRO) is an ontology under development by PIR at Georgetown University, building an ontology for proteins and related terms (complexes, families, fragments, and isoforms). PRO represents a rich vocabulary of protein concepts useful for a precise understanding of biological experiments and their results.

Our PRO resource consists of two parts: our contribution to the public PRO ontology and modeling effort, and our annotations of existing resources with links to the PRO ontology. As part of our representation of antibodies and mouse disease models, we have found references to modified proteins and peptide fragments which do not have corresponding terms in PRO -- we are working with PRO to create a social process of "batch" identifier requesting, and we are using that process to request new PRO identifiers for these references. The second part of our resource is the creation of links to PRO from other resources: antibodies, pathways, and mouse models for disease.

A complete set of resources, interlinked through common annotation with PRO terms, will allow those resources to be integrated in ways that are relevant to their biological meaning. For example, antibodies can be annotated as specific to a particular form or fragment of a protein instead of the protein itself, thereby avoiding false search results and confusion when the resource is used by a research scientist.

PRO Submission
We are working on draft versions of file formats for batch submission to PRO: both new terms, and new annotations.
 * Batch-submission File Formats

We are also designing a software "broker," to act as an intermediary for services which need to automatically generate temporary identifiers with minimal information for later submission to PRO.

Links

 * PAF.txt : Annotation file for PRO terms.
 * RACE PRO : Structured term submission form.
 * SourceForge Term Requests : bug-tracker style free-text submissions for new terms from PRO.

Meeting Notes

 * Darren Natale (November 11, 2009)

Searching PRO
The ability to index and search the PRO ontology is important for two of our tools: the antibody annotation software and the ontology broker.


 * Lucene Index for PRO

Tau Protein
/Tau

APP
/APP & Cleavage Products

SWAN Proteins

 * A list of 125 proteins cited by SWAN 1.0 Discourse Elements: [[Media:Cited-proteins.txt|cited-proteins.txt]]

Heat Shock Proteins

 * Hsp104 UniProtKB P31539 - REF830863 on November 13, 2009
 * Hsp105/Hsp110 UniProtKB Q92598 - REF163374 on November 13, 2009

Via Cecilia, an outline of the revised PRO hierarchy for some portion of the heat shock protein family:



Version of HSP proteins with species specific terms but not families yet: [[Media:Pro-hsps-2010-03-19.obo]]

Protein Complexes
As an example to work from, this is a multi-protein complex of great interest to AD, and below is the descriptions and uniprot IDs of the various components of the complex from our resident expert, Gwen Wong. From her descriptions here, we need to be able to deal with dimer, and different permutations of this protein complex.

Elizabeth

the gamma-secretase complex, a complex composed of a presenilin homodimer (PSEN1 or PSEN2), nicastrin (NCSTN), APH1 (APH1A or APH1B) and PEN2.


 * Human presenilin-1: presenilin-1, PSEN-1 http://www.uniprot.org/uniprot/P49768
 * Human PS2 (presenilin-2, PSEN-2): http://www.uniprot.org/uniprot/P49810

These are the catalytic aspartyl proteases of g-secretase.


 * Human PEN2: http://www.uniprot.org/uniprot/Q9NZ42 this is a subunit of g-secretase
 * nicastrin http://www.uniprot.org/uniprot/Q92542 this is a subunit of g-secretase
 * APH-1A, http://www.uniprot.org/uniprot/Q96BI3, these are homologues see Bart De Stroopers Serneels hypothesis
 * APH-1B, http://www.uniprot.org/uniprot/Q8WW43

1A is embryonically important, 1B is relevant for AD. Both are components of g-secretase

Gene Annotations
Gene Annotations