PRO Broker User Stories

= PRO Broker User Stories =

Domain Expert
The "domain expert" uses the broker to as an ontology-authoring tool. She wants to represent a portion of her expert knowledge as completely as possible, matching to existing terms where possible and submitting new terms everywhere else.

The broker should:
 * provide reviewable (OBO) output
 * check for consistency of representation against ontology
 * allow submission of new "subsidiary" terms (i.e. "exon", "site", "modified residue," etc.)
 * allow submission of new "primary" terms ("protein", "isoform")
 * provide a catalog of existing primary and subsidiary terms for review

An example of the "domain expert" is our work with APP and (now) with Tau, either a scientist (such as Gwen) or an informed reader of the literature.

The domain expert uses the broker by
 * uploading created catalogs of subsidiary and primary part terms
 * reviewing matches of submitted terms against existing ontology terms
 * reviewing inconsistent submissions (and re-submitting revised terms where necessary)
 * examine complete catalog of existing+submitted terms.

Annotator
The "annotator" is engaged in annotating existing texts (papers, web-pages, or other documents) with terms from the ontology for which the broker is a proxy. The annotator finds broker terms through string matching, either single strings or tuples of strings. For example, the annotator may find the term "Tau" in a paper, and wish to annotate the word with the PRO term for the Tau protein. A more involved example would be the annotator finding the word "Tau" in one location, and "isoform A" in another -- these two phrases should be used to *jointly* discover an existing term from the broker, or to submit a new term if no such text resolution can be made.

The annotator:
 * submits strings, or collections of strings ("queries") to the broker
 * the broker responds either with an existing term from the ontology, or an indication of an UNKNOWN term.
 * the annotator may choose to convert the query, in the case of an UNKNOWN response, into a "new term request" by submitting the query itself, along with the text in which it was found, to the broker.
 * the broker stores a "unfilled" term request and returns a nonce ID to the annotator for provisional use.
 * the annotator later returns to the broker, queries the nonce ID, and replaces it with the "real" ID when available -- or deprecates use of the annotated identifier if the request cannot be filled.

Curator
The "curator" is a role between the annotator and the domain expert; she is involved in "annotating" a separate data source with existing terms (where possible) or with new, described terms otherwise. This role is analogous to the user of the antibody annotation tool we have developed for our antibody resource.

The curator finds a string, or other indicator, in the resource she is annotating. For example, she finds "human Tau R5H". She recognizes this to be a "human tau" protein, subclass of "tau," with the sequence variant R5H (histidine at position 5, which would normally carry an arginine.)

The curator:
 * Asks the broker for terms "human tau" and part "R5H"
 * If the broker does not recognize either term, it allows the curator to submit provisional term requests for either one.
 * The curator then requests a new term defined by the intersection of these terms -- "human tau" and "has_part R5H".
 * the PRO broker responds with an existing ontology term ID (if such a term exists), or with a provisional nonce ID otherwise.
 * The curator later returns to the broker, queries the provisional IDs she was granted, and replaces them with "real" IDs where available -- or submits additional information for term requests, if the original request contained insufficient information to be filled.

Requirement Outline

 * User- and Ontology-facing interfaces
 * Request responses: existing, filled, additional information required, unable to be fulfilled
 * Reasoning: for consistency checking
 * Catalog of primary and subsidiary terms
 * Ontology generation: OBO (and OWL?)
 * Term text lookup -- either through single, or multiple, string queries.
 * Term ontology lookup -- through ontological definitions of parts lists.