Semantic resources project/Meeting notes/2009-07-16

= Notes from July 16 meeting =

Taken by: Kaitlin Thaney

Attendees: Alan Ruttenberg, Elizabeth Wu, Jonathan Rees, Tim Clark, Kaitlin Thaney, Paolo Ciccarese

I. Curation Process / Web services

Discussion re: Curation Process


 * PC - explaining that SWAN doesn't do curation on external data sources because there is a single source right now. SWAN adopting simpler model right now instead of more comprehensive approach.
 * AR - Coordination constraint.
 * PC discusses need for curation status field, as a way to track
 * AR says curation status field as part of OBI - ontology indicating curation details
 * AR talking about namespaces, autonomously being able to generate identifiers, also including the host name.


 * Agreement as to what's meant by the word "namespace".

Web service discussion


 * What it would entail: way to generate identifiers, generate subclasses in very short turnaround. For antibodies specifically, but likely can be used for other things.

1. feeds publication 2. cyclic requests.
 * strategies:


 * AR - consensus reached, creation of this sort of Web service is appropriate for the scope of the Semantic Resources project.Next step - discuss with folks in OBI

'''II. Review status'''

Antibodies


 * EW - who is the resource provider for antibody database?
 * AR - think some sub group of people responsible as a clearinghouse. responsible for saying 1) yes, info is adequate and 2) yes we understand it. resource provider - another role, curators of different kinds some professional (ie., Don) others not (ie., lab researchers) and manufacturers (ie., Abcam) have catalogues of information - another source of primary information. many sources of information.
 * EW - does clearinghouse for antibodies exist right now?
 * AR - have OBI, which is a set of files, repository, etc. since it's open, there's no constraint.
 * TC - helpful to think of OBI as a centralized naming scheme for antibodies, instead of as a clearinghouse for antibody information
 * AR - aiming for RDF ontologized resource for antibodies includes structured info which has identifier for each. the process we'll do for say, alzforum, would be to take subset of 1000 antibodies, curate by scripting, figure out some other qualities, take a pile and add to OBI - generate OBI IDS, put RDF in OBI repository. any services people want to build on top of that, they can. in Alzforum DB, you may just want to include identifiers, may want to add other services, package integration into triples stores. for integration, when speaking of that antibody, you'll want to use the OBI identifier so there's a common name and know we're speaking about the same antibody.
 * Expected process with Abcam - Frank Gibson will be prototyping RDF / OWL for it, then (Alan) help review it. at some point once in decent shape, ask for wider review and it'll become a part of OBI. longer term, want to figure out how to have ongoing coordination. no one's really done that with ontologies to date. central identifiers, open data are key issues. there may be some issues with companies and aggregators as to what is detailed re: the identity of the antibody
 * JAR - can script to sniff around and search for duplicate identifiers. only ask for identifiers when needed.
 * PC - can AR provide links for OBI / OBO Foundry process to better understand?
 * AR to post on the mailing list / googlegroup.

PRO


 * Meeting scheduled on Aug 10. Darren Natale. 9:30 - 2 p.m. EDT
 * Goal: First meet, bread breaking conversation to work out mechanics of the coordination.
 * AR's bias going in to it, try to get Gwen a names contributor of PRO (from AlzForum), gives her and the project more profile
 * Scope: Specifically focusing on neurodegenerative disease.

Discourse

Discussion about Dublin Core and representing "knowledge"


 * PC - some things coming up, like CITO which could be a good candidate for citations. problem with CITO - also covering other things like discourse relationships. even if SWAN can unplug a module and plug something else in, it's still carrying other pieces that aren't necessarily desirable in SWAN environment. we have 1 knowledgebase, more coming from SCF - want to be able to share KB, issue is provenance (module covers who did something and how. contributor, when made, when modified, versioning information, tracking of imported sources, software imported and version of that software). not reusing dublin core - since for digital documents and PC didn't think the relationships and properties were the same for this situation.
 * AR - long term goal of such things is to try and figure out where and how to get parties to collaborate on one thing and have that live somewhere. from PC point of view, wouldn't change anything in architecture except how represented in RDF.
 * TC - Sudeshna met with some dublin core people in milan and they offered to work with us and extend dublin core, if that'd be beneficial.
 * PC - skeptical since Dublin Core is restricted to digital documents
 * JAR - didn't get that understanding. can agree that it may not suit SWAN's needs, but possibly not for those needs.
 * AR - examples of things that are not digital documents?
 * PC - create FOAF profile - records about a person. as the record itself is not considered a digital document, since document expresses say 5 people. it may be that the 5 people have different sources and document holds all those. problem - when you create a document, create a chunk of RDF - that's the document / record. then you have the knowledge in there and that has provenance itself. those are 2 separate provenances, and want to keep track of both.
 * PC asks: common usage for Dublin Core is for creator of document, what about creator of knowledge?
 * issue with term "creator". doesn't make sense for knowledge. way people use Dublin Core is not the same way as SWAN. so have separate module for provenance.
 * AR - typically see that in GO as "evidence".
 * JAR - PC needs something more precise than Dublin Core.
 * AR - Need to figure out way to name the thing BEING created. Perhaps get a handle on that, not necessarily an issue of the word "creator". anything that you can say in RDF can now be referred to as a thing - ends up looking like reification. in OWL specified way of putting annotations on things, annotations on those annotations and so on. thinks SWAN pain point is no mechanism yet to do this sort of thing. still need to figure out vocabulary.
 * AR - building application using OBI; for example, IEDB (epitopies) is using OBO vocabulary to do underlying representation of things they're curating.
 * TC - is OBI stable?
 * AR - coming close to release 1.0 this summer. 80% finished with the paper. whittling away the tracker items. will provide tools to people that they could use in their daily work in a lab without getting too detailed.
 * AR - other projects in this space. gully burns and KEfED, another purpose of OBI would be capturing knowledge from methods section using structure

'''III. Brainstorming other resources'''
 * TC - would like to prioritize using mouse models, very important to Parkinsons and AD community, and stem cells. nominate animal models to work on after antibodies. something at the level of the JAX catalogue.
 * EW - have started collecting experimental protocols in Alzforum - have about 40. what protein, what method, what species ... protocol stored in a PDF.
 * AR - if want to make these available widely, be great and make sense to use OBI to do so, at least as a start.
 * TC brainstorming ...


 * Genes - related to PRO
 * Clinical Trials
 * Targets
 * Animal Models
 * Protocols
 * Classes of things people may create web services for:
 * PRO
 * Antibodies
 * Mouse Models
 * Discourse
 * Protocols
 * Experiments

IV. Action items


 * Paolo, Alan, and 1 or 2 people from OBI - start conversation re: Web service / needs.

Adjourned