Semantic resources project/Resources/Pathways

We have identified pathway information as a potential semantic resource for the Semantic Resources collaboration project. Existing databases and ontologies already support the storage and annotation of large numbers of pathways. The goal of this semantic resource will be to organize information about these existing pathways, such as pointers to existing pathways related to specific neurodegenerative disease, metadata about the visual layout of pathways, and tools to link pathway data to underlying semantic resources such as the PRO ontology.

Ideas
We would might like to represent
 * the biological entities participating in the pathways
 * the interactions and the type (e.g. phosphorylates)
 * the cartoon layout - coordinates, shapes, sizes, color etc.
 * overlay other types of information (such as expression)
 * provide service to convert Kegg like pathways into computable format (including mapping to PRO)

I (TWD) don't think we're interested in re-creating any kind of pathway database or resource -- KEGG, Wikipathways, reactome, etc. seem to have that wheel pretty much invented. But there might be a few places where we could have an impact as part of the Semantic Resources Project:
 * Coming up with a resource which describes which entries in those pathway databases are specifically related to neuroscience applications or to neurodegenerative disease; a sort of portal for Alzheimer's or Parkinson's researchers into those collections.
 * Representing the layout (graphically!) of the pathways that people have developed as a separate resource. This might plug into, or extend, the IAO (Information Artifact) ontology, I think, and would be a useful resource in its own right.

Data Sources

 * WikiPathways
 * Reactome
 * KEGG : (in particular, KEGG PATHWAY)
 * Pathways Commons - which is a complation of pathways from other sources
 * BioModels Database : EBI resource
 * Private/non-canonical curated datasets

Curated pathways for research areas of interest would be great - for example, stem cell hematopoiesis, Alzheimer's disease, Parkinson's disease. Researchers should be able to edit and conduct discussions on the pathways and agree/disagree with others.

Relevant Publications
Pathways can also be derived from informal diagrams in particular publications.

Pollio et al. "Increased expression of oligopeptidase THOP1 is a neuroprotective response to A-Beta toxicity."

File Formats
There are several formats to represent pathways:


 * BioPAX - a RDF/OWL-based standard for the exchange of biological pathways
 * GPML - GenMAPP Pathway Markup Language; it is a custom XML format compatible with pathway visualization and analysis tools such as Cytoscape, GenMAPP and PathVisio
 * SBML - Systems biology markup language
 * There are two "levels" of SBML notation -- Level 1 and Level 2 -- for which their exist canonical specification documents.

Software Parsing Libraries

 * JigCell SBML Parser
 * A Java library which parses the SBML XML format. Compatible with Java 1.4, which means it doesn't use generics -- that actually makes it harder to use and understand.  Fairly opaque rendering of the SBML structure into Java objects.
 * Paxtools
 * Appears to be a Java library for dealing with BioPAX -- I haven't used it. Contains Lucene (?!) support, among other things.  Does BioPAX have layout instructions in it?
 * CellDesigner
 * Java software for creating, editing, and laying out pathway diagrams. All the CellDesigner files I've dealt with so far are conformat SBML, but with CellDesigner-specific annotations inserted into the &lt;annotations&gt; tag for each species.  I've written some custom Java code for pulling this information out in certain cases, and I should post it in a public place.

Visual Pathway Layout

 * Term location (X, Y)
 * Visual groupings (subsets, complexes)
 * Boundaries and Compartments
 * Visual Rendering of connections?
 * Colors?