IAP 2009 notes 2009-01-16

Nearby: IAP 2009, IAP 2009 outline, Mon, Tue, Wed, Thu, Fri

Case study
[[Media:Friday.pdf]]

Outline for today:
 * 1) Identify data/information sources
 * 2) Review permission for intended use
 * 3) Model the reality that the data are about (in OWL, using Protege 4)
 * 4) Convert sources to RDF (using LSW scripts; might have used XQuery or something else)
 * 5) Check model consistency, data well-formedness, and data/model consistency (using a reasoner, Pellet; Pellet can be invoked either from Protege or from LSW)
 * 6) Use the integration: SPARQL, Tabulator, Simile, SWAP, other semweb tools

Ontologies strive to provide for ongoing integration by being "about" the things that the data is "about". This gains one a certain amount of protection against accidental properties of particular data sets. An ontology is not necessarily about the data itself - that would make it data model, not a reality model. Ontologies are open-ended - they are defined with what one might know in the future in mind, rather than what one happens to know at present.

Query: (see LSW transcript)

Find me, for some street, distinct lands on it.

Q: Community RDF/OWL creation? dbpedia, yago

Feed wikipedia into Open Calais.

dbpedia is evolving - has moved from one triple store to another.


 * 1) Web-enabled
 * 2) Policy - e.g. open a la wikipedia
 * 3) "Self-description" - good documentation, linked references

Q: The average data producer isn't going to be able to do this community-based ontology development. What are they going to do?

A: It will take a long time.

Needs a killer app.

Google 'parallax freebase'

Discussion with Tim and MacKenzie about strategy - where this is going to go - the need for shared names, or the prospect of not needing them.

Clinic
Discussion of Person/Person Name relation. Initially you might have multiple assertions U has_name N, with the same N but multiple Us. So do all of those Us name the same person, or not? Well, maybe yes, maybe no. Absent any constraint on has_name, you might have many people with the same name (i.e. the URIs might refer to different people. But if you add a single axiom

has_name inverse-functional true.

then you're able to infer that all of those URIs refer to the same person.

Google 'OpenLink RDF browser'

Question: What if, during development, I just want the URIs to be local - what do I do about the XML namespaces? Can I use "", or "/", or "./", or "#"?

TBD: JAR to upload the fusion XML files and the XQuery conversion script.