Information Artifact Ontology

About
The Information Artifact Ontology (IAO) is a new ontology of information entities, originally driven by work by the OBI digital entity and realizable information entity branch. The first workshop on the IEO took place in Boston at the MIT Stata Center, June 9, 2008. For more details see First IAO workshop.

Background
This effort is motivated by several experiences we have had in developing ontologies.


 * The OBI Digital Entity and Realizable Information Entity branch effort to define information entities necessary for the representation of Biomedical Investigations.
 * A small group discussing Semantic Web architecture
 * Discussions of bibliographic ontology and discourse stimulated by our work with SWAN

We initially considered calling our effort the information entity ontology. However, after our first meeting we decided to to rename it as Information Artifact Ontology -- thus narrowing the focus, and ideally leaving open the issue of whether DNA molecules are carriers of information  This choice is motivated by the prime need at the moment, which is to  support OBI, and to support the annotation of publications, results,  databases, etc., all of which are information artifacts.

To request a term be added to the ontology, fill out this form. View the current list.

The OBI ontology is available. The information related terms are subclasses of "information entity". Here's a link to an online html browser of it (if you get a yellow banner, click on the blue link to continue) browse.

Examples of information artifacts
The following are information artifacts in this sense proposed by Barry
 * serial number
 * batch number
 * grant number
 * person number
 * name
 * address
 * email address
 * URI
 * protocol
 * lab note
 * ontology
 * gene list
 * publication
 * result
 * license
 * document granting permission
 * contract
 * novel
 * textbook
 * newspaper
 * timetable
 * recipe
 * map
 * objective specification

Selected references
Post-workshop: Pre-workshop:
 * Features associated with information
 * How much are we anthropomorphizing when discussing what information is
 * Suggested reading: http://www.ei.sanken.osaka-u.ac.jp/pub/miz/Part3V3.pdf
 * Discussion within OBI
 * from a Denrie branch review
 * Noah Mendelson's use case from AWWSW, elaborated by Jonathan Rees
 * Bjoern's recent summary re: Programming, full thread
 * "identifiers"
 * Ordnance survey licenses
 * The Bibliographic Ontology owl/n3, use protege 4 to view. Specification
 * Semantically Annotated Latex project which defines rhetorical ontology, annotation ontology, and document ontology
 * Object Reuse and Exchange abstract data model
 * Functional Requirements for Bibliographic Records (FRBR) readings
 * Dolce based information objects ontology and core legal ontology (see also wiki page.)
 * Jonathan's IEO workshop notes
 * Ontology of Law - Recent presentation by Barry Smith
 * Browse Information entity in the Ontology for Biomedical Investigations.

Conversation leading up to the workshop
Finally, here's a short communication between Barry and me leading up to the workshop. Me with the notes, Barry responding. [AR] Had some discussion with few people today - Tom Knight, Chris Hanson, Jake Beal, Jonathan Rees. Tomorrow I'll chat with Gerry Sussman.

A variety of interesting issues arose. First one was the Shannon definition of information, which is at odds with our use of the term, I think. In particular, according to the Shannon measure, a random signal has the most information as it is least compressible. This is at odds with the sense of information we care about - random signal has none of it.

[BS] This is the mass noun sense of 'information' We are interested in the count noun 'information entity'

[AR] Other bits that came in to play:

Role of the receiver of information. Is it information before some agent interprets it? When the information isn't originating from a person, this is a reasonable question. Otherwise too much information around.

[BS] This question becomes easier to answer if you look at information entities; the latter are analogous to independent continuant artifacts (like screwdrivers); hence in normal cases they are information entities even before being registered.

[AR] Role of the producer of information. Where does physics end and information start? Tom Knight argues for no boundary, but I don't buy it. This went off in the direction of asking whether intention is necessary for information to be produced. But then what about the instrument that produces information.

[BS] I believe that information entities require a certain restricted kind of provenance (as do artifacts like screwdrivers). Hence the physics is never enough. (Analogously: a Credit Card Number is not a mathematical object.)

[AR] One possibility: The information we are interested in always originates with a sentient - either by a person thinking/ communicating, or by a machine that was designed to have a function to produce/communicate information.

[BS] Exactly

[AR] Other words suggested words that might be better than information: Message, Enscription

[BS] No thanks. Though I am not wedded to 'information' either; but 'message' is much too narrow; and 'enscription' is too precious.

[AR] One thing that there seemed to be consensus about was what I call the multiplexing of information entities. So consider a fragment of the constitution layed out on a printed page. One information content entity is the wording. Another might be the layout (vis. the layouts you can choose in word with lorum ipsum.... blank content). Someone might create the line breaks so that the first letter of each line spells out another message. There was general consensus that this was 3 things.

[BS] Agreed.