Trolling Medline

You can do greps over Medline, generating matches in RDF/XML, in about an hour.

The following process looks only at the abstracts, not any of the other fields. But the XQuery script is easily modified to look at other fields, if that's desired.

Create your own XQuery script by copying and mutating convert/occurrences.xq (in the svn repository).

Make sure you have saxon available (in the ../build directory) by issuing the command "make saxon".

Invoke your script following the pattern in the Makefile for the "occurrences" rule:

mkdir -p outputdirectory/DONE (COMMAND=convert/xquery-medline.sh \    DONE=outputdirectory/DONE/ \     PARAMS="/work/medline/source/ outputdirectory ../build convert/medline-regexp.xq" \     time nice make -j 4 -f ../build/silly-makefile )

Substitute the name of the destination directory for "outputdirectory" and, if the current directory is not the svn trunk, adjust the trunk-relative paths ../build and convert/xquery-medline.sh.

If this process gets interrupted that's fine - you can just restart it. The DONE directory keeps track of which Medline files have already been processed.

If you need to reprocess using the same output directory, simply rm -r outputdirectory/DONE and try again.

If you're likely to want to do it more than once (say, next year with a newer version of Medline) then add a new rule to the Makefile by copy-and-modify of the rule for "occurrences".