These exercises develop some useful utility functions for working with different aspects of the Semantic Web.

Exercises

SW-1: Name conversion

Function names: camelize, hyphenate

Test cases: microdata-exs-tests.lisp.

A common problem in many languages is mapping name strings from one camelCase to hyphens or back. For example, in Javascript, names are camelCase, but CSS is hyphenated, so the CSS attribute font-size has to be written as fontSize in Javascript.

For us, this occurs when reading Semantic Web names which use camelCase, e.g., JobPosting, but we want a Lisp symbol like job-posting.

Define (camelize string [capitalize]) to take a hyphenated string and return the camelCase equivalent. If the optional second argument is given and is true, then the first letter should be capitalized.

Define (hyphenate string [case] ) to take a camelCase string and return the hyphenated equivalent. If the optional second argument is :upper (the default), then the result string be all upper case. If it is :lower, it should be all lower case. Any other value is an error.

Only insert a hyphen when the case changes. Something like "getURL" should become "GET-URL" not "GET-U-R-L".

Define hyphenate using clean basic Lisp, without explicit setf calls. Don't use some library function, such as the CL-JSON function simplified-camel-case-to-lisp. Don't copy the CL-JSON code, which uses loop and setf.

SW-2: Microdata Reader

Function names: read-microdata, read-microdata-url

Test cases: microdata-exs-tests.lisp.

The nice thing about microdata in HTML files is that there's almost nothing to it. You just have to look for itemscope, itemtype and itemprop attributes, and keep track of nesting.

Define (read-microdata string) to return a list of objects given in microdata in the string string. This is the function that's more easily tested.

Define (read-microdata-url url-string) to return the list of objects given in microdata in the web page returned by url-string, with any nested data.

AllegroServe has functions to read and parse HTML for you. Your job is to take parsed HTML with microdata in it and pull the objects out.

To test on real pages with microdata, go to this listing of sites using microdata.

Warning: many links are broken, and some go to sites that have been hacked. Such is life on the bleeding edge. Also, look for pages using the Schema.org vocabulary. One site that produced at least one useful page is the Indianapolis Museum of Art.

Here's what your code would produce for the embedded object example:

((movie (:name "Avatar")
        (:director (person (:name "James Cameron")
                           (:birth-date "August 16, 1954")))
        (:genre "Science fiction") 
        (:trailer "Trailer")))

Use the following scheme to map strings to symbols:

If you're not sure about how some string should be parsed, look at the Live Microdata JSON output for your string.

Faculty: Chris Riesbeck
Time: Monday, Wednesday, Friday: 1pm - 2pm
Location:Annenberg G15

Contents

Important Links