As far as I can tell, there's no online documentation for the CLOCC XML Parser, available as part of the the CLOCC package of Lisp utilities. The following is what I've gleaned from reading the code and documentation strings, and should be enough to get you started.

Install the CLOCC XML parser

The Common Lisp Open Code Collection contains a number of useful utilities for Common Lisp. This page is only concerned with the XML parser, but feel free to explore the rest of the collection.

To install this code, you need an archive utility that knows how to handle tar-gzipped files. These are like Zip files on Windows but are more common on Unix systems. There are many free and inexpensive utilities for Windows, such as WinZip, ZipGenius, TugZip, Stuffit, etc., that know how to handle tarred gzip files. (Warning: FilZip did not know how to open this archive.)

How to use the CLOCC XML Parser

Assuming you've done the steps previously, all you'll need to do is load clocc.lisp and then the compiled version of xml.lisp.

The parser is defined in the cllib: package.

How to parse XML

Find a short XML file on your machine. XML files are used for many purposes. Java and .Net uses them to hold configuration information.

Alternatively, find a short XML file on the web and copy it to your machine. There are many there too. Blogs, for example, use XML files to list new items, using RSS. The Wikipedia entry on RSS gives an example of such an XML.

Once you have a file with XML, you can parse it into Lisp with:

    (cllib:xml-read-from-file pathname)
    
    Example: (cllib:xml-read-from-file "c:/cs325/code/test-bugs.xml")
    

The function returns a list of the top-level objects read from the XML file. See below for how to inspect these objects.

To parse XML directly from an input stream:

    (cllib:with-xml-input (variable stream)
      <code that calls (read variable) to extract XML forms>
      )
    

This puts an "XML wrapper" around the input stream, similar to the way Java uses wrappers around input streams. Each call to read will try to read one XML object, including nested objects, much the way read on a regular Lisp input stream will read a list.

When testing XML forms, it's often handy to parse XML stored in a string. Here's a function that will read one XML object from a string:

    (defun xml-read-from-string (string)
     (cllib:with-xml-input (in (make-string-input-stream string))
      (read in)))
    

If you don't want to type cllib: in front of the XML parser function names, do the following in the package cs325-user:

    (use-package :cllib)
    

If you do a (use-package :cllib), you will get warnings about some name conflicts. Just select the default response Unintern the conflicting symbol. Or, just learn to type cllib: a lot.

How to process XML objects

Reading from a CLOCC XML stream returns objects of type xml-obj. CLOCC defines several functions for getting data out of the object. To illustrate these functions, assume we've done the following:

    (setq xmlo
      (xml-read-from-string
        "<book id=\"book-1\"><title>Alas!</title><author>Anon</author></book>"))
    

The following functions will get data out of xmlo:

Using recursion, you can explore the entire XML tree with these functions.

Faculty: Chris Riesbeck
Time: MWF: 11:00am-11:50am
Location: Tech LR 5

Contents

Important Links