Modules and files

Introduction

When a large system is broken down into modules, inter-module dependencies will arise. In particular, one module may call functions and macros defined in one or more other modules. This raises several problems:

how to make sure those other modules are loaded when testing the module in question
how to make sure all modules are loaded in the correct order when building the whole system
how to make sure all modules are packaged up when moving the system to another machine

Ordering is particularly important with modules that define packages and macros. These must be loaded before any modules that use those packages or macros.

There are several methods in general use in Common Lisp for specifying how various modules of a large system go together:

a central "build" file with package definition and file loading commands
a central "build" file with a defsystem form
require and provide forms in each module

The first two methods are centralized approaches to system definition. The third method is decentralized. There are arguments to be made in favor of both approaches, and, in my opinion, no clear winner on those grounds.

The first method is simple, but not very flexible. It also doesn't help developers working with individual modules.

The second method is a generalization of the first. defsystem is a special form for declaring how a system depends on its modules and how they depend on each other. Unfortunately, there is no standard defsystem for Common Lisp. ASDF is a popular open-source centralized module loading system.

In one of the few places where I go against common practice, I use the third method for this course, because it's the simplest for end-users. When they want a file, they just load it, and any other files are loaded as needed.

Using require and provide

The basic idea of require and provide is simple.

(require module-name) loads the file associated with module-name unless that module is already loaded.
(provide module-name) registers a module as having been loaded.

A module name can be a string, e.g., "tables", or a keyword, e.g., :tables. It is never a file name, such as "tables.lsp".

We simply put the necessary require forms at the front of a file, and, optionally, one provide form at the end of the file. provide is only needed if the filename is different than the module name.

When the file is loaded, only those modules requested that are not already loaded will be loaded into the Lisp system.

For example, suppose the rules module needs the tables module. The following would go at the front of the file rules.lisp, before any code definitions:

(in-package ...) ;; whatever package the code should be in
    
    (eval-when (:compile-toplevel :load-toplevel :execute)
      (require :tables)
      (use-package :tables))

The following could go at the end of the same file:

(provide "rules")

but it's not necessary because the module name and file name are the same. Some authors recommend putting provide at the front of the file, in order to

advertise up front what module the file provides, and
avoid endless loops if there be any circularities (file A requires file B which requires file A)

I recommend putting the provide at the end, because

if the file fails to load completely, a provide at the front will fool Lisp into thinking that the module has been loaded>
circular dependencies should be fixed, not supported

What require and provide do

(provide module-name) simply adds the module name to *modules*, a list of loaded modules.

(require module-name) checks to see if the module is already in *modules*, using string=. If it is, require does nothing. If not, require gets the file name associated with the module name and loads that file.

It's that last step that is both the strength and weakness of require:

By separating module names from file names, the require forms in each file can still work when you move the files to other machines.
Unfortunately, how module names are mapped to file names is not standardized.

Require and relative pathnames

An alternative, for sets of files that go together, such as all the CS 325 course files, is to specify in each REQUIRE where to get the file(s) you need, using a pathname format that is relative to the file being loaded.

require takes an optional second argument. That argument should be either NIL, a pathname, or a list of pathnames. You might therefore write

(require :tables "tables.lisp")

Unfortunately, since the pathname is not a complete one, where Lisp will look for tables.lisp is not defined. It will depend on the setting of some "current working directory" internal variable, that every Lisp has, but for which there's no Common Lisp standard. Furthermore, what if there's a compiled version of tables.lisp? We'd rather load that unless the source code is more recent.

Therefore, we use the following function, defined in CS325-user, to get a code file for a module:

(defun get-code-file (name)
  (let* ((source-path 
          (merge-pathnames name 
                           (or *compile-file-truename* *load-truename*)))
         (object-path (compile-file-pathname source-path))
         (source-file (probe-file source-path))
         (object-file (probe-file object-path)))
    (cond ((null object-file) source-file)
          ((null source-file) object-file)
          ((> (file-write-date source-file)
              (file-write-date object-file))
           source-file)
          (t object-file))))

This does the following:

It uses *LOAD-TRUENAME* *COMPILE-FILE-PATHNAME* to fill out the path information for the file in which the REQUIRE is called. These variables are set by the loader and compiler, respectively, when loadin or compiling a file.
It uses COMPILE-FILE-PATHNAME to create the name of the file that the compiler would generate. Typically, this is a file with an extension such as .fasl or similar (for "fast loader").
It uses PROBE-FILE to check for existence of the source and compiled files.
It checks the file modification dates and returns the newer file.

To use this function in a CS325 code file:

(in-package :cs325-user)

(eval-when (:compile-toplevel :load-toplevel :execute)
  (require :tables #.(get-code-file "tables"))
  )

The EVAL-WHEN form is used to tell the compiler and loader to execute this code before processing the rest of the file. The #. reader macro is to make sure that the pathname is generated before the compiler compiles the REQUIRE form.

Pathnames in Common Lisp

No matter which version of require you use, in order to tell it where to look, you'll need to specify file pathnames. A pathname is so called because it specifies not only the name of the file, but the path of directories and subdirectories that lead to that file.

There are two ways to write pathnames in Common Lisp. One is simple, the other is portable.

Pathname strings

The simplest and most common method is to simply write the file's pathname as a string. For example, on my Unix machine, I can write "~/lisp/require.lisp" to refer to the file require.lisp in the directory /home/riesbeck/lisp/.

String pathnames are simple. Unfortunately, they're not portable across machines. To refer to the same file on my Macintosh, I have to say "Riesbeck's HD:MCL 2.0.1:require.lisp". Not only is the path different, but the punctuation and available abbreviations are different.

Furthermore, an important thing to know about a directory path is whether is starts from "the top," or "where you are right now." The punctuation for specifying this is quite variable.

This is really not a big deal, as long as you stick to one machine, but there is an alternative.

make-pathname

The function make-pathname in Common Lisp can be used to build pathname structures (which usually print out in the form #P"...") from the basic pieces. Those pieces are:

the device, which is the disk drive name on Windows machines, e.g., C or D
the directory path, which is a list of directory names, each of which is a string. The list starts with either :absolute or :relative, depending on whether you want the path to start from the top or not.
the file name, which is a string
the file type, which is a string

Within these strings, punctuation does not mean anything special, e.g., a period does not indicate the start of the file type.

For example, here's how I would specify the pathname for require on my Macintosh:

(make-pathname
      :directory '(:absolute "Riesbeck's HD" "MCL 2.0.1")
      :name "require"
      :type "lisp")

When make-pathname is important

make-pathname is particularly important if you are writing functions that generate pathnames.

For example, suppose you want your program to do the following:

when the program starts, it asks for the user's name
if a "log file" exists for that user, it loads it. The log file enables the system to restart the user where they last left off.
Otherwise, the program creates the log file.
Later, when the user quits, where they were in the program is saved in the log file for future use.

We could create the log file name like this:

(defun make-log-file (user-name)
      (concatenate 'string
          "/home/demo/log-files/"
          user-name
          ".log"))

For example,

> (make-log-file "riesbeck")
    "/home/demo/log-files/riesbeck.log"

Unfortunately, this code would only work on a Unix machine. We'd have to rewrite it for Macintosh's and DOS/Windows.

Furthermore, this code would have to be edited and recompiled every time we moved where we wanted the log files stored.

Here's the better way to do it:

(defvar *log-file-path*
      '(:absolute "home" "demo" "log-files"))
     
    (defun make-log-file (user-name)
      (make-pathname
        :directory *log-file-path*
        :name user-name
        :type "log"))

Not only does this work on most machines, but we can redirect where log files are stored by simply resetting *log-file-path*.