The DO iterative form in Lisp is very flexible, with a simple syntax and clean semantics. Many students however have trouble when they first begin trying to use it. The following method are intended to help you get from an iterative algorithm to a well-formed and efficient DO loop.
Why Do DO?
Here are two examples of simple loops, in DOLIST and DO form.
|Find first odd number|
(dolist (x lst) (when (oddp x) (return x)))
(do ((ll lst (cdr ll))) ((oddp (car ll)) (car ll)))
|Collect odd numbers|
(let ((odds nil)) (dolist (x lst odds) (when (oddp x) (push x odds))))
(do ((ll lst (cdr ll)) (odds nil (if (oddp (car ll) (cons (car ll) odds) odds)))) ((null ll)) odds))
Clearly there is nothing hard to read or understand about the DOLIST versions. It is also the case that there is nothing hard to read about the DO versions, if you know DO. If you don't, then the lack of symbols like WHEN and RETURN can be confusing.
So why do I push all students to master DO? Because
- These easy loops are not valid tests. You shouldn't use DO or
DOLIST for either. Common loop patterns have sequence functions
to do them more clearly than either form.
- Use FIND-IF- for the first example.
- Use REMOVE-IF-NOT for the second example.
- Even in these simple cases,
DOLIST, like its cousin DOTIMES, is too weak
to do the job alone. You need to add
- in the first case, a non-local RETURN exit, buried in a conditional
- in the second case, an outer LET and a side-effecting assignment, buried in a conditional
- With DO,
- only a conditional is needed
- there are no buried assignments statements
- there is no buried non-local exit
- the variables that are being changed are listed at the top-level
- the exit condition is listed at the top level
- there is no need to worry about the order of variable clauses
- As a result, when there is a tricky iteration case that standard sequence looping functions can't handle, DO will more likely produce robust, functional programming friendly looping than DOLIST or DOTIMES.
DOLIST and DOTIMES are great when you need to loop over one list or N times and do some side effect, like printing or updating an array. As soon as you need to manage several variables, or iterate in a non-standard way, use DO. And the best way to know how to write a clean DO in complicated cases, is to practice writing them in simple cases.
The Structure of DO
The syntax of DO is
(DO ((var1 init1 incr1) ; the variable clauses (var2 init2 incr2) ...) (exit-test exit-exp exit-exp ...) ; the exit clause body-exp ; the body body-exp ...)
The semantics is
- Evaluate all the init forms in the variable clauses.
- In parallel, bind all the var's to their init values.
- Evaluate exit-test.
- If it's true,
- Evaluate the exit-exp's
- Exit the loop, returning the value of the last exit-exp.
- Otherwise, evaluate the body-exp's.
- Evaluate all the incr forms.
- In parallel, bind all the var's to the corresponding incr values
Turning an Iterative Algorithm into a DO
First, you need to identify the following elements of your algorithm.
Identify the steppers
A stepper is a variable that controls how long the loop runs. It might be a variable stepping through a list or a counter stepping from 0 up or N down. Each stepper has
- a start value, e.g., 100 or some list
- an end test, e.g., i = 0 or l is NIL
- some formula for "stepping" the variable, e.g., decrementing i or setting l to its cdr
Many loops have only one stepper, but they might have several, e.g., a loop that steps through a list and also has a counter.
What are the accumulators?
An accumulator is a variable or variables into which you collect your answer. Each accumulator has
- a start value, e.g., 0 or NIL
- some formula for "adding" a value, e.g., adding a number, or cons'ing on a new element
- optionally, some test that has to be true for the value to be added
Most loops have at most one accumulator, but you can have more, e.g., a loop that collects odd numbers into one list and even numbers into another.
Normally, test results, e.g., finding an item with a particular property, should not be thought of as accumulators. There's a better way to handle them.
What are the exit tests?
An exit test says when the loop should stop. There are two kinds of exit tests. First, there are tests for the end points of the steppers, e.g.,
- (NULL L) for list stepping
- (>= I end) for counters going up
- (<= I 0) for counters going down
You always have at least one stepper exit test. You rarely have more, even with two or more steppers, because usually just one stepper, e.g., the list stepper, is in charge.
Second, there may be exit tests for special conditions, e.g., finding an odd number. If you have this situation, then make the exit test an OR. The stepper exit test comes first, then the special test, e.g.,
(OR (NULL L) (ODDP (FIRST L)))
Use exit tests instead of a pseudo-accumulator holding the result of some test.
What is the return value?
Every loop returns something. Typically, it will be the value of some accumulator. If you have a loop with no accumulator, i.e., you have a loop that exits when some special test is true, then the return value will be determined by whether the test was ever true or not.
Make sure you understand what the return value should be if the loop executes 0 times.
Putting the pieces together
Make a DO variable clause for each stepper. The init should be the start value, and the incr should be whatever expressions "bumps" the stepper appropriately. We'll get to where to put the end expression shortly. Some examples:
- (L input-list (REST L)) for list stepping
- (I start (1+ I)) for counters going up
- (I start (1- I)) for counters going down
Make a DO variable clause for each accumulator. The init should be the initial "empty" value. It should be the value you would return if the loop were to execute zero times. A common mistake here is forgetting about the empty loop case, but it's essential to building a clear and general DO.
The incr should be whatever expressions adds to the accumulator appropriately on every iteration. A common mistake here is writing an expression that returns the correct value sometimes but returns NIL in other cases, where, normally, NIL is not the right value.
- (RESULT NIL (CONS value RESULT) for collecting a list of values
- (SUM 0 (+ SUM value)) for accumulating a sum
- (RESULT NIL (IF test(CONS value RESULT) RESULT) for collecting a list of values satisfying a test
Now put your exit tests in the exit clause. If you have both stepper end tests and special end tests, then make the exit test an OR. Put the stepper exit test first, then the special test, e.g.,
(OR (NULL L) (ODDP (FIRST L)))
Finally, put in the exit expression in the exit clause. This is task-dependent, but for loops without a special exit test, the exit value is usually is one of the following forms:
- (NREVERSE RESULT) for a list accumulator
- SUM for a sum accumulator
For loops with a special exit test, the exit value is based on whether or not we reached the end of a stepper or the special condition. Typical return values with loops involving special exit tests, from simplest on, are:
- (NULL L) returns T when the stepper rather than the special exit test stopped the loop
- (NOT (NULL L)) returns T when the special exit test stopped the loop
- (IF (NULL L) val1 val2) returns val1 when the stepper stopped the loop and val2 when the special exit test stopped the loop
Note that the above method never generates a DO body. A DO body is only needed when your loop is not just accumulating a value or stopping on a special case. If instead your loop is being used to perform a side effect operation (printing, changing an array or table, ...), the side effect should go in the body, BUT it's probably the case a DOLIST or DOTIMES would be a better choice.