Final Project - Progress Report 2

Final Project - Progress Report 2
Forrest Sondahl
Multi-Agent Modeling (2005)

Summary:

I didn't make quite as much progress as I had hoped, since a problem came up. Here are the goals that I was hoping to accomplish this week:

expand the function and terminal sets considerably
deal with building and executing of commands and command blocks (will need to be special-cased somehow)
deal with command structures such as "ifelse", and "ifelse-value" (and running boolean expressions)
start work on performing the genetic cross-over operation between two trees.

I decided that (1) wasn't really a priority, and that the makeup of these genetic programs would vary considerably depending on the application they were designed for. So I'm putting it off until the structure for my maze solver is clearer. I did (2) and (3) -- (3) is where I ran into the problem, which will be explained in the details section. And sadly, I didn't get to (4) (with luck, I may have time this afternoon to start on it...). On the positive side, I overhauled the random tree generation code, so that it is much cleaner, and so that each each syntax element is chosen with a weighted probability.

(Random codeturtles are on the loose!)

Details:

Last time I discussed how I had written "careful-run-result-tree" and was going to write "careful-run-command-tree". The purpose of these procedures was to execute the code of the program in a trickle down progression, starting with leaf nodes, catching runtime errors as it goes along, and instead of having the code choke on the runtime errors, it would just report a default value (like 0, or false). Let me remind you that the alternative to my recursive "careful-run..." procedures is the "compile-tree" procedure, which turns the whole tree into a valid netlogo program string, which can be executed with a single "run" command. The downside to the "compile-tree" option, is that if a runtime error occurs on the second line of the program, then the rest of the program will not be run. I did write the "careful-run-command-tree" procedure, and it was working well until I came to the "if" statement. I realized then that I couldn't just recursively run "careful-run-command-tree" or "careful-run-result-tree" on all of the arguments of the command, unconditionally. Rather, because "if" is a control structure, it determines which code gets run. So I would need to first evaluate the boolean test value of the "if", and on the basis of that value, either run the codeblock, or not. The "ifelse" command was going to have a similar issue. And it gets worse, because if I wanted to allow for turtles to perform "repeat" loops, then that would require evaluating the number of times the loop should be run, and then running the codeblock that number of times. And what about other kinds of loops? And if I were to implement local variables, how would their scoping be treated across multiple calls to "run" and "run-result"?

[ Side note: I have mixed feelings on loops. On the one hand, they are a powerful tool in programming, so if I don't allow loops, then the programs code turtles can run will be considerably limited. On the other hand, loops can get out of control, and cause code turtles to run for long periods of time. I have been starting to think of the codeturtles code as a "behavior rule set" that will be run over and over again in each time step. So instead of trying to evolve a program which governs the whole of a codeturtle's life, I think it might be better to just evolve rules that codeturtles live by. This does not mean that the codeturtles cannot have any memory -- it is quite possible to set it up so that the codeturtle can change its turtle-owned variables in each time step, and refer back to them in the next. This breaks the turtles lifetime up into nice measurable timesteps, and all of the codeturtles will synchronize at the end of each ask [] block. This could also allows us to perform some sort of incremental fitness function, that sums up the fitness of a turtle across its whole lifetime, instead of trying to measure it solely at the end of a simulation. It also appears that the display is not updated during a "run" primitive command, so if we had the codeturtles run a whole program at one go, we wouldn't be able to watch them as they did it (not sure on this yet). All that said, because of the design choice that I ended up making, it is possible to have loops in the codeturtles code-- however, I may choose to leave them out of the syntax lists, or perhaps only to allow loops with 2 or 3 repititions. ]

The point was, to deal with control structures in my trickle-down run functions, I would be, to some small degree, creating an interpreter for the NetLogo language. For the most part, because of the "run" and "run-result" primitives, I had been able to avoid this. It seemed a bit unfortunate to be reinventing the wheel, for the purpose of supressing runtime errors, but I decided that as far as control structures went, I really might not need anything more than "if" and "ifelse", so I wrote their special case code, and denoted them as "TYPE_CONTROL" rather than "TYPE_COMMAND" in my code, leaving room to add more such control structures as needed.

This worked. However, I was about to discover a new flaw with my "careful-run..." strategy. Efficiency. There must be a fair amount of overhead involved on each call to "run" and "run-result", because I discovered that when I ran codetrees with the "careful-run..." recursive functions, they took more than twice as long to run, compared to if I compiled them and ran them whole. This was quite a let-down, since I'd spent a fair amount of time on this clever "run-carefully" system, to circumvent runtime errors. However, if we are to have a hope of genetic programming producing good results, then efficiency is key. There need to be many agents in the population, and we need to go through many generations before intelligent code-turtles can emerge. So I bit the bullet, and switched back to the "compile-tree" and run approach. But this means that again, we have to deal with runtime errors that the programs might generate. So I took a different approach. To get rid of the infamous and ubiquitous "division by zero" error, I changed it so that codeturtles were no longer able to use the " / " operator. Instead, I defined a new reporter called "safediv":

to-report safediv [ num div ]
report ifelse-value (div != 0) [ num / div ] [ 0 ]
end

"savediv" is the division operator, redefined such that X / 0 = 0. If you were a math major, as I was, then such a redefinition naturally causes trepidation. However, the fact that division in the mathematical sense does not work like this is quite unimportant. All we are trying to do, is provide the codeturtles with useful tools to build programs out of. And turtles aren't particularly interested in whether 1 / 0 yields infinity or 0 or anything else. Similarly, we can write wrapper functions for other syntax elements that might emit errors. e.g.

to-report safesqrt [ num ]
report sqrt (abs num)
end

In short, if we are careful, we can prevent code turtles from ever having errors. Thus we can use the "compile-tree" method, and gain the efficiency we need. (Even with compile tree, I have some concerns about the time it may take for large populations to run many generations -- slowness is an inherent problem with an interpreted language -- actually doubly interpreted in my case -- and it is just something that I may have to deal with. Hopefully my concerns will prove ungrounded.)

To finish up, let me quickly revisit the information from the last progress report, noting details that have now changed.

1) For the time being I have decided to drop the "VARIABLE" "AGENT" and "AGENTSET" types. They can always be added in later if need be, but most anything that could be done by them can be accomplished by writing custom procedures that are used by the codeturtles instead.

2) There are now just three lists of syntax elements, one for each of the return types: VOID, NUMERIC, and BOOLEAN. Also the format has changed (just for the syntax lists, not for the nodes of the codetrees that are created). It was redundant to have the RTYPE, since everything in the list had the same RTYPE. Also, now the last item of each list is a number, which defines the weight that each element has relative to the other syntax elements in the list. The weight determines the likelihood of being chosen when a random tree is built. So, for example, we now have:

set syntaxlist_numeric (list
(list "abs" TYPE_REPORTER RTYPE_NUMERIC 10)
(list "random" TYPE_REPORTER RTYPE_NUMERIC 5)
(list "+" TYPE_OPERATOR RTYPE_NUMERIC RTYPE_NUMERIC 10)
(list "-" TYPE_OPERATOR RTYPE_NUMERIC RTYPE_NUMERIC 10)
(list "*" TYPE_OPERATOR RTYPE_NUMERIC RTYPE_NUMERIC 10)
(list "safediv" TYPE_REPORTER RTYPE_NUMERIC RTYPE_NUMERIC 10)
(list "0" TYPE_TERMINAL 10)
(list "1" TYPE_TERMINAL 10)
(list "2" TYPE_TERMINAL 10)
(list "3" TYPE_TERMINAL 10)
(list "who" TYPE_TERMINAL 20)
)

On average "random" should appear in our initial codeturtles code only half as often as "abs" or "+" or "3", and "who" should appear twice as often as most of the elements, and four times as often as "random".

Random Notes:

I did finally have time to examine some existing NetLogo models that have a similar flavor to my own project -- Echo, and Hill's ARS-Genetics. I found both models intresting. I was happy to note, though, that my project is crossing uncharted waters -- the current models employ genetic algorithms, but none of them specifically employ "genetic programming."

Conclusion:

My code feels much cleaner now, except that I have two rather beautifully written recursive functions "careful-run..." that I am no longer using. I will probably discard them eventually, and then the code will be even shorter and cleaner, but it's hard to let go... ;) In any case, in the next week I need to buckle down and get to work on:

The three genetic operations (crossover, mutation, copying)
The mechanism by which generations occur
How to incorporate fitness functions (incremental vs. after done?)
Groundwork for the maze

Random maze generation? (probably too much work)
Handcode some mazes? Read them in from files?
Decide what functions and terminals codeturtles should use in the maze...

Back to Forrest's NetLogo Main Page.