Genetic Programming Demo

(By Forrest Sondahl, 2005)

The applet requires Java 1.4.1 or higher. It will not run on Windows 95 or Mac OS 8 or 9. Mac users must have OS X 10.2.6 or higher and use a browser that supports Java 1.4. (Safari works, IE does not. Mac OS X comes with Safari. Open Safari and set it as your default web browser under Safari/Preferences/General.) On other operating systems, you may obtain the latest Java plugin from Sun's Java site.

Created with NetLogo * View/download model file: Genetic Programming Demo.nlogo

Back to Forrest's Final Project Main Page.

 
WHAT IS IT?
 

This model is a simple demo of genetic programming, showing how to use the Genetic Programming Library for NetLogo. Turtles start at home, and go out seeking a red goal patch. Watch them converge toward the goal, and experiment with the genetic programming parameters. The code contains some comments that might be useful for building your own genetic programming model. Modelers are encouraged to use this model as a starting point for writing their own genetic programming models.

 
WHAT IS GENETIC PROGRAMMING?
 

In the world of biology, species evolve by means of natural selection and the interactions of DNA. Genetic Programming takes these ideas, and applies them in the field of computer science. Given a problem, the goal is to evolve a computer program that can solve the problem. It starts with a population of randomly generated computer programs. (The ingredients of these randomly generated computer programs are chosen based on the problem that is to be solved.) Each program is run, and its performance is measured by a "fitness function", which reflects how good each program is at solving the problem. Then programs are chosen from the population to "reproduce". The programs are chosen randomly, but with a weighting mechanism that makes it more likely that the more "fit" programs are chosen. (Analogously in the biological world, defective organisms can get lucky and pass on their DNA, but the more fit organisms have a better chance of doing so.) There are three forms of reproduction that occur:

* Cloning (the child program is identical to its parent)
* Mutation (the child program has some of its code replaced by randomly generated code)
* Crossover (two parent programs are chosen from the population, and the child program consists of a mixture of code from the two parents)

After a full new population of programs is formed, the old population is discarded, and each of the new programs is run and their fitness is measured. Reproduction occurs, and the cycle continues, until a program with a certain level of fitness is found -- namely, a program that is good enough to solve the problem given. The word "until" implies that such a state will be reached. This isn't necessarily true. For one thing, the problem posed might be an impossible one (e.g. "What is the answer to life the universe and everything?"). Or it could be solvable, but not with the ingredients that the programs are made from (e.g. "What is the solution to a quadratic equation?" with programs made only of "+" and "-" operators.) But even if the solution is within the realm of possible programs generated by the genetic programming process, success is by no means guaranteed. Genetic Programming is a stochastic, rather than deterministic, approach to solving problems. Currently there are, in fact, no proofs that genetic programming works -- merely empirical evidence showing that in some situations it does. The success of the genetic programming process is highly dependent on the choice of ingredients for the programs and a well designed "fitness function" that "leads" the population in the right direction toward the goal. Other important parameters include the size of the population, the length of each program's code, and the relative probability of each of the reproduction mechanisms. For more information on genetic programming, see these web sites:

http://www.geneticprogramming.com/Tutorial/index.html
http://www.genetic-programming.com/gpanimatedtutorial.html

 
HOW IT WORKS
 

In many NetLogo models, agents are given predetermined rules, and then the emergent behaviors that form through the interactions of these agents are studied. In contrast, this model is starts with a desired behavior (navigating to the red square) and works on trying to discover the code that the agents should execute to achieve this behavior, through use of Genetic Programming, as described in the section above.

In this model, the goal-seeking programs are represented by "codeturtles". Codeturtles each have a piece of NetLogo code assigned to them, and it's their job to perform it. Codeturtles will then be chosen, based on their fitness, to reproduce and create another generation of codeturtles.

You can find the ingredients that make up the codeturtles' netlogo code by looking in the "gpuser-setup-syntax" procedure. You can also get insight into what example code looks like by clicking the SHOW-BEST button.

The fitness function, which measures each codeturtles progress, is a simple one: "What is the geometric distance to the goal square?" The lower this number is, the more fit a turtle is.

 
HOW TO USE IT
 


1. Adjust the slider parameters (see below), or use the default settings.
2. Press the SETUP button.
3A. Press the GO button to begin the simulation
3B. Press the STEP button to go through one generation at a time.
4. Watch the View, to see the codeturtles execute their code.
5. Watch the FITNESS plot, to see how close the population is to the goal.
6. If a codeturtle successfully gets to the goal square, then the simulation will stop. To stop it earlier, press the GO button again.
7. Press the SHOW-BEST button to see the code for the current best (most fit) codeturtle.

Parameters:

POPULATION-SIZE: The number of codeturtles in each generation
INITIAL-CODE-MAX-DEPTH: The maximum depth of randomly generated code trees, that codeturtles start with. (It also affects the size of mutations that occur). In general, a larger number generally means longer programs are created.
BRANCH-CHANCE: Controls the amount of branching in the generation of random code trees. Again, a larger number generally means longer programs are created.

CLONE-CHANCE, MUTATE-CHANCE, CROSSOVER-CHANCE: These three sliders control the relative probability of each genetic operation occurring, with respect to the others.
(Examples: If the sliders are set to 10, 10, 10, then there is a 1/3 chance of each genetic operation happening. If the sliders are set to 10, 10, 20, then there is a 25% chance of cloning, 25% chance of mutation, and 50% chance of crossover. If the sum of these three sliders is 100, then each slider represents the percent chance of that genetic operation being chosen to create an offspring for the new generation.)

FIX-RANDOM-SEED?: If true, then RANDOMSEED is used to start the process. This allows a particular run to be reproduced exactly, and thus examined more closely, (provided that the parameters are the same). If false, then RANDOMSEED is not used.

RANDOMSEED: This is the number used to seed the random number generator, if FIX-RANDOM-SEED? is true, to allow for reproducible results.

Notes:
You cannot see the codeturtles while they are moving even if you slow down the speed slider, because of the way their code is executed, so what you see on the View is their positions after they have moved for that generation. They leave a colored trail behind them, which is cleared between generations. This shows the path that the turtles took.

The best fit codeturtle (or codeturtles, when there is a tie) in each generation is shown as a neutral face. If a codeturtle finds the goal, then it is shown as a smiley face.

 
THINGS TO NOTICE
 

The colors of the codeturtles have some meaning. They are initially randomly colored, but:
* When a codeturtle results from cloning, it has the same color as its parent.
* When a codeturtle results from crossover, it has a color averaging its two parents.
* When a codeturtle is mutated, it has a random color.

 
THINGS TO TRY
 

Try playing with the various sliders. See what difference they make.

 
EXTENDING THE MODEL
 

The purpose of this model is to provide a jumping off point for writing your own genetic programming models.

Genetic programming is a fairly versatile instrument, and can be used to solve many different types of problems.  Be creative, and good luck!

 
NETLOGO FEATURES
 

The NetLogo feature on which this whole model stands is the ability to take a string of text, and run it as NetLogo code. This is achieved through the "run" primitive.

Extensive use of recursion and lists has been employed, especially to deal with the tree structures which codeturtles use to store code. Since trees are not natively supported in NetLogo, they have been implemented as nested lists.

This model is built in two parts. The first part is the "GP Library for NetLogo", which consists of a framework of procedures that are useful for any model that is using genetic programming. The second part consists of procedures that are specific to this model. Since NetLogo doesn't support any formal concept of code libraries, this separation is largely achieved through positioning of the code, naming conventions, and comments.

 
RELATED MODELS
 

"Genetic Programming Maze Marchers" and "Genetic Programming Maze Rules"
-- two genetic programming approaches to solving mazes.

There are also several models out there that work with Genetic Algorithms, which are closely related to Genetic Programming. See:

"Echo" under Biology
"ARS-Genetics" and "BinaryGA" by Thomas Hills, in the User Community Models.

 
CREDITS AND REFERENCES
 

Author: Forrest Sondahl
Date: November 28, 2005
Project Web Page: http://cs.northwestern.edu/~fjs750/netlogo/final/

Part of the Genetic Programming Library for NetLogo project, which consists of a library of code that makes it easier to write genetic programming models, as well as several sample models that demonstrate the use of the library.
Created for the course CS 460 Multi-Agent Modeling, at Northwestern University.

Back to Forrest's Final Project Main Page.