Genetic
Programming Demo
(By Forrest Sondahl, 2005)
The applet requires Java 1.4.1 or higher. It will not run on Windows 95 or Mac OS 8 or 9. Mac users must have OS X 10.2.6 or higher and use a browser that supports Java 1.4. (Safari works, IE does not. Mac OS X comes with Safari. Open Safari and set it as your default web browser under Safari/Preferences/General.) On other operating systems, you may obtain the latest Java plugin from Sun's Java site.
Created with NetLogo
* View/download model file:
Genetic
Programming Demo.nlogo
Back to Forrest's Final Project
Main Page.
This model is a simple demo of genetic programming, showing how to
use
the Genetic Programming Library for NetLogo. Turtles start at home, and
go out seeking a red goal patch. Watch them converge toward the goal,
and experiment with the genetic programming parameters. The code
contains some comments that might be useful for building your own
genetic programming model. Modelers are encouraged to use this model as
a starting point for writing their own genetic programming models.
In the world of biology, species evolve by means of natural
selection
and the interactions of DNA. Genetic Programming takes these ideas, and
applies them in the field of computer science. Given a problem, the
goal is to evolve a computer program that can solve the problem. It
starts with a population of randomly generated computer programs. (The
ingredients of these randomly generated computer programs are chosen
based on the problem that is to be solved.) Each program is run, and
its performance is measured by a "fitness function", which reflects how
good each program is at solving the problem. Then programs are chosen
from the population to "reproduce". The programs are chosen randomly,
but with a weighting mechanism that makes it more likely that the more
"fit" programs are chosen. (Analogously in the biological world,
defective organisms can get lucky and pass on their DNA, but the more
fit organisms have a better chance of doing so.) There are three forms
of reproduction that occur:
* Cloning (the child program is identical to its parent)
* Mutation (the child program has some of its code replaced by randomly
generated code)
* Crossover (two parent programs are chosen from the population, and
the child program consists of a mixture of code from the two parents)
After a full new population of programs is formed, the old population
is discarded, and each of the new programs is run and their fitness is
measured. Reproduction occurs, and the cycle continues, until a program
with a certain level of fitness is found -- namely, a program that is
good enough to solve the problem given. The word "until" implies that
such a state will be reached. This isn't necessarily true. For one
thing, the problem posed might be an impossible one (e.g. "What is the
answer to life the universe and everything?"). Or it could be solvable,
but not with the ingredients that the programs are made from (e.g.
"What is the solution to a quadratic equation?" with programs made only
of "+" and "-" operators.) But even if the solution is within the realm
of possible programs generated by the genetic programming process,
success is by no means guaranteed. Genetic Programming is a stochastic,
rather than deterministic, approach to solving problems. Currently
there are, in fact, no proofs that genetic programming works -- merely
empirical evidence showing that in some situations it does. The success
of the genetic programming process is highly dependent on the choice of
ingredients for the programs and a well designed "fitness function"
that "leads" the population in the right direction toward the goal.
Other important parameters include the size of the population, the
length of each program's code, and the relative probability of each of
the reproduction mechanisms. For more information on genetic
programming, see these web sites:
http://www.geneticprogramming.com/Tutorial/index.html
http://www.genetic-programming.com/gpanimatedtutorial.html
In many NetLogo models, agents are given predetermined rules, and
then
the emergent behaviors that form through the interactions of these
agents are studied. In contrast, this model is starts with a desired
behavior (navigating to the red square) and works on trying to discover
the code that the agents should execute to achieve this behavior,
through use of Genetic Programming, as described in the section above.
In this model, the goal-seeking programs are represented by
"codeturtles". Codeturtles each have a piece of NetLogo code assigned
to them, and it's their job to perform it. Codeturtles will then be
chosen, based on their fitness, to reproduce and create another
generation of codeturtles.
You can find the ingredients that make up the codeturtles' netlogo code
by looking in the "gpuser-setup-syntax" procedure. You can also get
insight into what example code looks like by clicking the SHOW-BEST
button.
The fitness function, which measures each codeturtles progress, is a
simple one: "What is the geometric distance to the goal square?" The
lower this number is, the more fit a turtle is.
1. Adjust the slider parameters (see below), or use the default
settings.
2. Press the SETUP button.
3A. Press the GO button to begin the simulation
3B. Press the STEP button to go through one generation at a time.
4. Watch the View, to see the codeturtles execute their code.
5. Watch the FITNESS plot, to see how close the population is to the
goal.
6. If a codeturtle successfully gets to the goal square, then the
simulation will stop. To stop it earlier, press the GO button again.
7. Press the SHOW-BEST button to see the code for the current best
(most fit) codeturtle.
Parameters:
POPULATION-SIZE: The number of codeturtles in each generation
INITIAL-CODE-MAX-DEPTH: The maximum depth of randomly generated code
trees, that codeturtles start with. (It also affects the size of
mutations that occur). In general, a larger number generally means
longer programs are created.
BRANCH-CHANCE: Controls the amount of branching in the generation of
random code trees. Again, a larger number generally means longer
programs are created.
CLONE-CHANCE, MUTATE-CHANCE, CROSSOVER-CHANCE: These three sliders
control the relative probability of each genetic operation occurring,
with respect to the others.
(Examples: If the sliders are set to 10, 10, 10, then there is a 1/3
chance of each genetic operation happening. If the sliders are set to
10, 10, 20, then there is a 25% chance of cloning, 25% chance of
mutation, and 50% chance of crossover. If the sum of these three
sliders is 100, then each slider represents the percent chance of that
genetic operation being chosen to create an offspring for the new
generation.)
FIX-RANDOM-SEED?: If true, then RANDOMSEED is used to start the
process. This allows a particular run to be reproduced exactly, and
thus examined more closely, (provided that the parameters are the
same). If false, then RANDOMSEED is not used.
RANDOMSEED: This is the number used to seed the random number
generator, if FIX-RANDOM-SEED? is true, to allow for reproducible
results.
Notes:
You cannot see the codeturtles while they are moving even if you slow
down the speed slider, because of the way their code is executed, so
what you see on the View is their positions after they have moved for
that generation. They leave a colored trail behind them, which is
cleared between generations. This shows the path that the turtles took.
The best fit codeturtle (or codeturtles, when there is a tie) in each
generation is shown as a neutral face. If a codeturtle finds the goal,
then it is shown as a smiley face.
The colors of the codeturtles have some meaning. They are initially
randomly colored, but:
* When a codeturtle results from cloning, it has the same color as its
parent.
* When a codeturtle results from crossover, it has a color averaging
its two parents.
* When a codeturtle is mutated, it has a random color.
Try playing with the various sliders. See what difference they make.
The purpose of this model is to provide a jumping off point for
writing
your own genetic programming models.
Genetic programming is a fairly versatile instrument, and can be used
to solve many different types of problems. Be creative, and good
luck!
The NetLogo feature on which this whole model stands is the ability
to
take a string of text, and run it as NetLogo code. This is achieved
through the "run" primitive.
Extensive use of recursion and lists has been employed, especially to
deal with the tree structures which codeturtles use to store code.
Since trees are not natively supported in NetLogo, they have been
implemented as nested lists.
This model is built in two parts. The first part is the "GP Library for
NetLogo", which consists of a framework of procedures that are useful
for any model that is using genetic programming. The second part
consists of procedures that are specific to this model. Since NetLogo
doesn't support any formal concept of code libraries, this separation
is largely achieved through positioning of the code, naming
conventions, and comments.
"Genetic Programming Maze Marchers" and "Genetic Programming Maze
Rules"
-- two genetic programming approaches to solving mazes.
There are also several models out there that work with Genetic
Algorithms, which are closely related to Genetic Programming. See:
"Echo" under Biology
"ARS-Genetics" and "BinaryGA" by Thomas Hills, in the User Community
Models.
Author: Forrest Sondahl
Date: November 28, 2005
Project Web Page: http://cs.northwestern.edu/~fjs750/netlogo/final/
Part of the Genetic Programming Library for NetLogo project, which
consists of a library of code that makes it easier to write genetic
programming models, as well as several sample models that demonstrate
the use of the library.
Created for the course CS 460 Multi-Agent Modeling, at Northwestern
University.
Back to Forrest's Final Project Main Page.