EECS 311: DATA STRUCTURES |
Home Course Info Links Grades |
Lectures Newsgroup Homework Exams |
The exact test cases for this assignment are still being worked out.
The goals of this assignment are:
Be sure to follow the 311 coding standards.
There are three parts to this assignment:
Pair-programming is allowed if it is true pair programming, i.e., coding done by both partners together, on one computer, switching every 20 minutes or so. The analyses should be done individually.
An octree is a variation of an quad-tree, suitable for storing three-dimensional data. They are very simple data structures. No rotations to worry about, just pointers, points, and some way to keep track of what region of space any node in the tree represents.
You'll obviously need a class for octrees, but equally important to keep things simple will be at least two more classes:
point3d
class for three-dimensional coordinate pointsregion3d
class for three-dimensional regionsThe points are obvious because that's what we're storing in the octree. The regions are what we'll use to query our octree.
Your point3d
can be a simple struct
.
Your region3d
can also be a struct
,
with a 3-dimensional point P and a size N,
representing a cube with width, height and depth N and center at P.
We'll use a cube to keep the arithmetic simple. A more typical solution would
support general cuboids or even spheres.
Your octree
should support at least the following
public member functions:
void insert(point3d point)
that adds the given
point to the tree, following the rules for octrees. Duplicate points
should be allowed.template <Iter> int query(region3d region, Iter iter)
that takes a region and an insert iterator,
and stores in iter
all points, if any, that are in the given region. query
should return how many points were found.Each node in the octree needs to hold eight pointers to the eight octants, in some standard order. In addition, you'll need to decide how you want to store the actual points in the tree, and how you'll keep track of the dimensions each node represents. Note that if you know the dimensions of the root node, all the other dimensions are directly determined by their location in the tree.
Octrees are quad-trees extended to three dimensions. Quad-trees are described in exercises 12.39 and 12.40 on page 562. An example of quad-trees applied to two dimensional video games is given here.
A good description of octrees is here.
An excellent Powerpoint presentation with an empirical and analytical comparison of octrees and a number of other approaches to storing three-dimensional points is here.
Octants (and quadrants in quad-trees) are formed repeatedly until points fall into separate bins. A maximum depth is necessary to avoid very close points creating arbitrarily deep trees.
Do as formal an analysis as you can of the upper and lower complexity bounds of your algorithm for space and time. The number of points stored N is clearly important, but not the only relevant factor. Think about what else affects the space and time requirements. Include and refer to the key lines of your code that you believe dominate the complexity. Don't counting reading and printing, but definitely include the time to insert a point. Don't worry about deletion.
Run your program on some real data and a larger set of randomly generated data.
Use PhotoObjects.zip for the real data set. Each line of the file gives galactic coordinates for a galactic object in the Sloan Digital Sky Survey. I found the file here.
The coordinates in this data fall within the following bounds:
Axis | Minimum | Maximum |
---|---|---|
X | -0.998 | -0.994 |
Y | -0.105 | -0.069 |
Z | -0.022 | -0.003 |
For your second run, generate and store 100,000 random points in the same ranges.
Your program should run standalone, without user input. It should do two runs, one with real data, one with the randomly generated data. For each run it should
You can organize your code into multiple .cpp and .h
files, if that makes sense, but be sure to define
main()
in a file called main.cpp.
Note the use of a templated query()
member function.
This is so you can pass an insert operator to any container you
create to collect results. The simplest thing is to pass a
back_inserter
for a vector.
Use Weiss' Random class to create the random data. This
class is described in Section 10.4.1, with comments on why it might
be better than what your system comes with. I put the code in
Weiss-Random.zip.
It includes a test program you should compile and
run to make sure it works. Be sure to include the Random.cpp
in your project, and put Random.h
in your project
directory.
There are two separate attachments to email: your code, in a compressed archive, and your algorithmic analysis, in a text document. These are described below. Both should be emailed to me with the Subject line EECS 311 PA2.
Submit a zipped or tar-gzipped archive of one directory called PA2
containing:
If you worked alone, call the archive
last-name_first-initial_pa2.zip (use .tgz
for tar-gzipped files). Example: riesbeck_c_pa2.zip.
If you pair-programmed, call the archive
last-name1_last_name2_pa2.zip (use .tgz
for tar-gzipped files). Example: kornhauser_tarzia_pa2.zip.
Make sure your code is compile-and-run.. That means
"PhotoObjects.txt"
,
not absolute file paths, like "c:/eecs311/projects/pa2/PhotoObjects.txt"
Do not include PhotoObjects.txt
! My email account
is overfull as it is.
Every source file should have an opening header comment of the following form:
// Filename: main.c (or whatever) // EECS311 PA2 solution: Simple octree // Date: April 3, 2007 // Authors: Dan Kornhauser, Stephen Tarzia // Programming sessions: 4/1/07: 4-6pm; 4/2/07: 3-6pm; 4/3/07: 10am-noon // // Except for code provided by the instructor or textbook, this code was written // solely by the authors listed above, and not copied from another student // or external source.
Write an algorithmic analysis of the parts of the code that dominate the running time. The key independent variable is the number of points. Use "big-Oh" notation, as shown in the book. Include the key lines of code that you're analyzing. Compare your theoretical results to your empirical results. Ideally they should match, but don't claim a match that isn't there. It's better to recognize a problem with the analysis or empirical measurements than to fudge or misread the data.