EECS 311: DATA STRUCTURES

Programming Assignment 2

The exact test cases for this assignment are still being worked out.

Goal

The goals of this assignment are:

Be sure to follow the 311 coding standards.

Task

There are three parts to this assignment:

Pair-programming is allowed if it is true pair programming, i.e., coding done by both partners together, on one computer, switching every 20 minutes or so. The analyses should be done individually.

Implement an Octree

An octree is a variation of an quad-tree, suitable for storing three-dimensional data. They are very simple data structures. No rotations to worry about, just pointers, points, and some way to keep track of what region of space any node in the tree represents.

You'll obviously need a class for octrees, but equally important to keep things simple will be at least two more classes:

The points are obvious because that's what we're storing in the octree. The regions are what we'll use to query our octree.

Your point3d can be a simple struct.

Your region3d can also be a struct, with a 3-dimensional point P and a size N, representing a cube with width, height and depth N and center at P. We'll use a cube to keep the arithmetic simple. A more typical solution would support general cuboids or even spheres.

Your octree should support at least the following public member functions:

Each node in the octree needs to hold eight pointers to the eight octants, in some standard order. In addition, you'll need to decide how you want to store the actual points in the tree, and how you'll keep track of the dimensions each node represents. Note that if you know the dimensions of the root node, all the other dimensions are directly determined by their location in the tree.

Resources

Octrees are quad-trees extended to three dimensions. Quad-trees are described in exercises 12.39 and 12.40 on page 562. An example of quad-trees applied to two dimensional video games is given here.

A good description of octrees is here.

An excellent Powerpoint presentation with an empirical and analytical comparison of octrees and a number of other approaches to storing three-dimensional points is here.

Octants (and quadrants in quad-trees) are formed repeatedly until points fall into separate bins. A maximum depth is necessary to avoid very close points creating arbitrarily deep trees.

Analyze your algorithm

Do as formal an analysis as you can of the upper and lower complexity bounds of your algorithm for space and time. The number of points stored N is clearly important, but not the only relevant factor. Think about what else affects the space and time requirements. Include and refer to the key lines of your code that you believe dominate the complexity. Don't counting reading and printing, but definitely include the time to insert a point. Don't worry about deletion.

Input

Run your program on some real data and a larger set of randomly generated data.

Use PhotoObjects.zip for the real data set. Each line of the file gives galactic coordinates for a galactic object in the Sloan Digital Sky Survey. I found the file here.

The coordinates in this data fall within the following bounds:

AxisMinimumMaximum
X-0.998-0.994
Y-0.105-0.069
Z-0.022-0.003

For your second run, generate and store 100,000 random points in the same ranges.

Output

Your program should run standalone, without user input. It should do two runs, one with real data, one with the randomly generated data. For each run it should

Additional Requirements

You can organize your code into multiple .cpp and .h files, if that makes sense, but be sure to define main() in a file called main.cpp.

Note the use of a templated query() member function. This is so you can pass an insert operator to any container you create to collect results. The simplest thing is to pass a back_inserter for a vector.

Use Weiss' Random class to create the random data. This class is described in Section 10.4.1, with comments on why it might be better than what your system comes with. I put the code in Weiss-Random.zip. It includes a test program you should compile and run to make sure it works. Be sure to include the Random.cpp in your project, and put Random.h in your project directory.

What to Hand in

There are two separate attachments to email: your code, in a compressed archive, and your algorithmic analysis, in a text document. These are described below. Both should be emailed to me with the Subject line EECS 311 PA2.

Code

Submit a zipped or tar-gzipped archive of one directory called PA2 containing:

If you worked alone, call the archive last-name_first-initial_pa2.zip (use .tgz for tar-gzipped files). Example: riesbeck_c_pa2.zip.

If you pair-programmed, call the archive last-name1_last_name2_pa2.zip (use .tgz for tar-gzipped files). Example: kornhauser_tarzia_pa2.zip.

Make sure your code is compile-and-run.. That means

Do not include PhotoObjects.txt! My email account is overfull as it is.

Every source file should have an opening header comment of the following form:

// Filename: main.c (or whatever)
// EECS311 PA2 solution: Simple octree
// Date: April 3, 2007
// Authors: Dan Kornhauser, Stephen Tarzia
// Programming sessions: 4/1/07: 4-6pm; 4/2/07: 3-6pm; 4/3/07: 10am-noon
// 
// Except for code provided by the instructor or textbook, this code was written
// solely by the authors listed above, and not copied from another student
// or external source.

Analysis

Write an algorithmic analysis of the parts of the code that dominate the running time. The key independent variable is the number of points. Use "big-Oh" notation, as shown in the book. Include the key lines of code that you're analyzing. Compare your theoretical results to your empirical results. Ideally they should match, but don't claim a match that isn't there. It's better to recognize a problem with the analysis or empirical measurements than to fudge or misread the data.


Valid HTML 4.01 Strict