EECS 311: DATA STRUCTURES Home Course Info Links Grades Lectures Newsgroup Homework Exams

# Programming Assignment 3

## Goal

The goal of this assignment is to implement a variant form of priority queue capable of supporting updating and removing elements, while maintaining the complexity guarantees of standard priority queues.

Be sure to follow the 311 coding standards.

There are two parts to this assignment:

• Implement an updatable priority queue data structure.
• Test your queue on both artifical and real data.

Pair-programming is allowed if it is true pair programming, i.e., coding done by both partners together, on one computer, switching every 20 minutes or so. The analyses should be done individually.

### Implement an Updatable Priority Queue

A standard priority queue, such as `priority_queue` in the C++ standard library, does not let you remove items after they have been added, except by popping them off when they (eventually) reach the front of the queue.

Further, if the priority of the item is increased or decreased after it is added, there is no way to tell the queue to adjust itself.

Your job is to implement an updatable priority queue, called `u_priority_queue`. It should mirror the standard `priority_queue` template class, with a few simplifications, and two extra member functions:

• `u_priority_queue<T> upq` should make `upq` an empty updatable priority queue for data of type `T`.
• `upq` should support the following standard member functions of regular priority queues, with the same complexity: `empty()`, `size()`, `top()`, `pop()`, and `push(T value)`.
• `remove(T value)` should remove `value` from the queue. Nothing should happen if `value` is not in the queue. This operation should have worst case O(log N) behavior.
• `update(T value)` should recalculate the location of `value` in the queue and move it if necessary. Nothing should happen if `value` is not in the queue. This operation should have worst case O(log N) behavior.

You are not required to support additional template parameters, such as the container in which to store the heap or the comparison function to use. Instead, you should just use a `vector` (not an array) for the heap, and use `operator<` to compare values.

Note that updatable priority queues only make sense for compound objects, e.g., tasks with priorities or scheduled times, where the priority is just one part of the value and does not determine the identity of the value. An updatable priority queue of simple numbers would make no sense.

In order for `remove(T value)` and `update(T value)` to work in O(log N) time or less, an updatable priority queue must be able to

• find the current location of `value` in the heap in O(log N) or less time
• change the location of `value` to a new value in O(1) time, with as small a time constant as possible

The first part can be done with either a `map` or `hash_map` to associate a value with its current location.

A `map` would provide O(log N) performance, but introduces a conflict:

• The priority queue needs `operator<()` to sort values by their priority.
• The map needs `operator<()` to sort values by their identity, e.g., by id number or name.

One alternative would be to require users to specify in the template parameters the ordering predicates for priorities and identities. But this seems a bit annoying to have to do every time.

Therefore, use a `hash_map` (or `unordered_map`, if that's what your compiler supports) to map values to heap locations. Then, to store objects of type `T` in an updatable priority queue, the user just needs to make sure that:

• `operator<(const T &v1, const T & v2)` compares values by their priorities
• `hash(const T &v)` hashes values by their identifiers

A more flexible updatable priority queue could allow the user to specify template parameters for these functions, but it would not be required.

The second problem is to make changes to locations as cheap as possible. If the hashtable stores just a location, i.e., an integer, then every time a value is moved, the hashtable would have to be accessed and updated. Hashtables are O(1) but they are still much more expensive than simple vector assignments.

The trick, shown on many web sites, is to internally dynamically allocate a pair object containing the value and its location in the heap. Store pointers to that pair in the hashtable and the heap. When you need the location of a value, the hashtable gives you the pair, which gives you the location in the heap. Then, on remove or update, as the pair moves up or down the heap, the location can be updated with a simple assignment to the pair.

As always, when you allocate memory dynamically, you have to remember to deallocate it. That means whenever an item is removed, you need to `delete` the pair, and when the priority queue is destroyed, it should delete any pairs remaining in it.

## Testing

Use test_upq.cpp to test your updatable priority queue. This test code builds and tests a `u_priority_queue` with various data sets.

test_upq.cpp does two things:

• It runs a series of tests queuing, removing and/or modifying simple "event" objects with priorities.
• It then uses a priority queue to track a simulated election, using vote tallies (apparently from an actual election).

The event part is ready to run, as soon as you have a working priority queue. The election part requires you to define some additional classes to model the domain of candidates and votes.

The code and sample data files are in upq.zip.

### The Event Tests

The event testing part of test_upq.cpp is designed to help you uncover common bugs in implementing this kind of container. The code creates simple `Event` objects. These objects have just an id number and a priority. The class is defined in event.hpp and event.cpp. `operator<(const Event &e1, const Event &e2)` is defined to compare event priorities, while a hashing function is defined to use event ids. The number of events queued is controlled by `MAX_QUEUE`. It's set to 6, which is enough to uncover most common bugs in implementing heaps.

The hashing function declaration is specific to g++ 3.4.2, which is used in Dev-C++. If you are using another compiler, you will need to find out how hashing functions are defined for classes in that system.

test_upq.cpp should be very helpful when you develop and debug your queue, because it runs tests starting with very small data sets that can be analyzed by hand. test_upq.cpp illustrates an important point about testing code. Testing code should catch errors itself, rather than printing data for a (fallible) human to inspect for errors.

### The Election Test

After the event tests pass, uncomment the two lines in test_upq.cpp for doing elections. Declare the function testElection() in test_election.hppand implement it in test_election.cpp. You also need to define a class to represent a candidate, with the candidate's name and current vote tally. Use the `Event` class as a model.

The call testElection(int numSeats, const char* nameFile, const char *votesFile) should

• read and create candidates for each name in the given name file
• store the candidates in a vector for updating
• store all the candidates in an updatable priority queue
• then, until the end of the votes file
• read the next line of vote tallies
• update the candidate objects in the vector and priority queue
• print the current leader, i.e., at the top of the queue<
• Finally, it should use the queue to find and print the top numSeats vote getters

Two sample data files are provided.

• names.txt gives 24 candidate names, one to a line.
• votes.txt gives the tallies for a set of fictious precincts; each line contains the counts for the candidates, in the order given by names.txt

Here's some sample output:

```After precinct 1, leader is Toomey:550
After precinct 2, leader is ...
...

Top 9 vote-getters are:
1: xxxx:2729
2: ...
...
9: ...
```

## What to Hand in

Submit a zipped or tar-gzipped archive of one directory called last-name_first-initial_pa3, containing:

• your output in a plain text file called output.txt

If you worked alone, call the archive last-name_first-initial_pa3.zip (use `.tgz` for tar-gzipped files). Example: riesbeck_c_pa3.zip.

If you pair-programmed, call the archive last-name1_last_name2_pa3.zip (use `.tgz` for tar-gzipped files). Example: kornhauser_tarzia_pa3.zip.

Important! Make sure your archive meets the general code submission requirements.

Every source file should have an opening header comment of the following form:

```// Filename: u_priority_queue.hpp (or whatever)
// EECS311 PA3 solution: Updatable priority queue
// Date: May 9, 2007
// Authors: Dan Kornhauser, Stephen Tarzia
// Programming sessions: 5/2/07: 4-6pm; 5/6/07: 3-6pm; 5/8/07: 10am-noon
//
// Except for code provided by the instructor or textbook, this code was written
// solely by the authors listed above, and not copied from another student
// or external source.
```

Email your code, in a compressed archive, to me with the Subject line EECS 311 PA3.