D64 Assignment #4

Due Wednesday, March 1, 2000 at 4:00pm.

The assignment should be turned in before class on Wednesday.

Problem Set

  1. (30 pts.) It's your job to implement an aggregation module for a query evaluator.  The module will be used to compute an average of one column with GROUP BY on another column.  The output of the module will be rows with two columns, the average value and the group by value.  You may assume that the input relation is already sorted in ascending order of the group by column.

    You must implement three functions:

    You can assume that the following three global variable are available to you and are set correctly:

    You also have the following functions that you can call (in addition to the iterator calls on the input):

    For example, if the user asks the query:
    SELECT AVERAGE(gpa),class
    FROM Student
    GROUP BY class
    your module would be the root of the query plan. groupByColumnNum would be the column number of the class column and averageColumnNum the column number of the gpa column. Show the psuedo-code for each of the three routines you have to write.

  2. (80 pts.) This question is about evaluating following SQL query:
    SELECT *
    FROM Student,Dorm
    WHERE year = 1
      AND Student.dorm = Dorm.name

    Here is some info you may need for this problem:

    Answer the following questions
    1. (40 pts.) Determine the I/O cost of computing the answer to the SQL query using four different join algorithms: sort-merge join, hash join, index join, and nested loop join. Show all of your work. You may assume that you have 20 free buffers.
    2. (40 pts.) Repeat question 1 assuming that you have 100 free buffers.