Problem Set 2: Due 5:00PM Thursday, October 23

  1. Your boss sends you this e-mail: "I want a model that can predict when users are likely to return to our Website , based on which pages they visit. Mail me something by tomorrow morning." Answer the following three questions in 2-4 sentences each.
    1. Assume the dataset you have easy access to is a log of <user ID, url, timestamp> triples compiled over several weeks. How would you transform the data for use by a machine learning algorithm? Be specific. (2 points)
    2. Given the choice between neural networks and decision trees for this task, which would you choose? Give two reasons why. (2 points)
    3. What statistics would you measure to convince your boss that your model was successful at the task? (1 point)
  2. Consider a pair of perceptrons defined over continuous-valued ordered pairs (x1, x2). Perceptron P has weights (w0, w1, w2) and perceptron P' has weights (w0', w1', w2'). If the biases w0=w0'=1, and the weights w1=1 and w1'=1/2, under what conditions on w2 and w2' is P' more_general_than P? The more_general_than concept was defined in Mitchell, Chapter 2. (2 points)
  3. Consider a sequential covering algorithm (such as CN2) and a simultaneous covering algorithm (such as your decision tree learner).
    1. Describe a data set for which you would expect a sequential covering algorithm to outperform a simultaneous covering algorithm.
    2. Describe a different data set for which you would expect the opposite to be true.
    . (4 points, +2 potential points of extra credit for particularly insightful or clear answers).
 

Submit your homework via email to f-iacobelli@northwestern.edu. Put EECS349-PS<problem set number>-<first name>-<LastName> on the subject of your email and attach a compressed ZIP file with the solution. The ZIP file naming convention is: PS<problem set number>-<first name>-<LastName>.zip. For example, if your name is James Bond and you are submitting your solution to Problem Set 2, you will send the TA an email with EECS349-PS2-James-Bond as the subject and you will attach the file PS2-James-Bond.zip which contains all the files that comprise your solution to Problem Set 2.