Due Wednesday, January 27 at 11:59 PM via e-mail to BOTH jiangxu2011 at u.northwestern.edu and ddowney at eecs.northwestern.edu. Use EECS 395/495 Homework 2 as the e-mail subject line. PDF format preferred, though Plaintext, Word, and HTML are also acceptable.

1. (2 points) Exercise 3.4

2. (2 points) Exercise 3.5

3. (1 point) Exercise 3.6

4. (2 points) Exercise 3.15

5. (1 point) Exercise 4.10 (only the strong union part)

6. Project Exercises.
a. (1 point) First, divide your data set into a "training and development" portion, and a "test" portion. Randomly choose 200 examples or 30% of your data set (whichever is more data) and set this aside as a "test set" (you won't use this test data until the end of the quarter). Briefly describe how you made the random selection (incl. code if applicable) and report on how large your test set is and how large your training/dev set is.

b. (3 points) Read Box 3.C and 3.D in the textbook. Then, attempt to design a Bayesian Network graph for your project task. If your data set has a large number of features, you should select what you think are 8-10 important ones for this exercise.
(i) Provide an illustration of your Bayes Net graph. Briefly describe one specific aspect of your graph that provides sparsity -- that is, particular nodes or edges that significantly reduce the number of modeling parameters required, when compared with a full joint model.
(ii) Based on your experience constructing the graph, what are 1-2 other features that would likely be helpful for your task, and where would they fit in the graph?
(iii) Describe a "hidden" node that might be useful for your task (this node may or may not have be part of your original graph in part (i)). What information would the node capture, and why would the node be helpful in the model?

c. (3 points) Here, you'll try to add CPTs to a portion of your graph from part (b). Important: use your domain intuition, rather than your data set, for this exercise.
(i) Take a subgraph containing three important nodes from your graph in part (b), and manually specify reasonable CPTs for each node. Show the subgraph and the CPTs.
(ii) Based on these CPTs, compute the full joint distribution for one particular random variable combination. For example, if your model expresses P(SodaConsumption, Cavity, BearsWin), you might evaluate P(SodaConsumption=High, Cavity=Yes, BearsWin=No). Do you think the probability is realistic?