Probabilistic Reasoning in AI (Winter 2002)

Programming Assignment 1

[40 points] Knowledge Engineering (adapted from Russell and Norvig, Artificial Intelligence: A Modern Approach)

You are a consultant for an auto insurance company. Your task is to construct a belief network that will allow the company to assess how much rish they run from various policy holders. In this domain, risk is assessed by three major variables: medical cost, property cost and intangible liability cost. Medical and property costs are incurred by all individuals involved in an accident. Auto theft or vandalism might also incur property costs. Intangible liability costs are legal penalties for things like "pain and suffering", punitive damages and other costs that a driver might incur in an accident in which he or she is at fault. Evidence variables for this doamin are: driver's age and record; whether he or she owns a car, how far they drive, vehicle's make, model and year, presence of safety equipment (e.g. air bag), where vehicle is garaged and whether it has an anti-theft device.
1. Build a network for this problem. Note that you will have to make choices considering the domains of the variables, conditional independencies, and whetehr or not to add intermediate nodes (e.g. driving skill). You can use JavaBayes, the Matlab Bayes net toolbox or Microsoft's Bayes net package to input your network. In the homework report, show your network and justify your choices.
2. Construct a few queries (at least 3) and perform exact inference on this network, with the method of your choice. Explain the results you are getting
3. If you are not satisfied with the results and you re-structure your network, describe in the report how you did the restructuring
[60 points] Approximate inference
Consider the Sprinkler network from the lecture notes. For this problem, you should implement likelihood weighting and Gibbs sampling for this network, in the language of your choice. Note that you will need a data structure for holding the network. You will also need to keep probability tables for each node. This data structure does not have to be general, and you do not need to do fancy I/O. Just input the network as given, directly in your code.
To test your code, first compute (by hand) $P(\lnot Cloudy|WetGrass)$.
1. Generate 1000 instances. Use likelihood weighting to compute the required probability using 100, 200, $\dots$ 1000 instances. Repeat the generation process 5 times. Plot a graph with the amount of data on the x axis, and the estimated probability on the y axis. The graph should have 4 curves: the correct value of the probability, and the curves representing the average, minimum and maximum estimated probability over the 5 runs.
2. Repeat the same experiment with Gibbs sampling. Use two different settings for the duration of the burn-in period (e.g. 10 and 50). Plot a similar graph to the one described above.
3. Explain the differences between the graphs (if any). Comment on the time duration fo running the two algorithms
Please submit your documented code via e-mail. The report can be handed in on paper. No credit will be given without the code. Each inference method is worth 25 points. The report quality (graphs etc) is worth 10 points.

This assignment is worth 10% of your final grade

Prof. Doina PRECUP

Last modified: Mon Mar 4 15:00:59 EST 2002