Please make sure your final output file is a pdf document. You can submit handwritten solutions for non-programming exercises or type them using R Markdown, LaTeX or any other word processor. All programming exercises must be done in R, typed up clearly and with all code attached. Submissions should be made on gradescope: go to Assignments \(\rightarrow\) Homework 5.
Simulate data assuming \(\boldsymbol{y_i} = (y_{i1},y_{i2})^T \sim \mathcal{N}_2(\boldsymbol{\theta}, \Sigma)\), \(i = 1, \ldots, 100\), with \(\boldsymbol{\theta} = (0,0)^T\) and \(\Sigma\) chosen so that the marginal variances are \(1\) and correlation \(\rho = 0.8\).
Part (a): Estimate the MLE of \((\boldsymbol{\theta}, \Sigma)\). Make two contour plots, one using the true values of \((\boldsymbol{\theta}, \Sigma)\), and the other using the MLEs. Comment on the differences, if any. (2 points)
You are not expected to derive the maximum likelihood estimators for the two parameters from scratch. If you don’t know what the MLEs are, you can look up the forms (just the forms!) online.
Part (b): Assuming independent normal & inverse-Wishart priors for \(\boldsymbol{\theta}\) and \(\Sigma\), that is, \(\pi(\boldsymbol{\theta}, \Sigma) = \pi(\boldsymbol{\theta}) \pi(\Sigma)\), run Gibbs sampler (hyperparameters up to you but you must justify your choices) to generate posterior samples for \((\boldsymbol{\theta}, \Sigma)\). (2 points)
Part (c): Compare Bayes estimates of \((\boldsymbol{\theta}, \Sigma)\) to MLE and truth. Comment on the similarities and differences. How much influence do you think your prior had on the Bayes estimates in particular? (2 points)
Part (d): Given \(y_{i2}\) values for 50 new (test) subjects, describe in as much detail as possible, how you would use the Gibbs sampler to predict new \(y_{i1}\) values, given the \(y_{i2}\) values, from the “conditional posterior predictive distribution” of \((y_{i1} | y_{i2})\). (1 point)
You should view this as a “train –> test” prediction problem rather than a missing data problem on an original data.
Hoff problem 7.4.
You can find the data mentioned in the question here: http://www2.stat.duke.edu/~pdh10/FCBS/Exercises/.
For 7.4(d), parts (ii) and (iii) are NOT required but feel free to attempt for practice; you are only required to submit answers to part (i). For 7.4(d)(i), there is no need to read or complete the entire Exercise 7.1. Simply rely on the form of the Jeffrey’s prior (from class slides) and the corresponding full conditionals (from the discussion session).
Rejection sampling. Recall from class that the idea of rejection sampling is to create an importance/proposal/envelope function which is greater than or equal to the distribution we wish to sample from at all points, and which we can sample from easily. Suppose we end up with the following form for our posterior: \[f(\theta) \propto sin^2(\pi \theta), \ \ \ \theta \in [0,1],\] where \(\pi = 3.141593\). This density looks like:
20 points.