Bayesian regression — part I

To illustrate Bayesian regression, I’ll use an example from Simon Sheather’s new book on linear regression.  A restaurant guide collects a number of variables from a group of Italian restaurants in New York City.  One is interested in predicting the Price of a meal based on Food (measure of quality), Decor (measure of decor of the restaurant), Service (measure of quality of service), and East (dummy variable indicating if the restaurant is east or west of 5th Avenue).

Here is the dataset in csv format:  http://www.stat.tamu.edu/~sheather/book/docs/datasets/nyc.csv

Assuming the usual regression model

, ,

suppose we assign the following prior to :  we assume independence with and .  Here is the precision matrix, the inverse of the variance-covariance matrix.

How do we obtain values of the parameters ?  We’ll use a small part of the data to construct our prior, and then use the remaining part of the data to do inference and eventually do some model selection.

First, I read in the data, select a random sample of rows, and partition the data into the two datasets italian1 and italian2.

indices=sample(168,size=25)
italian1=italian[indices,]
italian2=italian[-indices,]

I use the function blinreg in the LearnBayes package to simulate from the posterior using a vague prior.
library(LearnBayes)
y=italian1$Price X=with(italian1,cbind(1,Food,Decor,Service,East)) fit=blinreg(y,X,5000) From the simulated draws, I can now get estimates of my prior. The parameters for the normal prior on are found by finding the sample mean and sample var-cov matrix of the simulated draws. The precision matrix is the inverse of the var-cov matrix. The parameters correspond to the sum of squared errors and degrees of freedom of my initial fit. beta0=apply(fit$beta,2,mean)
Sigma0=cov(fit\$beta)
B0=solve(Sigma0)
c0=(25-5)
d0=sum((y-X%*%beta0)^2)
The function MCMCregress in the MCMCpack package will simulate from the posterior of using this informative prior.  It seems pretty to use.  I show the function including all of the arguments.  Note that  I’m using the second part of the data italian2, I input my prior through the b0, B0, c0, d0 arguments, and I’m going to simulate 10,000 draws (through the mcmc=10000 input).
fit2=MCMCregress(Price~Food+Decor+Service+East,
data = italian2, burnin = 0, mcmc = 10000,
thin = 1, verbose = 0, seed = NA, beta.start = NA,
b0 = beta0, B0 = B0, c0 = c0, d0 = d0,
marginal.likelihood = c(“Chib95”))
The output of MCMCregress, fit2, is a MCMC object that one can easily summarize and display using the summary and plot methods in the coda package.
Suppose we are interested in estimating the mean price at a restaurant on the east side (east = 1) with Food = 30, Decor = 30 and Service = 30.  We define a row matrix that consists of these covariates.  Then we compute the linear predictor for each of the simulated beta vectors.  The vector lp contains 10,000 draws from the posterior of .
x=matrix(c(1,20,20,20,1),5,1)
beta=fit2[,1:5]
sigma=sqrt(fit2[,6])
lp=c(beta%*%x)
Suppose next that we’re interested in predicting the price of a meal at the same covariates.  Since we have a sample from the posterior of the linear predictor, we just need to do an additional step to draw from the posterior predictive distribution — we store these values in ys.
ys=rnorm(10000,lp,sigma)
We illustrate the difference between estimating the mean price and predicting the price by summarizing the two sets of simulated draws.  The point estimates are similar, but, as expected, the predictive interval estimate is much wider.
summary(lp)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
43.59   46.44   47.00   46.99   47.53   50.17
summary(ys)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
25.10   43.12   46.90   46.93   50.75   67.82