### Archive

Archive for September, 2011

## Contour plots on R

In one of the homework problems, you are asked to do a Bayesian sensitivity analysis for a posterior probability with respect to the prior.   Specifically, suppose we are interested in learning about a proportion and we wish to compute the posterior probability $P(\theta > 0.5)$.  We observe the data (25 successes, 40 failures) and we want to know the sensitivity of the posterior probability with respect to the prior mean $\eta = a/(a+b)$ and the prior precision parameter $K = a + b$ where $a$ and $b$ are the beta shape parameters.

It is easy to construct a contour plot to illustrate this sensitivity analysis on R.

1.  First we set up a grid of values of $\eta$ and $K$:

eta = seq(.1, .9, by=.1)
K = c(1, 2, 8, 16, 32)

2.  Next we write a short function that computes the posterior probability $P(\theta > 0.5)$ as a function of  $\eta$ and $K$.

myfun=function(eta, K)
1 - pbeta(.5, K*eta+ 25, K*(1-eta) + 40)

3.  We now use the outer function to compute this probability on all combinations of  $\eta$ and $K$ — the result is a matrix where the rows correspond to $\eta$ and the columns to $K$.

PROB = outer(eta, K, myfun)

4.  Last, we use the contour function with inputs the two vectors and the matrix of probabilities.  We can specify which levels to draw on the contour graph.

contour(eta, K, PROB,
levels=c(0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95),
xlab="eta", ylab="K", main="P(theta > 0.5)")

Categories: R work

## Prediction problem

In class, I described a simple Bayesian prediction problem.  One observes the number of fire calls in a small California city for 30 days — the total number of fire calls in this period is 158.  An administrator is interested in predicting the total number of calls in the next seven days.

We assume that y, the number of calls $y_i$ in a day, is Poisson with mean $\lambda$.  Suppose our prior beliefs about $\lambda$ are described by a normal density with mean 10 and standard deviation 2.

I first write a simple function compute.post that computes and plots the posterior (using the likelihood times prior recipe) on a grid between lo and hi.
compute.post=function(lo, hi,...){ lambda = seq(lo, hi, length=100) prior = dnorm(lambda, 10, 2) like = lambda^158 * exp(-30*lambda) post = prior * like plot(lambda, post,...) list(L = lambda, P = post) }

With some trial and error, the interval (3.3, 7.5) seems to bracket the posterior well.

S=compute.post(3.3, 7.5, type="l", col="red")

Next I take a simulated sample from the posterior by using the sample function.

sim.lambda = sample(S$L, 1000, replace=TRUE, prob=S$P)

Last, I simulate a sample from the posterior predictive density of z, the number of fire calls in seven days.  Since [z | $\lambda$] is Poisson(7 $\lambda$, we simulate z by (1) simulating $\lambda$ from the posterior and (2) simulating z from a Poisson(7 $\lambda$) distribution.  This is easy to do by using the vector properties of the random simulation functions.

sim.z = rpois(1000, 7*sim.lambda)

Here is a graph of the predictive distribution (by tabulating and plotting the simulated values of z).

Categories: Single parameter

## Brute force Bayes for one parameter

Although we talk a lot about conjugate analyses, one doesn’t need to restrict oneself to the use of conjugate priors.  Here we illustrate learning about a Poisson mean using a normal prior on the mean $\lambda$.

Suppose I observe the following number of fire calls for a particular community each week:  0, 0, 1, 3, 2, 2.  If we assume $y_i$, the number of fire calls in week $i$, is Poisson($\lambda$), then the likelihood function is given by

$L(\lambda) = \exp(- 6 \lambda) \lambda^8, \, \, \lambda>0$

Suppose I represent my prior beliefs about $\lambda$ by a normal curve with mean 3 and standard deviation 1.  Then the posterior is given by (up to an unknown proportionality constant) by

$g(\lambda | data) = L(\lambda) \times \exp(-(\lambda - 3)^2)$

Here is a simple brute-force method of summarizing this posterior.

1.  Choose a grid of $\lambda$ values that covers the region where the posterior is concentrated (this might take some trial and error).

2.  On this grid, compute the prior, likelihood, and posterior.

3.  Using the R sample function, take a large sample from the grid where the probabilities of the points are proportional to the like x prior values.

4.  Summarize this posterior simulated sample to learn about the location of the posterior.

Here’s some R code for this example.  I use the plot function to make sure the grid does cover the posterior.  The vector L contains 1000 draws from the posterior.

lambda = seq(0, 5, by=0.1)
like = exp(-6*lambda)*lambda^8
prior = dnorm(lambda, 3, 1)
post = like * prior
plot(lambda, post)
L = sample(lambda, size=1000, prob=post, replace=TRUE)
plot(density(L))  # density graph of simulated draws

Categories: Single parameter