Bayesian Thinking

Learning about a correlation — part IV

Advertisements

Currently we are introducing MCMC fitting of posterior distributions.  Here we illustrate the use of one popular algorithm, the Metropolis random walk chain, to simulate from an arbitrary posterior distribution.

Let’s return to the problem of learning about a correlation from standardized data.  We have programmed the logarithm of the posterior density in the function cor.sampling

cor.sampling=function(rho,data)
{
x=data[,1]
y=data[,2]
ifelse(abs(rho)<1,sum(-0.5*log(1-rho^2)-0.5/(1-rho^2)*(x^2-2*rho*x*y+y^2)),
-9.99e200)
}

Here the parameter rho falls in the interval (-1, 1), but I want to define the log posterior for all values of rho.  Note that in the ifelse command, I’m checking if abs(rho) < 1 — if it isn’t, then I’m defining the log posterior to be a very large negative value (so the posterior is essentially zero outside of that interval).

How does one implement a random walk Metropolis chain for this example?

1.  Recall that in this algorithm, you have a current simulated draw and one proposes a new simulated draw where

,

is a scale parameter and is a  simulated draw from a normal distribution with mean 0 and standard deviation .  One has to choose a value of the scale parameter and the variance .

2.  Then when you implement the algorithm, you choose a starting parameter value (where the simulation begins) and the total number of simulation draws you collect.

The function rwmetrop in the LearnBayes package has inputs:  (1)  the name of the function that defines the log posterior, (2) a list containing values of and , (3) the starting value of the parameter in the simulation, and (4) any data used in the log posterior function.

Here I’m using the function cor.sampling, I’ll choose , start the simulation at , collect 10,000 simulation draws, and my data is stored in dz.

R1=rwmetrop(cor.sampling,list(var=.009,scale=2),.5,10000,dz)

The output is R1 — R1$par contains the simulated draws and R$accept gives the acceptance rate of the algorithm.

Here’s some advice about using this function.

1.  Typically it is good to let have a standard deviation equal to twice the estimated standard deviation of the posterior.  So you let be an estimated posterior variance and .

2.  How does one estimate the posterior variance?  One can use the normal approximation (used in the function laplace) to estimate the posterior variance.

3.  Acceptance rate?  A random walk Metropolis algorithm works well when the acceptance rate is in the 25 – 40% range.  You can monitor this value by displaying the R1$accept value.

4.  A trace plot of the simulated parameter values is helpful for seeing how well the simulated sample is exploring the posterior density.  One looks for a trend in this graph (is the MCMC algorithm slow in converging to the posterior) and for a snake-like appearance (this indicates that you have poor mixing in the chain).   Trace plots and other diagnostics are available as part of the coda package.

Let’s look at two runs of this algorithm — one for a good scale value and one for a poor choice of scale.  The acceptance rate for the first run is 50% and the acceptance rate for the second run is 94%.

library(coda)
R1=rwmetrop(cor.sampling,list(var=.009,scale=2),.5,10000,dz)
R1$accept
traceplot(mcmc(R1$par),main=”Good choice of scale”)

R2=rwmetrop(cor.sampling,list(var=.009,scale=0.2),.5,10000,dz)
R2$accept
traceplot(mcmc(R2$par),main=”Poor choice of scale”)

Advertisements