## Help with LearnBayes?

We starting to get into the computational part of my Bayesian class, so I thought it would be helpful to point out some of the ways to get help in the special R functions in LearnBayes.

**Using R scripts from my website**

In the folder http://bayes.bgsu.edu/bcwr/R%20scripts/ I show the R scripts for all of the examples in the 2nd edition of the book. You can copy and paste these scripts directly into the R console. Or you can use the source command to bring these examples into R. Here is an example where I illustrate the multinomial example from Chapter 4.

source(“http://bayes.bgsu.edu/bcwr/R%20scripts/Chapter.4.3.R”)

**Using the R demo function:**

Also, all of the R scripts in the examples are available as part of the LearnBayes package.

Once you have loaded in the LearnBayes package, you can see a listing of all of these examples by typing

demo()

If I want to run the multinomial demo from Chapter 4, section 3, type

demo(“Chapter.4.3”)

You may get some error messages if you run these demos on a Macintosh. I use special R Windows functions so you get to pause the graphical output. I suspect there are better ways of writing these demos to adapt to the particular environment (Mac or Windows), but I haven’t figured it out yet.

## New version of LearnBayes

Based on my experience teaching Bayes this fall, I’ve added some new functions to LearnBayes. The new version, LearnBayes 1.20, is available from the book website http://bayes.bgsu.edu/bcwr

One simple way of extending conjugate priors is by the use of discrete mixtures. There are three new functions, binomial.beta.mix, poisson.gamma.mix, and normal.normal.mix, that do the posterior computations for binomial, Poisson, and normal problems using mixtures of conjugate priors.

Here I’ll illustrate one of the functions for Poisson sampling. Suppose we are interested in learning about the home run rate for Derek Jeter before the start of the 2004 season. (A home run rate is the proportion of official at-bats that are home runs.) Suppose our prior beliefs are that the median of is equal to 0.05 and the 90th percentile is equal to 0.081.

Here are two priors that match this information. The first is a conjugate Gamma prior and the second is a mixture of two conjugate Gamma priors.

Prior 1: Gamma(shape = 6, rate = 113.5)

Prior 2: 0.88 x Gamma(shape = 10, rate = 193.5) + 0.12 x Gamma(shape = 0.2, rate = 0.415)

I check below that these two priors match the prior information.

pgamma(c(.05,.081),shape=6,rate=113.5)

[1] 0.5008145 0.8955648

probs[1]*pgamma(c(.05,.081),shape=gammapar[1,1],rate=gammapar[1,2])+

probs[2]*pgamma(c(.05,.081),shape=gammapar[2,1],rate=gammapar[2,2])

[1] 0.5007136 0.9012596

Graphs of the two priors are shown below. They look similar, but the mixture prior has flatter tails.

Now we observe some data — the number of at-bats and home runs hit by Jeter in the 2004 season. This data is contained in the dataset jeter2004 contained in the LearnBayes pacakge.

library(LearnBayes)

data(jeter2004)

The function poisson.gamma.mix will compute the posterior for the mixture prior. The inputs are prior, the vector of prior probabilities of the components, gammapar, a matrix of the gamma parameters for the components, and data, a list with components y (the observed counts) and t (the corresponding time intervals).

probs=c(.88,.12)

gammapar=rbind(c(10,193.5),c(.2,.415)

data=list(y=jeter2004$HR,t=jeter2004$AB)

Now we can run the function.

poisson.gamma.mix(probs,gammapar,data)

$probs

[1] 0.98150453 0.01849547

$gammapar

[,1] [,2]

[1,] 33.0 836.500

[2,] 23.2 643.415

We see from the output that the posterior for is distributed

0.98 x Gamma(33.0, 836.5) + 0.02 x Gamma(23.3, 643.4)

Does the choice of prior make a difference here? The following figure displays the posteriors for the two priors. They look similar, indicating that inference about is robust to the choice of prior.

But this robustness will depend on the observed data. Suppose that Jeter is a “steriod slugger” during the 2004 season and hits 70 home runs in 500 at-bats.

We run the function poisson.gamma.mix for these data.

poisson.gamma.mix(probs,gammapar,list(y=70,t=500))

$probs

[1] 0.227754 0.772246

$gammapar

[,1] [,2]

[1,] 80.0 693.500

[2,] 70.2 500.415

Here the posterior is

0.23 x gamma(80, 693.5) + 0.77 x gamma(70.2, 500.4)

I’ve graphed the two posteriors from the two priors. Here we see that the two posteriors are significantly different, indicating that the inference depends on the choice of prior.

## No more loops!

One aspect of the LearnBayes package that I’ve been uncomfortable with is the use of loops in my definition of the functions defining log posteriors. I had a reason for using the loops (the theta argument to the function could be a matrix), but it is generally bad programming practice. The worst aspect of the looping is that it makes the process of a writing a posterior more difficult than it really is. Looping is bad from the user’s perspective. After all, we are teaching statistics not programming, and I want to encourage people to code the posteriors for their problems using R.

Here is a simple example. Suppose your model is that y1,…, yn are independent Cauchy with location mu and scale sigma. The log posterior is given by

log g = sum (log f),

where log f is the log of the Cauchy density conditional on parameters. My old way of programming the posterior had the loop

for (i in 1:length(data))

val = val + log(dt((data[i] – mu)/sigma, df = 1)/sigma)

I think this new way is preferable. First you define the function logf for a single observation y:

logf=function(y,mu,sigma) log(dt((y-mu)/sigma,df=1)/sigma)

Then the log posterior is given by

sum(logf(data,mu,sigma))

Anyway, I think that by avoiding loops, the function for the log posterior becomes more transparent.

The new version of the LearnBayes package will contain fewer loops.

## Using a Mixture of Conjugate Priors

One way to extend the use of conjugate priors is by the use of discrete mixtures. Here is a simple example. Suppose you have a coin that you believe may be fair or biased towards heads. If p represents the probability of flipping heads, then suppose your prior is

g(p) = 0.5 beta(p, 20, 20) + 0.5 beta(p, 30, 10)

Suppose that we flip the coin 30 and get 20 heads. It can be shown that the posterior density can also be represented by a mixture of beta distributions.

I have written a short R function pbetamix that computes the posterior distribution when a proportion has a prior that is a mixture of beta distributions.

The matrix bpar contains the beta parameters where each row gives the beta parameters for each component. The vector prob gives the prior probabilities of the components. The vector data contains the number of successes and failures. Here are the values of these components for the example.

> bpar=rbind(c(20,20),c(30,10))

> prob=c(.5,.5)

> data=c(20,10)

We use the function pbetamix with the inputs prob, bpar, and data. The output is the posterior probabilities of the components and the beta parameters of each component.

> pbetamix(prob,bpar,data)

$probs

[1] 0.3457597 0.6542403

$betapar

[,1] [,2]

[1,] 40 30

[2,] 50 20

The posterior density is

g(p | data) = 0.346 beta(p, 40, 30) + 0.654 beta(p, 50, 20)

We plot the prior and posterior densities

> plot(p,.5*dbeta(p,30,30)+.5*dbeta(p,30,10),type=”l”,ylim=c(0,5),col=”red”,lwd=2)

> lines(p,.346*dbeta(p,40,30)+.654*dbeta(p,50,20),lwd=2,col=”blue”)

> text(locator(n=1),”PRIOR”)

> text(locator(n=1),”POSTERIOR”)

Here the data (20 heads and 10 tails) is more supportive of the belief that the coin is biased and the posterior places more of its mass where p > .5.