## Inference for a Poisson Rate

In 2006, Ryan Howard had 58 home runs in 581 at bats. Suppose the number of home runs y is distributed Poisson with mean ab lambda, where lambda is the true home run rate. If we assume the usual noninformative prior, the posterior for lambda is Gamma with shape alpha=y and rate ab.

To learn about lambda, we simulate 1000 draws from the Gamma posterior. We construct a density estimate of the simulated draws and then find a 90% interval estimate by finding the 5th and 95th percentiles of the draws.

> y=58

> t=581

> lambda=rgamma(1000,shape=y,rate=t)

> plot(density(lambda),xlab=”LAMBDA”,main=”POSTERIOR”)

> quantile(lambda,c(.05,.95))

5% 95%

0.07897515 0.12170184

We see that a 90% interval estimate for lambda is (.079, .122).

Next, suppose that Howard has 550 at-bats in 2007 — how many home runs will he hit? We learn about the future number of home runs ys by use of the posterior predictive distribution. In the R output, we (1) simulate 1000 draws from the posterior predictive distribution, (2) tabulate the values by use of the table function, (3) construct a graph of this distribution, and (4) construct a 90% interval estimate for ys by use of the discint function (from the LearnBayes package).

> ys=rpois(1000,500*lambda)

> T=table(ys)

> plot(T,main=”POSTERIOR PREDICTIVE”)

>

> ys=as.integer(names(T))

> freq=as.integer(T)

> freq=freq/sum(freq)

> dist=cbind(ys,freq)

> discint(dist,.9)

$prob

[1] 0.905

$set

[1] 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[26] 61 63 64 65 66 67

## Plot of two distributions for proportion inference

In tomorrow’s class, I’ll hand out the graph below that shows a Bayesian “triplot” (showing the likelihood, prior, and posterior) and the prior predictive distribution. The triplot is useful to show how the the two types of information (data and prior) are combined. The predictive plot is helpful in judging the suitability of the Bayesian model.

I wrote two short R functions, triplot and predplot, that construct the graphs. I assume that you have already loaded the LearnBayes package.

> library(LearnBayes)

> source(url(“http://bayes.bgsu.edu/m648/triplot.R”))

> source(url(“http://bayes.bgsu.edu/m648/predplot.R”))

> prior=c(6.8,2.5)

> data=c(9,15)

> n=sum(data)

> par(mfrow=c(2,1))

> triplot(prior,data)

> predplot(prior,n,data[1])

## R computations for a proportion using a beta prior

Today we talked about using a beta prior to learn about a proportion. Inference about p is done by use of the beta posterior distribution and prediction about future samples is done by means of the predictive distribution.

Here are the R computations for the cell-phone example. I’ll first illustrate inference for the proportion p, and then I’ll illustrate the use of the special function pbetap (in the LearnBayes package) to compute the beta-binomial predictive distribution to learn about the number of successes in a future sample.

> library(LearnBayes)

>

> a=6.8; b=2.5 # parameters of beta prior

> n=24; y=9 # sample size and number of yes’s in sample

>

> a1=a+y; b1=b+n-y # parameters of beta posterior

>

> # I’ll illustrate different types of inferences

>

> # a point estimate is given by the posterior mean

> a1/(a1+b1)

[1] 0.4744745

>

> # or you could find the posterior median

> qbeta(.5,a1,b1)

[1] 0.4739574

>

> # a 90% interval estimate is found by use of the 5th and 95th quantiles

> # of the beta curve

> qbeta(c(.05,.95),a1,b1)

[1] 0.3348724 0.6158472

>

> # we illustrate prediction by use of the function pbetap

> # suppose we take a future sample of 20

> # how many will be driving when using a cell phone?

>

> m=20; ys=0:m

> pred.probs=pbetap(c(a1,b1),m,ys)

>

> # display the predictive probabilities

>

> cbind(ys,pred.probs)

ys pred.probs

[1,] 0 7.443708e-05

[2,] 1 6.444416e-04

[3,] 2 2.897264e-03

[4,] 3 8.968922e-03

[5,] 4 2.139155e-02

[6,] 5 4.170364e-02

[7,] 6 6.884411e-02

[8,] 7 9.841322e-02

[9,] 8 1.236003e-01

[10,] 9 1.376228e-01

[11,] 10 1.365218e-01

[12,] 11 1.208324e-01

[13,] 12 9.524434e-02

[14,] 13 6.650657e-02

[15,] 14 4.075296e-02

[16,] 15 2.159001e-02

[17,] 16 9.665296e-03

[18,] 17 3.527764e-03

[19,] 18 9.889799e-04

[20,] 19 1.901993e-04

[21,] 20 1.891124e-05

>

> # what is the probability that there are at least

> # 10 cell phone drivers in my sample?

>

> sum(pred.probs*(ys>=10))

[1] 0.4958392

>