Bayesian Thinking

Poisson mean example, part II


In the last post, I described the process of determining a prior for , my son’s average number of text messages per day.   I decided to model my beliefs with a gamma(a, b) density with a = 44.4 and b = 4.4.

Now I observe some data.  I looked at the online record of text messaging for the first seven days that’s been at school and observe the counts

Sat  Sun  Mon  Tue  Wed  Thu  Fri
19     4      26      17    15       0    17

If we assume these counts are Poisson with mean , then the likelihood function is given by

L(lambda) =  lambda^s  exp(-n lambda),

where is the number of observations (7) and is the sum of the observations ( 98).

The posterior density (using the prior times likelihood recipe) is given by

L(lambda) x g(lambda) = lambda^{a+s-1}  exp(-(b+n) lambda),

which we recognize as a gamma density with shape and rate .

Using the following R commands, we construct a triplot (prior, likelihood, and posterior).

> a=44.4; b=4.4
> s=sum(y); n=length(y)

> a=44.4; b=4.4
> s=sum(y); n=length(y)
> curve(dgamma(x,shape=s+a,rate=n+b),col=”red”,xlab=”LAMBDA”,
+   ylab=”DENSITY”,lwd=3,from=3,to=25)
> curve(dgamma(x,shape=a,rate=b),col=”blue”,lwd=3,add=TRUE)
> curve(dgamma(x,shape=s+1,rate=n),col=”green”,lwd=3,add=TRUE)
> legend(“topright”,c(“PRIOR”,”LIKELIHOOD”,”POSTERIOR”),
+  col=c(“blue”,”green”,”red”),lty=1,lwd=3)

Note that (1) the posterior is a compromise between the data information (likelihood) and the prior  and (2) the posterior is more precise since we are combining two sources of information.
I would like to predict my son’s text message use in the next month.  Let denote the number of text messages in the next month.  The total number of messages has, conditional on , a Poisson distribution with mean .
I am interested in computing the posterior predictive  distribution for .  One can simulate this distribution by (1) simulating a value of the parameter from the gamma(142.4, 11.4) posterior and then (2) simulating from a Poisson distribution with mean .
We can do this simulation in R in two lines.   In the last line, I tabulate the values of and plot the counts.
> lambda=rgamma(1000,shape=142.4,rate=11.4)
> ys=rpois(1000,30*lambda)
> plot(table(ys),ylab=”FREQUENCY”)
Looking at the graph, it is likely that my son will have between 310 and 450 text messages in the next 30 days.