In Chapter 11 of BCWR, I describe an analysis of a famous dataset, the counts of British coal mining disasters described in Carlin et al (1992). We observe the number of disasters
This model can be fit by the use of Gibbs sampling through the introduction of latent data. For each year, one introduces a state
In Chapter 11, I illustrate the use of WinBUGS and the R interface to simulate from this model. MCMCpack also offers a R function MCMCpoissonChangepoint to fit from this model which I’ll illustrate here.
First we load in the MCMC package.
library(MCMCpack)
We load in the disaster numbers in a vector data.
data=c(4,5,4,1,0,4,3,4,0,6,
3,3,4,0,2,6,3,3,5,4,5,3,1,4,4,1,5,5,3,4,2,5,2,2,3,4,2,1,3,2,
1,1,1,1,1,3,0,0,1,0,1,1,0,0,3,1,0,3,2,2,
0,1,1,1,0,1,0,1,0,0,0,2,1,0,0,0,1,1,0,2,
2,3,1,1,2,1,1,1,1,2,4,2,0,0,0,1,4,0,0,0,
1,0,0,0,0,0,1,0,0,1,0,0)
Suppose we decide to assign gamma(c0, d0) priors on each Poisson mean where c0=1 and d0=1. Then we fit this changepoint model simply by running the function MCMCpoissonChangepoint:
fit=MCMCpoissonChangepoint(data, m = 1, c0 = 1, d0 = 1,
burnin = 10000, mcmc = 10000)
I have included the important arguments: data is obviously the vector of counts, m is the number of unknown changepoints (here m = 1), c0, d0 are the gamma prior parameters, we choose to have a burnin period of 10,000 iterations, and then collect the following 10,000 iterations.
MCMCpack includes several graphical and numerical summaries of the MCMC output: plot(fit), summary(fit), plotState(fit), and plotChangepoint(fit).
plot(fit) shows trace plots and density estimates for the two Poisson means.
summary(fit) gives you summary statistics, including suitable standard errors, for each Poisson mean
Iterations = 10001:20000
Thinning interval = 1
Number of chains = 1
Sample size per chain = 10000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
lambda.1 3.0799 0.2870 0.002870 0.002804
lambda.2 0.8935 0.1130 0.001130 0.001170
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
lambda.1 2.5411 2.8861 3.0660 3.271 3.667
lambda.2 0.6853 0.8153 0.8895 0.966 1.130
plotState(fit) – this shows the probability that the process falls in each of the two states for all years
plotChangepoint(fit) — this displays the posterior distribution of the changepoint location.