## Three Sampling Models for Fire Calls

Continuing our example, suppose our prior beliefs about the mean count of fire calls is Gamma(280, 4). (Essentially this says that our prior guess at is 70 and the prior standard deviation is about 4.2.) But we’re unsure about the sampling model — it could be (M1) exponential(), (M2) Poisson(), or (M3) normal with mean and standard deviation 12.

To get some sense about the best sampling model, I’ve plotted a histogram of the fire call counts below. I’ve overlaid fitted exponential, Poisson, and normal distributions where I estimate by the sample mean.

I think it is pretty clear from the graph that the exponential model is a poor fit. The fits of the Poisson and normal (sd = 12) models are similar, although I’d give a slight preference to the normal model.

For each model, I compute the logarithm of the predictive probability . (I wrote a short function defining the log posterior of and then used the laplace function in the LearnBayes package to compute .)

Here are the results:

Model

————————–

Poisson -148.0368

exponential -194.3483

normal(12) -144.3027

—————————

The exponential model is a clear loser and the Poisson and normal(12) models are close. The Bayes factor in support of the normal(12) model is

## Modeling fire alarm counts

Sorry for the long delay since my last post. It seems that one needs to keep a regular posting pattern, say several days a week. to keep this blog active.

We’re now talking about comparing models by Bayes factors. To motivate the discussion of plausible models, I found the following website that gives the number of fire calls for each month in Franklinville, NC for the last several years.

Suppose we observe the fire call counts for consecutive months. Here is a general model for these data.

- are independent
- has a prior

## Are birth weights normal?

A few weeks ago, I asked my class for an example of data that would be normally distributed and someone mentioned weights of people.

Are weights, more specifically birthweights of newborns normally distributed?

Peter Dunn in a JSE article here analyzes statistics for 44 babies born in a single day in an Australian hospital. The article has a link to the dataset babyboom.dat that contains the data.

If we construct a histogram of the birthweights, we get the following display.

We see some left skewness in this distribution — does this indicate that our normality assumption is false?

One can check the normality assumption by means of the posterior predictive distribution. Here’s an outline of the approach. (There are special functions in the LearnBayes package that make this process simple to implement.)

1. First we fit the normal() model using the traditional noninformative prior proportional to . A simulated sample from the posterior is found using the function normpostsim.

fit=normpostsim(birth.weight,10000)

2. Next we think of an appropriate testing function. Since the birth weights don’t appear symmetric, a natural checking function is the sample skewness. We write a short function “skew” to compute this statistic.

sum((y-mean(y))^3)/sd(y)^3/(length(y)-1)

## A “robustness” illustration of Bayes factors

Here is another illustration of Bayes factors. We are worried about the possibility of outliers and so we assume the observations are distributed from a t distribution with location , scale , and degrees of freedom . We have some prior beliefs about the location and scale: we assume independent with and .

We observe the following data:

112.6,86.0,99.1,106.3,101.5,80.5,99.6,108.6,95.9)

{

mu=theta[1]

sigma=exp(theta[2])

y=stuff$y

df=stuff$df

loglike=sum(dt((y-mu)/sigma, df, log=TRUE)-log(sigma))

logprior=dnorm(mu,100,10,log=TRUE)+dnorm(log(sigma),0,2, log=TRUE)

loglike+logprior

}

## A Bayesian test of the equality of proportions

One problem with a p-value is that it is hard to interpret if you don’t know the sample size. Andy Gelman talks about this problem on his blog.

Here I’ll contrast the message communicated by a p-value and a Bayesian test by looking at the following three tables:

Table A: [32 18; 18 32]

Table B: [272 228; 228,272]

Table C: [2570 2430; 2430,2570]

[1] 0.9436246

[1] -0.3035413

[1] -1.411366

## Bayesian testing of a point null hypothesis

Here is today’s example illustrating a Bayesian test of a point null hypothesis. In Agresti and Franklin’s intro stats book, there is an example wondering if Americans work, on average, 40 hours per week. If represents the mean hours worked per week for all Americans, we are interested in testing against the alternative .

In the example in Agresti and Franklin’s text, a sample of 868 workers were selected and one observes that the workers worked an average of 39.11 hours and one assumes the population standard deviation is equal to . The test statistic is

and one computes a p-value of 0.061. There seems to be moderately significant evidence that the mean working hours for all workers is different from 40.

Here’s an outline of a Bayesian test (I give more details in class):

1. We assign probabilities to the hypotheses and . Here we assume each hypothesis is equally likely.

2. Under the hypothesis , the prior for is a point mass at the hypotheized value of 40.

3. Under the alternative hypothesis, we’ll place a N(40, ) prior on . This reflects the belief that, under the alternative, values of near 40 are more likely that values far from 40.

Under these assumptions, we give (in class) the posterior probability of the hypothesis . This probability is programmed in the function mnormt.twosided in the LearnBayes package. The inputs to this function are (1) the value to be tested, (2) the prior probability of , (3) a vector of values of and (4) a data vector consisting of . The outputs are the Bayes factor in support of and the posterior probability of .

Since it is not clear how to specify the prior standard deviation — we compute the posterior probability for a range of values of . I write a short function to output the posterior probability as a function of and then I use the curve function to plot this probability for values of from 0.01 to 2.

What we see is that the posterior probability of exceeds 0.35 for all values of . This suggests that the p-value overstates the evidence against the null hypothesis.

> data=c(39.11, 868, 14)

> prob.H=function(tau)

+ mnormt.twosided(40,.5,tau,data)$post

> curve(prob.H(x),from=.01,to=2,xlab=”TAU”,ylab=”P(H)”)

## Comparing sampling models by Bayes factors

In the last posting, I illustrated fitting a t(4) sampling model to some baseball data where I suspected there was an outlier. To compare this model to other sampling models, we can use Bayes factors.

Here is a t sampling model with a convenient choice of prior.

1.is a random sample from

2. and are independent, with distributed and distributed

I define a R function tsampling.R that computes the logarithm of the posterior. Note that I am careful to include all of the normalizing constants, since I am primarily interested in computing the marginal density of .

tsampling=function(theta,datapar)

{

mu=theta[1]; logsigma=theta[2]; sigma=exp(logsigma)

y=datapar$y; df=datapar$df

mu.mean=datapar$mu.mean; mu.sd=datapar$mu.sd

lsigma.mean=datapar$lsigma.mean; lsigma.sd=datapar$lsigma.sd

loglike=sum(dt((y-mu)/sigma,df,log=TRUE)-log(sigma))

logprior=dnorm(mu,mu.mean,mu.sd,log=TRUE)+

dnorm(logsigma,lsigma.mean,lsigma.sd,log=TRUE)

return(loglike+logprior)

}

We use this function together with the function laplace to compute the log marginal density for two models.

Model 1 — t(4) sampling, is normal(90, 20), is normal(1, 1).

Model 2 — t(30) sampling, is normal(90, 20), is normal(1, 1).

Note that I’m using the same prior for both models — the only difference is the choice of degrees of freedom in the sampling density.

dataprior1=list(y=y, df=4, mu.mean=90, mu.sd=20,

lsigma.mean=1, lsigma.sd=1)

log.m1=laplace(tsampling,c(80, 3), dataprior1)$int

dataprior2=list(y=y, df=30, mu.mean=90, mu.sd=20,

lsigma.mean=1, lsigma.sd=1)

log.m2=laplace(tsampling,c(80, 3), dataprior2)$int

BF.12=exp(log.m1-log.m2)

BF.12

[1] 14.8463

We see that there is substantial support for the t(4) model over the “close to normal” t(30) model.