Home > Hierarchical modeling, Priors > Home run hitting — priors?

Home run hitting — priors?

At our department’s colloquium last week, there was an interesting presentation on the use of an exchangeable prior in a regression context.  The analysis was done using WinBUGS using some standard choices of “weekly informative” priors.  That raises the natural question — how did you decide on this prior?

Let’s return to the home run hitting example where an exchangeable prior was placed on the set of home run hitting probabilities.  We assumed that p_1, ..., p_k were iid B(\eta, K) and then the hyperparameters \eta and K were assumed independent where \eta was assigned the prior \eta^{-1}(1-\eta)^{-1} and K was assigned the log logistic form g(K) = 1/(K+1)^2.

When I described these priors, I just said that they were “noninformative” and didn’t explain how I obtained them.

The main questions are:

1.  Does the choice of vague prior matter?
2.  If the answer to question 1 is “yes”, what should we do?

When one fits an exchangeable model, one is shrinking the observed rates y_i/n_i towards a common value.  The posterior of \eta tells us about the common value and the posterior of K tells us about the degree of shrinkage.  The choice of vague prior for \eta is not a concern — we can assume \eta is uniform or distributed according to the improper prior \eta^{-1}(1-\eta)^{-1}.   However, the prior on K can be a concern.  If one assigns the improper form g(K) = 1/K, it can be shown that the posterior may be improper.  So in general one needs to assign a proper prior to K.  One convenient choice is to assume that \log K has a logistic density with location \mu and scale \tau.  In my example, I assumed that \mu = 0, \tau = 1.

Since I was arbitrary in my choice of parameters for this logistic prior on \log K, I tried different choices for \mu and \tau.  I tried values of \mu between -4 and 4 and values of \tau between 0.2 and 10.  What did I find?  The posterior mode of \log K stayed in the 4.56 – 4.61 range for all of the priors I considered.  In this example, we have data from 605 players and clearly the likelihood is driving the inference.

What if the choice of \mu and \tau does matter?  Then one has to think more carefully about these parameters.  In this baseball context, I am pretty familiar with home run hitting rates.  Based on this knowledge, I have to figure out a reasonable guess at the variability of the home run probabilities that would give me information about K.


  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: