Home > Hierarchical modeling > Home run rates — fitting an exchangeable model

## Home run rates — fitting an exchangeable model

Continuing our home run hitting example, we observe home run counts $y_1, ..., y_k$ where $y_i$ is distributed $Bin(n_i, p_i)$. Suppose we assign the home run probabilities the following exchangeable model.

1.  $p_1, ..., p_k$ distributed $Beta(K, \eta)$

2. $K, \eta$ independent, $\eta$ distributed $\eta^{-1}(1-\eta)^{-1}, K$ distributed $\frac{1}{1+K^2}$.

We described the computing strategy in class to summarize the posterior distribution of $(p_1, .., p_k), \eta, K$.  We find

1.  For $\eta$, the posterior median is 0.0283 and a 90% interval estimate is (.0268, .0296).

2. For $K$, the posterior median is 99 and a 90% interval estimate is (85, 116).

We find posterior means for the home run probabilities $p_1, ..., p_k$.  We plot the posterior means against the square root of the at-bats for the 605 non-pitchers.

This is a very different graph pattern compared to the figure plotting sqrt(AB) against the observed home run rates.  Several things to note.

1.  For the players with few AB, their home run rates are strongly shrunk towards the overall home run rate.  This is reasonable — we have little information about these players’ true home run hitting abilities.

2.  Who are the best home run hitters?  It is clear from the figure that four hitters stand out — they are the ones in the upper right corner of the graph that I’ve labeled with the number of home runs.  The best home run hitters, that is the ones with the highest estimated home run probabilities, are the two player with 46 and 54 at-bats.

3.  I’ve drawn a smoothing curve to indicate the main pattern in the graph.  Note that it’s an increasing curve — this means that players with more AB tend to have higher home run probabilities.  This makes sense — the better hitters tend to get more playing time and more at-bats.