Author Archive

Modeling Field Goal Kicking — part 2

September 16, 2011 Leave a comment

Let’s continue the football kicking example.

1.  Prior beliefs.  On the surface, it seems challenging to talk about prior information since the regression parameters \beta_0 and \beta_1 are not very meaningful.  But we might have some beliefs about the probability of kicking a goal successful and we can use these beliefs to indirectly construct a prior on the regression coefficients.

2.  Conditional means prior.  We instead consider the parameters (p(30) and p(50)), the probabilities of successfully kicking a field goal at 30 yards and 50 yards.  After some thought, I believe:

  • The median and 90th percentile at p(30) are respectively 0.90 and 0.98  (I’m pretty confident of a successful kick at 30 yards.)
  • The median and 90th percentile at p(50) are respectively 0.50 and 0.70. (I’m less confident of a successful kick of 50 yards.)
Assuming my beliefs about p(30) and p(50) are independent, and assuming beta priors, my joint prior is given by
g(p(30), p(50)) = beta(p(30), a_1, b_1) beta(p(50), a_2, b_2))
where a_1, b_1, a_2, b_2 are the matching beta shape parameters found using a function like in the LearnBayes package.
3.  Then we can transform this prior on (p(30), p(50)) to a density on (\beta_0, \beta_1).  It can be shown that this transformed prior has a product of betas form which makes it very convenient to use.
The log of the logistic likelihood is actually a function logisticpost in the LearnBayes package, and it is convenient to use it for this example.
Categories: Regression

Modeling Field Goal Kicking

September 15, 2011 Leave a comment

Since it is (American) football season, it seemed appropriate to do a football example to illustrate summarizing a bivariate posterior.

One way of scoring points in football is by kicking field goals.  The ball is lined up at a particular location on the field and the kicker attempts to kick the ball through the uprights.

We are interested in looking at the relationship between the distance of the attempt (in yards) and the result of the kick (success and failure).  I collect the results of 31 field goal attempts for the Big Ten college games in the second week of the 2011 season.   I read the data into R from the class web folder and display the beginning of the data frame fieldgoal:

> fieldgoal=read.csv("")
> head(fieldgoal)
  Distance Result
1       42      1
2       54      1
3       20      1
4       42      0
5       50      1
6       38      0

A logistic model can be used to represent this data.  Let y_i denote the result of the ith kick at distance x_i — either the kick is good (y_i = 1) or it is missed (y_i = 0).  Let p_i denote the probability that y_i = 1.  A simple logistic model says that

log \frac{p_i}{1-p_i} = \beta_0 + \beta_1 x_i

The standard way of fitting this model is based on maximum likelihood based on the iterative reweighted least-squares algorithm.  We can implement this algorithm using the R glm function with the family = binomial argument.

> glm(Result ~ Distance, family = binomial, data = fieldgoal)

(Intercept)     Distance
    3.81329     -0.07549

As expected, the slope is negative — this means that it is harder to make a field goal for a longer distance.

We’ll use this example in a future post to illustrate prior modeling and “brute force” computation of the posterior.

Categories: Regression

Contour plots on R

September 12, 2011 Leave a comment

In one of the homework problems, you are asked to do a Bayesian sensitivity analysis for a posterior probability with respect to the prior.   Specifically, suppose we are interested in learning about a proportion and we wish to compute the posterior probability P(\theta > 0.5).  We observe the data (25 successes, 40 failures) and we want to know the sensitivity of the posterior probability with respect to the prior mean \eta = a/(a+b) and the prior precision parameter K = a + b where a and b are the beta shape parameters.

It is easy to construct a contour plot to illustrate this sensitivity analysis on R.

1.  First we set up a grid of values of \eta and K:

eta = seq(.1, .9, by=.1)
K = c(1, 2, 8, 16, 32)

2.  Next we write a short function that computes the posterior probability P(\theta > 0.5) as a function of  \eta and K.

myfun=function(eta, K)
  1 - pbeta(.5, K*eta+ 25, K*(1-eta) + 40)

3.  We now use the outer function to compute this probability on all combinations of  \eta and K — the result is a matrix where the rows correspond to \eta and the columns to K.

PROB = outer(eta, K, myfun)

4.  Last, we use the contour function with inputs the two vectors and the matrix of probabilities.  We can specify which levels to draw on the contour graph.

contour(eta, K, PROB,
        levels=c(0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95),
        xlab="eta", ylab="K", main="P(theta > 0.5)")

Categories: R work

Prediction problem

September 8, 2011 Leave a comment

In class, I described a simple Bayesian prediction problem.  One observes the number of fire calls in a small California city for 30 days — the total number of fire calls in this period is 158.  An administrator is interested in predicting the total number of calls in the next seven days.

We assume that y, the number of calls y_i in a day, is Poisson with mean \lambda.  Suppose our prior beliefs about \lambda are described by a normal density with mean 10 and standard deviation 2.

I first write a simple function that computes and plots the posterior (using the likelihood times prior recipe) on a grid between lo and hi., hi,...){
lambda = seq(lo, hi, length=100)
prior = dnorm(lambda, 10, 2)
like = lambda^158 * exp(-30*lambda)
post = prior * like
plot(lambda, post,...)
list(L = lambda, P = post)

With some trial and error, the interval (3.3, 7.5) seems to bracket the posterior well., 7.5, type="l", col="red")

Next I take a simulated sample from the posterior by using the sample function.

sim.lambda = sample(S$L, 1000, replace=TRUE, prob=S$P)

Last, I simulate a sample from the posterior predictive density of z, the number of fire calls in seven days.  Since [z | \lambda] is Poisson(7 \lambda, we simulate z by (1) simulating \lambda from the posterior and (2) simulating z from a Poisson(7 \lambda) distribution.  This is easy to do by using the vector properties of the random simulation functions.

sim.z = rpois(1000, 7*sim.lambda)

Here is a graph of the predictive distribution (by tabulating and plotting the simulated values of z).

Categories: Single parameter

Brute force Bayes for one parameter

September 6, 2011 Leave a comment

Although we talk a lot about conjugate analyses, one doesn’t need to restrict oneself to the use of conjugate priors.  Here we illustrate learning about a Poisson mean using a normal prior on the mean \lambda .

Suppose I observe the following number of fire calls for a particular community each week:  0, 0, 1, 3, 2, 2.  If we assume y_i, the number of fire calls in week i, is Poisson(\lambda), then the likelihood function is given by

L(\lambda) = \exp(- 6 \lambda) \lambda^8, \, \, \lambda>0

Suppose I represent my prior beliefs about \lambda by a normal curve with mean 3 and standard deviation 1.  Then the posterior is given by (up to an unknown proportionality constant) by

g(\lambda | data) = L(\lambda) \times \exp(-(\lambda - 3)^2)

Here is a simple brute-force method of summarizing this posterior.

1.  Choose a grid of \lambda values that covers the region where the posterior is concentrated (this might take some trial and error).

2.  On this grid, compute the prior, likelihood, and posterior.

3.  Using the R sample function, take a large sample from the grid where the probabilities of the points are proportional to the like x prior values.

4.  Summarize this posterior simulated sample to learn about the location of the posterior.

Here’s some R code for this example.  I use the plot function to make sure the grid does cover the posterior.  The vector L contains 1000 draws from the posterior.

lambda = seq(0, 5, by=0.1)
like = exp(-6*lambda)*lambda^8
prior = dnorm(lambda, 3, 1)
post = like * prior
plot(lambda, post)
L = sample(lambda, size=1000, prob=post, replace=TRUE)
plot(density(L))  # density graph of simulated draws
Categories: Single parameter

Nice R Interface

August 30, 2011 Leave a comment

This week I’m introducing R to two different classes, one is a probability class to secondary-ed math majors and the second is a Bayesian course. Recently I learned about a nice R interface called RStudio.

I’ve included a screenshot below (you first install R and then RStudio and then launch the RStudio application).

What’s so special about RStudio?

1.  RStudio partitions the screen into four sections.  The bottom left is the Console window, the top left window shows R scripts, the top right window shows all the workspace variables, and the bottom right displays the help output and any plots.  I like that you can see all parts of R at once.

2.  It has a nice interface for importing data.  One clicks on an Import Dataset button and it steps the user through the process of reading in datasets from a file or from a web location.

3.  It is user friendly when one types in commands.  It automatically completes left parentheses and brackets.  Also it has a nice Code Completion — when you hit the tab key it will give you different options for completing the typing.  Also it gives you a glimpse into the arguments of the particular function typed.

I’m sure there are more nice features.  But I found RStudio easy to use and I’ve been using it all of the time the past few months.


Categories: R work

Bayes Rule in R

August 28, 2011 Leave a comment

On Friday, I gave several examples of Bayes’ rule in class. I’ll do a slight generalization of the testing for a disease example to illustrate using a special R function bayes to do the calculations.

This special function can be read into R directly from my web site.

> source("")

This function has three inputs prior, likelihood, and data.

1. The prior. Here there are two models — either we have the disease (D) or we don’t (not D) — and suppose that the probability of D is only 0.0001. prior is a vector of probabilities, where we give names to the two values:

> prior = c(D = .0001, not.D = 1 - .0001)

2. The likelihood. Here the data is the test result (pos or neg) where the test result is incorrect with probability 0.01. We’ll specify the likelihood as a matrix where each row of the matrix gives the probability of pos and neg for each model.

> like.D = c(pos=.99, neg=.01)
> like.not.D = c(pos=.01, neg=.99)
> likelihood = rbind(D = like.D, not.D = like.not.D)
> likelihood
pos neg
D 0.99 0.01
not.D 0.01 0.99

3. data is a vector of data values. Suppose we get a positive test result (pos). Then

> data="pos"

So we enter as inputs prior, likelhood, and data — the output is a matrix — the first row gives the prior probs and the second row gives the posterior probabilities.

> bayes(prior, likelihood, data)

D not.D
prior 0.000100000 0.9999000
pos 0.009803922 0.9901961

Although the positive blood test result is “bad news”, it is still very unlikely that you have the disease.

Categories: Bayes rule