Estimating a log odds ratio

We were looking at the relationship between a student’s ACT score (lo or hi) and his GPA (lo or hi).  We constructed a prior on and , where is the log odds ratio and is the sum of the log odds.  We assumed that was N(0.5, 0.3), was N(0, 2), and the two parameters were independent.

We now observe some data for a sample of students:

GPA <= 3.0  GPA > 3.0
ACT <=21      29                24
ACT > 21        15                 34

The likelihood function is given by

.

Here is a computing strategy to summarize the posterior density.

1.  We write a R function to compute the log posterior density of .

2.  We use the function mycontour to compute the log posterior density on a grid.

3.  Using the function simcontour, we simulate values of from the posterior density.

4.  We learn about the log odds ratio by summarizing the sample of values.

lctablepost=function (theta, s)
{
ilogit=function(x) exp(x)/(1+exp(x))
logitp1 = (theta[1] + theta[2])/2
logitp2 = (theta[2] – theta[1])/2
s$data[1]*ilogit(logitp1)+s$data[2]*(1-ilogit(logitp1))+
s$data[3]*ilogit(logitp2)+s$data[4]*(1-ilogit(logitp2))+
dnorm(theta[1],s$prior[1],s$prior[2],log=TRUE) +
dnorm(theta[2],s$prior[3],s$prior[4],log=TRUE)
}
d=c(29,24,15,34)
p=c(0.5,0.3,0,2)
stuff=list(data=d,prior=p)
mycontour(lctablepost,c(-1,2,-9,3),stuff,
xlab=”THETA1″,ylab=”THETA2″,col=”red”)
s=simcontour(lctablepost,c(-1,2,-9,3),stuff,1000)
points(s)

Here is the R work.  I write a function, lctablepost, that has two inputs, the vector theta (), and a list s that has two components, data (a vector of table counts) and prior (a vector of the means and standard deviations of the two normal priors).  Note that I use a special function ilogit that computes the inverse logit of a probability.

lctablepost=function (theta, s)
{
ilogit=function(x) exp(x)/(1+exp(x))
logitp1 = (theta[1] + theta[2])/2
logitp2 = (theta[2] – theta[1])/2
s$data[1]*ilogit(logitp1)+s$data[2]*(1-ilogit(logitp1))+
s$data[3]*ilogit(logitp2)+s$data[4]*(1-ilogit(logitp2))+
dnorm(theta[1],s$prior[1],s$prior[2],log=TRUE) +
dnorm(theta[2],s$prior[3],s$prior[4],log=TRUE)
}

We define the data and the prior inputs.

d=c(29,24,15,34)
p=c(0.5,0.3,0,2)
stuff=list(data=d,prior=p)

Now we use the LearnBayes function mycontour — the bounding rectangle of (-1, 2, -9, 3) seems to work well.

mycontour(lctablepost,c(-1,2,-9,3),stuff,
xlab=”THETA1″,ylab=”THETA2″,col=”red”)

Last, I simulate 1000 values from this grid by the LearnBayes function simcontour.  The output is a list s with components x and y — so s$x are the simulated draws of and s$y are the simulated draws of .

s=simcontour(lctablepost,c(-1,2,-9,3),stuff,1000)

I show the simulated sample on my contour by use of the points function.

points(s)