Home > Bayesian computation > Estimating a log odds ratio

## Estimating a log odds ratio

We were looking at the relationship between a student’s ACT score (lo or hi) and his GPA (lo or hi).  We constructed a prior on $\theta_1$ and $\theta_2$, where $\theta_1$ is the log odds ratio and $\theta_2$ is the sum of the log odds.  We assumed that $\theta_1$ was N(0.5, 0.3), $\theta_2$ was N(0, 2), and the two parameters were independent.

We now observe some data for a sample of students:

GPA <= 3.0  GPA > 3.0
ACT <=21      29                24
ACT > 21        15                 34

The likelihood function is given by

$L(p_1, p_2) = p_1^{29} (1-p_1)^{24} p_2^{15} (1-p_2)^{34}$.

Here is a computing strategy to summarize the posterior density.

1.  We write a R function to compute the log posterior density of $(\theta_1, \theta_2)$.

2.  We use the function mycontour to compute the log posterior density on a grid.

3.  Using the function simcontour, we simulate values of $(\theta_1, \theta_2)$ from the posterior density.

4.  We learn about the log odds ratio $\theta_1$ by summarizing the sample of $\theta_1$ values.

lctablepost=function (theta, s)
{
ilogit=function(x) exp(x)/(1+exp(x))
logitp1 = (theta[1] + theta[2])/2
logitp2 = (theta[2] – theta[1])/2
s$data[1]*ilogit(logitp1)+s$data[2]*(1-ilogit(logitp1))+
s$data[3]*ilogit(logitp2)+s$data[4]*(1-ilogit(logitp2))+
dnorm(theta[1],s$prior[1],s$prior[2],log=TRUE) +
dnorm(theta[2],s$prior[3],s$prior[4],log=TRUE)
}
d=c(29,24,15,34)
p=c(0.5,0.3,0,2)
stuff=list(data=d,prior=p)
mycontour(lctablepost,c(-1,2,-9,3),stuff,
xlab=”THETA1″,ylab=”THETA2″,col=”red”)
s=simcontour(lctablepost,c(-1,2,-9,3),stuff,1000)
points(s)

Here is the R work.  I write a function, lctablepost, that has two inputs, the vector theta ($\theta_1, \theta_2$), and a list s that has two components, data (a vector of table counts) and prior (a vector of the means and standard deviations of the two normal priors).  Note that I use a special function ilogit that computes the inverse logit of a probability.

lctablepost=function (theta, s)
{
ilogit=function(x) exp(x)/(1+exp(x))
logitp1 = (theta[1] + theta[2])/2
logitp2 = (theta[2] – theta[1])/2
s$data[1]*ilogit(logitp1)+s$data[2]*(1-ilogit(logitp1))+
s$data[3]*ilogit(logitp2)+s$data[4]*(1-ilogit(logitp2))+
dnorm(theta[1],s$prior[1],s$prior[2],log=TRUE) +
dnorm(theta[2],s$prior[3],s$prior[4],log=TRUE)
}

We define the data and the prior inputs.

d=c(29,24,15,34)
p=c(0.5,0.3,0,2)
stuff=list(data=d,prior=p)

Now we use the LearnBayes function mycontour — the bounding rectangle of (-1, 2, -9, 3) seems to work well.

mycontour(lctablepost,c(-1,2,-9,3),stuff,
xlab=”THETA1″,ylab=”THETA2″,col=”red”)

Last, I simulate 1000 values from this grid by the LearnBayes function simcontour.  The output is a list s with components x and y — so s$x are the simulated draws of $\theta_1$ and s$y are the simulated draws of $\theta_2$.

s=simcontour(lctablepost,c(-1,2,-9,3),stuff,1000)

I show the simulated sample on my contour by use of the points function.

points(s)