Home > R work > Using the sapply function in R

Using the sapply function in R

In R, I think one of the most useful functions is sapply.   By using this function, one can avoid writing a loop.  Let me illustrate using this function for the binomial/beta exchangeable model.

The model is

1.  y_i distributed binomial(n_i, p_i)
2.  p_1, ..., p_k are iid beta(K, \eta)
3.  (K, \eta) have a vague prior

The log posterior of \theta_1 = logit(\eta), \theta_2 = \log K is programmed in the function betabinexch in the LearnBayes package.

Suppose we wish to obtain a 90% interval estimate for p_j.  To simulate from the posterior, we first simulate a value of (K, \eta) from its posterior and then simulate p_j from the posterior  beta(y_j + K \eta, n_j - y_j + K (1-\eta) )

Let’s consider the famous Efron and Morris baseball data — y_j is the number of hits in n_j at-bats for the jth player.

I’ll read in the data, get a normal approximation to the posterior (function laplace), and then do a random walk Metropolis chain — I obtain a simulated sample from (\theta_1, \theta_2).

y=c(12,7,14,9,18,16,15,13,8,10,17,10,11,10,14,11,10,10)
n=rep(45,18)
library(LearnBayes)
data=as.matrix(cbind(y,n))
fit=laplace(betabinexch,c(-1,4),data)
mcmc.fit=rwmetrop(betabinexch,list(var=fit$var,scale=2),c(-1,4),10000,data)
Now I write a function interval.estimate that has a single input j — it will simulate from the posterior of p_j and output a 90% interval estimate.
interval.estimate=function(j)
{
K=exp(mcmc.fit$par[,2])
eta=exp(mcmc.fit$par[,1])/(1+exp(mcmc.fit$par[,1]))
thetaj=rbeta(1000,y[j]+K*eta,n[j]-y[j]+K*(1-eta))
quantile(thetaj,c(.05,.95))
}
If I wish to obtain a 90% interval estimate for p_1, I type
> interval.estimate(1)
5%       95%
0.2052578 0.3288738
To do this in a single stroke for all 18 probabilities, I use sapply.  I wish to apply interval.estimate over all values 1, …, 18:
> sapply(1:18,interval.estimate)
[,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]
5%  0.1981516 0.1535653 0.2240657 0.1825992 0.2481277 0.2339640 0.2203171
95% 0.3314778 0.2906053 0.3578334 0.3076716 0.4013249 0.3805313 0.3658903
[,8]      [,9]     [,10]     [,11]     [,12]     [,13]     [,14]
5%  0.2061546 0.1637018 0.1841709 0.2375183 0.1837387 0.1948228 0.1878043
95% 0.3435467 0.3000630 0.3211896 0.3969801 0.3151900 0.3279358 0.3136143
[,15]     [,16]     [,17]     [,18]
5%  0.2193047 0.1949027 0.1821835 0.1864202
95% 0.3616939 0.3240116 0.3109836 0.3109164
I get interval estimates for all probabilities.  The 18th column contains the 5th and 95th percentiles of the posterior of p_{18}.
Advertisements
Categories: R work
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: