Home > Bayesian computation, Multiple parameters > Learning from selected order statistics

Learning from selected order statistics

To illustrate some computational methods for summarizing a posterior, I describe a missing data problem motivated from my undergraduate days at Bucknell. Ken Kaminsky and Paul Nelson, two of my Bucknell professors, were interested in learning about populations based on selected order statistics. (I wrote an undergraduate honors thesis on this topic. ) Here is a simple illustration of the problem.

Suppose a sample y1, …, y20 is taken from the two-parameter exponential distribution of the form f(y | mu, beta) =1/beta exp(-(y-mu)/beta), y > mu. But you don’t observe the complete dataset — all you observe are the two order statistics y(5) and y(15) (the order statistics are the observations arranged in ascending order).

Based on this selected data, we wish to (1) estimate the parameters mu and beta by 90% interval estimates and (2) predict the value of the order statistics y*(5) and y*(20) from a future sample taken from the same population.

Here’s the plan:

1. First, we write the likelihood which is the density of the observed data (y(5) and y(20)) given values of the exponential parameters mu and beta. One can show that this likelihood is given by

L(mu, beta) = f(y(5)) f(y(15)) F(y(5))^4 (F(y(15) )-F(y(5)))^9 (1- P(y(15)))^5, mu>0, beta>0.

2. Assuming a flat (uniform) prior on (mu, beta), the posterior density is proportional to the likelihood. We write a R function kaminsky0.R that computes the logarithm of the posterior — here the parameters are (mu, beta) and the data is (y(5), y(15)).

kaminsky0=function(theta,data)
{

f=function(y,mu,beta)
return(dexp(y-mu,rate=1/beta))
F=function(y,mu,beta)
return(pexp(y-mu,rate=1/beta))

y5=data[1]; y15=data[2]
mu=theta[,1]
beta=theta[,2]

loglike=log(f(y5,mu,beta))+log(f(y15,mu,beta))+
4*log(F(y5,mu,beta))+9*log(F(y15,mu,beta)-F(y5,mu,beta))+
5*log(1-F(y15,mu,beta))

return(loglike)
}

3. Graphing the posterior of (mu, beta), we see strong skewness in both parameters.
It is usually helpful to transform to real-valued parameters

theta1 = log(y(5) – mu) , theta1 = log(beta).

We write the following function kaminsky.R that computes the log posterior of (theta1, theta2).

kaminsky=function(theta,data)
{

f=function(y,mu,beta)
return(dexp(y-mu,rate=1/beta))
F=function(y,mu,beta)
return(pexp(y-mu,rate=1/beta))

y5=data[1]; y15=data[2]
mu=y5-exp(theta[,1])
beta=exp(theta[,2])

loglike=log(f(y5,mu,beta))+log(f(y15,mu,beta))+
4*log(F(y5,mu,beta))+9*log(F(y15,mu,beta)-F(y5,mu,beta))+
5*log(1-F(y15,mu,beta))

logjack=theta[,1]+theta[,2]

return(loglike+logjack)
}

Here’s a graph of the posterior of the reexpressed parameters — note that it is much more normal-shaped.


4. We’ll use several functions in the next posting to summarize the posterior.

(a) The laplace function is useful in finding the posterior mode and normal approximation to the posterior.

(b) By use of the rwmetrop function, we construct a random-walk Metropolis algorithm to simulate from the joint posterior.

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: