## Increasing popularity of Bayesian thinking

It is remarkable to see the growth in the development and use of Bayesian methods over my academic lifetime. One way of measuring this growth is to simply count the number of Bayesian papers presented at meetings. In Statistics, our major meeting is JSM (Joint Statistical Meeting) that is held each summer in a major U.S. city. I pulled out the program for the 1983 JSM. Scanning through the abstracts, I found 18 presented papers that had Bayes in the title. Looking at the online program for the 2011 JSM, I found 58 sessions where Bayes was in the title of the session and typically a session will include 4-5 papers. Using this data, I would guess that there were approximately 15 times as many Bayesian papers presented in 2011 than in 1983.

Another way of measuring the growth is to look at the explosion of Bayesian texts that have been recently published. At first, the Bayesian books were more advanced with limited applications. Now there are many applied statistics books that illustrate the application of Bayesian thinking in many disciplines like economics, biology, ecology, and the social sciences.

## Welcome to MATH 6480 – Fall 2011

Welcome to MATH 6480 Bayesian Statistical Analysis. I’ll be using this blog to help you with Bayesian computation (specifically R examples) and helping you with general issues that come up in this course.

## Bayesian communication

Here are some thoughts about the project that my Bayesian students are working on now.

1. When one communicates a Bayesian analysis, one should clearly state the prior beliefs and the prior distribution that matches these beliefs, the likelihood, and the posterior distribution.

2. In any problem, there are particular inferential questions, and a Bayesian report should give the summaries of the posterior that answer the inferential questions.

3. In the project, the questions are to compare two proportions, and if it is reasonable to assume that the proportions are equal. (The first question is an estimation problem and the second question relates to the choice of model.)

4. What role does the informative prior play in the final inference? In the project, the students perform two analyses, one with an informative prior and one with a vague prior. By comparing the two posterior inferences, one can better understand the influence of the prior information.

5. There is a computational aspect involved in obtaining the posterior distribution. In a Bayesian report, one can talk about the general algorithms that were used. But the computational details (like R code) has to be in the background, say in an appendix.

6. The focus of the project (of course) is the Bayesian analysis. But it is helpful to contrast the Bayesian analysis with frequentist methods. The student should think of frequentist methods for estimation and testing and which methods are appropriate for addressing these questions. In the project drafts, it seemed the weakest part of the draft was the description of the frequentist methods.

## Why should I buy a textbook?

I just discovered that many of my students didn’t buy my text since there is free electronic access to an older edition of my text through the library system.

That raises the question “Why purchase the textbook?”

I’l give several reasons.

1. If you are taking a graduate course in your major, you should purchase all of the recommended texts. After all, you are planning to be a statistician and you’ll need a collection of books that will help you in your job.

2. There is a reason why authors write second editions of a book. In my case, I changed the LearnBayes package, and so I wanted to revise the book to be consistent with version 2.0 of the package. Also there are new sections and new exercises in the 2nd edition. The student is missing useful material. There was some confusion on the last homework, partly since the students were not reading the current edition.

3. “Books are expensive?” I understand a book can be expensive (actually, my text is relatively inexpensive), but it is certainly cheap relative to the cost of taking the course.

4. By not purchasing the textbook, the student is sending a message to the instructor, saying “I can get through the course without reading the book” which really is an insult to the instructor. This is common practice among undergraduates, but I hope it isn’t common among statistics students.

## Grading homework …

My recent homework in my Bayesian class was on several one-parameter problems where R was used in the posterior and predictive calculations.

There was much variation in what was turned in. One student’s homework consisted of 37 pages and every simulated parameter value was displayed. Another student’s turn-in was 4 pages where all of the R work was displayed (in a 2 point font) on a single page.

Here are some guidelines for what I’d like a student’s homework to look like.

1. Homework consisting completely of R work (input and output) is clearly inappropriate.

2. The answers to the exercise questions should be written in paragraph form with complete sentences. Imagine that the student was supposed to report to his/her boss about what she learned. She or he would write a report that describes in words what was learned.

3. Obviously, I’d like to see that the student is using R functions in a reasonable way. But I’m primarily interested in a copy of what the student entered and the relevant output. For example, suppose the student is summarizing a beta posterior using simulation. I don’t want to see the 1000 simulated draws, but the student could convince me that he or she is getting reasonable results by showing several summaries, such as a posterior mean and posterior standard deviation.

4. If I assign a homework with 8 exercises, then I think that 3 pages is too brief (not enough said), but over 20 pages indicates too much irrelevant R output is included. The student needs a reasonable balance. Maybe 10 pages would be an optimal length of a turn-in — maybe longer if the student wishes to include some graphs.

By the way, it was interesting how one particular question was answered.

In Chapter 2, exercise 4, I asked the student to contrast predictions using two different priors, one a discrete one, and the second a beta prior. Most of the students were successful in computing the predictive probabilities using the two priors. But there were different comparisons that were done.

1. DISPLAY? Some students just displayed the two probability distributions and said they were similar. Let’s say that this approach wasn’t that persuasive.

2. GRAPH? Some students graphed the two sets of predictive probabilities on a single graph. Assuming the graph is readable, that is a much better idea. One can quickly see if the distributions are similar by looking at the graph.

3. SUMMARIZE? Another approach is to compare the two distributions by summarizing each distribution in some way. For example, one could compute the mean and standard deviation of each distribution? Or one could compute a 90% predictive interval for each distribution?

What would I prefer? It is pretty obvious that simply displaying the two probability distributions is not enough. I think graphing the two distributions and summarizing the distributions is a good strategy. Otherwise, you really aren’t answering the question of whether the two distributions are similar.

## Why Bayes?

When I start my Bayesian class, I like to mention some reasons why this is a relevant class. Specifically, what is wrong with frequentist inference and what can Bayes thinking add the statistician’s toolkit?

There is an article published in Science about 10 years ago titled “Bayes Offers a ‘New’ Way to Make Sense of Numbers” that you can find at

http://bayes.bgsu.edu/m6480/LECTURE%20NOTES/science.article.pdf

It does a good job selling Bayes to the public. Here are a couple of things from the article that I mentioned in my class.

1. Part of the motivation for considering Bayesian methods are the advances in computers and computational methods together with some limitations of frequentist methods.

2. Bayesian conclusions are easier to understand.

3. The FDA is currently encouraging more use of Bayesian methods for clinical trials. One area where Bayesian methods appear to have an advantage is sequential trials where one is collecting data in time and one wishes to stop the trial when one has sufficient evidence to make a decision.

4. P-values, one of the standard frequentist summaries, are frequently misinterpreted. In addition, there is a strong literature that suggests that p-values typically overstate the evidence against the null hypothesis.

5. One the popular computer tools is the Microsoft animated paperclip http://en.wikipedia.org/wiki/Office_Assistant that is driven by Bayesian methods. But it seems that people are generally annoyed with this help device and it is going away.

## Welcome to MATH 6480

Welcome to the 2009 version of MATH 6480 Bayesian Analysis. I’ll be using this blog to post new examples and explain how to use R, especially for Bayesian computations. One reason why I’m using wordpress is that it is simple to write mathematical expressions like this important one:

or “Posterior” is proportional to “Prior” times “Likelihood”.