A PERSONAL HISTORY OF BAYESIAN STATISTICS

 

Thomas Hoskyns Leonard

 

Retired Professor of Statistics, Universities of Wisconsin-Madison and Edinburgh

 

APPENDIX A

 

Bayesian Methods for the Simultaneous Estimation of Several Parameters

 

Tom's external Ph.D,examiner Patricia M.E, Altham
(Statistical Laboratory, University of Cambridge)

 

    The data in Table 1 combine the results reported in Tables 2.4.1 and 2.4.2 of my long-lost 1973 University of London Ph.D. Thesis.

    The data in the second and third columns denote the number of males and females attending ten different courses of a similar type. The raw proportions (1) of females may be contrasted with an overall proportion of 170/553= 0.307. The hierarchical Bayes shrinkage estimates (1) smooth the raw proportion towards 0.280, which is a smoothed overall proportion. These shrinkage estimates refer to my hierarchical exchangeable first-stage normal prior distribution for the binomial logits, with parameters ν and λ in my inverse chi-squared distribution for the first stage prior variance set equal to 0.0001 and 1.0 respectively. This shrinkage estimates transform the exact joint posterior modes of the binomial logits, using the reverse transformation θ =exp(α)/{1+exp(α)}.  They smooth the raw proportions (1) of females towards the overall value 0.280.

    The entries in (3) approximate the modal estimates in (2) by reference to a normal approximation to the likelihood of the logits that refers to the unadjusted empirical logits and the likelihood dispersions. The approximations are reassuring close to the exact results.

    The entries in (4) invoke unadjusted reverse transformations of the unconditional posterior MEANS of the logits, but subject to the preceding approximation to their likelihood. They smooth the raw proportions noticeably less towards 0.280 than do the modal estimates.

    This is a not-too-alarming example of the ‘collapsing phenomenon’ which I discussed in the current Ch. 4.

    I computed all these results during the academic year 1970-1.  I nevertheless felt persuaded into believing that joint posterior modes, rather than unconditional posterior means, were the way to go.

    My 1972 and 1973 Biometrika papers were appended to my thesis, and the eight chapter headings were:

 

Chapter 1                 Introduction

Chapter 2                 The Estimation of Several Parameters

Chapter 3                 The Simultaneous Estimation of Multinomial Cell Probabilities

Chapter 4                 A Bayesian Method for Histograms

Chapter 5                 A Bayesian Analysis for Several Multinomial Distributions

Chapter 6                 Two-Way Contingency Tables and Related Topics

Chapter 7                 The Linear Model with Unequal Variances

Chapter 8                 Regression Models

 
 

    A modified and slightly extended version of the material in Ch. 7 was published in Technometrics in 1975. It invokes very general prior informative assumptions which incorporate linear models for both the normal means and the log-variances. After I much later taught this approach on my Statistics 775 course at the University of Wisconsin-Madison, a modified form of the methodology was very beneficially used in the Animal Breeding to smooth the log-variances. See, for example the  article by Jean-Louis Foulley, Daniel Gianola, Magali San Christobal and Sotan Im in Computational Statistics and Data Analysis (1992).

    Daniel Gianola’s student Rob Tempelman did something similar in his 1993 University of Wisconsin Ph.D. thesis Poisson Mixed Models for the Analysis of Counts with an Application to Dairy Cattle Breeding with the litter sizes of dairy cattle and the logs of the corresponding Poisson means.

    Gianola is a leading Bayesian in this general area. Tempelman went on to become a Professor of Animal Sciences at Michigan State University, where he has published a number of important Bayesian papers. Jean-Louis Foulley’s subsequent career at INRA in France has also been very impressive. He has published numerous applications of Bayesian inference in Animal Sciences and Genetics.

    In the special case where the normally distributed observations are unreplicated, some special cases of the prior covariance matrices provide us with some interesting time series models where the log-variances are taken to be stochastically related. These formulations may be contrasted with the various ‘stochastic volatility models’ which have been proposed in the Economics literature.

    My thesis created a quite general paradigm for the construction of non-conjugate prior distributions. Firstly, seek a transformation of the parameters such that the new parameters can be taken to possess a multivariate normal distribution. If you regard this as the first stage of a hierarchical prior, then you may, at the second stage, assign further distributions to the hyperparameters appearing in your first stage mean vector and covariance matrix. This also provides general formulation for non-linear random effects models. See also section 6.3 of [15]. 

 

Table 1: Estimated Gender Probabilities for Students attending 10 courses

 
Course Female Male (1) (2) (3) (4)
1. 7 10 0.259 0.267 0.268 0.265
2. 3 13 0.188 0.236 0.240 0.227
3. 3 12  0.200 0.243 0.247 0.234
4. 10 16 0.385 0.345 0.345 0.355
5. 11 84 0.116 0.145 0.149 0.139
6. 42 47 0.472 0.444 0.444 0.452
7. 5 22 0.185 0.223 0.227 0.215
8. 32 40 0.444 0.416 0.416 0.424
9. 45 57 0.441 0.421 0.421 0.426
10. 12 72 0.143 0.169 0.169 0.163

 

 
 
 
 
  © Thomas Hoskyns Leonard, 2014