Mixture Densities, Maximum Likelihood and the EM Algorithm

The problem of estimating the parameters which determine a mixture density has been the subject of a large, diverse body of literature spanning nearly ninety years. During the last two decades, the method of maximum likelihood has become the most widely followed approach to this problem, thanks primarily to the advent of high speed electronic computers. Here, we first offer a brief survey of the literature directed toward this problem and review maximum-likelihood estimation for it. We then turn to the subject of ultimate interest, which is a particular iterative procedure for numerically approximating maximum-likelihood estimates for mixture density problems. This procedure, known as the EM algorithm, is a specialization to the mixture density context of a general algorithm of the same name used to approximate maximum-likelihood estimates for incomplete data problems. We discuss the formulation and theoretical and practical properties of the EM algorithm for mixture densities, focussing in particular on mixtures of densities from exponential families.

  • [1]  M. A. Acheson and , E. M. McElwee, Concerning the reliability of electron tubes, The Sylvania Technologist, 4 (1951), 105–116 Google Scholar

  • [2]  J. Aitchison and , S. D. Silvey, Maximum-likelihood estimation of parameters subject to restraints, Ann. Math. Statist., 29 (1958), 813–828 20:1382 0092.36704 CrossrefISIGoogle Scholar

  • [3]  J. A. Anderson, Multivariate logistic compounds, Biometrika, 66 (1979), 17–26 80k:62040 0399.62029 CrossrefISIGoogle Scholar

  • [4]  Niels Arley and , K. Rander Buch, Introduction to the Theory of Probability and Statistics, John Wiley & Sons Inc., New York, N. Y., 1950xi+236 11,187e 0041.24704 Google Scholar

  • [5]  G. A. Baker, Maximum likelihood estimation of the ratio of the components of non-homogeneous populations, Tôhoku Math. J., 47 (1940), 304–308 3,7e 0024.16101 Google Scholar

  • [6]  O. Barndorff-Nielsen, Identifiability of mixtures of exponential families, J. Math. Anal. Appl., 12 (1965), 115–121 10.1016/0022-247X(65)90059-4 32:540 0138.12105 CrossrefISIGoogle Scholar

  • [7]  Ole Barndorff-Nielsen, Information and exponential families in statistical theory, John Wiley & Sons Ltd., Chichester, 1978ix+238 82k:62011 0387.62011 Google Scholar

  • [8]  Leonard E. Baum, Ted Petrie, George Soules and , Norman Weiss, A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains, Ann. Math. Statist., 41 (1970), 164–171 44:4816 0188.49603 CrossrefISIGoogle Scholar

  • [9]  J. Behboodian, On a mixture of normal distributions, Biometrika, 57 (1970), 215–217, Part 1 0193.18104 CrossrefISIGoogle Scholar

  • [10]  J. Behboodian, Information matrix for a mixture of two normal distributions, J. Statist. Comput. Simul., 1 (1972), 295–314 0247.62012 CrossrefGoogle Scholar

  • [11]  C. T. Bhattacharya, A simple method of resolution of a distribution into Gaussian components, Biometrics, 23 (1967), 115–137 CrossrefISIGoogle Scholar

  • [12]  W. R. Blischke, Moment estimators for the parameters of a mixture of two binomial distributions, Ann. Math. Statist., 33 (1962), 444–454 25:675 0131.17804 CrossrefISIGoogle Scholar

  • [13]  W. R. Blischke, Mixtures of discrete distributionsClassical and Contagious Discrete Distributions (Proc. Internat. Sympos., McGill Univ., Montreal, Que., 1963), Statistical Publishing Society, Calcutta, 1965, 351–372 35:2385 Google Scholar

  • [14]  W. R. Blischke, Estimating the parameters of mixtures of binomial distributions, J. Amer. Statist. Assoc., 59 (1964), 510–528 28:5509 0128.13501 CrossrefISIGoogle Scholar

  • [15]  D. C. Boes, On the estimation of mixing distributions, Ann. Math. Statist., 37 (1966), 177–188 32:4766 0136.39902 CrossrefISIGoogle Scholar

  • [16]  D. C. Boes, Minimax unbiased estimator of mixing distribution for finite mixtures, Sankhyā Ser. A, 29 (1967), 417–420 36:7238 Google Scholar

  • [17]  K. O. Bowman and , L. R. Shenton, Space of solutions for a normal mixture, Biometrika, 60 (1973), 629–636 49:1650 0271.62021 CrossrefISIGoogle Scholar

  • [18]  Russell A. Boyles, On the convergence of the EM algorithm, J. Roy. Statist. Soc. Ser. B, 45 (1983), 47–50 85c:62064 0508.62030 Google Scholar

  • [19]  C. Burrau, The half-invariants of the sum of two typical laws of errors, with an application to the problem of dissecting a frequency curve into components, Skand. Aktuarietidskrift, 17 (1934), 1–6 0008.36805 Google Scholar

  • [20]  R. M. Cassie, Some uses of probability paper in the analysis of size frequency distributions, Austral. J. Marine and Freshwater Res., 5 (1954), 513–523 CrossrefGoogle Scholar

  • [21]  R. Ceppelini, M. Siniscalco and , C. A. B. Smith, The estimation of gene frequencies in a random-mating population, Ann. Human Genetics, 20 (1955), 97–115 17,761c CrossrefISIGoogle Scholar

  • [22]  K. C. Chanda, A note on the consistency and maxima of the roots of likelihood equations, Biometrika, 41 (1954), 56–61 16,55d 0055.12901 CrossrefISIGoogle Scholar

  • [23]  W. C. Chang, The effects of adding a variable in dissecting a mixture of two normal populations with a common covariance matrix, Biometrika, 63 (1976), 676–678 0344.62021 CrossrefISIGoogle Scholar

  • [24]  W. C. Chang, Confidence interval estimation and transformation of data in a mixture of two multivariate normal distributions with any given large dimension, Technometrics, 21 (1979), 351–355 CrossrefISIGoogle Scholar

  • [25]  C. V. L. Charlier, Researches into the theory of probability, Acta Univ. Lund. (Neue Folge. Abt. 2), 1 (1906), 33–38 Google Scholar

  • [26]  C. V. L. Charlier and , S. D. Wicksell, On the dissection of frequency functions, Arkiv. for Matematik Astronomi Och Fysik, 18 (1924), , Stockholm 50.0657.01 Google Scholar

  • [27]  T. Chen, Masters Thesis, Mixed-up frequencies in contingency tables, Ph.D. dissertation, Univ. of Chicago, Chicago, 1972 Google Scholar

  • [28]  Keewhan Choi, Estimators for the parameters of a finite mixture of distributions. , Ann. Inst. Statist. Math., 21 (1969), 107–116 39:6435 0183.48301 CrossrefISIGoogle Scholar

  • [29]  Keewhan Choi and , W. B. Bulgren, An estimation procedure for mixtures of distributions, J. Roy. Statist. Soc. Ser. B, 30 (1968), 444–460 39:7704 0187.15804 Google Scholar

  • [30]  A. C. Cohen, Jr., Estimation in mixtures of discrete distributionsProc. of the International Symposium on Classical and Contagious Discrete Distributions, Pergamon Press, New York, 1963, 351–372 Google Scholar

  • [31]  A. C. Cohen, Jr., A note on certain discrete mixed distributions, Biometrics, 22 (1966), 566–572 34:3696 CrossrefISIGoogle Scholar

  • [32]  A. C. Cohen, Estimation in mixtures of two normal distributions, Technometrics, 9 (1967), 15–28 35:7455 0147.18104 CrossrefISIGoogle Scholar

  • [33]  A. C. Cohen, Communications in Statistics, Special Issue on Remote Sensing, Comm. Statist. Theor. Meth., A5 (1976), Google Scholar

  • [34]  P. W. Cooper, J. T. Tou, Some topics on nonsupervised adaptive detection for multivariate normal distributionsComputer and Information Sciences, II, Academic Press, , New York, 1967, 143–146 Google Scholar

  • [35]  D. R. Cox, The analysis of exponentially distributed life-times with two types of failure, J. Roy. Statist. Soc. Ser. B, 21 (1959), 411–421 22:5104 0093.15704 Google Scholar

  • [36]  Harald Cramér, Mathematical Methods of Statistics, Princeton Mathematical Series, vol. 9, Princeton University Press, Princeton, N. J., 1946xvi+575 8,39f 0063.01014 Google Scholar

  • [37]  N. E. Day, Estimating the components of a mixture of normal distributions, Biometrika, 56 (1969), 463–474 40:8163 0183.48106 CrossrefISIGoogle Scholar

  • [38]  J. J. Deely and , R. L. Kruse, Construction of sequences estimating the mixing distribution, Ann. Math. Statist., 39 (1968), 286–288 36:3437 0174.22302 CrossrefISIGoogle Scholar

  • [39]  A. P. Dempster, N. M. Laird and , D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Statist. Soc. Ser. B, 39 (1977), 1–38, (methodological) 58:18858 0364.62022 ISIGoogle Scholar

  • [40]  J. E. Dennis, Jr., Algorithms for nonlinear fittingProc. of the NATO Advanced Research Symposium, Cambridge Univ., Cambridge, England, 1981, July 0545.65007 Google Scholar

  • [41]  J. E. Dennis, Jr. and , Jorge J. Moré, Quasi-Newton methods, motivation and theory, SIAM Rev., 19 (1977), 46–89 10.1137/1019005 56:4146 0356.65041 LinkISIGoogle Scholar

  • [42]  John E. Dennis, Jr. and , Robert B. Schnabel, Numerical methods for unconstrained optimization and nonlinear equations, Prentice Hall Series in Computational Mathematics, Prentice Hall Inc., Englewood Cliffs, NJ, 1983xiii+378 85j:65001 0579.65058 Google Scholar

  • [43]  N. P. Dick and , D. C. Bowden, Maximum-likelihood estimation for mixtures of two normal distributions, Biometrics, 29 (1973), 781–791 CrossrefISIGoogle Scholar

  • [44]  L. C. W. Dixon and , G. P. Szegö, Towards Global Optimization, Vols. 1, 2, North-Holland, Amsterdam, 1975, 1978 Google Scholar

  • [45]  G. Doetsch, Zerlegung einer Funktion in Gausche Fehlerkurven und zeitliche Zuruckverfolgung eines Temperaturzustandes, Math. Z., 41 (1936), 283–318 0014.21301 CrossrefGoogle Scholar

  • [46]  R. O. Duda and , P. E. Hart, Pattern Classification and Scene Analysis, John Wiley, New York, 1973 0277.68056 Google Scholar

  • [47]  B. S. Everitt and , D. J. Hand, Finite mixture distributions, Chapman & Hall, London, 1981x+143 83a:62046 0466.62018 CrossrefGoogle Scholar

  • [48]  George E. Forsythe, Michael A. Malcolm and , Cleve B. Moler, Computer methods for mathematical computations, Prentice-Hall Inc., Englewood Cliffs, N.J., 1977xi+259 56:16983 0361.65002 Google Scholar

  • [49]  E. B. Fowlkes, Some methods for studying the mixture of two normal (log normal) distributions, J. Amer. Statist. Assoc., 74 (1979), 561–575 0434.62024 CrossrefISIGoogle Scholar

  • [50]  J. G. Fryer and , C. A. Robertson, A comparison of some methods for estimating mixed normal distributions, Biometrika, 59 (1972), 639–648 49:4146 0255.62033 CrossrefISIGoogle Scholar

  • [51]  K. Fukunaga, Introduction to Statistical Pattern Recognition, Academic Press, New York, 1972 Google Scholar

  • [52]  S. Ganesalingam and , G. J. McLachlan, Some efficiency results for the estimation of the mixing proportion in a mixture of two normal distributions, Biometrics, 37 (1981), 23–33 83j:62050 0498.62049 CrossrefISIGoogle Scholar

  • [53]  Philip E. Gill, Walter Murray and , Margaret H. Wright, Practical optimization, Academic Press Inc. [Harcourt Brace Jovanovich Publishers], London, 1981xvi+401 83d:65195 0503.90062 Google Scholar

  • [54]  L. A. Goodman, The analysis of systems of qualitative variables when some of the variables are unobservable: Part I-A modified latent structure approach, Amer. J. Sociol., 79 (1974), 1179–1259 10.1086/225676 CrossrefISIGoogle Scholar

  • [55]  V. H. Gottschalk, Symmetrical bi-modal frequency curves, J. Franklin Inst., 245 (1948), 245–252 10.1016/0016-0032(48)90386-X 9,452i CrossrefGoogle Scholar

  • [56]  J. Gregor, An algorithm for the decomposition of a distribution into Gaussian components, Biometrics, 25 (1969), 79–93 CrossrefISIGoogle Scholar

  • [57]  N. T. Gridgeman, A comparison of two methods of analysis of mixtures of normal distributions, Technometrics, 12 (1970), 823–833 0203.51904 ISIGoogle Scholar

  • [58]  E. J. Gumbel, La dissection d'une répartition, Ann. Univ. Lyon (3), A 11 (1939), 39–51 1,247f 0063.01784 Google Scholar

  • [59]  A. K. Gupta and , T. Miyawaki, On a uniform mixture model, Biometrical J., 20 (1978), 631–637 80m:62029 0406.62016 CrossrefGoogle Scholar

  • [60]  L. F. Guseman, Jr. and , Jay R. Walton, An application of linear feature selection to estimation of proportions, Comm. Statist.—Theory Methods, A6 (1977), 611–617 56:1593 0363.62025 CrossrefISIGoogle Scholar

  • [61]  L. F. Guseman, Jr. and , J. R. Walton, Methods for estimating proportions of convex combinations of normals using linear feature selection, Comm. Statist. Theor. Meth., A7 (1978), 1439–1450 0394.62043 CrossrefISIGoogle Scholar

  • [62]  Shelby J. Haberman, Log-linear models for frequency tables derived by indirect observation: maximum likelihood equations, Ann. Statist., 2 (1974), 911–924 56:16887 0288.62013 CrossrefISIGoogle Scholar

  • [63]  S. J. Haberman, Iterative scaling procedures for log-linear models for frequency tables derived by indirect observation, Proc. Amer. Statist. Assoc. (Statist. Comp. Sect. 1975), 1976, 45–50 Google Scholar

  • [64]  Shelby J. Haberman, Product models for frequency tables involving indirect observation, Ann. Statist., 5 (1977), 1124–1147 56:4023 0371.62150 CrossrefISIGoogle Scholar

  • [65]  A. Hald, The compound hypergeometric distribution and a system of single sampling inspection plans based on prior distributions and costs, Technometrics, 2 (1960), 275–340 22:6058 0097.13701 CrossrefGoogle Scholar

  • [66]  Peter Hall, On the nonparametric estimation of mixture proportions, J. Roy. Statist. Soc. Ser. B, 43 (1981), 147–156 82h:62067 Google Scholar

  • [67]  J. P. Harding, The use of probability paper for the graphical analysis of polynomial frequency distributions, J. Marine Biological Assoc., 28 (1949), 141–153 CrossrefISIGoogle Scholar

  • [68]  John A. Hartigan, Clustering algorithms, John Wiley & Sons, New York-London-Sydney, 1975xiii+351 53:9518 Google Scholar

  • [69]  M. J. Hartley, Comment on [118], J. Amer. Statist. Assoc., 73 (1978), 738–741 Google Scholar

  • [70]  Victor Hasselblad, Estimation of parameters for a mixture of normal distributions, Technometrics, 8 (1966), 431–446 33:5028 CrossrefISIGoogle Scholar

  • [71]  V. Hasselblad, Estimation of finite mixtures of distributions from the exponential family, J. Amer. Statist. Assoc., 64 (1969), 1459–1471 CrossrefISIGoogle Scholar

  • [72]  R. J. Hathaway, Constrained maximum-likelihood estimation for a mixture of m univariate normal distributions, Statistics Tech. Rep., 92, 62F10-2, Univ. of South Carolina, Columbia, SC, 1983 Google Scholar

  • [73]  Bruce M. Hill, Information for estimating the proportions in mixtures of exponential and normal distributions, J. Amer. Statist. Assoc., 58 (1963), 918–932 27:5315 CrossrefISIGoogle Scholar

  • [74]  David W. Hosmer, Jr., On MLE of the parameters of a mixture of two normal distributions when the sample size is small, Comm. Statist., 1 (1973), 217–227 47:4378 CrossrefGoogle Scholar

  • [75]  D. W. Hosmer, Jr., A comparision of iterative maximum-likelihood estimates of the parameters of a mixture of two normal distributions under three different types of sample, Biometrics, 29 (1973), 761–770 CrossrefISIGoogle Scholar

  • [76]  D. W. Hosmer, Jr., Maximum-likelihood estimates of the parameters of a mixture of two regression lines, Comm. Statist., 3 (1974), 995–1006 0294.62085 CrossrefGoogle Scholar

  • [77]  D. W. Hosmer, Jr., Comment on [118], J. Amer. Statist. Assoc., 73 (1978), 741–744 CrossrefGoogle Scholar

  • [78]  D. W. Hosmer, Jr. and , N. P. Dick, Information and mixtures of two normal distributions, J. Statist. Comput. Simul., 6 (1977), 137–148 0366.62038 CrossrefGoogle Scholar

  • [79]  V. S. Huzurbazar, The likelihood equation, consistency and the maxima of the likelihood function, Ann. Eugenics, 14 (1948), 185–200 10,388a 0033.07703 CrossrefGoogle Scholar

  • [80]  I. R. James, Estimation of the mixing proportion in a mixture of two normal distributions from simple, rapid measurements, Biometrics, 34 (1978), 265–275 0384.62027 CrossrefISIGoogle Scholar

  • [81]  S. John, On identifying the population of origin of each observation in a mixture of observations from two normal populations, Technometrics, 12 (1970), 553–563 CrossrefISIGoogle Scholar

  • [82]  S. John, On identifying the population of origin of each observation in a mixture of observations from two gamma populations, Technometrics, 12 (1970), 565–568 CrossrefISIGoogle Scholar

  • [83]  A. B. M. Lutful Kabir, Estimation of parameters of a finite mixture of distributions, J. Roy. Statist. Soc. Ser. B, 30 (1968), 472–482, (methodological) 39:7715 0187.15802 Google Scholar

  • [84]  B. K. Kale, On the solution of the likelihood equation by iteration processes, Biometrika, 48 (1961), 452–456 24:A1768 0121.35902 CrossrefISIGoogle Scholar

  • [85]  B. K. Kale, On the solution of likelihood equations by iteration processes. The mulitparametric case, Biometrika, 49 (1962), 479–486 27:6326 0118.14301 CrossrefISIGoogle Scholar

  • [86]  Dimitri Kazakos, Recursive estimation of prior probabilities using a mixture, IEEE Trans. Information Theory, IT-23 (1977), 203–211 10.1109/TIT.1977.1055693 58:4559 0382.93058 CrossrefISIGoogle Scholar

  • [87]  J. Kiefer and , J. Wolfowitz, Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters, Ann. Math. Statist., 27 (1956), 887–906 19,189a 0073.14701 CrossrefISIGoogle Scholar

  • [88]  Nicholas M. Kiefer, Discrete parameter variation: efficient estimation of a switching regression model, Econometrica, 46 (1978), 427–434 58:3218 0408.62058 CrossrefISIGoogle Scholar

  • [89]  N. M. Kiefer, Comment on [118], J. Amer. Statist. Assoc., 73 (1978), 744–745 CrossrefGoogle Scholar

  • [90]  D. E. Knuth, Seminumerical algorithmsThe Art of Computer Programming, Vol. 2, Addison- Wesley, Reading, MA, 1969 0191.18001 Google Scholar

  • [91]  K. D. Kumar, E. H. Nicklin and , A. S. Paulson, Comment on: “Estimating mixtures of normal distributions and switching regressions” by R. E. Quandt and J. B. Ramsey, J. Amer. Statist. Assoc., 74 (1979), 52–56 82h:62048 ISIGoogle Scholar

  • [92]  Nan Laird, Nonparametric maximum likelihood estimation of a mixed distribution, J. Amer. Statist. Assoc., 73 (1978), 805–811 80e:62021 0391.62029 CrossrefISIGoogle Scholar

  • [93]  P. F. Lazarsfeld and , N. W. Henry, Latent Structure Analysis, Houghton Mifflin Company, Boston, 1968 0182.52201 Google Scholar

  • [94]  Michel Loève, Probability theory, Third edition, D. Van Nostrand Co., Inc., Princeton, N.J.-Toronto, Ont.-London, 1963xvi+685 34:3596 0108.14202 Google Scholar

  • [95]  David G. Luenberger, Optimization by vector space methods, John Wiley & Sons Inc., New York, 1969xvii+326 38:6748 Google Scholar

  • [96]  P. D. M. Macdonald, Comment on “An estimation procedure for mixtures of distribution” by Choi and Bulgren, J. Royal Statist. Soc. Ser. B, 33 (1971), 326–329 Google Scholar

  • [97]  P. D. M. Macdonald, R. P. Gupta, Estimation of finite distribution mixturesApplied statistics (Proc. Conf., Dalhousie Univ., Halifax, N. S., 1974), North-Holland, Amsterdam, 1975, 231–245 53:11854 0303.62023 Google Scholar

  • [98]  C. L. Mallows, review of [100], Biometrics, 18 (1962), 617– CrossrefGoogle Scholar

  • [99]  J. S. Maritz, Empirical Bayes methods, Methuen and Co., Ltd., London, 1970viii+159 47:2710 Google Scholar

  • [100]  Pál Medgyessy, Decomposition of superpositions of distribution functions, Akadémiai Kiadó, Budapest, 1961, 227– 22:12553 0094.32801 Google Scholar

  • [101]  William Mendenhall and , R. J. Hader, Estimation of parameters of mixed exponentially distributed failure time distributions from censored life test data, Biometrika, 45 (1958), 504–520 20:6762 0088.12302 CrossrefISIGoogle Scholar

  • [102]  H. Muench, Probability distribution of protection test results, J. Amer. Statist. Assoc., 31 (1936), 677–689 CrossrefGoogle Scholar

  • [103]  H. Muench, Discrete frequency distributions arising from mixtures of several single probability values, J. Amer. Statist. Assoc., 33 (1938), 390–398 0019.07304 CrossrefGoogle Scholar

  • [104]  P. L. Odell and , J. P. Basu, Concerning several methods for estimating crop acreages using remote sensing data, Comm. Statist.–Theory Methods, A5 (1976), 1091–1114 55:6658 0364.62066 CrossrefISIGoogle Scholar

  • [105]  P. L. Odell and , R. Chhikara, Estimation of a large area crop acreage inventory using remote sensing technology , Rep., NASA/JSC-09703, Univ. of Texas at Dallas, Dallas, TX, 1975, in Annual Report: Statistical Theory and Methodology for Remote Sensing Data Analysis Google Scholar

  • [106]  Terence Orchard and , Max A. Woodbury, A missing information principle: theory and applications, Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. I: Theory of statistics, Univ. California Press, Berkeley, Calif., 1972, 697–715 53:4347 0263.62023 Google Scholar

  • [107]  J. M. Ortega and , W. C. Rheinboldt, Iterative solution of nonlinear equations in several variables, Academic Press, New York, 1970xx+572 42:8686 0241.65046 Google Scholar

  • [108]  A. M. Ostrowski, Solution of equations and systems of equations, Second edition. Pure and Applied Mathematics, Vol. 9, Academic Press, New York, 1966xiv+338 35:7575 0222.65070 Google Scholar

  • [109]  K. Pearson, Contributions to the mathematical theory of evolution, Phil. Trans. Royal Soc., 185A (1894), 71–110 25.0347.02 CrossrefGoogle Scholar

  • [110]  K. Pearson, On certain types of compound frequency distributions in which the components can be individually described by binomial series, Biometrika, 11 (1915–17), 139–144 Google Scholar

  • [111]  K. Pearson and , A. Lee, On the generalized probable error in multiple normal correlation, Biometrika, 6 (1908–09), 59–68 CrossrefGoogle Scholar

  • [112]  B. C. Peters, Jr. and , W. A. Coberly, The numerical evaluation of the maximum-likelihood estimate of mixture proportions, Comm. Statist.–Theory Methods, A5 (1976), 1127–1135 55:6659 0364.62023 CrossrefISIGoogle Scholar

  • [113]  B. C. Peters, Jr. and , H. F. Walker, An iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions, SIAM J. Appl. Math., 35 (1978), 362–378 10.1137/0135032 58:24687 0443.65112 LinkISIGoogle Scholar

  • [114]  B. C. Peters, Jr. and , H. F. Walker, The numerical evaluation of the maximum-likelihood estimate of a subset of mixture proportions, SIAM J. Appl. Math., 35 (1978), 447–452 10.1137/0135036 80c:62036 0405.62024 LinkISIGoogle Scholar

  • [115]  H. S. Pollard, On the relative stability of the median and the arithmetic mean, with particular reference to certain frequency distributions which can be dissected into normal distributions, Ann. Math. Statist., 5 (1934), 227–262 0010.17402 CrossrefGoogle Scholar

  • [116]  Eric J. Preston, A graphical method for the analysis of statistical distributions into two normal components, Biometrika, 40 (1953), 460–464 15,331c 0051.10811 CrossrefISIGoogle Scholar

  • [117]  R. E. Quandt, A new approach to estimating switching regressions, J. Amer. Statist. Assoc., 67 (1972), 306–310 0237.62047 CrossrefISIGoogle Scholar

  • [118]  Richard E. Quandt and , James B. Ramsey, Estimating mixtures of normal distributions and switching regressions, J. Amer. Statist. Assoc., 73 (1978), 730–752 80a:62040 0401.62024 CrossrefISIGoogle Scholar

  • [119]  C. R. Rao, The utilization of multiple measurements in problems of biological classification, J. Roy. Statist. Soc. Ser. B., 10 (1948), 159–193 11,191i 0034.07902 Google Scholar

  • [120]  R. A. Redner, An iterative procedure for obtaining maximum liklihood estimates in a mixture model, Rep., SR-T1-04081, NASA Contract NAS9-14689, Texas A & M Univ., College Station, TX, 1980, Sept. Google Scholar

  • [121]  R. A. Redner, Maximum-likelihood estimation for mixture models, NASA Tech. Memorandum, to appear Google Scholar

  • [122]  Richard Redner, Note on the consistency of the maximum likelihood estimate for nonidentifiable distributions, Ann. Statist., 9 (1981), 225–228 83c:62046 0453.62021 CrossrefISIGoogle Scholar

  • [123]  Paul R. Rider, The method of moments applied to a mixture of two exponential distributions, Ann. Math. Statist., 32 (1961), 143–147 22:10048 0106.13101 CrossrefISIGoogle Scholar

  • [124]  P. R. Rider, Estimating the parameters of mixed Poisson, binomial and Weibull distributions by the method of moments, Bull. Internat. Statist. Inst., 39 (1962), 225–232, Part 2 0212.21602 Google Scholar

  • [125]  C. A. Robertson and , J. G. Fryer, The bias and accuracy of moment estimators, Biometrika, 57 (1970), 57–65, Part 1 42:8605 0193.16801 CrossrefISIGoogle Scholar

  • [126]  J. W. Sammon, Jr., L. N. Kanal, An adaptive technique for multiple signal detection and identificationPattern Recognition, Thompson Book Co., London, 1968, 409–439 Google Scholar

  • [127]  Walter Schilling, A frequency distribution represented as the sum of two Poisson distributions, J. Amer. Statist. Assoc., 42 (1947), 407–424 9,48c CrossrefISIGoogle Scholar

  • [128]  Léopold Simar, Maximum likelihood estimation of a compound Poisson process, Ann. Statist., 4 (1976), 1200–1209 55:1603 0362.62095 CrossrefISIGoogle Scholar

  • [129]  J. Sittig, Superpositie van twee frequentieverdelingen, Statistics, 2 (1948), 206–227 Google Scholar

  • [130]  D. F. Stanat, L. N. Kanal, Unsupervised learning of mixtures of probability functionsPattern Recognition, Thompson Book Co., London, 1968, 357–389 Google Scholar

  • [131]  G. W. Stewart, Introduction to matrix computations, Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], New York-London, 1973xiii+441 56:17018 0302.65021 Google Scholar

  • [132]  B. Strömgren, Tables and diagrams for dissecting a frequency curve into components by the half-invariant method, Skand. Aktuarietidskrift, 17 (1934), 7–54 Google Scholar

  • [133]  R. Sundberg, Masters Thesis, Maximum-likelihood theory and applications for distributions generated when observing a function of an exponential family variable, Doctoral thesis, Inst. Math. Stat., Stockholm Univ., Stockholm, Sweden, 1972 Google Scholar

  • [134]  Rolf Sundberg, Maximum likelihood theory for incomplete data from an exponential family, Scand. J. Statist., 1 (1974), 49–58 52:2007 0284.62014 Google Scholar

  • [135]  Rolf Sundberg, An iterative method for solution of the likelihood equations for incomplete data from exponential families, Comm. Statist.—Simulation Comput., B5 (1976), 55–64 56:1561 0352.62014 CrossrefISIGoogle Scholar

  • [136]  G. M. Tallis and , R. Light, The use of fractional moments for estimating the parameters of a mixed exponential distribution, Technometrics, 10 (1968), 161–175 37:2363 CrossrefISIGoogle Scholar

  • [137]  W. Y. Tan and , W. C. Chang, Convolution approach to genetic analysis of quantitative characters of self-fertilized population, Biometrics, 28 (1972), 1073–1090 CrossrefISIGoogle Scholar

  • [138]  W. Y. Tan and , W. C. Chang, Some comparisons of the method of moments and the method of maximum-likelihood in estimating parameters of a mixture of two normal densities, J. Amer. Statist. Assoc., 67 (1972), 702–708 0245.62039 CrossrefISIGoogle Scholar

  • [139]  R. D. Tarone and , G. Gruenhage, A note on the uniqueness of roots of the likelihood equations for vector-valued parameters, J. Amer. Statist. Assoc., 70 (1975), 903–904 52:12189 0328.62018 CrossrefISIGoogle Scholar

  • [140]  M. Tarter and , A. Silvers, Implementation and applications of bivariate Gaussian mixture decomposition, J. Amer. Statist. Assoc., 70 (1975), 47–55 0311.62028 CrossrefISIGoogle Scholar

  • [141]  Henry Teicher, On the mixture of distributions, Ann. Math. Statist, 31 (1960), 55–73 22:12555 0107.13501 CrossrefISIGoogle Scholar

  • [142]  Henry Teicher, Identifiability of mixtures, Ann. Math. Statist., 32 (1961), 244–248 22:11426 0146.39302 CrossrefGoogle Scholar

  • [143]  Henry Teicher, Identifiability of finite mixtures, Ann. Math. Statist., 34 (1963), 1265–1269 27:5310 0137.12704 CrossrefISIGoogle Scholar

  • [144]  Henry Teicher, Identifiability of mixtures of product measures, Ann. Math. Statist, 38 (1967), 1300–1302 35:7464 0153.47904 CrossrefISIGoogle Scholar

  • [145]  D. M. Titterington, Some problems with data from finite mixture distributions, Technical Summary Report, 2369, Mathematics Research Center, University of Wisconsin-Madison, Madison, WI, 1982 Google Scholar

  • [146]  D. M. Titterington, Minimum distance nonparametric estimation of mixture proportions, J. Roy. Statist. Soc. Ser. B, 45 (1983), 37–46 84k:62067 0563.62027 Google Scholar

  • [147]  J. D. Tubes and , W. A. Coberly, An empirical sensitivity study of mixture proportion estimators, Comm. Statist. Theor. Meth., A5 (1976), Google Scholar

  • [148]  Robert R. Sokal, J. Van Ryzin, Clustering and classification: background and current directionsClassification and clustering (Proc. Adv. Sem., Univ. Wisconsin, Madison, Wis., 1976), Academic Press, New York, 1977, 1–15. Math. Res. Center Publ., No. 37 56:10209 CrossrefGoogle Scholar

  • [149]  Y. Vardi, Nonparametric estimation in renewal processes, Ann. Statist., 10 (1982), 772–785 84e:62071 0502.62036 CrossrefISIGoogle Scholar

  • [150]  Abraham Wald, Note on the consistency of the maximum likelihood estimate, Ann. Math. Statistics, 20 (1949), 595–601 11,261d 0034.22902 CrossrefISIGoogle Scholar

  • [151]  Homer F. Walker, Estimating the proportions of two populations in a mixture using linear maps, Comm. Statist. A—Theory Methods, 9 (1980), 837–849 81h:62079 0437.62019 CrossrefISIGoogle Scholar

  • [152]  K. Weichselberger, Uber ein graphisches Verfahren zur Trennung von Mischverteilungen und zur Identifikation kupierter Normalverteilungen bie grossem Stichprobenumfang, Metrika, 4 (1951), 178–229 CrossrefISIGoogle Scholar

  • [153]  J. H. Wolfe, Pattern clustering by multivariate mixture analysis, Multivariate Behavioral Res., 5 (1970), 329–350 CrossrefISIGoogle Scholar

  • [154]  J. Wolfowitz, The minimum distance method, Ann. Math. Statist., 28 (1957), 75–88 19,472f 0086.35403 CrossrefISIGoogle Scholar

  • [155]  C.-F. Wu, On the convergence properties of the EM algorithm, Ann. Statist., 11 (1983), 95–103 85e:62049 0517.62035 CrossrefISIGoogle Scholar

  • [156]  S. J. Yakowitz, A consistent estimator for the identification of finite mixtures, Ann. Math. Statist., 40 (1969), 1728–1735 40:5065 0184.42101 CrossrefISIGoogle Scholar

  • [157]  S. J. Yakowitz, Unsupervised learning and the identification of finite mixtures, IEEE Trans. Inform. Theory, IT-16 (1970), 330–338 10.1109/TIT.1970.1054442 0197.45502 CrossrefISIGoogle Scholar

  • [158]  Sidney J. Yakowitz and , John D. Spragins, On the identifiability of finite mixtures, Ann. Math. Statist., 39 (1968), 209–214 36:7248 0155.25703 CrossrefISIGoogle Scholar

  • [159]  Tzay Y. Young and , Thomas W. Calvert, Classification, estimation and pattern recognition, American Elsevier Publishing Co., Inc., New York-London-Amsterdam, 1974xiv+366 50:3467 0277.68055 Google Scholar

  • [160]  T. Y. Young and , G. Coraluppi, Stochastic estimation of a mixture of normal density functions using an information criterion, IEEE Trans. Inform. Theory, IT-16 (1970), 258–263 10.1109/TIT.1970.1054454 0206.19803 CrossrefISIGoogle Scholar

  • [161]  Shelemyahu Zacks, The theory of statistical inference, John Wiley & Sons Inc., New York, 1971xiii+609 54:8934a Google Scholar

  • [162]  Willard I. Zangwill and , B. Mond, Nonlinear programming: a unified approach, Prentice-Hall Inc., Englewood Cliffs, N.J., 1969xvi+356 50:12268 0195.20804 Google Scholar