Identifying mixtures of mixtures using Bayesian estimation

Malsiner-Walli, Gertraud and Frühwirth-Schnatter, Sylvia and Grün, Bettina ORCID: (2017) Identifying mixtures of mixtures using Bayesian estimation. Journal of Computational and Graphical Statistics, 26 (2). pp. 285-295. ISSN 1537-2715

Available under License Creative Commons: Attribution 4.0 International (CC BY 4.0).

Download (274kB)


The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition, this prior allows to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semi-parametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark data sets.

Item Type: Article
Additional Information: The author gratefully acknowledges support by the Austrian Science Fund (FWF): V170-N18. As a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.
Keywords: Dirichlet prior; Finite mixture model; Model-based clustering; Bayesian nonparametric mixture model; Normal gamma prior; Number of components
Divisions: Departments > Finance, Accounting and Statistics > Statistics and Mathematics
Version of the Document: Accepted for Publication
Depositing User: Gertraud Novotny
Date Deposited: 02 Dec 2016 12:16
Last Modified: 17 Aug 2021 10:38
Related URLs:


View Item View Item


Downloads per month over past year

View more statistics