The Bayesian Gaussian Mixture Model (BGMM) extends the classical GMM by placing prior distributions over the model parameters, enabling automatic regularization, complexity control, and uncertainty quantification. This approach integrates over model uncertainty rather than relying solely on point estimates.
Generative Process:
Each observation \( y_t \) is assumed to be generated from one of \( K \) Gaussian components:
\[
p(y_t) = \sum_{k=1}^{K} \pi_k \cdot \mathcal{N}(y_t \mid \mu_k, \Sigma_k)
\]
Priors:
- Mixture Weights:
\[
\boldsymbol{\pi} \sim \text{Dirichlet}(\boldsymbol{\alpha})
\]
where \( \boldsymbol{\pi} = (\pi_1, \ldots, \pi_K) \) represents the mixing proportions across \( K \) components.
- Component Parameters:
\[
(\mu_k, \Sigma_k) \sim \text{Normal-Inverse-Wishart}(\mu_0, \kappa_0, \Psi_0, \nu_0)
\]
where:
- \( \mu_0 \): prior mean of component means
- \( \kappa_0 \): strength (confidence) of the prior on the mean
- \( \Psi_0 \): scale matrix of the Inverse-Wishart prior (covariance structure)
- \( \nu_0 \): degrees of freedom for the covariance prior
This formulation defines a full posterior over both the mixture weights and component parameters:
\[
p(\boldsymbol{\pi}, \mu_{1:K}, \Sigma_{1:K} \mid y_{1:T})
\]
Inference is typically performed via variational methods or Gibbs sampling. Compared to standard GMM, BGMM is less prone to overfitting and can automatically reduce the effective number of components by shrinking irrelevant mixture weights toward zero.