GaussianMixture & TableOfReal: To GaussianMixture (CEMM)...

Find the best GaussianMixture from the data according to an iterative Component-wise Expectation-Maximization for Mixtures algorithm by which components may be deleted.

Settings

Minimum number of components
defines the minimum number of components that have to survive the minimization process. If a value of zero is chosen, all components will survive and no deletions will take place.
Tolerance of minimizer
defines when to stop optimizing. If the relative difference between the likelihoods at two successive iteration steps differs by less then the tolerance we stop, i.e. when |(L(i-1)-L(i))/L(i)| < tolerance.
Maximum number of iterations
defines another stop criterion. Iteration stops whenever the number of iterations reaches this value.
Stability coefficient lambda
defines the fraction of the total covariance that is added to the covariance of each component to prevent these matrices from becoming singular.
Criterion based on
defines whether the function to be optimized is the log likelihood or the related mininum description length.

Algorithm

The component-wise optimization algorithm is described in Figueiredo & Jain (2002) where the function to be optimized is the minimum description length defined as:

L(θ,Y) = N/2 Σm=1k ln(nαk/12) + k/2 ln(n/12) + k(N+1)/2 - ln p(Y|θ),

where k is the number of components, N is the number of parameters of one component (i.e. d+d(d+1)/2 for a full covariance matrix of dimension d with means and d+d for a diagonal matrix with means), and n is the number of data vectors. The term ln p(Y|θ) is the log likelihood of the data given the model.

For the optimization we either optimize the complete function L(θ,Y) or only the likelihood ln p(Y|θ) term.


© djmw 20230801