GaussianMixture & TableOfReal: To GaussianMixture (CEMM)...


Find the best GaussianMixture from the data according to a iterative componentwise optimization algorithm by which components may be deleted.
Settings

Minimum number of components

defines the minimum number of components that have to survive the minimization process. If a value of zero is chosen all components will survive and no deletions will take place.

Tolerance of minimizer

defines when to stop optimizing. If the relative difference between the likelihoods at two successive iteration steps differs by less then the tolerance we stop, i.e. when (L(i1)L(i))/L(i) < tolerance.

Maximum number of iterations

defines another stop criterion. Iteration stops whenever the number of iterations reaches this value.

Stability coefficient lambda

defines the fraction of the totat covariance that is added to the covariance of each component to prevent these matrices from becoming singular.

Criterion based on

defines whether the function to be optimized is the log likelihood or the related miminum description length.
Algorithm
The componentwise optimization algorithm is described in Figueiredo & Jain (2002) where the function to be optimized is the minimum description length defined as:
L(θ,Y) = N/2 Σ_{m=1}^{k} ln(nα_{k}/12) + k/2 ln(n/12) + k(N+1)/2  ln p(Yθ), 
where k is the number of components, N is the number of parameters of one component, i.e. d+d(d+1)/2 for a full covariance matrix of dimension d with means and d+d for a diagonal matrix with means; n is the number of data vectors. The term ln p(Yθ) is the log likelihood of the data given the model.
For the optimization we either optimize the complete function L(θ,Y) or only the likelihood ln p(Yθ) term.
© djmw, November 20, 2010