GaussianMixture & TableOfReal: To GaussianMixture (CEMM)...
|
|
Find the best GaussianMixture from the data according to an iterative Component-wise Expectation-Maximization for Mixtures algorithm by which components may be deleted.
Settings
- Minimum number of components
- defines the minimum number of components that have to survive the minimization process. If a value of zero is chosen, all components will survive and no deletions will take place.
- Tolerance of minimizer
- defines when to stop optimizing. If the relative difference between the likelihoods at two successive iteration steps differs by less then the tolerance we stop, i.e. when |(L(i-1)-L(i))/L(i)| < tolerance.
- Maximum number of iterations
- defines another stop criterion. Iteration stops whenever the number of iterations reaches this value.
- Stability coefficient lambda
- defines the fraction of the total covariance that is added to the covariance of each component to prevent these matrices from becoming singular.
- Criterion based on
- defines whether the function to be optimized is the log likelihood or the related mininum description length.
Algorithm
The component-wise optimization algorithm is described in Figueiredo & Jain (2002) where the function to be optimized is the minimum description length defined as:
L(θ,Y) = N/2 Σm=1k ln(nαk/12) + k/2 ln(n/12) + k(N+1)/2 - ln p(Y|θ), |
where k is the number of components, N is the number of parameters of one component (i.e. d+d(d+1)/2 for a full covariance matrix of dimension d with means and d+d for a diagonal matrix with means), and n is the number of data vectors. The term ln p(Y|θ) is the log likelihood of the data given the model.
For the optimization we either optimize the complete function L(θ,Y) or only the likelihood ln p(Y|θ) term.
© djmw 20230801