concentration ellipse

concentration ellipse

The percentage of bivariate normally distributed data covered by an ellipse whose axes have a length of numberOfSigmas · σ can be obtained by integration of the probability distribution function over an elliptical area. This results in the following equation, as can be verified from equation 26.3.21 in Abramowitz & Stegun (1970):

percentage = (1 - exp (-numberOfSigmas²/2)) · 100%,

where the numberOfSigmas is the radius of the "ellipse":

(x/σ_x)² + (y/σ_y)² = numberOfSigmas².

The numberOfSigmas=1 ellipse covers 39.3% of the data, the numberOfSigmas=2 ellipse 86.5%, and the numberOfSigmas=3 ellipse 98.9%.

From the formula above we can show that if we want to cover p percent of the data, we have to choose numberOfSigmas as:

numberOfSigmas = √(-2 ln(1-p/100)).

For covering 95% of the data we calculate numberOfSigmas = 2.45.

Links to this page

Discriminant: Draw sigma ellipses...
Discriminant: Get concentration ellipse area...
GaussianMixture & PCA: Draw concentration ellipses...
GaussianMixture: Draw concentration ellipses...
SSCP: Draw sigma ellipse...