Covariance & TableOfReal: To TableOfReal (mahalanobis)...

Calculate Mahalanobis distance for the selected TableOfReal with respect to the selected Covariance object.

Setting

Use table centroid
Use the mean vector calculated from the columns in the selected TableOfReal instead of the means in the selected Covariance.

Explanation

The Mahalanobis distance is defined as

d = √((x - mean)′ S-1 (x - mean)),

where x is a vector, mean is the average and S is the covariance matrix.

It is the multivariate form of the distance measured in units of standard deviation.

Example

Count the number of items that are within 1, 2, 3, 4 and 5 standard deviations from the mean.

We first create a table with only one column and 10000 rows and fill it with numbers drawn from a normal distribution with mean zero and standard deviation one. Its covariance matrix, of course, is one dimensional. We next create a table with Mahalanobis distances.

n = 100000
t0 = Create TableOfReal... table n 1
Formula... randomGauss(0,1)
c = To Covariance
plus t0
ts = To TableOfReal (mahalanobis, 0)

for nsigma to 5
    select ts
    Extract rows where... self < nsigma
    nr = Get number of rows
    nrp = nr / n * 100
    expect = (1 - 2 * gaussQ (nsigma)) * 100
    printline 'nsigma'-sigma: 'nrp:4', 'expect:4'
    Remove
endfor

Links to this page


© djmw, January 6, 2010