Confusion: To Dissimilarity (pdf)...

A command that creates a Dissimilarity from every selected Confusion.

Settings

Symmetrize first
when on, the confusion matrix is symmetrized before we calculate dissimilarities.
Maximum dissimilarity (units of sigma)
specifies the dissimilarity from confusion matrix elements that are zero.

Algorithm

1. Normalize rows by dividing each row element by the row sum (optional).
2. Symmetrize the matrix by averaging fij and fji.
3. Transformation of the confusion measure which is a sort of similarity measure to the dissimilarity measure.

Similarity and dissimilarity have an inverse relationship: the greater the similarity, the smaller the dissimilarity and vice versa. Both have a monotonic relationship with distance. The most simple way to transform the similarities fij into dissimilarities is:

dissimilarityij = maximumSimilaritysimilarityij

For ordinal analyses like Kruskal this transformation is fine because only order relations are important in this analysis. However, for metrical analyses like INDSCAL this is not optimal. In INDSCAL, distance is a linear function of dissimilarity. This means that, with the transformation above, you ultimately fit an INDSCAL model in which the distance between object i and j will be linearly related to the confusion between i and j.

For the relation between confusion and dissimilarity, the model implemented here, makes the assumption that the amount of confusion between objects i and j is related to the amount that their probability density functions, pdf's, overlap. Because we do not know these pdf's we make the assumption that both are normal, have equal sigma and are one-dimensional. The parameter to be determined is the distance between the centres of both pdf's. According to formula 26.2.23 in Abramowitz & Stegun (1970), for each fraction fij, we have to find an x that solves:

fij = 1 / √(2π) ∫x e-t·t/2 dt

This x will be used as the dissimilarity between i and j. The relation between x and fij is monotonic. This means that the results for a Kruskal analysis will not change much. For INDSCAL, in general, you will note a significantly better fit.

Links to this page


© djmw, April 7, 2004