kNN classifiers 1.1.1. Feature weighting

A kNN classifier in its most basic form operates under the implicit assumption that all features are of equal value as far as the classification problem at hand is concerned. When irrelevant and noisy features influence the neighbourhood search to the same degree as highly relevant features, the accuracy of the model is likely to deteriorate. Feature weighting is a technique used to approximate the optimal degree of influence of individual features using a training set. When successfully applied relevant features are attributed a high weight value, whereas irrelevant features are given a weight value close to zero. Feature weighting can be used not only to improve classification accuracy but also to discard features with weights below a certain threshold value and thereby increase the resource efficiency of the classifier.

Two fundamentally different approaches to this optimization problem can be identified, the filter-based and the wrapper-based. The class of filter-based methods contains algorithms that use no input other than the training data itself to calculate the feature weights, whereas wrapper-based algorithms use feedback from a classifier to guide the search. Wrapper-based algorithms are inherently more powerful than their filter-based counterpart as they implicitly take the inductive bias of the classifier into account. This power comes at a price however; the usage of wrapper-based algorithms increases the risk of overfitting the training data.

In section 1.1.1.1. the filter-based feature weighting algorithm implemented in Praat is presented. Section 1.1.1.2. contains an account of the implemented wrapper-based feature weighting algorithm.

Links to this page


© Ola S√∂der, May 29, 2008