Description
A method is presented that quantifies the discriminative power of the input features in a fuzzy model. The proposed quantification helps the interpretation of fuzzy models constructed on high dimensional and very fragmented training sets. First, a measure of the information contained in the fuzzy model is defined on the basis of its fuzzy rules. The classification is then performed along one of the input features, that is, the fuzzy rules are split according to that feature's linguistic values. For each linguistic value, a fuzzy sub-model is generated from the original fuzzy model. The average information contained in these fuzzy sub-models is measured and the relative comparison with the information measure of the original fuzzy model quantifies the information gain that derives from the classification performed on the selected input feature. This information gain characterizes the discriminative power of that input feature. Therefore, the proposed information gain can be used to obtain better insights into the selected fuzzy classification strategy, even in very high dimensional cases, and possibly to reduce the input dimension.
Several artificial and real-world data analysis are reported as examples, in order to illustrate the characteristics and potentialities of the proposed algorithm. As real-world examples, the most informative electrocardiographic measures are detected for an arrhythmia classification problem and the role of duration, amplitude and pitch variations of syllabic nuclei in American English spoken sentences is investigated for prosodic stress classification.