Cai, Di (2010) An Information-Theoretic Foundation for the Measurement of Discrimination Information. IEEE Transactions on Knowledge and Data Engineering, 22 (9). pp. 1262-1273. ISSN 1041-4347

Hitherto, it has not been easy to interpret the meaning of the amount of discrimination information conveyed in a term
rationally and explicitly within practical application contexts; it has not been simple to introduce the concept of the extent of semantic
relatedness between two terms meaningfully and successfully into scientific discussions. This study is part of an attempt to do this. We
attempt to answer two important questions: 1) What is the discrimination information conveyed by a term and how to measure it?
2) What is the relatedness between two terms and how to estimate it? We focus on the first question and present an in-depth
investigation into the discrimination measures based on several information measures, which are widely used in a variety of
applications. The relatedness measures are then naturally defined according to the individual discrimination measures. Some key
points are made for clarifying potential problems arising from using the relatedness measures, and solutions are suggested. Two
example applications in the contexts of text mining and information retrieval are provided. The aim of this study, of which this paper
forms part, is to establish a unified theoretical framework, with measurement of discrimination information (MDI) at the core, for
achieving effective measurement of semantic relatedness (MSR). Due to its generality, our method can be expected to be a useful tool
with a wide range of application areas.