Cai, Di and van Rijsbergen, C.J. (2009) Learning semantic relatedness from term discrimination information. Expert Systems With Applications, 36 (2). pp. 1860-1875. ISSN 0957-4174
Abstract

Formalization and quantification of the intuitive notion of relatedness between terms has long been a major challenge for computing science, and an intriguing problem for other sciences. In this study, we meet the challenge by considering a general notion of relatedness between terms and a given topic. We introduce a formal definition of a relatedness measure based on term discrimination measures. Measurement of discrimination information (MDI) of terms is a fundamental issue for many areas of science. In this study, we focus on MDI, and present an in-depth investigation into the concept of discrimination information conveyed in a term. Information radius is an information measure relevant to a wide variety of applications and is the basis of this investigation. In particular, we formally interpret discrimination measures in terms of a simple but important property identified by this study, and argue the interpretation is essential for guiding their application. The discrimination measures can then naturally and conveniently be utilized to formalize and quantify the relatedness between terms and a given topic. Some key points about the information radius, discrimination measures and relatedness measures are also made. An example is given to demonstrate how the relatedness measures can deal with some basic concepts of applications in the context of text information retrieval (IR). We summarize important features of, and differences between, the information radius and two other information measures, from a practical perspective. The aim of this study is part of an attempt to establish a theoretical framework, with MDI at its core, towards effective estimation of semantic relatedness between terms. Due to its generality, our method can be expected to be a useful tool with a wide range of application areas.

Library
Documents
[img]
Cai+2009.pdf - Published Version
Restricted to Repository staff only

Download (709kB)
Statistics
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email