Search:
Computing and Library Services - delivering an inspiring information environment

Identifying Domains and Concepts in Short Texts via Partial Taxonomy and Unlabeled Data

Zhang, Yihong, Szabo, Claudia, Sheng, Quan Z., Zhang, Wei Emma and Qin, Yongrui (2017) Identifying Domains and Concepts in Short Texts via Partial Taxonomy and Unlabeled Data. In: The 29th International Conference on Advanced Information Systems Engineering (CAiSE), 12-16 June 2017, Essen, Germany. (Unpublished)

[img]
Preview
PDF - Accepted Version
Download (528kB) | Preview

Abstract

Accurate and real-time identification of domains and concepts discussed in microblogging texts is crucial for many important applications such as earthquake monitoring, influenza surveillance and disaster management. Existing techniques such as machine learning and keyword generation are application specific and require significant amount of training in order to achieve high accuracy. In this paper, we propose to use a multiple domain taxonomy (MDT) to capture general user knowledge. We formally define the problems of domain classification and concept tagging. Using the MDT, we devise domain-independent pure frequency count methods that do not require any training data nor annotations and that are not sensitive to misspellings or shortened word forms. Our extensive experimental analysis on real Twitter data shows that both methods have significantly better identification accuracy with low runtime than existing methods for large datasets.

Item Type: Conference or Workshop Item (Paper)
Subjects: Q Science > QA Mathematics > QA76 Computer software
Schools: School of Computing and Engineering
School of Computing and Engineering > High-Performance Intelligent Computing > Planning, Autonomy and Representation of Knowledge
School of Computing and Engineering > High-Performance Intelligent Computing > Planning, Autonomy and Representation of Knowledge
Related URLs:
Depositing User: Yongrui Qin
Date Deposited: 22 May 2017 13:46
Last Modified: 21 Jul 2017 00:24
URI: http://eprints.hud.ac.uk/id/eprint/31928

Downloads

Downloads per month over past year

Repository Staff Only: item control page

View Item View Item

University of Huddersfield, Queensgate, Huddersfield, HD1 3DH Copyright and Disclaimer All rights reserved ©