Hybrid Classification for Tweets Related to Infection with Influenza

Xiangfeng Dai, Marwan Bikdash

Research output: Contribution to conferencePaper

Abstract

Traditional public health surveillance methods such as those employed by the CDC (United States Centers for Disease Control and Prevention) rely on regular clinical reports, which are almost always manual and labor intensive. Twitter, a popular micro-blogging service, provides the possibility of automated public health surveillance. Tweets, however, are less than 140 characters, and do not provide sufficient word occurrences for conventional classification methods to work reliably. Moreover, natural language is complex. This makes health-related classification more challenging. In this study, we use flu-related classification as a demonstration to propose a hybrid classification method, which combines two classification approaches: manually- defined features and auto-generated features by machine learning approaches. Preprocessing based on Natural Language Processing (NLP) is used to help extract useful information, and to eliminate noise features. Our simulations show an improved accuracy.

Original languageEnglish
StatePublished - 2015
Event2015 IEEE SoutheastCon -
Duration: Jan 1 2015 → …

Conference

Conference2015 IEEE SoutheastCon
Period01/1/15 → …

Fingerprint

Dive into the research topics of 'Hybrid Classification for Tweets Related to Infection with Influenza'. Together they form a unique fingerprint.

Cite this