Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings
Speaker (s): 
Maksim TKACHENKO PhD Candidate School of Information Systems Singapore Management University | Date:
Time:
Venue: | | June 29, 2018, Friday
2:00pm - 2:30pm
Meeting Room 5.1, Level 5 School of Information Systems Singapore Management University 80 Stamford Road Singapore 178902 We look forward to seeing you at this research seminar. ![]()
|
|
About the Talk
Distributional word embedding methods such as Word2Vec and GloVe have been critical for the success of many large-scale natural language processing applications. In this talk, we explore the notion of subjectivity and how the varying levels of subjectivity in input corpora affect word embeddings for text classification (e.g., sentiment, subjectivity, topic). Through systematic comparative analysis, we discover the outsized role that sentiment words play on subjectivity-sensitive tasks and develop SentiVec, a novel word embedding method. The SentiVec embeddings are infused with sentiment information from a lexical resource and are shown to outperform baselines on such tasks.
This a pre-conference talk for 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018).
About the Speaker
Maksim TKACHENKO is a PhD candidate at Singapore Management University (SMU). He received his diploma in mathematics and software engineering from Saint Petersburg State University, Russia, where afterwards he served as a research engineer. At SMU, his research focuses on text mining and natural language processing methods for user preference acquisition.