About The Talk In the past few decades, supervised machine learning approach is one of the most important methodologies in the Natural Language Processing (NLP) community. Although various kinds of supervised learning methods have been proposed to achieve the state-of-the-art performance across most NLP tasks, the bottleneck of them lies in the heavy reliance on the large amount of manually annotated in-domain data, which is not always available in our desired target domains, especially new emerging domains. To alleviate the data sparsity issue in target domains, an attractive solution is to find sufficient labeled data from a related source domain. However, for most NLP applications, due to the discrepancy between the distributions of the source and the target domains, directly training any supervised models only based on labeled data in the source domain usually results in poor performance in the target domain. Therefore, it is necessary to develop effective techniques to leverage rich annotations in source domains to improve the model performance in target domains. To solve the problem, this thesis focuses on proposing novel domain adaptation methods for different NLP tasks, with the goal of inducing a domain-invariant latent feature space, where the knowledge gained from the source domain can be easily adapted to the target domain: Firstly, we propose a simple yet effective unsupervised domain adaptation method, which reduces high computational costs and can easily scale to large-scale applications. We theoretically show that our method is able to assign appropriate weights to target-specific features, which co-occur with useful domain-independent features. Our extensive evaluations on three NLP tasks show that our method can outperform a number of baselines including the widely used SCL. Secondly, we develop an unsupervised neural network-based domain adaptation framework together with two novel auxiliary tasks for sentiment classification, and respectively apply them to sentence-level sentiment classification and document-level sentiment classification based on two state-of-the-art neural network models. We conduct a series of experiments to examine the effectiveness of our proposed framework. As a piece of future work, we would like to propose a supervised neural domain adaptation method for retrieval-based question answering systems, which can efficiently and effectively improve the model performance in the target domain. |