Published online by Cambridge University Press: 14 March 2012
This paper proposes a novel application of topic models to do entity relation detection (ERD). In order to make use of the latent semantics of text, we formulate the task of relation detection as a topic modeling problem. The motivation is to find underlying topics that are indicative of relations between named entities (NEs). Our approach considers pairs of NEs and features associated with them as mini documents, and aims to utilize the underlying topic distributions as indicators for the types of relations that may exist between the NE pair. Our system, ERD-MedLDA, adapts Maximum Entropy Discriminant Latent Dirichlet Allocation (MedLDA) with mixed membership for relation detection. By using supervision, ERD-MedLDA is able to learn topic distributions indicative of relation types. Further, ERD-MedLDA is a topic model that combines the benefits of both, maximum likelihood estimation (MLE) and maximum margin estimation (MME), and the mixed-membership formulation enables the system to incorporate heterogeneous features. We incorporate different features into the system and perform experiments on the ACE 2005 corpus. Our approach achieves better overall performance for precision, recall, and F-measure metrics as compared to baseline SVM-based and LDA-based models. We also find that our system shows better and consistent improvements with the addition of complex informative features as compared to baseline systems.