Hostname: page-component-745bb68f8f-d8cs5 Total loading time: 0 Render date: 2025-01-08T12:04:38.300Z Has data issue: false hasContentIssue false

Possession identification in text

Published online by Cambridge University Press:  04 April 2018

CARMEN BANEA
Affiliation:
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA e-mail: [email protected], [email protected]
RADA MIHALCEA
Affiliation:
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA e-mail: [email protected], [email protected]

Abstract

Just as industrialization matured from mass production to customization and personalization, so has the Web migrated from generic content to public disclosures of one’s most intimately held thoughts, opinions, and beliefs. This relatively new type of data is able to represent finer and more narrowly defined demographic slices. If until now researchers have primarily focused on leveraging personalized content to identify latent information such as gender, nationality, location, or age, this article seeks to establish a structured way of extracting possessions, or items that people own or are entitled to, as a way to ultimately provide insights into people’s behaviors and characteristics. We introduce the new task of ‘possession identification in text’, and release a novel dataset where possessions are marked at different confidence levels. We present experiments and results obtained when seeking to automatically identify and extract possessions from the text.

Type
Article
Copyright
Copyright © Cambridge University Press 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aha, D. W., Kibler, D., and Albert, M. K., 1991. Instance-based learning algorithms. Machine Learning 6 (1): 3766.Google Scholar
Burger, J. D., and Henderson, J. C. 2006. An exploration of observable features related to blogger age. In Proceedings of the AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, March, pp. 15–20.Google Scholar
Burger, J. D., Henderson, J., Kim, G., and Zarrella, G. 2011. Discriminating gender on Twitter. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2011), July, pp. 1301–9.Google Scholar
Cheng, Z., Caverlee, J., and Lee, K. 2010. You are where you tweet: a content-based approach to geo-locating Twitter users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM-2010), October, pp. 759–68.Google Scholar
Ciot, M., Sonderegger, M., and Ruths, D. 2013. Gender inference of Twitter users in non-English contexts. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP-2013), October, pp. 18–21.Google Scholar
Cohen, R., and Ruths, D. 2013. Classifying political orientation on Twitter: it’s not easy! In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media (ICWSM-2013), July, pp. 91–9.Google Scholar
Conover, M., Gonçalves, B., Ratkiewicz, J., Flammini, A., and Menczer, F. 2011. Predicting the political alignment of Twitter users. IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing (SocialCom-2011), October, pp. 192–199.Google Scholar
Gerlof, B. 2009. Normalized (pointwise) mutual information in collocation extraction. In Proceedings of the Biennial Conference of the German Society for Computational Linguistics and Language Technology (GSCL-2009), September, pp. 3140–51.Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H., 2009. The WEKA data mining software: an update. SIGKDD Explorations 11 (1): 10–8.Google Scholar
Hornik, K., 1991. Approximation capabilities of multilayer feedforward networks. Neural Networks 4 (2): 251–7.Google Scholar
Hu, T., Bigelow, E., Luo, J., and Kautz, H., 2017. Tales of two cities: using social media to understand idiosyncratic lifestyles in distinctive metropolitan areas. IEEE Transactions on Big Data 3 (1): 5566.Google Scholar
Levin, B., 1993. English Verb Classes and Alternations: A Preliminary Investigation. Chicago, IL: The University of Chicago Press.Google Scholar
Levin, B. 2006. English Object Alternations: A Unified Account. Unpublished manuscript. Stanford, CA, USA. http://web.stanford.edu/~bclevin/alt06.pdfGoogle Scholar
Li, J., Ritter, A., and Hovy, E. 2014. Weakly supervised user profile extraction from Twitter. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL-2014), June, pp. 165–74.Google Scholar
Liu, Wendy, & Ruths, Derek. 2013. What’s in a name? Using first names as features for gender inference in Twitter. In Analyzing Microtext: Papers from the 2013 AAAI Spring Symposium, March, pp. 10–6.Google Scholar
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., and McClosky, D. 2014. The Stanford CoreNLP natural language processing toolkit. In Proceedings of the Association for Computational Linguistics System Demonstrations (ACL-2014), June, pp. 55–60.Google Scholar
Mukherjee, A., and Liu, B. 2010. Improving gender classification of blog authors. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP-2010), October, pp. 207–17.Google Scholar
Nelson, D. L., McEvoy, C. L., and Schreiber, T. A., 2004. The University of South Florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers 36 (3): 402–7.Google Scholar
Pennacchiotti, M., and Popescu, A.-M. 2011. Democrats, republicans and Starbucks afficinados: user classification in Twitter. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011), August, pp. 430–8.Google Scholar
Platt, J. C. 1999. Fast training of support vector machine using sequential minimal optimization. In Schölkopf, B., Burges, C. J. C., and Smola, A. J. (eds.), Advances in Kernel Methods – Support Vector Learning. Cambridge, MA: MIT Press, pp. 185208.Google Scholar
Quinlan, R., 1993. C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann Publishers.Google Scholar
Rao, D., Yarowsky, D., Shreevats, A., and Gupta, M. 2010. Classifying latent user attributes in Twitter. In Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents (SMUC-2010.), October, pp. 37–44.Google Scholar
Rosenberg, M. J., 1956. Cognitive structure and attitudinal affect. The Journal of Abnormal and Social Psychology 53 (3): 367–72.Google Scholar
Rosenberg, M. J. 1968. Hedonism, inauthenticity, and other goals toward expansion of a consistency theory. In pp. 73111 Abelson, R. P., Aronson, E., McGuire, W. J., Newcomb, T. M., Rosenberg, M. J., and Tannenbaum, P. H. (eds.), Theories of Cognitive Consistency: A Sourcebook. Chicago, IL: Rand McNally.Google Scholar
Sadilek, A., Kautz, H., and Bigham, J. P. 2012. Finding your friends and following them to where you are. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM-2012), February, pp. 723–32.Google Scholar
Stecher, K., and Counts, S. 2008. Spontaneous inference of personality traits and effects on memory for online profiles. Proceedings of the 2nd International Conference on Weblogs and Social Media (ICWSM-2008), March, pp. 118–26.Google Scholar
Van Durme, B. 2012. Streaming analysis of discourse participants. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL-2012), July, pp. 48–58.Google Scholar
Volkova, S., and Bachrach, Y., 2015. On predicting sociodemographic traits and emotions from communications in social networks and their implications to online self-disclosure. Cyberpsychology, Behavior and Social Networking 18 (12): 726–36.Google Scholar
Volkova, S., and Bachrach, Y. 2016. Inferring perceived demographics from user emotional tone and user-environment emotional contrast. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-2016), August, pp. 1567–78.Google Scholar
Zamal, F. A., Liu, W., and Ruths, D. 2012. Homophily and latent attribute inference: inferring latent attributes of Twitter users from neighbors. In Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM-2012), June, pp. 387–90.Google Scholar