Hostname: page-component-cd9895bd7-fscjk Total loading time: 0 Render date: 2024-12-23T12:54:54.422Z Has data issue: false hasContentIssue false

Mining, analyzing, and modeling text written on mobile devices

Published online by Cambridge University Press:  10 October 2019

K. Vertanen*
Affiliation:
Michigan Technological University, Houghton, MI, USA
P.O. Kristensson
Affiliation:
University of Cambridge, Cambridge, UK
*
*Corresponding author. Email: [email protected]

Abstract

We present a method for mining the web for text entered on mobile devices. Using searching, crawling, and parsing techniques, we locate text that can be reliably identified as originating from 300 mobile devices. This includes 341,000 sentences written on iPhones alone. Our data enables a richer understanding of how users type “in the wild” on their mobile devices. We compare text and error characteristics of different device types, such as touchscreen phones, phones with physical keyboards, and tablet computers. Using our mined data, we train language models and evaluate these models on mobile test data. A mixture model trained on our mined data, Twitter, blog, and forum data predicts mobile text better than baseline models. Using phone and smartwatch typing data from 135 users, we demonstrate our models improve the recognition accuracy and word predictions of a state-of-the-art touchscreen virtual keyboard decoder. Finally, we make our language models and mined dataset available to other researchers.

Type
Article
Copyright
© Cambridge University Press 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Baldwin, T. and Chai, J. (2012). Autonomous self-assessment of autocorrections: exploring text message dialogues. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Montréal, Canada: Association for Computational Linguistics, pp. 710719.Google Scholar
Bell, P., Yamamoto, H., Swietojanski, P., Wu, Y., McInnes, F., Hori, C. and Renals, S. (2013). A lecture transcription system combining neural network acoustic and Language Models. In Proceedings of INTERSPEECH. ISCA, pp. 30873091.Google Scholar
Bisani, M. and Ney, H. (2004). Bootstrap estimates for confidence intervals in ASR performance evaluation. In Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing. ICASSP’04. IEEE, pp. 409411.CrossRefGoogle Scholar
Brill, E. and Moore, R.C. (2000). An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics. ACL’00. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 286293.CrossRefGoogle Scholar
Brody, S. and Diakopoulos, N. (2011). Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using word lengthening to detect sentiment in microblogs. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Edinburgh, Scotland, UK: Association for Computational Linguistics, pp. 562570.Google Scholar
Bulyko, I., Ostendorf, M., Siu, M., Ng, T., Stolcke, A. and Çetin, Ö. (2007). Web resources for language modeling in conversational speech recognition. ACM Transactions on Speech and Language Processing 5(1), 1:11:25.CrossRefGoogle Scholar
Burton, K., Java, A. and Soboroff, I. (2009). The ICWSM 2009 Spinn3r dataset. In: Proceedings of the 3rd Annual Conference on Weblogs and Social Media. ICWSM’09. Palo Alto, California, USA: AAAI.Google Scholar
Carey, J. (1980). Paralanguage in computer mediated communication. In Proceedings of the 18th Annual Meeting on Association for Computational Linguistics. ACL’80. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 6769.CrossRefGoogle Scholar
Chelba, C., Brants, T., Neveitt, W. and Xu, P. (2010). Study on interaction between entropy pruning and Kneser–Ney smoothing. In Proceedings of INTERSPEECH. ISCA, pp. 22422245.Google Scholar
Chen, B., Kuhn, R., Foster, G., Cherry, C. and Huang, F. (2016). Bilingual methods for adaptive training data selection for machine translation. In Proceedings of the Association for Machine Translation in the Americas. AMTA’16, pp. 93103.Google Scholar
Chen, S.F., Beeferman, D. and Rosenfeld, R. (1998). Evaluation metrics for language models. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. Morgan Kaufmann, pp. 275280.Google Scholar
Chen, S.F. and Goodman, J. (1996). An empirical study of smoothing techniques for language modeling. In Proceedings of the 34th annual meeting on Association for Computational Linguistics. ACL’96. Morristown, NJ, USA: Association for Computational Linguistics, pp. 310318.CrossRefGoogle Scholar
Chen, T. and Kan, M.-Y. (2013). Creating a live, public short message service corpus: the NUS SMS corpus. Language Resources and Evaluation 47(2), 299335.Google Scholar
Cooper, W.E. (1983). Cognitive Aspects of Skilled Typewriting. New York: Springer-Verlag.CrossRefGoogle Scholar
Creutz, M., Virpioja, S. and Kovaleva, A. (2009). Web augmentation of language models for continuous speech recognition of SMS text messages. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics. EACL’09. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 157165.CrossRefGoogle Scholar
Darragh, J.J., Witten, I.H. and James, M.L. (1990). The reactive keyboard: a predictive typing aid. Computer 23(11), 4149.CrossRefGoogle Scholar
De Mulder, W., Bethard, S. and Moens, M.-F. (2015). A survey on the application of recurrent neural networks to statistical language modeling. Computer Speech & Language 30(1), 6198.CrossRefGoogle Scholar
Devlin, J., Zbib, R., Huang, Z., Lamar, T., Schwartz, R.M. and Makhoul, J. (2014). Fast and robust neural network joint models for statistical machine translation. In Proceedings of the Conference on Computational Linguistics. ACL’14. Baltimore, USA: Association for Computational Linguistics, pp. 13701380.Google Scholar
Fowler, A., Partridge, K., Chelba, C., Bi, X., Ouyang, T. and Zhai, S. (2015). Effects of language modeling and its personalization on touchscreen typing performance. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI’15. New York, NY, USA: ACM, 649658.Google Scholar
Fu, B., Lin, J., Li, L., Faloutsos, C., Hong, J. and Sadeh, N. (2013). Why people hate your app: making sense of user feedback in a mobile app store. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD’13. New York, NY, USA: ACM, pp. 12761284.CrossRefGoogle Scholar
Gao, J., Goodman, J., Li, M. and Lee, K.-F. (2002). Toward a unified approach to statistical language modeling for chinese. ACM Transactions on Asian Language Information Processing (TALIP) 1(1), 333.CrossRefGoogle Scholar
Gillick, L. and Cox, S.J. (1989). Some statistical issues in the comparison of speech recognition algorithms. In Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing. ICASSP’89. IEEE, pp. 532535.CrossRefGoogle Scholar
Goodman, J., Venolia, G., Steury, K. and Parker, C. (2002). Language modeling for soft keyboards. In Proceedings of the Eighteenth National Conference on Artificial Intelligence. Menlo Park, CA, USA: American Association for Artificial Intelligence, pp. 419424.CrossRefGoogle Scholar
Grinter, R. and Eldridge, M. (2003). Wan2Tlk?: everyday text messaging. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI’03. New York, NY, USA: ACM, pp. 441448.CrossRefGoogle Scholar
Han, B. and Baldwin, T. (2011). Lexical normalisation of short text messages: makn sens a #twitter. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1. HLT’11. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 368378.Google Scholar
Hayes, A.F. and Krippendorff, K. (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures 1(1), 7789.CrossRefGoogle Scholar
Heafield, K. (2011). KenLM: faster and smaller language model queries. In Proceedings of the EMNLP 2011 Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics, pp. 187197.Google Scholar
Hunt, M.J. (1990). Figures of merit for assessing connected-word recognisers. Speech Communication 9(4), 329336.CrossRefGoogle Scholar
Kalman, Y.M. and Gergle, D. (2009). Letter and punctuation mark repeats as cues in computer-mediated communication. In 95th Annual Meeting of the National Communication Association in Chicago, IL. Google Scholar
Kamvar, M. and Baluja, S. (2007). Deciphering trends in mobile search. IEEE Computer 40(8), 5862.CrossRefGoogle Scholar
Klimt, B. and Yang, Y. (2004). The enron corpus: a new dataset for email classification research. In Proceedings of the European Conference on Machine Learning. Springer-Verlag, pp. 217226.CrossRefGoogle Scholar
Koehn, P. (2004). Statistical significance tests for machine translation evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. Barcelona, Spain: Association for Computational Linguistics, pp. 388395.Google Scholar
Kombrink, S., Mikolov, T., Karafiát, M. and Burget, L. (2011). Recurrent neural network based language modeling in meeting recognition. In Proceedings of INTERSPEECH. ISCA, vol. 11, pp. 28772880.Google Scholar
Kristensson, P.O. and Vertanen, K. (2012). Performance comparisons of phrase sets and presentation styles for text entry evaluations. In Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces. IUI’12. New York, NY, USA: ACM, 2932.CrossRefGoogle Scholar
Kukich, K. (1992). Techniques for automatically correcting words in text. ACM Computing Surveys 24(4), 377439.CrossRefGoogle Scholar
Levenshtein, V.I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics Doklady, vol. 10, pp. 707710. Available at https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf Google Scholar
Ling, R. (2005). The sociolinguistics of SMS: an analysis of SMS use by a random sample of Norwegians. In Ling, R. and Pedersen, P. E. (eds), Mobile Communications. London: Springer-Verlag London Limited, Springer, pp. 335349.CrossRefGoogle Scholar
Ling, R. (2007). The Length of Text Messages and the Use of Predictive Texting: Who Uses It and How Much Do They Have to Say? TESOL, College of Arts and Sciences, American University.Google Scholar
Lui, M. and Baldwin, T. (2012). langid.py: an off-the-shelf language identification tool. In Proceedings of the ACL 2012 System Demonstrations. ACL’12. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 2530.Google Scholar
Maalej, W. and Nabil, H. (2015). Bug report, feature request, or simply praise? On automatically classifying app reviews. In Proceedings of the 2015 IEEE 23rd International Requirements Engineering Conference (RE). IEEE, pp. 116125.CrossRefGoogle Scholar
Mikolov, T., Deoras, A., Kombrink, S., Burget, L. and Cernocký, J. (2011). Empirical evaluation and combination of advanced language modeling techniques. In Proceedings of INTERSPEECH. ISCA, pp. 605608.Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Cernocký, J. and Khudanpur, S. (2010). Recurrent neural network based language model. In Proceedings of INTERSPEECH. ISCA, pp. 10451048.Google Scholar
Moore, R.C. and Lewis, W. (2010). Intelligent selection of language model training data. In Proceedings of the ACL 2010 Conference Short Papers. ACLShort’10. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 220224.Google Scholar
Munro, R. (2011). Subword and spatiotemporal models for identifying actionable information in Haitian Kreyol. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning. CoNLL’11. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 6877.Google Scholar
Munro, R. and Manning, C.D. (2010). Subword variation in text message classification. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 510518.Google Scholar
Munro, R. and Manning, C.D. (2012). Short message communications: users, topics, and in-language processing. In: Proceedings of the 2nd ACM Symposium on Computing for Development. ACM.Google Scholar
Neviarouskaya, A., Prendinger, H. and Ishizuka, M. (2007). Textual affect sensing for sociable and expressive online communication. In Proceedings of the 2nd International Conference on Affective Computing and Intelligent Interaction. ACII’07. Berlin, Heidelberg: Springer-Verlag, pp. 218229.CrossRefGoogle Scholar
O’Day, D.R. and Calix, R. (2013). Text message corpus: applying natural language processing to mobile device forensics. In Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops. ICMEW’13. IEEE, pp. 16.CrossRefGoogle Scholar
Paek, T. and Hsu, B.-J. (Paul). (2011). Sampling representative phrase sets for text entry experiments: a procedure and public resource. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI’11. New York, NY, USA: ACM, pp. 24772480.CrossRefGoogle Scholar
Pauls, A. and Klein, D. (2011). Faster and smaller N-gram language models. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1. HLT’11. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 258267.Google Scholar
Read, J. (2005). Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In Proceedings of the ACL Student Research Workshop. ACLstudent’05. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 4348.CrossRefGoogle Scholar
Renals, S. (2010). Recognition and understanding of meetings. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. HLT’10. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 19.Google Scholar
Riordan, M.A. and Kreuz, R.J. (2010). Cues in computer-mediated communication: a corpus analysis. Computers in Human Behavior 26(6), 18061817.CrossRefGoogle Scholar
Rosenfeld, R. (2000). Two decades of statistical language modeling: where do we go From here? In Proceedings of the IEEE. IEEE, vol. 88, pp. 12701278.Google Scholar
Rough, D., Vertanen, K. and Kristensson, P.O. (2014). An evaluation of dasher with a high-performance language model as a gaze communication method. In Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces. AVI’14. New York, NY, USA: ACM, pp. 169176.CrossRefGoogle Scholar
Schnoebelen, T. (2012). Do you smile with your nose? Stylistic variation in twitter emoticons. University of Pennsylvania Working Papers in Linguistics 18(2), 14.Google Scholar
Shaoul, C. and Westbury, C. (2009). A USENET Corpus (2005–2009). http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html. University of Alberta, Edmonton, AB.Google Scholar
Stolcke, A. (1998). Entropy-based pruning of backoff language models. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop. Morgan Kaufmann, pp. 270274.Google Scholar
Stolcke, A. (2002). SRILM – an extensible language modeling toolkit. In Proceedings of INTERSPEECH. ISCA, pp. 901904.Google Scholar
Stolcke, A., Yuret, D. and Madnani, N. (2010). SRILM-FAQ - Frequently Asked Questions About SRI LM Tools. http://www.speech.sri.com/projects/srilm/manpages/srilm-faq.7.html.Google Scholar
Stolcke, A., Zheng, J., Wang, W. and Abrash, V. (2011). SRILM at sixteen: update and outlook. In Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop. ASRU’11. IEEE, vol. 5.Google Scholar
Strik, H., Cucchiarini, C. and Kessens, J.M. (2001). Comparing the performance of two CSRs: how to determine the significance level of the differences. In Proceedings of INTERSPEECH. ISCA, pp. 20912094.Google Scholar
Tagg, C. (2009). A Corpus Linguistics Study of SMS Text Messaging. PhD Thesis, University of Birmingham, Birmingham, UK.Google Scholar
Thelwall, M., Buckley, K., Paltoglou, G., Cai, D. and Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology 61(12), 25442558.CrossRefGoogle Scholar
Tong, X. and Evans, D.A. (1996). A statistical approach to automatic OCR error correction in context. In Proceedings of the Fourth Workshop on Very Large Corpora. Association for Computational Linguistics, pp. 88100.Google Scholar
Vasa, R., Hoon, L., Mouzakis, K. and Noguchi, A. (2012). A preliminary analysis of mobile app user reviews. In Proceedings of the 24th Australian Computer-Human Interaction Conference. OzCHI’12. New York, NY, USA: ACM, pp. 241244.CrossRefGoogle Scholar
Vertanen, K., Fletcher, C., Gaines, D., Gould, J. and Kristensson, P.O. (2018). The impact of word, multiple word, and sentence input on virtual keyboard decoding performance. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI’18. New York, NY, USA: ACM, pp. 626:1626:12.Google Scholar
Vertanen, K. and Kristensson, P.O. (2011a). The imagination of crowds: conversational AAC language modeling using crowdsourcing and large data sources. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Edinburgh, Scotland, UK: Association for Computational Linguistics, pp. 700711.Google Scholar
Vertanen, K. and Kristensson, P.O. (2011b). A versatile dataset for text entry evaluations based on genuine mobile emails. In Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services. MobileHCI’11. New York, NY, USA: ACM, pp. 295298.Google Scholar
Vertanen, K. and Kristensson, P.O. (2014). Complementing text entry evaluations with a composition task. ACM Transactions on Computer-Human Interaction 21(2), 8:18:33.CrossRefGoogle Scholar
Vertanen, K., Memmi, H., Emge, J., Reyal, S. and Kristensson, P.O. (2015). VelociTap: investigating fast mobile text entry using sentence-based decoding of touchscreen keyboard input. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI’15. New York, NY, USA: ACM, pp. 659668.CrossRefGoogle Scholar
Walther, J.B. and D’Addario, K.P. (2001). The impacts of emoticons on message interpretation in computer-mediated communication. Social Science Computer Review 19(3), 324347.CrossRefGoogle Scholar
Ward, D.J., Blackwell, A.F. and MacKay, D.J.C. (2000). Dasher - a data entry Interface using continuous gestures and language models. In Proceedings of the 13th Annual ACM Symposium on User Interface Software and Technology. UIST’00. New York, NY, USA: ACM, pp. 129137.CrossRefGoogle Scholar
Wobbrock, J.O. (2007). Measures of text entry performance, Chapter 3. In MacKenzie, I.S. and Tanaka-Ishii, K. (eds), Text Entry Systems. San Francisco, California, USA: Morgan Kauffman, pp. 4774.CrossRefGoogle Scholar
Yao, K., Zweig, G., Hwang, M.-Y., Shi, Y. and Yu, D. (2013). Recurrent neural networks for language understanding. In Proceedings of INTERSPEECH. ISCA, pp. 25242528.Google Scholar