Hostname: page-component-586b7cd67f-r5fsc Total loading time: 0 Render date: 2024-11-25T20:13:30.626Z Has data issue: false hasContentIssue false

EHME: a New Word Database for Research in Basque Language

Published online by Cambridge University Press:  14 November 2014

Joana Acha*
Affiliation:
Universidad del País Vasco UPV/EHU (Spain)
Itziar Laka
Affiliation:
Universidad del País Vasco UPV/EHU (Spain)
Josu Landa
Affiliation:
Universidad del País Vasco UPV/EHU (Spain)
Pello Salaburu
Affiliation:
Universidad del País Vasco UPV/EHU (Spain)
*
*Correspondence concerning this article should be addressed to Joana Acha. Department of Basic Cognitive processes. Universidad del País Vasco UPV/EHU. Tolosa Hiribidea. 20018. Donostia (Spain). E-mail: [email protected]

Abstract

This article presents EHME, the frequency dictionary of Basque structure, an online program that enables researchers in psycholinguistics to extract word and nonword stimuli, based on a broad range of statistics concerning the properties of Basque words. The database consists of 22.7 million tokens, and properties available include morphological structure frequency and word-similarity measures, apart from classical indexes: word frequency, orthographic structure, orthographic similarity, bigram and biphone frequency, and syllable-based measures. Measures are indexed at the lemma, morpheme and word level. We include reliability and validation analysis. The application is freely available, and enables the user to extract words based on concrete statistical criteria1, as well as to obtain statistical characteristics from a list of words2.

Type
Research Article
Copyright
Copyright © Universidad Complutense de Madrid and Colegio Oficial de Psicólogos de Madrid 2014 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Acha, J., Laka, I., & Perea, M. (2010). Reading development in agglutinative languages: Evidence with beginning, intermediate and adult Basque readers. Journal of Experimental Child Psychology, 105, 359375. http://dx.doi.org/10.1016/j.jecp.2009.10.008 Google Scholar
Acha, J., & Perea, M. (2008). The effect of neighborhood frequency in reading: Evidence with transposed-letter neighbors. Cognition, 108, 290300. http://dx.doi.org/10.1016/j.cognition.2008.02.006 Google Scholar
Alvarez, C. J., Carreiras, M., & Taft, M. (2001). Syllables and morphemes: Contrasting frequency effects in Spanish. Journal of Experimental Psychology: Learning, Memory and Cognition, 27, 545555. http://dx.doi.org/10.1037//0278-7393.27.2.545 Google ScholarPubMed
Azkarate, M. (1993). Basque compound nouns and generative morphology: Some data. In Ortiz de Urbina, J., & Hualde, J. I., (Eds.), Generative studies in Basque linguisstics. Amsterdam, Philadelphia: John Benjamins.Google Scholar
Balota, D. A., & Chumbley, J. I. (1984). Are lexical decisions a good measure of lexical Access? The role of word frequency in the neglected decision stage. Journal of Experimental Psychology: Human Perception and Performance. 10, 340357. http://dx.doi.org/10.1037//0096-1523.10.3.340 Google Scholar
Berent, I., & Marom, M. (2005). The skeletal structure of printed words: Evidence from the Stroop task. Journal of Experimental Psychology: Human Perception & Performance, 31, 328338. http://dx.doi.org/10.1037/0096-1523.31.2.328 Google Scholar
Brysbaert, M., Buchmeier, M., Conrad, M., Jacobs, A. M., Bölte, A., & Böhl, A. (2011). The word frequency effect. Experimental Psychology, 58, 412424. http://dx.doi.org/10.1027/1618-3169/a000123 Google Scholar
Buchwald, A., & Rapp, B. (2006). Consonants and vowels in orthographic representation. Cognitive Neuropsychology, 23, 308337. http://dx.doi.org/10.1080/02643290442000527 CrossRefGoogle Scholar
Caramazza, A. (1990). The structure of graphemic representations. Cognition, 37, 243297. http://dx.doi.org/10.1016/0010-0277(90)90047-N Google Scholar
Carreiras, M., Alvarez, C. J., & de Vega, M. (1993). Syllable frequency and visual word recognition in Spanish. Journal of Memory and Language, 32, 766780. http://dx.doi.org/10.1006/jmla.1993.1038 Google Scholar
Carreiras, M., Duñabeitia, J. A., Vergara, M., de la Cruz-Pavia, I., & Laka, I. (2010). Subject relative clauses are not universally easier to process: Evidence from Basque. Cognition, 115, 7992. http://dx.doi.org/10.1016/j.cognition.2009.11.012 CrossRefGoogle Scholar
Carreiras, M., & Perea, M. (2002). Masked priming effects with syllabic neighbors in the lexical decision task. Journal of Experimental Psychology: Human Perception & Performance, 28, 12281242. http://dx.doi.org/10.1037//0096-1523.28.5.1228 Google Scholar
Carreiras, M., & Perea, M. (2004). Naming pseudowords in Spanish: Effects of syllable frequency. Brain & Language, 90, 393400. http://dx.doi.org/10.1016/j.bandl.2003.12.003 Google Scholar
Coltheart, M., Davelaar, E., Jonasson, J. T., & Besner, D. (1977). Access to the internal lexicon. In Dornic, S. (Ed.), Attention and performance VI (pp. 535555). New York, NY: Academic Press.Google Scholar
Davis, C. J. (2005). N-Watch: A program for deriving neighborhood size and other psycholinguistic statistics. Behavior Research Methods, 37, 6570. http://dx.doi.org/10.3758/BF03206399 CrossRefGoogle ScholarPubMed
Davis, C. J., & Perea, M. (2005). BuscaPalabras: A program for deriving orthographic and phonological neighborhood statistics and other psycholinguistic indices in Spanish. Behavior Research Methods, 37, 665671. http://dx.doi.org/10.3758/BF03192738 Google Scholar
Davis, C. J., Perea, M., & Acha, J. (2009). Re(de)fining the orthographic neighbourhood: The role of addition and deletion neighbors in lexical decision and reading. Journal of Experimental Psychology: Human Perception and Performance, 35, 15501570. http://dx.doi.org//10.1037/a0014253 Google Scholar
De Rijk, R. (2007). Standard Basque, a progressive grammar. Cambridge, MA: MIT Press.CrossRefGoogle Scholar
Dixon, R. M. W. (1994). Ergativity, Cambridge studies in linguistics 69. Cambrige, UK: Cambridge University Press.Google Scholar
Erdozia, K., Laka, I., Mestres-Misse, A., & Rodriguez-Fornells, A. (2009). Syntactic complexity and ambiguity resolution in a free word-order language: Behavioral and electrophysiological evidences from Basque. Brain and Language, 109, 117. http://dx.doi.org/10.1016/j.bandl.2008.12.003 CrossRefGoogle Scholar
Forster, K. I., & Forster, J. C. (2003). DMDX: A Windows display program with millisecond accuracy. Behavior Research Methods, Instruments, & Computers, 35, 16124.Google Scholar
Giraudo, H., & Grainger, J. (2000). Effects of prime word frequency and cumulative root frequency in masked morphological priming. Language and Cognitive Processes, 15, 421444. http://dx.doi.org/10.1080/01690960050119652 Google Scholar
Grainger, J. (1990). Word frequency and neighborhood frequency effects in lexical decision and naming. Journal of Memory and Language, 29, 228244. http://dx.doi.org/10.1016/0749-596X(90)90074-A Google Scholar
Hino, Y., & Lupker, S. J. (2000). Effects of Word frequency and spelling to sound Regularity in naming with and without preceding lexical decision. Journal of Experimental Psychology: Human Perception and Performance, 26, 166183. http://dx.doi.org/10.1037//0096-1523.26.1.166 Google Scholar
Holopainen, L., Ahonen, T., & Lyytinen, H. (2002). The role of reading by analogy in first grade Finnish readers. Scandinavian Journal of Educational Research, 46, 8398. http://dx.doi.org/10.1080/00313830120115624 Google Scholar
Hualde, J. I., & Ortiz de Urbina, J. (Eds.) (2003). A grammar of Basque. New York, NY: Mouton de Gruyter.Google Scholar
Laka, I. (1996). A brief grammar of Euskara, the Basque language. Vitoria-Gasteiz, Spain: Universidad del País Vasco/Euskal Herriko Unibertsitatea. Retrieved from http://www.ehu.es/grammar.Google Scholar
Laka, I. (2006). Deriving split-ergativity in the progressive: The case of Basque. In Johns, Alana, Massam, Diane, & Ndayuragije, Juvenal (Eds.) Ergativity: Emerging Issues (pp. 173195). Dordrecht, Berlin: Springer.Google Scholar
Laka, I., & Korostola, L. E. (2001). Aphasia manifestations in Basque. Journal of Neurolinguistics, 14, 133157. http://dx.doi.org/10.1016/S0911-6044(01)00012-4 Google Scholar
Miller, B., Juhasz, B. J., & Rayner, K. (2006). The orthographic uniqueness point and eye movements during reading. British Journal of Psychology, 97, 191216. http://dx.doi.org/10.1348/000712605X66845 Google Scholar
Perea, M., & Carreiras, M. (1998). Effects of syllable frequency and syllable neighborhood frequency in visual word recognition. Journal of Experimental Psychology: Human Perception and Performance, 24, 134144. http://dx.doi.org/10.1037//0096-1523.24.1.134 Google Scholar
Perea, M., & Pollatsek, A. (1998). The effects of neighborhood frequency in reading and lexical decision. Journal of Experimental Psychology: Human Perception and Performance, 24, 767779. http://dx.doi.org/10.1037//0096-1523.24.3.767 Google Scholar
Perea, M., Urkia, M., Davis, C. J., Agirre, A., Laseka, E., & Carreiras, M. (2006). E-Hitz: A word-frequency list and a program for deriving psycholinguistic statistics in an agglutinative language (Basque). Behavior Research Methods, 38, 610615. http://dx.doi.org/10.3758/BF03193893 Google Scholar
Landa, J., Sarasola, I., & Salaburu, P. (2010). Euskal Hiztegiaren Maiztasun Egitura (EHME). Euskal Herriko Unibertsitatea [Dictionary of frequency structures in Basque. University of the Basque Country]. Bilbao, Spain: Euskara Institutoa.Google Scholar
Sarasola, I., Salaburu, P., Landa, J., & Zabaleta, J. (2007). Ereduzko Prosa Gaur (EPG). Euskal Herriko Unibertsitatea [Current prototypical prose. University of the Basque Country]. Bilbao, Spain: Euskara Institutoa.Google Scholar
Taft, M. (2004). Morphological decomposition and the reverse base frequency effect. The Quarterly Journal of Experimental Psychology, 57, 745765. http://dx.doi.org/10.1080/02724980343000477 Google Scholar
Treiman, R., & Zukowski, A. (1991). Levels of phonological awareness. In Brady, S. A. & Shankweiler, D. P. (Eds.), Phonological processes in literacy. A tribute to Isabelle Y. Liberman (pp. 6783). Hillsdale, NJ: Erlbaum.Google Scholar
van Heuven, W. J. B., Mandera, P., Keuleers, E., & Brysbaert, M. (2014). SUBTLEX-UK: A new and improved word frequency database for British English. Quarterly Journal of Experimental Psychology, 67, 11761190. http://dx.doi.org/10.1080/17470218.2013.850521 CrossRefGoogle ScholarPubMed
Whitney, C. (2001). How the brain encodes the order of letters in a printed word: The SERIOL model and selective literature review. Psychonomic Bulletin and Review, 8, 221243. http://dx.doi.org/10.3758/BF03196158 Google Scholar
Zawiszewski, A., Gutierrez, E., Fernandez, B., & Laka, I. (2011). Language distance and non-native syntactic processing: Evidence from event-related potentials. Bilingualism: Language and Cognition, 14, 400411. http://dx.doi.org/10.1017/S1366728910000350 Google Scholar