Approaching Gradience in Acceptability with the Tools of Signal Detection Theory

doi:10.1017/9781108569620.004

3 - Approaching Gradience in Acceptability with the Tools of Signal Detection Theory

from Part I - General Issues in Acceptability Experiments

Published online by Cambridge University Press: 16 December 2021

Brian Dillon and

Matthew W. Wagers

Edited by

Grant Goodall

Show author details

Grant Goodall: Affiliation:
University of California, San Diego

Book contents

Get access

Summary

This chapter outlines a framework for using signal detection theory (SDT) to guide the design and analysis of acceptability judgment studies in experimental linguistics. It presents a worked example experiment on the syntactic phenomenon of D-linking (discourse linking) and wh-movement. It shows how to derive common SDT measures (like d_sub_a and s), how to do inferential statistics over those measures, and how to find additional theoretical and practical resources.

Keywords

signal detection theory acceptability judgments D-linking gradience

Type: Chapter
Information: The Cambridge Handbook of Experimental Syntax , pp. 62 - 96

DOI: https://doi.org/10.1017/9781108569620.004 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2021

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Aarts, B. (2007). Syntactic gradience: The nature of grammatical indeterminacy. Oxford: Oxford University Press.Google Scholar

Alexopoulou, T. & Keller, F. (2007). Locality, cyclicity, and resumption: At the interface between the grammar and the human sentence processor. Language, 83(1), 110–160.Google Scholar

Almeida, D. (2014). Subliminal wh-islands in Brazilian Portuguese and the consequences for syntactic theory. Revista da ABRALIN, 13(2), 55–91.Google Scholar

Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412.Google Scholar

Bader, M. & Häussler, J. (2010). Toward a model of grammaticality judgments. Journal of Linguistics, 46(2), 273–330.CrossRef Google Scholar

Bard, E. G., Robertson, D., & Sorace, A. (1996). Magnitude estimation of linguistic acceptability. Language, 71(2), 32–68.Google Scholar

Bock, K. & Middleton, E. L. (2011). Reaching agreement. Natural Language & Linguistic Theory, 29(4), 1033–1069.Google Scholar

Bürkner, P. C. & Vuorre, M. (2019). Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science, 2(1), 77–101.CrossRef Google Scholar

Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior, 12(4), 335–359.CrossRef Google Scholar

Cowart, W. (1997). Experimental Syntax. Thousand Oaks, CA: Sage.Google Scholar

DeCarlo, L. T. (2002). Signal detection theory with finite mixture distributions: Theoretical developments with applications to recognition memory. Psychological Review, 109(4), 710.CrossRef Google Scholar PubMed

DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics, 44(3), 837–845.Google Scholar

Dillon, B., Andrews, C., Rotello, C. M., & Wagers, M. (2019). A new argument for co-active parses during language comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45(7), 1271.Google Scholar

Dillon, B., Staub, A., Levy, J., & Clifton, C. Jr. (2017). Which noun phrases is the verb supposed to agree with? Object agreement in American English. Language, 93(1), 65–96.CrossRef Google Scholar

Drummond, A. (2013). Ibex farm. Online server: http://spellout.net/ibexfarm.Google Scholar

Dube, C., Rotello, C. M., & Heit, E. (2010). Assessing the belief bias effect with ROCs: It’s a response bias effect. Psychological Review, 117(3), 831.Google Scholar

Efron, B. & Tibshirani, R. (1993). An Introduction to the Bootstrap. London: Chapman & Hall.Google Scholar

Featherston, S. (2008). Thermometer judgments as linguistic evidence. In Riehl, C. M. & Rothe, A. (eds.), Was ist linguistische Evidenz? Aachen: Shaker Verlag, pp. 69–90.Google Scholar

Featherston, S. (2009). Relax, lean back, and be a linguist. Zeitschrift für Sprachwissenschaft, 28(1), 127–32.Google Scholar

Franck, J. (2011). Reaching agreement as a core syntactic process. Natural Language & Linguistic Theory, 29(4), 1071–1086.Google Scholar

Fukuda, S., Goodall, G., Michel, D., & Beecher, H. (2012). Is Magnitude Estimation worth the trouble? In Choi, J., Hogue, E. A., Punske, J., Tat, D., Schertz, J., & Trueman, A., eds., Proceedings of the 29th West Coast Conference on Formal Linguistics. Somerville, MA: Cascadilla Proceedings Project, pp. 328–336.Google Scholar

Gahl, S., Jurafsky, D., & Roland, D. (2004). Verb subcategorization frequencies: American English corpus data, methodological studies, and cross-corpus comparisons. Behavior Research Methods, Instruments, & Computers, 36(3), 432–443.Google Scholar

Goodall, G. (2015). The D-linking effect on extraction from islands and non-islands. Frontiers in Psychology, 5, 1493.Google Scholar

Gordon, P. C., Hendrick, R., & Johnson, M. (2001). Memory interference during language processing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27(6), 1411.Google Scholar PubMed

Hanley, J. A. & McNeil, B. J. (1983). A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology, 148(3), 839–843.CrossRef Google Scholar PubMed

Häussler, J., Grant, M., Fanselow, G., & Frazier, L. (2015). Superiority in English and German: Cross‐language grammatical differences? Syntax, 18(3), 235–265.Google Scholar

Hautus, M. J. (1995). Corrections for extreme proportions and their biasing effects on estimated values of d′. Behavior Research Methods, Instruments, & Computers, 27(1), 46–51.Google Scholar

Hautus, M. J. (1997). Calculating estimates of sensitivity from group data: Pooled versus averaged estimators. Behavior Research Methods, Instruments, & Computers, 29(4), 556–562.CrossRef Google Scholar

Heit, E. & Rotello, C. M. (2014). Traditional difference-score analyses of reasoning are flawed. Cognition, 131(1), 75–91.Google Scholar

Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59(4), 434–446.Google Scholar

Keller, F. (2000). Gradience in grammar: Experimental and computational aspects of degrees of grammaticality. Doctoral dissertation, University of Edinburgh.Google Scholar

Kush, D., Lohndal, T., & Sprouse, J. (2018). Investigating variation in island effects. Natural Language & Linguistic Theory, 36(3), 743–779.Google Scholar

Langsford, S., Perfors, A., Hendrickson, A. T., Kennedy, L. A., & Navarro, D. J. (2018). Quantifying sentence acceptability measures: Reliability, bias, and variability. Glossa: A Journal of General Linguistics, 3(1). DOI: 10.5334/gjgl.396Google Scholar

Lau, J. H., Clark, A., & Lappin, S. (2017). Grammaticality, acceptability, and probability: A probabilistic view of linguistic knowledge. Cognitive Science, 41(5), 1202–1241.Google Scholar

Liddell, T. M. & Kruschke, J. K. (2018). Analyzing ordinal data with metric models: What could possibly go wrong? Journal of Experimental Social Psychology, 79, 328–348.CrossRef Google Scholar

Liu, C. C. & Smith, P. L. (2009). Comparing time-accuracy curves: Beyond goodness-of-fit measures. Psychonomic Bulletin & Review, 16(1), 190–203.CrossRef Google Scholar PubMed

Loftus, G. R. (1978). On interpretation of interactions. Memory & Cognition, 6(3), 312–319.CrossRef Google Scholar

Ma, H., Bandos, A. I., Rockette, H. E., & Gur, D. (2013). On use of partial area under the ROC curve for evaluation of diagnostic performance. Statistics in Medicine, 32(20), 3449–3458.Google Scholar

Macmillan, N. A. & Creelman, C. D. (2005). Detection Theory: A User’s Guide. Mahwah, NJ: Lawrence Erlbaum.Google Scholar

Macmillan, N. A. & Kaplan, H. L. (1985). Detection theory analysis of group data: Estimating sensitivity from average hit and false-alarm rates. Psychological Bulletin, 98(1), 185.Google Scholar

Macmillan, N. A., Rotello, C. M., & Miller, J. O. (2004). The sampling distributions of Gaussian ROC statistics. Perception & Psychophysics, 66(3), 406–421.Google Scholar

Mauner, G. (1995). Examining the empirical and linguistic bases of current theories of agrammatism. Brain and Language, 50(3), 339–368.Google Scholar

McElree, B. (2000). Sentence comprehension is mediated by content-addressable memory structures. Journal of Psycholinguistic Research, 29(2), 111–123.Google Scholar

McElree, B., Foraker, S., & Dyer, L. (2003). Memory structures that subserve sentence comprehension. Journal of Memory and Language, 48(1), 67–91.Google Scholar

Melo, F. (2013). Area under the ROC curve. In Dubitzky, W., Wolkenhauer, O., Cho, K. H., & Yokota, H., eds., Encyclopedia of Systems Biology. New York: Springer New York, pp. 38–39.Google Scholar

Pazzaglia, A. M., Dube, C., & Rotello, C. M. (2013). A critical comparison of discrete-state and continuous models of recognition memory: Implications for recognition and beyond. Psychological Bulletin, 139(6), 1173.Google Scholar

Ratcliff, R., McKoon, G., & Tindall, M. (1994). Empirical generality of data from recognition memory receiver-operating characteristic functions and implications for the global memory models. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(4), 763.Google Scholar PubMed

Ratcliff, R., Sheu, C. F., & Gronlund, S. D. (1992). Testing global memory models using ROC curves. Psychological Review, 99(3), 518.Google Scholar

Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J. C., & Müller, M. (2011). pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics, 12(1), 77.Google Scholar

Rotello, C. M., Heit, E., & Dubé, C. (2015). When more data steer us wrong: Replications with the wrong dependent measure perpetuate erroneous conclusions. Psychonomic Bulletin & Review, 22(4), 944–954.Google Scholar

Schütze, C. T. (1996). The Empirical Base of Linguistics: Grammaticality Judgments and Linguistic Methodology. Chicago: University of Chicago Press.Google Scholar

Schütze, C. T. & Sprouse, J. (2014). Judgment data. In Podesva, R. & Sharma, D., eds., Research Methods in Linguistics. Cambridge: Cambridge University Press, pp. 27–50.Google Scholar

Sorace, A. & Keller, F. (2005). Gradience in linguistic data. Lingua, 115(11), 1497–1524.CrossRef Google Scholar

Sprouse, J. (2011). A test of the cognitive assumptions of magnitude estimation: Commutativity does not hold for acceptability judgments. Language 87(2), 274–288.Google Scholar

Sprouse, J. & Almeida, D. (2012). Assessing the reliability of textbook data in syntax: Adger’s Core Syntax. Journal of Linguistics, 48, 609–652.CrossRef Google Scholar

Sprouse, J. & Almeida, D. (2017). Design sensitivity and statistical power in acceptability judgment experiments. Glossa: A Journal of General Linguistics, 2(1), 1–32. DOI:10.5334/gjgl.236 Google Scholar

Sprouse, J., Caponigro, I., Greco, C., & Cecchetto, C. (2016). Experimental syntax and the variation of island effects in English and Italian. Natural Language & Linguistic Theory, 34(1), 307–344.Google Scholar

Sprouse, J., Schütze, C. T., & Almeida, D. (2013). A comparison of informal and formal acceptability judgments using a random sample from Linguistic Inquiry 2001–2010. Lingua, 134, 219–248.Google Scholar

Sprouse, J., Wagers, M., & Phillips, C. (2012). A test of the relation between working-memory capacity and syntactic island effects. Language, 88, 82–123.Google Scholar

Sprouse, J., Yankama, B., Indurkhya, S., Fong, S., & Berwick, R. C. (2018). Colorless green ideas do sleep furiously: gradient acceptability and the nature of the grammar. Linguistic Review, 35(3), 575–599.CrossRef Google Scholar

Stevens, S. S. (1956). The direct estimation of sensory magnitudes: Loudness. American Journal of Psychology, 69(1), 1–25.Google Scholar

Stevens, S. S. (1960). The psychophysics of sensory function. American Scientist, 48(2), 226–253.Google Scholar

Theodoridis, S. & Koutroumbas, K. (2008). Pattern Recognition. Burlington, MA: Academic Press.Google Scholar

Venkatraman, E. S. (2000). A permutation test to compare receiver operating characteristic curves. Biometrics, 56, 1134–1138.Google Scholar

Wagers, M. (2013). Memory mechanisms for wh-dependency formation and their implications for islandhood. In Sprouse, J. & Hornstein, N. (eds.), Experimental Syntax and Island Effects. Cambridge: Cambridge University Press, pp. 161–185.Google Scholar

Wagers, M. & Dillon, B. (in prep). Which sentences do speakers favor? ROC analysis of d-linking in filler–gap integration.Google Scholar

Wagers, M. W. & Phillips, C. (2014). Going the distance: memory and control processes in active dependency construction. The Quarterly Journal of Experimental Psychology, 67(7), 1274–1304.Google Scholar

Warstadt, A., Singh, A., & Bowman, S. R. (2018). Neural network acceptability judgments. arXiv preprint arXiv:1805.12471.Google Scholar

Weskott, T. & Fanselow, G. (2011). On the informativity of different measures of linguistic acceptability. Language, 87(2), 249–273.CrossRef Google Scholar