No CrossRef data available.
Article contents
Genies, lawyers, and smart-asses: Extending proxy failures to intentional misunderstandings
Published online by Cambridge University Press: 13 May 2024
Abstract
We propose that the logic of a genie – an agent that exploits an ambiguous request to intentionally misunderstand a stated goal – underlies a common and consequential phenomenon, well within what is currently called proxy failures. We argue that such intentional misunderstandings are not covered by the current proposed framework for proxy failures, and suggest to expand it.
- Type
- Open Peer Commentary
- Information
- Copyright
- Copyright © The Author(s), 2024. Published by Cambridge University Press
References
Bridgers, S. E. C., Taliaferro, M., Parece, K., Schulz, L., & Ullman, T.. (2023). Loopholes: A window into value alignment and the communication of meaning. PsyArxiv.Google Scholar
Da Silva, S. G., Tehrani, J. J. (2016). Comparative phylogenetic analyses uncover the ancient roots of Indo-European folktales. Royal Society open science, 3(1), 150645.CrossRefGoogle ScholarPubMed
Goodman, N. D., & Frank, M. C. (2016). Pragmatic language interpretation as probabilistic inference. Trends in Cognitive Sciences, 20(11), 818–829.CrossRefGoogle ScholarPubMed
Hannikainen, I. R., Tobia, K. P., de Almeida, G. D. F., Struchiner, N., Kneer, M., Bystranowski, P., … Żuradzki, T. (2022). Coordination and expertise foster legal textualism. Proceedings of the National Academy of Sciences of the United States of America, 119(44), e2206531119.CrossRefGoogle ScholarPubMed
Katz, L. (2010). A theory of loopholes. The Journal of Legal Studies, 39(1), 1–31.CrossRefGoogle Scholar
Krakovna, V. (2020). Specification gaming examples in AI – Master list. http://bit.ly/kravokna_examples_list (accessed: 2020-12-28).Google Scholar
Opie, I. A., & Opie, P. (2001). The lore and language of schoolchildren. New York Review of Books.Google Scholar
Scott, J. C. (1985). Weapons of the weak: Everyday forms of peasant resistance. Yale University Press.Google Scholar
Uther, H.-J. (2004). The types of international folktales – A classification and bibliography. Suomalainen Tiedeakatemia Academia Scientiarum Fennica Exchange Centre.Google Scholar
Target article
Dead rats, dopamine, performance metrics, and peacock tails: Proxy failure is an inherent risk in goal-oriented systems
Related commentaries (20)
An updated perspective on teleonomy
Animal welfare science, performance metrics, and proxy failure
Behavioral proxies compete by the time courses of their rewards, including endogenous rewards
Changing the incentive structure of social media may reduce online proxy failure and proliferation of negativity
Dynamic diversity is the answer to proxy failure
Genies, lawyers, and smart-asses: Extending proxy failures to intentional misunderstandings
It's the biology, stupid! Proxy failures in economic decision making
Navigating proxy failures in education: Learning from human and animal play
On abstract goals’ perverse effects on proxies: The dynamics of unattainability
Proxies, heuristics, and goal alignment
Proxy failure and poor measurement practices in psychological science
Proxy failure as a feature of adaptive control systems
Proxy failure in academia: More than just another example
Proxy failure in social policies as one of the main causes of persistent sexism and racism
Proxy failures in practice: Examples from the sociology of science
Reductionism and proxy failure: From neuroscience to target-based drug discovery
Regulator and agent sophistication as an explanation-generating engine for proxy failure dynamics
Subjective and objective corruption of intuition and rational choice
The cost of success or failure for proxy signals in ecological problems
The determinants of proxy treadmilling in evolutionary models of reliable signals
Author response
Teleonomy, legibility, and diversity: Do we need more “proxynomics”?