Genies, lawyers, and smart-asses: Extending proxy failures to intentional misunderstandings

Tomer D. Ullman; Sophie Bridgers

doi:10.1017/S0140525X23002820

Genies, lawyers, and smart-asses: Extending proxy failures to intentional misunderstandings

Published online by Cambridge University Press: 13 May 2024

Tomer D. Ullman

and

Sophie Bridgers

Show author details

Tomer D. Ullman*: Affiliation:
Department of Psychology, Harvard University, Cambridge, MA, USA www.tomerullman.org
Sophie Bridgers: Affiliation:
Department of Psychology, Harvard University, Cambridge, MA, USA www.tomerullman.org Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA [email protected]
*: Corresponding author: Tomer D. Ullman; Email: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We propose that the logic of a genie – an agent that exploits an ambiguous request to intentionally misunderstand a stated goal – underlies a common and consequential phenomenon, well within what is currently called proxy failures. We argue that such intentional misunderstandings are not covered by the current proposed framework for proxy failures, and suggest to expand it.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 47 , 2024 , e86

DOI: https://doi.org/10.1017/S0140525X23002820 [Opens in a new window]
Copyright: Copyright © The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bridgers, S. E. C., Taliaferro, M., Parece, K., Schulz, L., & Ullman, T.. (2023). Loopholes: A window into value alignment and the communication of meaning. PsyArxiv.Google Scholar

Da Silva, S. G., Tehrani, J. J. (2016). Comparative phylogenetic analyses uncover the ancient roots of Indo-European folktales. Royal Society open science, 3(1), 150645.CrossRef Google Scholar PubMed

Goodman, N. D., & Frank, M. C. (2016). Pragmatic language interpretation as probabilistic inference. Trends in Cognitive Sciences, 20(11), 818–829.CrossRef Google Scholar PubMed

Hannikainen, I. R., Tobia, K. P., de Almeida, G. D. F., Struchiner, N., Kneer, M., Bystranowski, P., … Żuradzki, T. (2022). Coordination and expertise foster legal textualism. Proceedings of the National Academy of Sciences of the United States of America, 119(44), e2206531119.CrossRef Google Scholar PubMed

Isenbergh, J. (1982). Musings on form and substance in taxation. HeinOnline.CrossRef Google Scholar

Katz, L. (2010). A theory of loopholes. The Journal of Legal Studies, 39(1), 1–31.CrossRef Google Scholar

Krakovna, V. (2020). Specification gaming examples in AI – Master list. http://bit.ly/kravokna_examples_list (accessed: 2020-12-28).Google Scholar

Opie, I. A., & Opie, P. (2001). The lore and language of schoolchildren. New York Review of Books.Google Scholar

Scott, J. C. (1985). Weapons of the weak: Everyday forms of peasant resistance. Yale University Press.Google Scholar

Uther, H.-J. (2004). The types of international folktales – A classification and bibliography. Suomalainen Tiedeakatemia Academia Scientiarum Fennica Exchange Centre.Google Scholar