Hostname: page-component-586b7cd67f-rcrh6 Total loading time: 0 Render date: 2024-11-22T10:23:15.809Z Has data issue: false hasContentIssue false

Dynamic diversity is the answer to proxy failure

Published online by Cambridge University Press:  13 May 2024

Zeb Kurth-Nelson*
Affiliation:
Google DeepMind, London, UK [email protected] Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK
Steve Sullivan
Affiliation:
Department of Anesthesiology and Perioperative Medicine, Oregon Health and Science University, Portland, OR, USA [email protected]
Joel Z. Leibo
Affiliation:
Google DeepMind, London, UK [email protected]
Marc Guitart-Masip
Affiliation:
Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK Aging Research Center, Department of Neurobiology, Care Sciences and Society, Karolinska Institutet, Stockholm, Sweden [email protected] Center for Psychiatry Research, Region Stockholm, Stockholm, Sweden. Center for Cognitive Computational Neuropsychiatry (CCNP), Karolinska Institutet, Stockholm, Sweden
*
Corresponding author: Zeb Kurth-Nelson; Email: [email protected]

Abstract

We argue that a diverse and dynamic pool of agents mitigates proxy failure. Proxy modularity plays a key role in the ongoing production of diversity. We review examples from a range of scales.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2024. Published by Cambridge University Press

The ingredients for proxy failure are a target and an agent that optimizes for an approximation (proxy) of the target. Because the proxy is not the actual target, the behavior of the agent can become misaligned with the target (John et al.). In fact, Sohl-Dickstein (Reference Sohl-Dickstein2022) points out that if proxy optimization is too efficient, it reliably becomes not only ineffective but also actively harmful. Here, we argue, from molecules to societies, that the harm of proxy failure is minimized by a diverse and dynamic population of proxies; and that periodic separation between agents forces them to both individualize and work together, leading to new solutions.

John et al. give the example of decision-making algorithms in the brain as proxies for evolutionary fitness. These proxies fail with, for example, abused drugs or excessive consumption of food. In our view, diversity in decision-making systems is a central defense against this kind of proxy failure. The hypothalamus contains a set of segregated circuits, each implementing a distinct “hard-wired” behavioral policy aimed toward one homeostatic or reproductive goal, such as feeding, drinking, or mating (Saper & Lowell, Reference Saper and Lowell2014; Schulkin & Sterling, Reference Schulkin and Sterling2019; Sewards & Sewards, Reference Sewards and Sewards2003). In service of basic drives, corticostriatal circuitry also learns a more general and flexible set of goals (Balleine, Delgado, & Hikosaka, Reference Balleine, Delgado and Hikosaka2007; Cardinal, Parkinson, Hall, & Everitt, Reference Cardinal, Parkinson, Hall and Everitt2002; Frank & Claus, Reference Frank and Claus2006; Saunders & Robinson, Reference Saunders and Robinson2012). Of course, the behaviors prescribed by different goals often conflict, and the striatum can be viewed as a “parliament” dynamically arbitrating between goals (Cui et al., Reference Cui, Jun, Jin, Pham, Vogel, Lovinger and Costa2013; Da Silva, Tecuapetla, Paixão, & Costa, Reference Da Silva, Tecuapetla, Paixão and Costa2018; Graybiel & Grafton, Reference Graybiel and Grafton2015; Klaus et al., Reference Klaus, Martins, Paixao, Zhou, Paninski and Costa2017; Mohebi et al., Reference Mohebi, Pettibone, Hamid, Wong, Vinson, Patriarchi and Berke2019). Humans in particular adopt a dizzying diversity of goals (O'Reilly, Hazy, Mollick, Mackie, & Herd, Reference O'Reilly, Hazy, Mollick, Mackie and Herd2014; Schank & Abelson, Reference Schank and Abelson1977) and also synthesize new goals when existing ones are frustrated. Each goal represents a different proxy for evolutionary fitness, and they better approximate fitness when they are in balance than when an individual goal is excessively optimized. Pathological states occur when the system gets stuck on a single goal, such as in addiction or rumination.

Diversity of beliefs protects against proxy failure in the same way as diversity of goals. Every human holds many distinct beliefs. The beliefs are “separate,” in that they are not required to be consistent with one another (Wood, Douglas, & Sutton, Reference Wood, Douglas and Sutton2012), and when one is active, others are largely inaccessible (Hills, Todd, Lazer, Redish, & Couzin, Reference Hills, Todd, Lazer, Redish and Couzin2015). Each belief (or perspective, or metaphor) is only a partial description of the world – a proxy for a broader truth. This proxy diversity serves us well. An individual with multiple perspectives on a problem is less likely to get stuck in a particular approach (De Bono, Reference De Bono1970; Duncker, Reference Duncker1945; Ohlsson, Reference Ohlsson1992), and a deep understanding of a topic means having many different perspectives available (Feyerabend, Reference Feyerabend1975; Lakoff & Johnson, Reference Lakoff and Johnson1980; Saffo, Reference Saffo2008; Wittgenstein, Reference Wittgenstein1953). Conversely, if we attach to and optimize for a single perspective, our thinking is rigid and shallow: Optimizing too strongly for that single proxy leads to divergence from the broader truth. In the brain, a network centered on hippocampus appears to support diversity and dynamism. This network separates knowledge modularly into distinct entities and narratives (McClelland, McNaughton, & O'Reilly, Reference McClelland, McNaughton and O'Reilly1995; Yassa & Stark, Reference Yassa and Stark2011). Vitally, after they are separated, the entities are then also flexibly composed together in many different ways, synthesizing new knowledge and perspectives (Buckner, Reference Buckner2010; Kazanina & Poeppel, Reference Kazanina and Poeppel2023; Kurth-Nelson et al., Reference Kurth-Nelson, Behrens, Wayne, Miller, Luettgau, Dolan and Schwartenbeck2023; O'Reilly, Ranganath, & Russin, Reference O'Reilly, Ranganath and Russin2022).

Just as the brain holds diverse motivations and beliefs in balance, multiagent systems such as human societies contain diverse and competing forces, which can be seen as proxies for collective welfare. There is a rich tradition of studying the conditions under which this diversity of objectives is conducive to broader success (Ostrom, Gardner, & Walker, Reference Ostrom, Gardner and Walker1994). Empirically, excess communication reduces diversity and worsens performance in human groups (Lorenz, Rauhut, Schweitzer, & Helbing, Reference Lorenz, Rauhut, Schweitzer and Helbing2011; Page, Reference Page2017). However, if individuals are allowed to spend time first working on a problem in isolation and then combine solutions, the group performs better (Bernstein, Shore, & Lazer, Reference Bernstein, Shore and Lazer2018). This example follows the general pattern that entities must first separate to diversify and gain individual stability. Then, interaction creates higher-order structures, leading to hierarchies and open-ended evolution.

Diversity plays a similar role in groups of artificial agents. Imagine an evolving population of game-playing agents, where the fitness of each individual is determined by playing paper-rock-scissors against each other. If the population loses diversity and collapses on a single strategy, such as “always play rock,” then a mutation that produces the strategy “always play paper” will dominate the population. These waves of dominant strategies can go in circles through the optimization landscape, never improving overall. However, if the population is diverse, agents are forced to discover truly new solutions, an effect also documented in much more complex games (Crepinšek, Liu, & Mernik, Reference Crepinšek, Liu and Mernik2013; Czarnecki et al., Reference Czarnecki, Gidel, Tracey, Tuyls, Omidshafiei, Balduzzi and Jaderberg2020; Leibo, Hughes, Lanctot, & Graepel, Reference Leibo, Hughes, Lanctot and Graepel2019; Vinyals et al., Reference Vinyals, Babuschkin, Czarnecki, Mathieu, Dudzik, Chung and Silver2019).

As a final example, sexual reproduction is remarkably common (Judson & Normark, Reference Judson and Normark1996; Speijer, Lukeš, & Eliáš, Reference Speijer, Lukeš and Eliáš2015), despite the cost of producing males and the challenge of finding mates in the vast world (Lehtonen, Jennions, & Kokko, Reference Lehtonen, Jennions and Kokko2012; Maynard Smith, Reference Maynard Smith1978). What advantages does sex offer? A traditional view is that recombination generates diversity by exploring new combinations of genes. A fascinating extension of this theory is that recombination also forces the genes to be modular, democratizing the genome (Agren, Haig, & McCoy, Reference Agren, Haig and McCoy2022; Livnat, Papadimitriou, Dushoff, & Feldman, Reference Livnat, Papadimitriou, Dushoff and Feldman2008; Melo, Porto, Cheverud, & Marroig, Reference Melo, Porto, Cheverud and Marroig2016; Srivastava, Hinton, Krizhevsky, Sutskever, & Salakhutdinov, Reference Srivastava, Hinton, Krizhevsky, Sutskever and Salakhutdinov2014; Veller, Reference Veller2022). A gene can't depend on the presence of another particular gene because it might disappear in the next shuffling. Instead, each gene is incentivized to function productively with any new genome it finds itself in – yielding a genetic foundation ripe for synthesis of new solutions. Although each gene is selfish and is only an imperfect proxy for the welfare of the organism, a diverse and dynamic set of genes protects against proxy failure.

In conclusion, connectedness must be balanced with periods of separation to maintain diversity and protect against proxy failures. We should be cautious about moving toward continual interconnectedness and premature exchange of information. Similarly, with rapid advances in artificial intelligence (AI), we should be cautious about concentrating intelligence in one place. Diverse AI systems should exist with different objectives and modes of operation. Troublingly, proxy failure may explain the Fermi paradox – the puzzle that we don't see other intelligent life in the universe. Through Earth's history, evolutionary experiments have had opportunities to develop separately. Archaea and prokaryotic mitochondrial ancestors specialized separately for hundreds of millions of years before achieving the distinct forms that enabled fruitful endosymbiosis, fueling the explosion of multicellular complexity (Lane & Martin, Reference Lane and Martin2010; Margulis, Reference Margulis1970; Roger, Muñoz-Gómez, & Kamikawa, Reference Roger, Muñoz-Gómez and Kamikawa2017). However, the trend with increased intelligence is toward immediate exchange of information between entities across the planet, reducing proxy diversity, with risk of catastrophic failure (Diamond, Reference Diamond2005).

Acknowledgments

We thank Zora Wessely for her comments on an earlier version of the manuscript.

Financial support

This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Competing interest

Z. K.-N. and J. Z. L. are employed by Google DeepMind. S. S. and M. G.-M. have no competing interest to declare.

Footnotes

*

Equal contribution

References

Agren, J. A., Haig, D., & McCoy, D. E. (2022). Meiosis solved the problem of gerrymandering. Journal of Genetics, 101(2), 38.CrossRefGoogle ScholarPubMed
Balleine, B. W., Delgado, M. R., & Hikosaka, O. (2007). The role of the dorsal striatum in reward and decision-making. Journal of Neuroscience, 27(31), 81618165.CrossRefGoogle ScholarPubMed
Bernstein, E., Shore, J., & Lazer, D. (2018). How intermittent breaks in interaction improve collective intelligence. Proceedings of the National Academy of Sciences of the United States of America, 115(35), 87348739.CrossRefGoogle ScholarPubMed
Buckner, R. L. (2010). The role of the hippocampus in prediction and imagination. Annual Review of Psychology, 61, 2748.CrossRefGoogle ScholarPubMed
Cardinal, R. N., Parkinson, J. A., Hall, J., & Everitt, B. J. (2002). Emotion and motivation: The role of the amygdala, ventral striatum, and prefrontal cortex. Neuroscience & Biobehavioral Reviews, 26(3), 321352.CrossRefGoogle ScholarPubMed
Crepinšek, M., Liu, S.-H., & Mernik, M. (2013). Exploration and exploitation in evolutionary algorithms: A survey. ACM Computing Surveys (CSUR), 45(3), 133.CrossRefGoogle Scholar
Cui, G., Jun, S. B., Jin, X., Pham, M. D., Vogel, S. S., Lovinger, D. M., & Costa, R. M. (2013). Concurrent activation of striatal direct and indirect pathways during action initiation. Nature, 494(7436), 238242.CrossRefGoogle ScholarPubMed
Czarnecki, W. M., Gidel, G., Tracey, B., Tuyls, K., Omidshafiei, S., Balduzzi, D., & Jaderberg, M. (2020). Real world games look like spinning tops. Advances in Neural Information Processing Systems, 33, 1744317454.Google Scholar
Da Silva, J. A., Tecuapetla, F., Paixão, V., & Costa, R. M. (2018). Dopamine neuron activity before action initiation gates and invigorates future movements. Nature, 554(7691), 244248.CrossRefGoogle ScholarPubMed
De Bono, E. (1970). Lateral thinking (Vol. 70). Harper & Row.Google Scholar
Diamond, J. (2005). Collapse: How societies choose to fail or succeed. Penguin Books. ISBN 9780241958681.Google Scholar
Duncker, K. (1945). On problem-solving. Psychological monographs, 58(5), 1113.CrossRefGoogle Scholar
Feyerabend, P. K. (1975). Against method. Verso.Google Scholar
Frank, M. J., & Claus, E. D. (2006). Anatomy of a decision: Striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychological Review, 113(2), 300.CrossRefGoogle ScholarPubMed
Graybiel, A. M., & Grafton, S. T. (2015). The striatum: Where skills and habits meet. Cold Spring Harbor Perspectives in Biology, 7(8), a021691.CrossRefGoogle ScholarPubMed
Hills, T. T., Todd, P. M., Lazer, D., Redish, A. D., & Couzin, I. D. (2015). Exploration versus exploitation in space, mind, and society. Trends in Cognitive Sciences, 19(1), 4654.CrossRefGoogle ScholarPubMed
Judson, O. P., & Normark, B. B. (1996). Ancient asexual scandals. Trends in Ecology & Evolution, 11(2), 4146.CrossRefGoogle ScholarPubMed
Kazanina, N., & Poeppel, D. (2023). The neural ingredients for a language of thought are available. Trends in Cognitive Sciences, 27(11), 9961007.CrossRefGoogle ScholarPubMed
Klaus, A., Martins, G. J., Paixao, V. B., Zhou, P., Paninski, L., & Costa, R. M. (2017). The spatiotemporal organization of the striatum encodes action space. Neuron, 95(5), 11711180.CrossRefGoogle ScholarPubMed
Kurth-Nelson, Z., Behrens, T., Wayne, G., Miller, K., Luettgau, L., Dolan, R., … Schwartenbeck, P. (2023). Replay and compositional computation. Neuron, 111(4), 454469.CrossRefGoogle ScholarPubMed
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. University of Chicago Press.Google Scholar
Lane, N., & Martin, W. (2010). The energetics of genome complexity. Nature, 467(7318), 929934.CrossRefGoogle ScholarPubMed
Lehtonen, J., Jennions, M. D., & Kokko, H. (2012). The many costs of sex. Trends in Ecology & Evolution, 27(3), 172178.CrossRefGoogle ScholarPubMed
Leibo, J. Z., Hughes, E., Lanctot, M., & Graepel, T. (2019). Autocurricula and the emergence of innovation from social interaction: A manifesto for multi-agent intelligence research. arXiv preprint arXiv:1903.00742.Google Scholar
Livnat, A., Papadimitriou, C., Dushoff, J., & Feldman, M. W. (2008). A mixability theory for the role of sex in evolution. Proceedings of the National Academy of Sciences of the United States of America, 105(50), 1980319808.CrossRefGoogle ScholarPubMed
Lorenz, J., Rauhut, H., Schweitzer, F., & Helbing, D. (2011). How social influence can undermine the wisdom of crowd effect. Proceedings of the National Academy of Sciences of the United States of America, 108(22), 90209025.CrossRefGoogle ScholarPubMed
Margulis, L. (1970). Origin of eukaryotic cells: Evidence and research implications for a theory of the origin and evolution of microbial, plant, and animal cells on the Precambrian Earth. Yale University Press.Google Scholar
Maynard Smith, J. (1978). The evolution of sex (Vol. 4). Cambridge University Press.Google Scholar
McClelland, J. L., McNaughton, B. L., & O'Reilly, R. C. (1995). Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102(3), 419.CrossRefGoogle ScholarPubMed
Melo, D., Porto, A., Cheverud, J. M., & Marroig, G. (2016). Modularity: Genes, development, and evolution. Annual Review of Ecology, Evolution, and Systematics, 47, 463486.CrossRefGoogle ScholarPubMed
Mohebi, A., Pettibone, J. R., Hamid, A. A., Wong, J.-M. T., Vinson, L. T., Patriarchi, T., … Berke, J. D. (2019). Dissociable dopamine dynamics for learning and motivation. Nature, 570(7759), 6570.CrossRefGoogle ScholarPubMed
Ohlsson, S. (1992). Information-processing explanations of insight and related phenomena. Advances in the Psychology of Thinking, 1, 144.Google Scholar
O'Reilly, R. C., Hazy, T. E., Mollick, J., Mackie, P., & Herd, S. (2014). Goal-driven cognition in the brain: A computational framework. arXiv preprint arXiv:1404.7591.Google Scholar
O'Reilly, R. C., Ranganath, C., & Russin, J. L. (2022). The structure of systematicity in the brain. Current Directions in Psychological Science, 31(2), 124130.CrossRefGoogle ScholarPubMed
Ostrom, E., Gardner, R., & Walker, J. (1994). Rules, games, and common-pool resources. University of Michigan Press.CrossRefGoogle Scholar
Page, S. E. (2017). The diversity bonus. Princeton University Press.CrossRefGoogle Scholar
Roger, A. J., Muñoz-Gómez, S. A., & Kamikawa, R. (2017). The origin and diversification of mitochondria. Current Biology, 27(21), R1177R1192.CrossRefGoogle ScholarPubMed
Saper, C. B., & Lowell, B. B. (2014). The hypothalamus. Current Biology, 24(23), R1111R1116.CrossRefGoogle ScholarPubMed
Saunders, B. T., & Robinson, T. E. (2012). The role of dopamine in the accumbens core in the expression of Pavlovian-conditioned responses. European Journal of Neuroscience, 36(4), 25212532.CrossRefGoogle ScholarPubMed
Schank, R. C., & Abelson, R. P. (1977). Scripts, plans, goals, and understanding: An inquiry into human knowledge structures. Psychology Press.Google Scholar
Schulkin, J., & Sterling, P. (2019). Allostasis: A brain-centered, predictive mode of physiological regulation. Trends in Neurosciences, 42(10), 740752.CrossRefGoogle ScholarPubMed
Sewards, T. V., & Sewards, M. A. (2003). Representations of motivational drives in mesial cortex, medial thalamus, hypothalamus and midbrain. Brain Research Bulletin, 61(1), 2549.CrossRefGoogle ScholarPubMed
Sohl-Dickstein, J. (2022). Too much efficiency makes everything worse: Overfitting and the strong version of Goodhart's law, November 2022. https://sohl-dickstein.github.io/2022/11/06/strong-Goodhart.htmlGoogle Scholar
Speijer, D., Lukeš, J., & Eliáš, M. (2015). Sex is a ubiquitous, ancient, and inherent attribute of eukaryotic life. Proceedings of the National Academy of Sciences of the United States of America, 112(29), 88278834.CrossRefGoogle ScholarPubMed
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 19291958.Google Scholar
Veller, C. (2022). Mendel's first law: Partisan interests and the parliament of genes. Heredity, 129(1), 4855.CrossRefGoogle ScholarPubMed
Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., … Silver, D. (2019). Grandmaster level in Starcraft II using multi-agent reinforcement learning. Nature, 575(7782), 350354.CrossRefGoogle ScholarPubMed
Wittgenstein, L. (1953). Philosophical investigations. Basil Blackwell.Google Scholar
Wood, M. J., Douglas, K. M., & Sutton, R. M. (2012). Dead and alive: Beliefs in contradictory conspiracy theories. Social Psychological and Personality Science, 3(6), 767773.CrossRefGoogle Scholar
Yassa, M. A., & Stark, C. E. (2011). Pattern separation in the hippocampus. Trends in Neurosciences, 34(10), 515525.CrossRefGoogle ScholarPubMed