Origins of life: first came evolutionary dynamics

Charles Kocher; Ken A. Dill

doi:10.1017/qrd.2023.2

Origins of life: first came evolutionary dynamics

Published online by Cambridge University Press: 22 March 2023

Charles Kocher

and

Ken A. Dill

Show author details

Charles Kocher: Affiliation:
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, USA
Ken A. Dill*: Affiliation:
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, USA Department of Chemistry, Stony Brook University, Stony Brook, NY, USA
*: Corresponding author: Ken A. Dill; Email: [email protected]

Article contents

Abstract
Requirement for life’s origin: persistent propagation
Description of Darwinian dynamics as a cyclic machine
Features of evolution that are relevant for origins
Puzzles about the molecular origins of the DEM
The case for proteins and the folding process
Emergent autocatalysis from HP foldcats
Outlook: from protein folding to evolution
Open peer review
Supplementary materials
Author contribution
Financial support
Competing interest
References

Rights & Permissions

Abstract

When life arose from prebiotic molecules 3.5 billion years ago, what came first? Informational molecules (RNA, DNA), functional ones (proteins), or something else? We argue here for a different logic: rather than seeking a molecule type, we seek a dynamical process. Biology required an ability to evolve before it could choose and optimise materials. We hypothesise that the evolution process was rooted in the peptide folding process. Modelling shows how short random peptides can collapse in water and catalyse the elongation of others, powering both increased folding stability and emergent autocatalysis through a disorder-to-order process.

Keywords

Darwinian evolution protein folding origin of life

Type: Perspective
Information: QRB Discovery , Volume 4 , 2023 , e4

DOI: https://doi.org/10.1017/qrd.2023.2 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2023. Published by Cambridge University Press

Requirement for life’s origin: persistent propagation

What was the origin of life? A pre-requisite for answering that question is to define the difference between dead and alive. Defining life has been notoriously challenging (Schrodinger, Reference Schrodinger1944; Cleland and Chyba, Reference Cleland and Chyba2002; Popa, Reference Popa2004; Benner, Reference Benner2010; Machery, Reference Machery2012; Pross, Reference Pross2016; Plaxco and Gross, Reference Plaxco and Gross2021). For example, the ability to metabolise, grow, and duplicate is not sufficient to distinguish life from a candle flame. Nevertheless, a good consensus definition is from NASA: ‘Life is a self-sustaining chemical system capable of Darwinian evolution’ (Joyce et al., Reference Joyce, Deamer and Fleischaker1994). The italics are ours; the clear implication is that life could not begin before some dynamical adaptation process of molecules was already operative.

So, in order to seek the origins of life, we ask what physico-chemical process(es) could have driven prebiotic molecules to become autocatalytic and adaptive. We call this The Day Two Problem, to distinguish it from traditional questions about ‘What Came First’, which we call The Day One Problem.

• The ‘Day One’ Question: What material came first? Think of the metaphorical ‘chicken or the egg problem’ (although in the real world, the chicken-and-egg framing is misleading because many things, including chickens and eggs, emerged in parallel, not in series). For example: Did life start as an RNA World (Gilbert, Reference Gilbert1986; Joyce and Szostak, Reference Joyce and Szostak2018)? Or a lipid world (Segré et al., Reference Segré, Ben-Eli, Deamer and Lancet2001; Deamer, Reference Deamer2011), an amyloid world (Maury, Reference Maury2009, Reference Maury2015, Reference Maury2018) or through metabolism first (Wächtershäuser, Reference Wächtershäuser1988; De Duve and De Neufville, Reference De Duve and De Neufville1991; De Duve, Reference De Duve1995; Cody, Reference Cody2004; Shapiro, Reference Shapiro2006; Jordan et al., Reference Jordan, Ioannou, Rammu, Halpern, Bogart, Ahn, Vasiliadou, Christodoulou, Maréchal and Lane2021; Matsuo and Kurihara, Reference Matsuo and Kurihara2021)? Our view is that the full origin of life required many aspects – metabolism, information, protein-like catalysts and makers, and others – to arise together.
• The ‘Day Two’ Question: What dynamical process might have driven prebiotic molecules to become self-sustaining and self-serving? How might prebiotic molecules undergo directed change from Day One to Day Two and beyond? For biology to operate, it requires an operating system. What would drive molecules to evolve further on their own?

We give here a hypothesis. We describe a mechanism by which prebiotic undirected syntheses of short peptides could plausibly become ‘makers’, that is molecules that persistently make other molecules. We give a summary and perspective of recent computer modelling of a disorder-to-order process that achieves positive feedback against the forces of degradation. We start below by positing Darwinian evolutionary dynamics as a driven machine cycle of steps, because this vantage point illuminates principles about possible molecular origins.

Description of Darwinian dynamics as a cyclic machine

Darwinian Dynamics has three well-known features: (1) Replication, moms making more moms; (2) Mutation, search and discovery of sequence → function polymers that create new molecular entities and mechanisms and (3) Selection, competition-driven upratcheting of fitness. Fig. 1 expresses these properties in the form of a machine-like, biosphere-wide nonequilibrium (NEQ) cycle that we call the Darwinian Evolution Machine (DEM). Described in detail elsewhere (Kocher and Dill, Reference Kocher and Dill2023), the cycle operates from left to right. (a) At time $ t $ , $ X $ indicates a wild-type (status quo) population. (b) A mutation occurs in some individual. (c) That individual cell is grown up into a population $ Y $ . Populations $ X $ and $ Y $ compete for resources. (a(t + 1)) The winner becomes the new status quo wild-type population, thus becoming ‘remembered’ in the population, and gains more resources. The cycle repeats, driven by a persistent external supply of resources.

Fig. 1. Top: The DEM cycle. (a) At time $ t $ shows a population $ X $ of wild-type cells. (b) One cell mutates. That cell grows to have population $ Y $ in (c). Populations $ X $ and $ Y $ compete, one wins, and the cycle begins again as (a) at time $ t+1 $ . Bottom: Fitness landscapes show the separation of actions. From point $ X $ , mutation entails a random, relatively unbiased exploration (orange region). The third landscape shows selection, in this case of $ Y $ , where the bias and preference occurs.

The steps (b) and (c) of population growth on the resources, competition and winning can be expressed as population-resource dynamics. For example, if we have $ N $ competitors $ {A}_n $ , labelled by their phenotype, fighting for one resource $ r $ ,

(1)

$$ {\displaystyle \begin{array}{c}\frac{dr}{dt}\hskip0.35em =\hskip0.35em g(r)-{D}_r(r)-\sum \limits_n{U}_n\left({A}_n,r\right),\\ {}\frac{dA_n}{dt}\hskip0.35em =\hskip0.35em {R}_n\left(\frac{U_n\left({A}_n,r\right)}{A_n}\right){A}_n-{D}_n\left({A}_n\right),\end{array}} $$

where $ g(r) $ describes the NEQ input of the resource, $ {D}_r(r) $ and $ {D}_n\left({A}_n\right) $ are decay/death terms, $ {U}_n\left({A}_n,r\right) $ is the total resource use rate by moms of type $ n $ , and $ {R}_n $ is the reproduction rate of one mom of type $ n $ given that mom eats resource at a rate $ {U}_n\left({A}_n,r\right)/{A}_n $ . Mutational discovery will introduce new moms, say $ {A}_{N+1} $ , which are then selected for or against by the DEM. The most competitive moms are ‘remembered’ by the DEM because they are good enough autocatalysts to maintain persistent populations. ‘Fitness’, which is the term that describes this competition-driven selection, is non-trivial to define, and can be model-dependent; see Supplementary Material SI.2 for more discussion.

Known biology constrains the mathematical form that is needed for the function $ U $ . Often, population genetics modelling approximates it as linear in both resources and moms, $ {U}_n\left({A}_n,r\right)\hskip0.35em =\hskip0.35em {rk}_n{A}_n $ . But, such linearity leads to ‘winner-take-all’ (WTA) dynamics (Volterra, Reference Volterra1928; Fisher, Reference Fisher1930; Gause, Reference Gause1934; MacArthur, Reference MacArthur1970; Hsu et al., Reference Hsu, Hubbell and Waltman1977; Tilman, Reference Tilman1982; Chesson, Reference Chesson1990; Lifson, Reference Lifson1997; Pross, Reference Pross2011; van Opheusden et al., Reference van Opheusden, Hemerik, van Opheusden and van der Werf2015), a dynamics that misses important features of evolution. Instead, evolution often gives ‘peaceful coexistence’ of multiple species on a given resource (Hutchinson, Reference Hutchinson1961; Armstrong and McGehee, Reference Armstrong and McGehee1980; Chesson, Reference Chesson2000; Chesson and Kuang, Reference Chesson and Kuang2008; Charlebois and Balázsi, Reference Charlebois and Balázsi2016; Barabás et al., Reference Barabás, D’Andrea and Stump2018; Goyal et al., Reference Goyal, Dubinkina and Maslov2018; Wang et al., Reference Wang, Fridman, Maslov and Goyal2022). Peaceful coexistence is captured using a saturating function (Beddington, Reference Beddington1975; DeAngelis et al., Reference DeAngelis, Goldstein and O’Neill1975; Novak and Stouffer, Reference Novak and Stouffer2021; Stouffer and Novak, Reference Stouffer and Novak2021; Kocher and Dill, Reference Kocher and Dill2023),

(2)

$$ {U}_n\left({A}_n,r\right)\hskip0.35em =\hskip0.35em \frac{rk_n{A}_n}{b_n+{c}_nr+{A}_n}, $$

which simply expresses two natural limits, that maximum concentrations of moms are finite and that speeds of producing offspring are finite. What is novel in the present DEM perspective is the combining of this generalised form of population-genetics (Eq. (1)) with iterative mutation, competition and selection cycles (Kocher and Dill, Reference Kocher and Dill2023). In the following section, we extract four principles from this DEM perspective that helps us formulate possible molecular precursors in the next sections.

Features of evolution that are relevant for origins

The DEM model perspective illuminates what is needed for the origin of life. First, the DEM is a maker of makers, a process of moms creating more moms. The DEM is an autocatalytic set, or a collection of entities, each of which can be created catalytically by other entities, such that as a whole, the set is able to catalyse its own production (Hordijk, Reference Hordijk2019). An extensive literature describes the importance of autocatalytic sets in the origins of life (Eigen, Reference Eigen1971; Kauffman, Reference Kauffman1971, Reference Kauffman1986; Kauffman et al., Reference Kauffman1993; Dyson, Reference Dyson1999, Reference Dyson1982; Jain and Krishna, Reference Jain and Krishna2002; Hordijk and Steel, Reference Hordijk and Steel2014; Hordijk, Reference Hordijk2019; Hordijk et al., Reference Hordijk, Steel and Kauffman2022). The positive feedback of makers making makers contributes to the self-sustaining nature of evolution.

Second, environments that are unruly and fluctuating can sort winners from losers through booms and busts (Doebeli et al., Reference Doebeli, Jaque and Ispolatov2021; Wang et al., Reference Wang, Fridman, Maslov and Goyal2022). Booms and busts drive the recycling of resources, taking resources away from the losers and giving them to the winners, thus driving the rich to get richer on the road to autocatalysis.

Third, DEM Dynamics can sustain peaceful coexistence among multiple agents at the same time. In a world of winners-taking-all (WTA), without peaceful coexistence, evolution would have been brittle, always on the edge of extinction. If $ X $ is more fit than $ Y $ in environment $ {E}_1 $ , $ X $ would be the lone survivor in a WTA model. Now, if the environment fluctuates to $ {E}_2 $ , which kills $ X $ , then the whole ecosystem dies. Instead, in a world of coexistence, diversity preserves the ecosystem. An ecosystem that has an ensemble of backup moms is more robust to unpredictable new environments. Ensembles are crucial for long-term survival and persistence of the DEM.

And fourth, the biosphere-wide DEM is a driven machine: its cycles of molecule-making are powered by uptake of out-of-equilibrium resources from the environment. There are different tendencies for driven systems than for equilibrium processes. Some detail is given in Supplementary Material SI.1; here we just give a few examples. (1) A fluid subject to gravity will flow down a hill and stop in the valley at the bottom. But, a fluid subject to a strong force can flow beyond the valley to cross the next hill and beyond. (2) A TV set or computer performs intricate functions as long as it is ‘plugged in’. Its current flows are not predicted by principles of equilibrium, such as the Second Law. Such devices only tend to equilibrium when they are unplugged. Think about an electromagnet. A metal rod will not pick up nails, but when a current is driven around the bar, it will. An electromagnet is driven by the current input. Its action is a nonequilibrium (NEQ) force. That force goes to zero when the input current is turned off. (3) While equilibrium systems tend downhill in energy (or free energy), driven systems can also go uphill. Think of chemical reactions, like binding events or protein folding or molecular association or partitioning processes; under common circumstances, their stable states are predicted by tendencies towards minimum free energies. But, living systems have biochemical cycles, where uphill steps are driven by a coupling to downhill steps. The persistence of the DEM for 3.5 billion years is because biology has become so capable of exploiting the food, energy, and matter out-of-equilibrium aspects of its environments. How might the DEM have arisen from prebiotic molecular processes? Below are some of the key questions.

Puzzles about the molecular origins of the DEM

• What molecules were the first ‘makers?’ Today’s biological maker molecules are proteins and RNA, chain molecules that encode different functionalities as different monomer sequences. How did prebiotic polymers come to have sequence $ \to $ function relationships? What simple molecular process started producing self-sustaining maker molecules?
• How was molecule-making powered by external forces? How did molecule–making come to outcompete molecule degradation and become so sustainable?
• How did makers and catalysts become mobile, molecular-scale and editable? An enormously transformative event in the prebiotic transition from chemistry to biology was the transition from catalysts that were immutable macroscale surfaces to microscale mobile editable proteins. Current thinking is that prebiotic reactions were first catalysed by mineral or clay surfaces (Wächtershäuser, Reference Wächtershäuser1988), or interfaces (Holden et al., Reference Holden, Morato and Cooks2022), or in hot volcanic vents (Martin et al., Reference Martin, Baross, Kelley and Russell2008). Such catalysts are macroscopic, geographically immovable, and fixed in their single-reaction catalysis under fixed conditions. But cells need whole biochemical pathways, where multiple reactions are strung together to achieve complex chemistry. Each step has a tailored catalyst that provides precisely the right acceleration of precisely that reaction, is mobile and small enough to fit inside a cell, and functions in the same water solvent as all the other requisite catalysts at room temperature. As a metaphor, consider the importance in the Industrial Revolution of steam engines, which replaced immovable energy sources of rivers and waterfalls by power that was mobile and tailorable to circumstances. How did prebiotic catalysis ‘learn’ to become untethered from rigid macroscopics to become flexible mobile microscopic biopolymers?
• What was ‘fitness’ before there were cells? Biological organisms are self-serving. This is captured in the multi-faceted notion of fitness. In contrast, molecules are not self-serving. How would molecules start becoming selected for or against?
• Needles in haystacks and blind watchmakers: overcoming the infinitesimal probabilities. Life’s origin is often considered impossibly improbable, like finding a needle in a haystack, or finding a watch made by a blind watchmaker (Dawkins et al., Reference Dawkins1996). But those arguments are based on models that assume many improbable steps happen independently. There is a problem with those models. Life’s originating events were surely not independent: they were correlated. The key questions are: (1) What was the nature of those correlations? and (2) In what physical process does each step build on the advantages of the preceding steps to give cumulative long-term sustainability?

The case for proteins and the folding process

On the one hand, our view is that even the earliest life requires multiple components – functional molecules like proteins, informational molecules like RNA/DNA, encapsulation like lipids, and on-board energy like the ATP; see Supplementary Material SI.3 and Carter and Kraut (Reference Carter and Kraut1974) and Frenkel-Pinter et al. (Reference Frenkel-Pinter, Haynes, Mohyeldin, Sargon, Petrov, Krishnamurthy, Hud, Williams and Leman2020). On the other hand, our goal here is more modest, namely just to explain the roots of evolution-like dynamics, how molecules became makers, and how maker molecules developed sequence-to-function relationships.

In principle, the first sequence-to-function maker molecules could have been either RNA or proteins. The pros and cons of the RNA world hypothesis, that RNA came first, are discussed elsewhere (Joyce et al., Reference Joyce, Deamer and Fleischaker1994; Atkins et al., Reference Atkins, Gesteland and Cech2011; Robertson and Joyce, Reference Robertson and Joyce2012; Joyce and Szostak, Reference Joyce and Szostak2018; Wills and Carter, Reference Wills and Carter2018). Here, we postulate that proteins came first, both for reasons discussed in those references and because of the need to first establish some form of propagation dynamics. Here is a short summary. (1) Proteins are most of a cell’s mass, so the differential growth rates of cell evolution are largely a matter of differential protein production. (2) Proteins are today’s main maker molecules, catalysing the reactions of cell growth. (3) Proteins are unique in having sequence $ \to $ structure $ \to $ function relationships. Most other polymers, including most RNAs, do not. Proteins achieve their actions, functions and mechanisms by virtue of their native molecular structures. The folding code is primarily a hydrophobic (H) and polar (P) code, which other linear biomolecules do not have. Consequently, while some RNA molecules do fold uniquely and are catalysts, they are driven by different forces. Proteins are compact because they are dominated by hydrophobic tertiary interactions, whereas RNA molecules tend to be stringy because they are dominated by secondary-structure interactions of hydrogen bonding and base stacking. Moreover, because hydrogen bonds and base stacking are relatively sequence-independent, where chain slippage leads to many local minima in free energy, RNA folding landscapes are bumpier and less funnelled than protein landscapes (Chen and Dill, Reference Chen and Dill2000). Thus, even RNAs that actually have folded structures tend to have multiple ones, and those structures are only weakly specified by RNA sequences. (4) Proteins’ unique folded states make proteins good catalysts. Folded proteins are miniature solids. Being a solid is exactly what is needed to catalyse chemical reactions, because catalyst atoms need to hold their places long enough to assist the reaction. (5) A 20 amino acid alphabet spans a range of chemistries, so they catalyse a range of reactions. For these purposes, RNA molecules are not as good as proteins. Even where a given reaction can be catalysed by either proteins or RNAs, proteins are often better (Plaxco and Gross, Reference Plaxco and Gross2021). (6) While some RNA molecules can self-copy, those molecules would need to have very low error rates in order to persist (Eigen, Reference Eigen1971; Jeancolas et al., Reference Jeancolas, Malaterre and Nghe2020). The first copying machines would have to have had near-perfect fidelities. However, exact copying would be too brittle, for the same reasons we explained above that winner-takes-all (WTA) competitions are: without a way of generating diversity, exact copying is too prone to extinction in the face of environmental changes. In our view, prebiotic forces did not aim at self-copying; they aimed instead towards becoming autocatalytic sets, not strict autocatalysts. Variance is crucial. Progeny must not be identical to moms. The origins process must have some aspects of replication that are also to some degree unfaithful.

In terms of a dynamical process, protein folding has pertinent features. Protein folding entails a probabilistic needle-in-a-haystack search challenge through a disorder-to-order transformation. The folding search problem is now well understood in terms of funnel-shaped energy landscapes (Chan and Dill, Reference Chan and Dill1991; Dill and Shortle, Reference Dill and Shortle1991; Wolynes et al., Reference Wolynes, Onuchic and Thirumalai1995; Onuchic et al., Reference Onuchic, Luthey-Schulten and Wolynes1997; Wolynes, Reference Wolynes1997; Dill et al., Reference Dill, Ozkan, Weikl, Chodera and Voelz2007, Reference Dill, Ozkan, Shell and Weikl2008; Thirumalai et al., Reference Thirumalai, O’Brien, Morrison and Hyeon2010; Rollins and Dill, Reference Rollins and Dill2014; Nassar et al., Reference Nassar, Dignon, Razban and Dill2021). Fig. 2 compares a funnel landscape to a ‘golf-course’ landscape, which is premised on the assumption of uncorrelated independent events. ‘Funnel’ refers to the coarsest level of kinetic features, and not the potentially many finer-grained kinetic traps. Protein folding occurs so rapidly and towards such a unique ordered state because small random local steps combine together to lead effectively to the native state. In short, many proteins fold by rapidly finding needles in haystacks and creating complex watchmaker-like structures through small random correlated actions following combinatorially many microscopic routes via opportunistic chemical preferences. Protein folding gives both a metaphor for needle-in-a-haystack searching and a specific physical process, as described below, that could have become evolutionary dynamics.

Fig. 2. Different landscapes of stochastic exploration: Golf courses versus Funnel Landscapes. Lateral directions are sampling degrees of freedom; the up-down direction is some measure of value (more value is downhill). (Left) Blind Watchmaker, Needle-in-a-Haystack: all states, except one, have no value. Success is nearly impossible. (Right) Funnel Landscape: From any starting point, there is often some direction that gives incremental advantage. And there are many routes for chaining together small advantages to more global advantage (black lines). Success is nearly inevitable. We believe evolution, once it gets going, is more funnel-like.

Emergent autocatalysis from HP foldcats

Here is our hypothesis, first in overview, then in more detail. We postulate that prebiotic syntheses could produce short peptide chains, some of which collapse into compact structures in water because of their hydrophobic content. A fraction of those collapsed chains will have exposed hydrophobic surfaces, active as a primitive catalytic site, slightly accelerating the binding and elongation of other peptides. Computer simulations show that this mechanism leads to autocatalytic sets. The premise that amino acids could be produced and could polymerise into short random peptide chains under plausible prebiotic conditions is well-established (Miller and Urey, Reference Miller and Urey1959; Wächtershäuser, Reference Wächtershäuser1988; Botta and Bada, Reference Botta and Bada2002; Johnson et al., Reference Johnson, Cleaves, Dworkin, Glavin, Lazcano and Bada2008; Lambert, Reference Lambert2008; Ikehara, Reference Ikehara2014; Foden et al., Reference Foden, Islam, Fernández-Garca, Maugeri, Sheppard and Powner2020; Frenkel-Pinter et al., Reference Frenkel-Pinter, Haynes, Mohyeldin, Sargon, Petrov, Krishnamurthy, Hud, Williams and Leman2020; Muchowska et al., Reference Muchowska, Varma and Moran2020; Holden et al., Reference Holden, Morato and Cooks2022; Krasnokutski et al., Reference Krasnokutski, Chuang, Jäger, Ueberschaar and Henning2022).

However, existing peptide synthesis experiments do not explain how chains could have become long enough to fold and function like proteins; how they could become catalysts and makers; how the process could become autocatalytic; how they could give non-random sequence $ \to $ structure relationships; how catalysis became mobile; or what are the molecular origins of fitness. We address these below.

Fig. 3 illustrates the chain elongation challenge. Typical polymer syntheses give mostly only short chains that are not long enough to fold and function as today’s proteins do. However, it has been found in computer modelling that some heteropolymers behave differently (Guseva et al., Reference Guseva, Zuckermann and Dill2017). Chains that have particular sequences of hydrophobic (H) and polar (P) types of monomers, called HP polymers, collapse in water into compact states due to the hydrophobic effect. Even some relatively short sequences can collapse. Here, we call those chains foldamers. Furthermore, a small fraction of foldamer sequences can act as primitive catalysts, described below.

Fig. 3. Traditional polymerizations give mostly short chains, described by the Flory distribution. (a) A stationary catalyst polymerisation scheme. (b) Examples of the resulting Flory distribution, fit to experimental data: Orange (Kanavarioti et al., Reference Kanavarioti, Monnard and Deamer2001), pink (Ferris, Reference Ferris1999), and blue (Ding et al., Reference Ding, Kawamura and Ferris1996). Populations rapidly diminish exponentially with chain length. This plot was reprinted with permission from Guseva et al. (Reference Guseva, Zuckermann and Dill2017).

HP chains can fold, catalyse and elongate

Fig. 4 shows that HP chain molecules have three general classes of behaviour in water, depending on their sequence of H and P monomers. (1) Some chains do not fold at all (think of the all-P sequence, for example). (2) Some HP sequences are foldamers, compact with hydrophobic cores. And (3) a fraction of HP foldamers happen to have surface patches that are concentrated in hydrophobic monomers; we call these surface regions ‘landing pads’, because these are regions that are sticky for other hydrophobic molecules floating in solution, Landing pads can be regions of catalysis. We call collapsed chains having landing pads foldcats, short for foldamer-catalysts.

Fig. 4. Some HP chains can fold in water. HP chains are heteropolymers of hydrophobic (H, red) and polar (P, blue) monomers. Some sequences will not fold (nonfolders), while others will fold into compact states in water (foldamers), and a fraction of those sequences will have surface sites that can catalyse other reactions (foldcats).

These landing pads on foldcats could catalyse the covalent elongation of other ‘client’ HP sequences. The mechanism of this catalysis process is shown in Fig. 5. Each foldcat sequence balls up, leaving a sticky spot (clustered H monomers) on its surface. A different peptide chain, call it a client, lands with its H monomers binding hydrophobically to the landing pad of the foldcat. A free H monomer from solution also lands on the landing pad. The spatial colocalization of the H monomer adjacent to the client chain can reduce the kinetic barrier to elongation of the client chain. The foldcat’s job is to keep all the required pieces for elongation (the growing chain and a free monomer) in the same place. Peptide bond formation has a transition state barrier of 18 kcal mol⁻¹ (Gindulyte et al., Reference Gindulyte, Bashan, Agmon, Massa, Yonath and Karle2006). Spatial localization of two reactants, often called proximity effects or enhanced effective concentrations, is known to accelerate covalent bond formation reactions by as much as $ {10}^8 $ (Menger and Nome, Reference Menger and Nome2019). For illustration, we have supposed only a binary code and only hydrophobicity-based landing pads. More realistically, a code will have more than two amino acid types, and more diverse interactions. The expectation that prebiotic peptides would have had both H and P amino acids is supported by the Miller–Urey experiment and recent variants of it (Miller and Urey, Reference Miller and Urey1959; Botta and Bada, Reference Botta and Bada2002; Johnson et al., Reference Johnson, Cleaves, Dworkin, Glavin, Lazcano and Bada2008). A proposed minimal set would be GADV peptides (Ikehara, Reference Ikehara2014), although there would be value in including cysteine (Foden et al., Reference Foden, Islam, Fernández-Garca, Maugeri, Sheppard and Powner2020), and lysine or arginine for breadth of chemistry and control of aggregation.

Fig. 5. HP foldcat chains can catalyse the elongation of other peptides. From the left: an HP sequence folds and exposes a hydrophobic surface (‘landing pad’), a site on which a different chain can land and add a monomer to grow longer.

This HP foldcat mechanism has recently been observed and explored in computer simulations (Guseva et al., Reference Guseva, Zuckermann and Dill2017). First, note that all these effects would likely have been almost negligibly small at first. Foldamers constitute only a fraction of all HP sequences; foldcats constitute an even smaller fraction; and colocalization-based rate enhancements are unlikely to be greater than a few $ \mathrm{kT} $ in free energy (based on hydrophobicity estimates). But, it is not the smallness of populations or actions that matter. Rather, it is whether one step to the next entails some form of systematic positive cooperativity. What matters for origins (as well as for evolution in general) is whether some sub-population, even a very small one, is capable of some action – call it emergent behaviour – of positive feedback, so that it grows relative to other sub-populations, ultimately overcoming the relatively fixed forces of degradation. In general, a big challenge in origins-of-life research is that the initial seeding event is likely to be a very small signal in very large noise – precisely the sort of event for which devising a good experiment is difficult. Below, we describe how the HP foldcat mechanism predicts such emergent behaviours.

How the folding process leads to the evolution process

Here are the emergent behaviours of the HP folding and catalysis mechanism.

• Emergence of makers, catalysts and molecular functionalities. From the short random peptides that are plausibly synthesised prebiotically, the HP foldcat mechanism produces longer chains; see Fig. 6a. On average, longer HP chains are more stably folded and more protein-like (because they bury more hydrophobic surface). So, as long as amino acids are input, the HP foldamer mechanism pushes from peptides towards proteins, creating more catalytic power and functional diversity. These foldcats are makers that make makers. An alternative mechanism proposed for chain elongation is templated ligation, but it requires enzyme assistance (Tkachenko and Maslov, Reference Tkachenko and Maslov2015; Kudella et al., Reference Kudella, Tkachenko, Salditt, Maslov and Braun2021).
• Emergence of sequence-to-function relationships. This mechanism amplifies the populations of foldamers and foldcats, simply because foldcats are a larger proportion of longer-chain sequences. Foldcats form an autocatalytic set; Fig. 6b. Such situations, where some sequences are populated selectively relative to other sequences based on their functionalities, are the basis for sequence-to-function relationships.
• Emergence of programmable mobile molecular machines. Presumably, the first prebiotic peptides were synthesised on macroscale catalysts, fixed in space and inflexible in their actions. But, the foldcat mechanism then produces its own catalysts, poor at first and better later. This untethers the peptide catalysis process from fixed spaces. Now, catalysts are at the microscale: they are mobile; and they are diverse and programmable by virtue of the sampling of sequence space. We regard this untethering, from macro to micro, from fixed to mobile, to have been a transformative step from prebiotic chemistry to biology.

• Emergence of adaptation. Arguably, evolution’s central principle is that organisms adapt to environments. Evolution’s great power of innovation and resourcefulness comes from its mutational search, competition, and fitness-based selection. The HP foldamer perspective posits that such adaptivity could have originated from a disorder-to-order process, in which chain molecules sample different sequences; molecules compete for limited resources; and winners are those that are more stable and get more resources.
• What is ‘fitness’ among molecules? First, just persistence. Darwinian evolution chooses winners and losers based on fitness ratcheting. What are winners and losers in a prebiotic world of molecules? HP chains persist in stably folded states for longer or shorter times, based on their sequences. Longer chains are more stable because they bury more hydrophobic residues upon folding, and because compactness limits access to chemical agents that hydrolyze proteins. In unruly environments of booms and busts, molecules that are more stable persist by scavenging the recycled monomers and peptides from molecules that are less stable.
• Emergence of a tipping point from error catastrophes to success catastrophe. Prebiotic molecules are subject to degradation. Error catastrophes are unavoidable in direct-replicator mechanisms (Eigen, Reference Eigen1971; Jeancolas et al., Reference Jeancolas, Malaterre and Nghe2020). Short peptides will hydrolyze to monomers. The origin of life was a tipping point from error catastrophes (where degradation dominates), to a ‘success catastrophe’, where maker molecules establish persistent populations. Beyond this point, evolution and growth then prevail over degradation. Three factors explain this tipping point in the HP foldamer model: (1) Autocatalysis. As noted above, peptides grow longer, more stably folded, and form an autocatalytic set. This contributes positive cooperativity towards self-sustainability. (2) A driven machine. Like a TV set that is ‘plugged in’, the HP foldamer mechanism is driven by a persistent input. The input is amino acids (and at early stages, also a catalyst of peptide synthesis). It does not matter that most product peptides fail and degrade; being ‘plugged in’ means that the system keeps pumping to push the chain lengths higher. (3) Adaptivity. Autocatalysis and input power alone are not sufficient. Environments are unruly. Biology would not have survived without adaptability to changing environments. The combination of these factors contribute to a drive to ratchet up persistence over time; see Fig. 7.

Fig. 6. The HP Foldcat mechanism: (a) grows longer chains and (b) populates an autocatalytic set. (a) Starting from random short HP molecules, chains elongate (orange) more than in the traditional Flory distribution (black) or the case of foldamers only with no catalysis (green). (b) Active, folded foldcat sequences ( $ {A}_f $ and $ {B}_f $ ) amplify the populations of other growing foldcats ( $ {A}_{gr} $ and $ {B}_{gr} $ ) while they are unfolded ( $ {A}_u $ and $ {B}_u $ ), leading to an autocatalytic set.

Fig. 7. The HP foldcat mechanism spontaneously grows populations of longer chains. (a) Longer proteins fold more stably and are more persistent to fluctuating environments. (b) A conceptual fitness landscape based on time of persistence of folding stability.

To summarise, the HP foldamer mechanism explains how peptide synthesis and folding could result in the emergence of evolution-like propagation; see Table 1. But to be clear, we regard this not as the origin of life itself, but rather only as a precursor to it. Origins surely required much more than this: cell-like encapsulation, information and heritability, and more (some further discussion is given in the Supplementary Material).

Table 1. Correspondence between properties of evolution and properties of origins, in the HP foldamer model described in the text

Evidence supporting the HP foldcat mechanism

Although there is no direct experiment testing this foldcat mechanism, several of its components are supported by experiments. HP chains can fold and function. The binary HP code dominates protein folding (Lim and Sauer, Reference Lim and Sauer1989; Bowie et al., Reference Bowie, Reidhaar-Olson, Lim and Sauer1990; Kamtekar et al., Reference Kamtekar, Schiffer, Xiong, Babik and Hecht1993; Dill et al., Reference Dill, Bromberg, Yue, Chan, Ftebig, Yee and Thomas1995; Dill and MacCallum, Reference Dill and MacCallum2012; Koga et al., Reference Koga, Yamamoto, Kosugi, Kobayashi, Sugiki, Fujiwara and Koga2020). But also, a biomolecule backbone is not required; HP peptoids can fold and function too (Lee et al., Reference Lee, Zuckermann and Dill2005; Yoo and Kirshenbaum, Reference Yoo and Kirshenbaum2008). Some random peptide sequences can fold. It is not an infinitely dilute space of sequences that can fold. As discussed in Guseva et al. (Reference Guseva, Zuckermann and Dill2017), for HP chains up to length 25, 2.3% fold to unique structures and 12.7% of those foldamers, or 0.3% of all sequences, have the foldcat catalytic surface. Peptide syntheses occur naturally. Even in interstellar space, 6–8-mer peptides have been found (Kebukawa et al., Reference Kebukawa, Asano, Tani, Yoda and Kobayashi2022; Krasnokutski et al., Reference Krasnokutski, Chuang, Jäger, Ueberschaar and Henning2022). Sea spray or air-water surfaces could catalyse small peptide formation (Griffith and Vaida, Reference Griffith and Vaida2012; Deal et al., Reference Deal, Rapf and Vaida2021; Holden et al., Reference Holden, Morato and Cooks2022). Some short peptides can catalyse reactions (Adamala and Szostak, Reference Adamala and Szostak2013; Rufo et al., Reference Rufo, Moroz, Moroz, Stöhr, Smith, Hu, DeGrado and Korendovych2014). Hydrophobic patches are common on proteins (Lijnzaad et al., Reference Lijnzaad, Berendsen and Argos1996; Tonddast-Navaei and Skolnick, Reference Tonddast-Navaei and Skolnick2015), which we call landing pads in the foldamer mechanism. Some proteins are synthesised without ribosomes (Finking and Marahiel, Reference Finking and Marahiel2004; Miller and Gulick, Reference Miller and Gulick2016).

Outlook: from protein folding to evolution

We have posited that the origins of life could not have arisen without first a Darwin-like propagation mechanism. We believe function came before information, because we know of no driving force for the reverse. Rather than genes using proteins to make new genes (as in the Selfish Gene hypothesis Dawkins, Reference Dawkins1976), our view is that proteins use genes to make new proteins. And, the foldcat mechanism indicates a way that the middleman – the gene – was simply not needed at first. This mechanism is based on solution physics – the oil–water and hydrogen-bonding forces of protein folding, the ability of miniature solids having different chemical moieties to catalyse reactions, and the ability of random syntheses to find and retain useful sequences based on their persistences. The foldcat mechanism addresses an important problem of origins research: it does not require a guiding hand of a researcher who chooses molecules, systems or processes. Instead, the foldcat mechanism is a disorder-to-order transition that bootstraps functional advantages that it finds from random search.

Open peer review

To view the open peer review materials for this article, please visit http://doi.org/10.1017/qrd.2023.2.

Supplementary materials

To view supplementary material for this article, please visit http://doi.org/10.1017/qrd.2023.2.

Acknowledgements

We thank Luca Agozzino and Gabor Balazsi for early discussions and Charlie Carter for extensive insightful comments. We are grateful to the Templeton Foundation and the Laufer Center for their support.

Author contribution

All authors contributed equally to this work.

Financial support

This work was supported by the John Templeton Foundation (grant ID 62564).

Competing interest

The authors declare none.

References

Adamala, K and Szostak, JW (2013) Competition between model protocells driven by an encapsulated catalyst. Nature Chemistry 5(6), 495–501.CrossRef Google Scholar PubMed

Armstrong, RA and McGehee, R (1980) Competitive exclusion. The American Naturalist 115(2), 151–170.CrossRef Google Scholar

Atkins, JF, Gesteland, RF and Cech, T (eds) (2011) RNA Worlds: From Life’s Origins to Diversity in Gene Regulation. New York: Cold Spring Harbor Laboratory Press.Google Scholar

Barabás, G, D’Andrea, R and Stump, SM (2018) Chesson’s coexistence theory. Ecological Monographs 88(3), 277–303.CrossRef Google Scholar

Beddington, JR (1975) Mutual interference between parasites or predators and its effect on searching efficiency. The Journal of Animal Ecology 44, 331–340.CrossRef Google Scholar

Benner, SA (2010) Defining life. Astrobiology 10(10), 1021–1030.CrossRef Google Scholar PubMed

Botta, O and Bada, JL (2002) Extraterrestrial organic compounds in meteorites. Surveys in Geophysics 23, 411–467.CrossRef Google Scholar

Bowie, JU, Reidhaar-Olson, JF, Lim, WA and Sauer, RT (1990) Deciphering the message in protein sequences: Tolerance to amino acid substitutions. Science 247(4948), 1306–1310.CrossRef Google Scholar PubMed

Carter, CW Jr and Kraut, J (1974) A proposed model for interaction of polypeptides with RNA. Proceedings of the National Academy of Sciences 71(2), 283–287.CrossRef Google Scholar PubMed

Chan, HS and Dill, KA (1991) Polymer principles in protein structure and stability. Annual Review of Biophysics and Biophysical Chemistry 20(1), 447–490.CrossRef Google Scholar PubMed

Charlebois, DA and Balázsi, G (2016) Frequency-dependent selection: A diversifying force in microbial populations. Molecular Systems Biology 12(8), 880.CrossRef Google Scholar PubMed

Chen, S-J and Dill, KA (2000) RNA folding energy landscapes. Proceedings of the National Academy of Sciences 97(2), 646–651.CrossRef Google Scholar PubMed

Chesson, P (1990) Macarthur’s consumer-resource model. Theoretical Population Biology 37(1), 26–38.CrossRef Google Scholar

Chesson, P (2000) Mechanisms of maintenance of species diversity. Annual Review of Ecology and Systematics 31, 343–366.CrossRef Google Scholar

Chesson, P and Kuang, JJ (2008) The interaction between predation and competition. Nature 456(7219), 235–238.CrossRef Google Scholar PubMed

Cleland, CE and Chyba, CF (2002) Defining ‘life’. Origins of Life and Evolution of the Biosphere 32(4), 387–393.CrossRef Google Scholar PubMed

Cody, GD (2004) Transition metal sulfides and the origins of metabolism. Annual Review of Earth and Planetary Sciences 32, 569.CrossRef Google Scholar

Dawkins, R (1976). The Selfish Gene. New York: Oxford University Press.Google Scholar

Dawkins, R. (1996) The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe without Design. New York: WW Norton & Company.Google Scholar

De Duve, C (1995) The beginnings of life on earth. American Scientist 83(5), 428–437.Google Scholar

De Duve, C and De Neufville, R (1991) Blueprint for a Cell: The Nature and Origin of Life. Burlington, NC: Carolina Biological Supply Company.Google Scholar

Deal, AM, Rapf, RJ and Vaida, V (2021) Water–air interfaces as environments to address the water paradox in prebiotic chemistry: A physical chemistry perspective. The Journal of Physical Chemistry A 125(23), 4929–4942.CrossRef Google Scholar PubMed

Deamer, D (2011) First Life: Discovering the Connections between Stars, Cells, and How Life Began. Berkley, CA: University of California Press.CrossRef Google Scholar

DeAngelis, DL, Goldstein, R and O’Neill, RV (1975) A model for tropic interaction. Ecology 56(4), 881–892.CrossRef Google Scholar

Dill, KA, Bromberg, S, Yue, K, Chan, HS, Ftebig, KM, Yee, DP and Thomas, PD (1995) Principles of protein folding—A perspective from simple exact models. Protein Science 4(4), 561–602.CrossRef Google Scholar PubMed

Dill, KA and MacCallum, JL (2012) The protein-folding problem, 50 years on. Science 338(6110), 1042–1046.CrossRef Google Scholar

Dill, KA, Ozkan, SB, Shell, MS and Weikl, TR (2008) The protein folding problem. Annual Review of Biophysics 37, 289.CrossRef Google Scholar PubMed

Dill, KA, Ozkan, SB, Weikl, TR, Chodera, JD and Voelz, VA (2007) The protein folding problem: When will it be solved? Current Opinion in Structural Biology 17(3), 342–346.CrossRef Google Scholar PubMed

Dill, KA and Shortle, D (1991) Denatured states of proteins. Annual Review of Biochemistry 60(1), 795–825.CrossRef Google Scholar PubMed

Ding, PZ, Kawamura, K and Ferris, JP (1996) Oligomerization of uridine phosphorimidazolides on montmorillonite: A model for the prebiotic synthesis of RNA on minerals. Origins of Life and Evolution of the Biosphere 26, 151–171.CrossRef Google Scholar

Doebeli, M, Jaque, EC and Ispolatov, Y (2021) Boom-bust population dynamics increase diversity in evolving competitive communities. Communications Biology 4(1), 1–8.CrossRef Google Scholar PubMed

Dyson, FJ (1982) A model for the origin of life. Journal of Molecular Evolution 18(5), 344–350.CrossRef Google Scholar

Dyson, F (1999) Origins of Life. Cambridge: Cambridge University Press.CrossRef Google Scholar

Eigen, M (1971) Selforganization of matter and the evolution of biological macromolecules. Naturwissenschaften 58(10), 465–523.CrossRef Google Scholar PubMed

Ferris, JP (1999) Prebiotic synthesis on minerals: Bridging the prebiotic and RNA worlds. The Biological Bulletin 196(3), 311–314.CrossRef Google Scholar PubMed

Finking, R and Marahiel, MA (2004) Biosynthesis of nonribosomal peptides. Annual Review of Microbiology 58, 453.CrossRef Google Scholar

Fisher, R (1930) The Genetical Theory of Natural Selection. Oxford: Clarendon.CrossRef Google Scholar

Foden, CS, Islam, S, Fernández-Garca, C, Maugeri, L, Sheppard, TD and Powner, MW (2020) Prebiotic synthesis of cysteine peptides that catalyze peptide ligation in neutral water. Science 370(6518), 865–869.CrossRef Google Scholar PubMed

Frenkel-Pinter, M, Haynes, JW, Mohyeldin, AM, Sargon, AB, Petrov, AS, Krishnamurthy, R, Hud, NV, Williams, LD and Leman, LJ (2020) Mutually stabilizing interactions between proto-peptides and RNA. Nature Communications 11(1), 3137.CrossRef Google Scholar PubMed

Gause, G (1934) The Struggle for Existence. Baltimore, MA: Williams and Wilkins.CrossRef Google Scholar PubMed

Gilbert, W (1986) Origin of life: The RNA world. Nature 319(6055), 618–618.CrossRef Google Scholar

Gindulyte, A, Bashan, A, Agmon, I, Massa, L, Yonath, A and Karle, J (2006) The transition state for formation of the peptide bond in the ribosome. Proceedings of the National Academy of Sciences 103(36), 13327–13332.CrossRef Google Scholar PubMed

Goyal, A, Dubinkina, V and Maslov, S (2018) Multiple stable states in microbial communities explained by the stable marriage problem. The ISME Journal 12(12), 2823–2834.CrossRef Google Scholar PubMed

Griffith, EC and Vaida, V (2012) In situ observation of peptide bond formation at the water–air interface. Proceedings of the National Academy of Sciences 109(39), 15697–15701.CrossRef Google Scholar PubMed

Guseva, E, Zuckermann, RN and Dill, KA (2017) Foldamer hypothesis for the growth and sequence differentiation of prebiotic polymers. Proceedings of the National Academy of Sciences 114(36), E7460–E7468.CrossRef Google Scholar PubMed

Holden, DT, Morato, NM and Cooks, RG (2022) Aqueous microdroplets enable abiotic synthesis and chain extension of unique peptide isomers from free amino acids. Proceedings of the National Academy of Sciences 119(42), e2212642119.CrossRef Google Scholar PubMed

Hordijk, W (2019) A history of autocatalytic sets. Biological Theory 14(4), 224–246.CrossRef Google Scholar

Hordijk, W and Steel, M (2014) Conditions for evolvability of autocatalytic sets: A formal example and analysis. Origins of Life and Evolution of Biospheres 44(2), 111–124.CrossRef Google Scholar PubMed

Hordijk, W, Steel, M and Kauffman, S (2022) Autocatalytic sets arising in a combinatorial model of chemical evolution. Life 12(11), 1703.CrossRef Google Scholar

Hsu, S-B, Hubbell, S and Waltman, P (1977) A mathematical theory for single-nutrient competition in continuous cultures of micro-organisms. SIAM Journal on Applied Mathematics 32(2), 366–383.CrossRef Google Scholar

Hutchinson, GE (1961) The paradox of the plankton. The American Naturalist 95(882), 137–145.CrossRef Google Scholar

Ikehara, K (2014) [GADV]-protein world hypothesis on the origin of life. Origins of Life and Evolution of Biospheres 44, 299–302.CrossRef Google Scholar PubMed

Jain, S and Krishna, S (2002) Large extinctions in an evolutionary model: The role of innovation and keystone species. Proceedings of the National Academy of Sciences 99(4), 2055–2060.CrossRef Google Scholar

Jeancolas, C, Malaterre, C and Nghe, P (2020) Thresholds in origin of life scenarios. Iscience 23(11), 101756.CrossRef Google Scholar PubMed

Johnson, AP, Cleaves, HJ, Dworkin, JP, Glavin, DP, Lazcano, A and Bada, JL (2008) The miller volcanic spark discharge experiment. Science 322(5900), 404–404.CrossRef Google Scholar PubMed

Jordan, SF, Ioannou, I, Rammu, H, Halpern, A, Bogart, LK, Ahn, M, Vasiliadou, R, Christodoulou, J, Maréchal, A and Lane, N (2021) Spontaneous assembly of redox-active iron-sulfur clusters at low concentrations of cysteine. Nature Communications 12(1), 1–14.CrossRef Google Scholar PubMed

Joyce, G, Deamer, DW and Fleischaker, G (1994) Foreward to Origins of Life: The Central Concepts. Boston, MA: Jones and Bartlett Publishers.Google Scholar

Joyce, GF and Szostak, JW (2018) Protocells and RNA self-replication. Cold Spring Harbor Perspectives in Biology 10(9), a034801.CrossRef Google Scholar PubMed

Kamtekar, S, Schiffer, JM, Xiong, H, Babik, JM and Hecht, MH (1993) Protein design by binary patterning of polar and nonpolar amino acids. Science 262(5140), 1680–1685.CrossRef Google Scholar PubMed

Kanavarioti, A, Monnard, P-A and Deamer, DW (2001) Eutectic phases in ice facilitate nonenzymatic nucleic acid synthesis. Astrobiology 1(3), 271–281.CrossRef Google Scholar PubMed

Kauffman, SA (1971) Cellular homeostasis, epigenesis and replication in randomly aggregated macromolecular systems. Journal of Cybernetics 1, 71–96.CrossRef Google Scholar

Kauffman, SA (1986) Autocatalytic sets of proteins. Journal of Theoretical Biology 119(1), 1–24.CrossRef Google Scholar PubMed

Kauffman, SA, et al. (1993) The Origins of Order: Self-Organization and Selection in Evolution. Oxford: Oxford University Press.Google Scholar

Kebukawa, Y, Asano, S, Tani, A, Yoda, I and Kobayashi, K (2022) Gamma-ray-induced amino acid formation in aqueous small bodies in the early solar system. ACS Central Science 8, 1664–1671.CrossRef Google Scholar PubMed

Kocher, Charles D., and Dill, Ken A. (2023) Darwinian evolution as a dynamical principle. Proceedings of the National Academy of Sciences 120.11: e2218390120.CrossRef Google Scholar PubMed

Koga, R, Yamamoto, M, Kosugi, T, Kobayashi, N, Sugiki, T, Fujiwara, T and Koga, N (2020) Robust folding of a de novo designed ideal protein even with most of the core mutated to valine. Proceedings of the National Academy of Sciences 117(49), 31149–31156.CrossRef Google Scholar

Krasnokutski, S, Chuang, K-J, Jäger, C, Ueberschaar, N and Henning, T (2022) A pathway to peptides in space through the condensation of atomic carbon. Nature Astronomy 6(3), 381–386.CrossRef Google Scholar

Kudella, PW, Tkachenko, AV, Salditt, A, Maslov, S and Braun, D (2021) Structured sequences emerge from random pool when replicated by templated ligation. Proceedings of the National Academy of Sciences 118(8), e2018830118.CrossRef Google Scholar PubMed

Lambert, J-F (2008) Adsorption and polymerization of amino acids on mineral surfaces: A review. Origins of Life and Evolution of Biospheres 38(3), 211–242.CrossRef Google Scholar PubMed

Lee, B-C, Zuckermann, RN and Dill, KA (2005) Folding a nonbiological polymer into a compact multihelical structure. Journal of the American Chemical Society 127(31), 10999–11009.CrossRef Google Scholar PubMed

Lifson, S (1997) On the crucial stages in the origin of animate matter. Journal of Molecular Evolution 44(1), 1–8.CrossRef Google Scholar PubMed

Lijnzaad, P, Berendsen, HJ and Argos, P (1996) Hydrophobic patches on the surfaces of protein structures. Proteins: Structure, Function, and Bioinformatics 25(3), 389–397.3.0.CO;2-E>CrossRef Google Scholar PubMed

Lim, WA and Sauer, RT (1989) Alternative packing arrangements in the hydrophobic core of λ represser. Nature 339(6219), 31–36.CrossRef Google Scholar

MacArthur, R (1970) Species packing and competitive equilibrium for many species. Theoretical Population Biology 1(1), 1–11.CrossRef Google Scholar PubMed

Machery, E (2012) Why I stopped worrying about the definition of life… And why you should as well. Synthese 185(1), 145–164.CrossRef Google Scholar

Martin, W, Baross, J, Kelley, D and Russell, MJ (2008) Hydrothermal vents and the origin of life. Nature Reviews Microbiology 6(11), 805–814.CrossRef Google Scholar PubMed

Matsuo, M and Kurihara, K (2021) Proliferating coacervate droplets as the missing link between chemistry and biology in the origins of life. Nature Communications 12(1), 1–13.CrossRef Google Scholar PubMed

Maury, CPJ (2009) Self-propagating β-sheet polypeptide structures as prebiotic informational molecular entities: The amyloid world. Origins of Life and Evolution of Biospheres 39(2):141–150.CrossRef Google Scholar PubMed

Maury, CPJ (2015) Origin of life. Primordial genetics: Information transfer in a pre-RNA world based on self-replicating beta-sheet amyloid conformers. Journal of Theoretical Biology 382, 292–297.CrossRef Google Scholar

Maury, CPJ (2018) Amyloid and the origin of life: Self-replicating catalytic amyloids as prebiotic informational and protometabolic entities. Cellular and Molecular Life Sciences 75(9), 1499–1507.CrossRef Google Scholar PubMed

Menger, FM and Nome, F (2019) Interaction vs. preorganization in enzyme catalysis. A dispute that calls for resolution. ACS Chemical Biology 14(7), 1386–1392.CrossRef Google Scholar PubMed

Miller, BR and Gulick, AM (2016) Structural biology of nonribosomal peptide synthetases. In Nonribosomal Peptide and Polyketide Biosynthesis. New York, NY: Springer, pp. 3–29.CrossRef Google Scholar

Miller, SL and Urey, HC (1959) Organic compound synthesis on the primitive earth: Several questions about the origin of life have been answered, but much remains to be studied. Science 130(3370), 245–251.CrossRef Google Scholar

Muchowska, KB, Varma, SJ and Moran, J (2020) Nonenzymatic metabolic reactions and life’s origins. Chemical Reviews 120(15), 7708–7744.CrossRef Google Scholar PubMed

Nassar, R, Dignon, GL, Razban, RM and Dill, KA (2021) The protein folding problem: The role of theory. Journal of Molecular Biology 433(20), 167126.CrossRef Google Scholar PubMed

Novak, M and Stouffer, DB (2021) Systematic bias in studies of consumer functional responses. Ecology Letters 24(3), 580–593.CrossRef Google Scholar PubMed

Onuchic, JN, Luthey-Schulten, Z and Wolynes, PG (1997) Theory of protein folding: The energy landscape perspective. Annual Review of Physical Chemistry 48(1), 545–600.CrossRef Google Scholar PubMed

Plaxco, KW and Gross, M (2021) Astrobiology: An Introduction. Baltimore, MD: Johns Hopkins University Press.CrossRef Google Scholar

Popa, R (2004) Between Necessity and Probability: Searching for the Definition and Origin of Life. Berlin: Springer Science & Business Media.Google Scholar

Pross, A (2011) Toward a general theory of evolution: Extending Darwinian theory to inanimate matter. Journal of Systems Chemistry 2(1), 1–14.CrossRef Google Scholar

Pross, A (2016) What Is Life? How Chemistry Becomes Biology. Oxford: Oxford University Press.Google Scholar

Robertson, MP and Joyce, GF (2012) The origins of the RNA world. Cold Spring Harbor Perspectives in Biology 4(5), a003608.CrossRef Google Scholar PubMed

Rollins, GC and Dill, KA (2014) General mechanism of two-state protein folding kinetics. Journal of the American Chemical Society 136(32), 11420–11427.CrossRef Google Scholar PubMed

Rufo, CM, Moroz, YS, Moroz, OV, Stöhr, J, Smith, TA, Hu, X, DeGrado, WF and Korendovych, IV (2014) Short peptides self-assemble to produce catalytic amyloids. Nature Chemistry 6(4), 303–309.CrossRef Google Scholar PubMed

Schrodinger, E (1944) What Is Life? Cambridge: Cambridge University Press.Google Scholar

Segré, D, Ben-Eli, D, Deamer, DW and Lancet, D (2001) The lipid world. Origins of Life and Evolution of the Biosphere 31(1), 119–145.CrossRef Google Scholar PubMed

Shapiro, R (2006) Small molecule interactions were central to the origin of life. The Quarterly Review of Biology 81(2), 105–126.CrossRef Google Scholar PubMed

Stouffer, DB and Novak, M (2021) Hidden layers of density dependence in consumer feeding rates. Ecology Letters 24(3), 520–532.CrossRef Google Scholar PubMed

Thirumalai, D, O’Brien, EP, Morrison, G and Hyeon, C (2010) Theoretical perspectives on protein folding. Annual Review of Biophysics 39(1), 159–183.CrossRef Google Scholar PubMed

Tilman, D (1982) Resource Competition and Community Structure. Monographs in Population Biology. Princeton, NJ: Princeton University Press.Google Scholar

Tkachenko, AV and Maslov, S (2015) Spontaneous emergence of autocatalytic information-coding polymers. The Journal of Chemical Physics 143(4), 07B612_1.CrossRef Google Scholar PubMed

Tonddast-Navaei, S and Skolnick, J (2015) Are protein-protein interfaces special regions on a protein’s surface? The Journal of Chemical Physics 143(24), 12B631_1.CrossRef Google Scholar PubMed

van Opheusden, JH, Hemerik, L, van Opheusden, M and van der Werf, W (2015) Competition for resources: Complicated dynamics in the simple Tilman model. Springerplus 4(1), 1–31.CrossRef Google Scholar PubMed

Volterra, V (1928) Variations and fluctuations of the number of individuals in animal species living together. ICES Journal of Marine Science 3(1), 3–51.CrossRef Google Scholar

Wächtershäuser, G (1988) Before enzymes and templates: Theory of surface metabolism. Microbiological Reviews 52(4), 452–484.CrossRef Google Scholar PubMed

Wang, Z, Fridman, Y, Maslov, S and Goyal, A (2022) Fine-scale diversity of microbial communities due to satellite niches in boom-and-bust environments. PLOS Computational Biology, 18, e1010244 Google Scholar

Wills, PR and Carter, CW (2018) Insuperable problems of the genetic code initially emerging in an RNA world. Biosystems 164, 155–166.CrossRef Google Scholar

Wolynes, PG (1997) Folding funnels and energy landscapes of larger proteins within the capillarity approximation. Proceedings of the National Academy of Sciences 94(12), 6170–6175.CrossRef Google Scholar PubMed

Wolynes, PG, Onuchic, JN and Thirumalai, D (1995) Navigating the folding routes. Science 267(5204), 1619–1620.CrossRef Google Scholar PubMed

Yoo, B and Kirshenbaum, K (2008) Peptoid architectures: Elaboration, actuation, and application. Current Opinion in Chemical Biology 12(6), 714–721.CrossRef Google Scholar PubMed

Fig. 1. Top: The DEM cycle. (a) At time $ t $ shows a population $ X $ of wild-type cells. (b) One cell mutates. That cell grows to have population $ Y $ in (c). Populations $ X $ and $ Y $ compete, one wins, and the cycle begins again as (a) at time $ t+1 $. Bottom: Fitness landscapes show the separation of actions. From point $ X $, mutation entails a random, relatively unbiased exploration (orange region). The third landscape shows selection, in this case of $ Y $, where the bias and preference occurs.

Fig. 2. Different landscapes of stochastic exploration: Golf courses versus Funnel Landscapes. Lateral directions are sampling degrees of freedom; the up-down direction is some measure of value (more value is downhill). (Left) Blind Watchmaker, Needle-in-a-Haystack: all states, except one, have no value. Success is nearly impossible. (Right) Funnel Landscape: From any starting point, there is often some direction that gives incremental advantage. And there are many routes for chaining together small advantages to more global advantage (black lines). Success is nearly inevitable. We believe evolution, once it gets going, is more funnel-like.

Fig. 3. Traditional polymerizations give mostly short chains, described by the Flory distribution. (a) A stationary catalyst polymerisation scheme. (b) Examples of the resulting Flory distribution, fit to experimental data: Orange (Kanavarioti et al.,2001), pink (Ferris, 1999), and blue (Ding et al.,1996). Populations rapidly diminish exponentially with chain length. This plot was reprinted with permission from Guseva et al. (2017).

Fig. 4. Some HP chains can fold in water. HP chains are heteropolymers of hydrophobic (H, red) and polar (P, blue) monomers. Some sequences will not fold (nonfolders), while others will fold into compact states in water (foldamers), and a fraction of those sequences will have surface sites that can catalyse other reactions (foldcats).

Fig. 6. The HP Foldcat mechanism: (a) grows longer chains and (b) populates an autocatalytic set. (a) Starting from random short HP molecules, chains elongate (orange) more than in the traditional Flory distribution (black) or the case of foldamers only with no catalysis (green). (b) Active, folded foldcat sequences ($ {A}_f $ and $ {B}_f $) amplify the populations of other growing foldcats ($ {A}_{gr} $ and $ {B}_{gr} $) while they are unfolded ($ {A}_u $ and $ {B}_u $), leading to an autocatalytic set.

Table 1. Correspondence between properties of evolution and properties of origins, in the HP foldamer model described in the text

Kocher and Dill supplementary material

PDF 351.7 KB

Review: Origins of life: First came evolutionary dynamics — R0/PR1

Published online by Cambridge University Press: 22 March 2023

DOI: https://doi.org/10.1017/qrd.2023.2.pr1

Rodrigo Gonzalo Parra

Life Sciences, Barcelona Supercomputing Center: Centro Nacional de Supercomputacion, Germany

Date of review: 22 February 2023

Revision round: 0

Role: reviewer

Recommendation/decision: minor-revision

Conflict of interest statement

Reviewer declares none.

Comments

Comments to Author: The Authors Present Their Work “Origins Of Life: First Came Evolutionary Dynamics” where they present a Darwinian Evolution Machine (DEM) that explains the evolution of a population where mutations are possible, which in turn create differential populations that will compete for resources and will lead the system to have one winner population that takes the system to a new equilibrium where that population wins and becomes dominant. However, it is important to note that their DEM takes into account the possibility of peaceful coexistence between different populations instead of a winner-takes-all model. Peaceful coexistence would provide evolution with the robustness needed to be possible, otherwise extinction would have been most likely the rule. In addition, the DEM is not a closed system. Instead the DEM self-sustains due to the uptake of external resources.

Given the previous definitions the authors navigate in the description of which types of molecules would have had the necessary properties to be the protagonists in such a DEM model.

I think that the introduction is very well structured as well as the initial explanations that lead the author to the understanding of the properties the DEM, first makers need to have.

I think that the logical steps to explain how certain molecules become makers and these in turn develop sequence to function relationships is really appealing. It is also very interesting that the hypothesis that RNA and its ability to replicate is not the important property for this beginning of makers’ emergence but the ability of dynamic propagation by autocatalytic molecules. Furthermore the way in which the authors link the previous introduction of terms and concepts with the funneled energy landscapes theory is very nice.

To be honest, I have liked the paper a lot and since it is more of an hypothesis that builds up on several of the authors’ previous publications I do not have much to correct.

I do have some questions though.

The authors say that the evolutionary landscape that led to the initial pre-protein molecules was somehow funneled and not a golf-course. What I understand from this analogy is that if the landscape, because of the biophysical properties of the early makers, was already funneled and therefore success was nearly inevitable because of that (the authors say in Fig2 legend that they believe evolution is more funnel-like). If this is true, do you think then that the probability of having protein-like systems in planets like planet earth are almost certain, given that that would already set up the conditions to funnel evolutionary landscapes? What the consequences of this theoretical framework would be for the likely development of life or at least of early makers outside of earth?

The authors provide an explanation of the emergence of foldamers and foldcats based on a hydrophobic and polar set of amino acids. Can the authors maybe hypothesize or reflect on what kind of alphabet would be needed and compatible with the emergence of foldamers and foldcats. Do they have any preference towards some study that has hypothesized about the size of the primitive alphabet for when proteins emerged?

Minor comments

There is a mention to Fig3 which has panels a and b. But there is no mention of the individual panels. Panel b is difficult to understand. I found that a similar figure is reported in Guzeva et al, PNAS 2017. There the figure is better explained (also there is another panel) where the colors are meaningful. I would improve that figure panel to make it easier to understand. I would also place the y-axis next to the y-axis instead of in the corner. I was confused for a bit about if it was a title or an axis label.

In fig 4. There is no mention of the non folders in the legend as there are for the others.

Both Figs4 and 5 would benefit from saying which color is the H and which one is the P type of amino acid. I know it is a conceptual figure, but still I tried to mentally make sense of it and it became hard for a while.

Regarding Fig5 and the explanation from lines 186 to lines 193.. I did not fully understand it. I kind of understand the mechanism but somehow some details are escaping from my understanding. In particular I don’t understand how “These landing pads on foldcats could catalyze the COVALENT elongation of other client HP sequences. Maybe a better description in the conceptual schema form Fig5 would help or a modification of Fig5 to make it clearer.

In Fig6. There is no explanation to the labels in panel b (Agr, Au, Af…, etc). What are they?

This is just a suggestion that in my opinion would close the paper in a nicer way. I am missing a connection between the title and the latest part of the OUTLOOK section. The title states that “First Came evolutionary dynamics”. Although I understand what the authors refer to along the manuscript.. I final punch going back to that concept in a more explicit way would be really nice to close the paper.

There is a repeated “the” at line 117 in page 4.

Fig8 appears in the supplementary. Maybe it should be named FigS1?

The rest of the paper after Fig5 and its reference goes very smoothly and is nicely structured making its interpretation really straight forward.

Review: Origins of life: First came evolutionary dynamics — R0/PR2

Published online by Cambridge University Press: 22 March 2023

DOI: https://doi.org/10.1017/qrd.2023.2.pr2

Shi-Jie Chen

Physics & Biochemistry, University of Missouri, United States

Date of review: 17 February 2023

Revision round: 0

Role: reviewer

Recommendation/decision: minor-revision

Conflict of interest statement

Reviewer declares none.

Comments

Comments to Editor: Hi Giulia, I am happy to review this manuscript (sorry that I clicked the “decline” button by mistake). Best wishes,-- Shi-Jie

Comments to Author: This manuscript presents a novel study of evolutionary dynamics. The authors first provided an in-depth description of the Darwinian Evolution Machine (DEM) cycle and its advantages for modeling the origins of life. The authors then discussed the puzzles about the molecular origin of DEM and concluded that proteins have essential features for being the maker molecules. Last, the authors presented the HP foldamer mechanism and demonstrated that short HP peptide chains can fold and catalyze the elongation of other peptides, resulting in an autocatalytic set and an evolution-like propagation.

This is an excellent manuscript with robust findings and interesting conclusions. I have only a few minor comments:

1. The figure captions lack sufficient details and do not fully describe the information presented in the panels. More detailed descriptions are necessary. Below are a few specific questions:

Fig. 3: In panel (b), what are the different lines and points representing?

Fig. 4 and Fig. 5: Why are the red and blue dots there? H monomers or P monomers?

Fig. 6: In panel (b), what do the symbols (Au, Af, etc.) mean??

2. The relationships between the different sections are not clear. Table I may be moved to the main text and further discussion of the correspondence between DEM cycle and HP foldamer model may be provided.

3. Page 4, Line 141. “Folded proteins are nanoscale solids, different for different sequences. In contrast, because RNA molecules are stiff and hydrogen bonded, they tend to be stringier - less folded, less compact and more floppy - and with poorly defined sequence-to-structure relationships.”

Structures are generally more conserved than the sequences and different RNA sequences can fold to the same or similar structures. Indeed, RNAs are usually more flexible. It would be useful for the authors to clarify the meaning of “poorly defined sequence-to-structure relationships” for RNAs.

4. Page 4, Line 144. “Accurate self-copying is not the property we seek here.”

Why is an accurate replication of informational polymers not a key property for the autocatalytic system? More discussions would be helpful here.

5. Page 4, the last paragraph describes the “golf course” landscape and an idealized “funnel” landscape for fast protein folding. What about more complex funnel landscapes, such as a global funnel-like landscape that involves kinetic traps and bumps? Can complex funnel landscapes also result in disorder-to-order transformation?

6. Page 5, Line 173. “However, it has been found in computer modeling that some heteropolymers behave differently.”

Citations of references are needed here.

7. Page 6, Line 186. “These landing pads on foldcats could catalyze the covalent elongation of other “client” HP sequences.”

In the HP foldamer mechanism, how long can a sequence be elongated? How does the covalent elongation occur when the H monomer is close to the landing pads on the foldcats?

8. The key idea behind the HP foldamer mechanism comes from the hydrophobic effects, where the hydrophobic “landing pads” can attract peptides and catalyze polymerization.The “landing pads” idea is quite interesting and may be generalized in the discussion. For example, nucleotides in a loop region of a folded RNA may also serve as the “landing anchors” for other nucleotides/short chains through base pairing.

Recommendation: Origins of life: First came evolutionary dynamics — R0/PR3

Published online by Cambridge University Press: 22 March 2023

DOI: https://doi.org/10.1017/qrd.2023.2.pr3

Giulia Palermo

Department of Chemistry and Biochemistry, University of California San Diego, United States

Revision round: 0

Role: Associate Editor

Recommendation/decision: minor-revision

Comments

Comments to Editor: PLEASE LEAVE THIS FIELD BLANK AND USE THE ‘INSTRUCTIONS TO EDITORIAL OFFICE’ TEXT BOX

Reviewer, Shi-Jie Chen: Hi Giulia, I am happy to review this manuscript (sorry that I clicked the “decline” button by mistake). Best wishes,-- Shi-Jie

Comments to Author: Reviewer #2: The Authors Present Their Work “Origins Of Life: First Came Evolutionary Dynamics” where they present a Darwinian Evolution Machine (DEM) that explains the evolution of a population where mutations are possible, which in turn create differential populations that will compete for resources and will lead the system to have one winner population that takes the system to a new equilibrium where that population wins and becomes dominant. However, it is important to note that their DEM takes into account the possibility of peaceful coexistence between different populations instead of a winner-takes-all model. Peaceful coexistence would provide evolution with the robustness needed to be possible, otherwise extinction would have been most likely the rule. In addition, the DEM is not a closed system. Instead the DEM self-sustains due to the uptake of external resources.

Given the previous definitions the authors navigate in the description of which types of molecules would have had the necessary properties to be the protagonists in such a DEM model.

I think that the introduction is very well structured as well as the initial explanations that lead the author to the understanding of the properties the DEM, first makers need to have.

To be honest, I have liked the paper a lot and since it is more of an hypothesis that builds up on several of the authors’ previous publications I do not have much to correct.

I do have some questions though.

Minor comments

In fig 4. There is no mention of the non folders in the legend as there are for the others.

In Fig6. There is no explanation to the labels in panel b (Agr, Au, Af…, etc). What are they?

There is a repeated “the” at line 117 in page 4.

Fig8 appears in the supplementary. Maybe it should be named FigS1?

The rest of the paper after Fig5 and its reference goes very smoothly and is nicely structured making its interpretation really straight forward.

Reviewer #3: This manuscript presents a novel study of evolutionary dynamics. The authors first provided an in-depth description of the Darwinian Evolution Machine (DEM) cycle and its advantages for modeling the origins of life. The authors then discussed the puzzles about the molecular origin of DEM and concluded that proteins have essential features for being the maker molecules. Last, the authors presented the HP foldamer mechanism and demonstrated that short HP peptide chains can fold and catalyze the elongation of other peptides, resulting in an autocatalytic set and an evolution-like propagation.