THE REPLICATOR"S NEW CLOTHES.
Although the autogenic theory proposed earlier in this book was intended primarily as a model system for elucidating the logic of the emergence of teleodynamic processes, it also offers a partial answer to Bastian"s challenge and an approach to the origin of life. It is, however, something less than an exemplar of spontaneous generation, and considerably less than an account of the last universal common ancestor (often identified with the acronym LUCA) of living organisms. This is because autogens are not alive-at least not in any current sense of that concept. They lack persistent non-equilibrium dynamics, diffusible surfaces, genetic information, an autonomously implemented reproductive process, or any way to selectively react to or act upon their environment in any way that is self-supportive. Yet autogenic theory may provide something equally useful: a first building block in a theory that allows us to deduce the origins of these fundamental attributes of organisms. Below, we will see that extrapolating from the logic of autogen evolution, and utilizing the logic of emergent dynamics, it is possible to provide an account of how such protolife forms might give rise to these properties that we recognize as the hallmarks of life. Most important, a principled account of the origins of biological information is a critical step toward demystifying and de-homunculizing our understanding of the relationship between genetic information and the defining property of life.
Autogenic theory provides a glimpse of an elusive law of emergence operative at the dawn of life, exemplifying the emergence of ententional properties from non-equilibrium thermodynamics, and thus a bridge from non-living to living processes. The theory also demonstrates why spontaneous generation is so exceedingly rare. This is because the conditions that make it possible are highly precise and at the same time highly atypical of the thermodynamically driven processes that are ubiquitous outside of biology.6 But once teleodynamic systems capable of natural selection emerged, an unbounded territory opened up. The evolution of life has led to many levels of radical and unprecedented higher-order teleodynamic phenomena, including mental phenomena. But prerequisite to this capacity to evolve level upon level of more complex ententional relationships is an ability to capture the critical dynamical constraints for each lower level of teleodynamics, in a form that allows them to be preserved and transmitted irrespective of the further convolutions of dynamical organizations that may be incorporated during the course of future evolution. If the source of constraints maintaining the core critical dynamics is not somehow itself insulated from these modifications, there can be no solid foundation to build upon. Each new change will modify existing constraints, and the compounding of higher-order teleodynamic constraints on a base of preexisting constraints will be nearly impossible. There need to be separately sequestered constraints embodied in some non-dynamical attribute, which can be preserved unmodified across changes in dynamics, so that earlier dynamical achievement will not be continually undermined.
This is what genetic information provides for living organisms-and much more. It sequesters an independent source of constraints that is partially redundant to that intrinsic to the dynamics of the organism itself. This has two immediate advantages. First, it is a conservative factor. It protects against degradation of evolved adaptive constraints that might occur due to dynamical interactions with unprecedented environmentally derived factors. Second, it is an innovative factor. Its separate material properties provide a basis for indirectly modifying dynamical constraints that are physically independent of the details and limitations of these dynamics.
The molecular informational mechanism const.i.tuting genetics that is today ubiquitous to all living organisms was not just an augmentation of the autogenic process; it took it to a whole new level. The conservative effect of embodying dynamical constraint in a separate physical substrate from that doing the work to maintain the organism provided a critical foundation on which evolution could build progressively higher-order teleodynamic processes. As our a.n.a.lysis of the concept of information in previous chapters has demonstrated, however, genetic information cannot be simply identified with a physical substrate or pattern. Information is dependent on the propagation of constraints linking a teleodynamic system and its environmental context. That means that information is not any intrinsic property of the substrate that embodies or conveys these constraints. Although one of the crucial properties of an information-bearing medium is that it can serve as a template for copying and propagating constraints, this simple physical quality is not what defines it. The general theory of information that we explored in the two previous chapters demonstrated that information is identified with the transmission of constraints, exemplified by some physical medium linking a teleodynamic system with its environment. Information does not stand apart from this relationship, nor does it preexist the teleodynamics that it informs. Another way to say this is that teleodynamic organization is primary, and information is a special feature of some teleodynamic processes.
What does this mean for the role of genetic information in the origin of life? Basically, it suggests that genetic information is not primary, but is rather a derived feature of life. Although a molecule that has the unusual property of serving as a template for producing a precise replica of itself is without doubt potentially useful for propagating constraints, the process of structural replication by itself does not const.i.tute information. A DNA molecule outside of an organism does not convey information about anything, and is mostly just sticky goo. And gene sequences transplanted from one sort of organism to another sort are likely to be noise in that new context.7 So, even if DNA and RNA were abundantly synthesized and replicated in the merely geochemical environment of the early Earth, it would not under those circ.u.mstances be information about anything. It is not the template replication that is the basis for the information-conveying capacity of DNA and RNA in organisms; it is the integration of the patterns that they can exhibit into the teleodynamics of the living process that matters.
To state this claim more forcefully, DNA is just another-albeit very useful-adaptation, that itself must have evolved this function by natural selection. This should not surprise us, because these molecules are remarkably well suited for the purpose. The almost identical bonding energies of any given adjacent nucleotides, the nearly unlimited size of a given DNA strand, and the precision of template specificity, among other attributes, give the impression of a molecule that was honed by natural selection in response to its information-carrying function. More important, replication of a molecule like DNA is not essential for either reproduction or evolution, since something as simple as an autogen can reproduce and evolve. Calling it the "secret of life" is thus hyperbole. And considering genetic information to be the defining character of life is also a bit hasty. It is without question critical to the evolution of all higher-order forms of life, including all that are currently available to biologists. An account of how biological information emerges from more basic teleodynamic processes is the first step to explaining the nature of all higher-order ententional properties.
This view of information poses a challenge to a widely accepted account of the origin of life, and of evolution in general. This is the belief that life is fundamentally just a complex kind of copying process. The most well known version of this is replicator selection theory. The term replicator was coined for this use by Richard Dawkins in 1976 in his influential book The Selfish Gene. Though Dawkins gave it a name, as we will see, the core a.s.sumptions of this theory-that the essential feature of reproduction is the copying of template molecules-is in some form or other characteristic of nearly all modern conceptions of the evolutionary process. Like Darwin"s account of the necessary conditions for natural selection, replicator selection also begins by a.s.suming some unspecified and highly non-trivial kinds of processes. But unlike Darwin, who refused to speculate about these, replicator theories often a.s.sume that this process is so ubiquitous and uncomplicated that it can be accepted as a defining attribute of life.
A replicator is something that gets copied. More precisely, this something is a pattern embodied in a physical substrate. In biological systems, this is most commonly taken to be DNA or RNA, but Dawkins suggested that certain cultural artifacts and habits could also be replicators, and coined the term memes to refer to them. Critics of replicator theories have often argued that the concept is too narrowly reserved for genetic information, and that other features of organisms must also be included as replicators, such as the membrane of the cell and many organelles. These are not created entirely anew in cell division, but physically inherited from the progenitor cell when it divides. In both views, however, whether there is only one kind of replicator or many at different levels, it is copying that is a.s.sumed to be the defining principle.
Although it is generally believed that polynucleotide chains like DNA and RNA molecules const.i.tute life"s replicators, even by Dawkins" own description, based on a characterization of DNA "replication," these molecules fail the crucial criterion. They do not replicate themselves. To be more explicit: polynucleotide A cannot directly produce another exact duplicate of polynucleotide A.8 Instead, with the a.s.sistance of special contextual conditions (e.g., in company of a molecular complex made up of a number of supportive transcription molecules, or within an operating PCR machine, and including critical component molecules as raw materials), polynucleotide molecule A can produce a complementary polynucleotide molecule B, which in turn under the same conditions can produce polynucleotide molecule A.
Whereas viral, bacterial cell, and eukaryotic cell division do produce replicas (with slight variation), this involves the replication of both strands of a DNA molecule to produce two duplicated double strands. On more careful inspection, then, we can see that the idea of a replicator, which Dawkins has identified with DNA, is instead an oversimplified projection of cellular reproduction onto the process called DNA replication. Indeed, we can now easily recognize that polynucleotide "replication" is in fact a special case of what is often described as autocatalysis (also a misnomer for similar reasons). As we noted in our discussions of autogenic chemistry, the chemical process described in both cases might more accurately be called reciprocal catalysis, or even reciprocal indirect catalysis, since it always involves at least two and sometimes more complementary catalysts, each catalyzing another member of the set. So, DNA replication in a PCR machine-ignoring the role of the supportive machinery and mediating molecules-is a two-step, reciprocally catalytic loop.
Perhaps the closest a.n.a.logue (though not a true example) to what can be called direct molecular replication is found in the special case of prion replication. In this process, there is no new molecular synthesis or lysis involved to form the new molecule. The precursor molecule, the so-called pre-prion protein, is made up of the same component amino acids arranged into the same polymeric sequence as the prion protein. It"s just that the three-dimensional conformation of the two proteins is different. Prion formation simply involves the prion molecule binding with the pre-prion molecule in such a way as to cause the latter to deform into the prion conformation, and thus become capable of similarly deforming other pre-prion proteins. Pre-prion proteins require the molecular machinery found in mammalian brains (including the DNA code for this pre-prion sequence) in order to be synthesized. In this respect, prion "replication" does not actually generate any new material. It is merely "damage" done to proteins synthesized in mammalian brains, which subsequently propagate this damage to others.
If we return to the actual case of bacterial cell replication and try to capture its minimal logical structure, we must take into account that it is made up of a complex of reciprocally interdependent molecules. The crucial factor is the complex reciprocities that enable each part to be both end and means in forming the whole integrated organism. Specifying these details is critical to fully account for the process of self-replication. There is no subset of molecules that suffices. Ultimately, all essential components need to get replicated, if for no other reason than to replace those that have become damaged. In this sense, the nucleic acid sequences have no special claim to be the replicators. They are merely more central because many other molecular replication processes depend on them.
But a.s.suming for the moment that naked nucleic acids could serve as templates for replicating identical copies, we still wouldn"t be any closer to understanding the relationship between replication and genetic information. Although a replicated molecule is literally a re-presentation of its "parent" molecule"s form, there is nothing but this form (or its complement) copied over and over. There is no information about something else that is copied in this process, just the molecular structure. If inserted into a living cell, this sequence might be capable of producing a protein product or some other sort of biological effect, as can be achieved by artificially generated DNA sequences, but the best we could say of this is that this sequence is "potential information" (or misinformation). It is the cellular machinery that determines that a DNA sequence has this potential, and the sequence only inherits this potential because of the existence of cells that have used similar molecules to their advantage in the past. Randomly generated DNA sequences are parasitic on this potential in the same way that a randomly mutated gene can be.
There is a significant conceptual gap to be bridged between the replication of DNA and its role as a medium of information. Although Dawkins often speaks as though a given base sequence on a DNA molecule is intrinsically a form of information (so that copying it is transmitting information), it is only information in the minimalistic Shannonian sense. As we have seen, the Shannonian conception of information is an abstraction that only considers the most minimal criteria for the possibility of carrying information. Although no molecular biologist would consider the structure of a DNA molecule to be information were it not for the fact that it contributes to the operation of other cellular-molecular processes, they may still be willing to bracket this from consideration when thinking of evolution. In life, DNA molecules do not provide information about other replicas of themselves, but rather about the molecular dynamics of the cell in relation to its likely environmental milieu. And yet many theories of the origins of life are based on the a.s.sumption that molecular replication is a sufficient defining property of living information.
Probably the most influential of the scenarios explaining the origin of life based on molecular replication is known as the RNA-World hypothesis. In this scenario, it is argued that RNA replication is the core process distinguishing the first lifelike process from other chemical processes. This view has grown in influence over recent decades because of discoveries of RNA functions in addition to its role in mediating between DNA sequences and the amino acid sequence specifying a protein. Single-stranded RNA molecules can coil back on themselves to form complex, cross-linked hairpin forms and coils, producing a complex three-dimensional structure. In the form of transfer RNA, this structure plays the critical role in binding to amino acids and aligning them with respect to a messenger RNA molecule within a ribosome. But also due to this structure, some RNA molecules are also capable of catalytic action. More recently, RNA polymer fragments have been found to play a wide variety of regulatory roles in the cell as well. These many diverse functions have suggested that RNA molecules could serve all the essential functions a.s.sumed to be requisite for an early life form. RNA can thus serve both a synthetic role and a template role. However, such scenarios treat the template-copying capacity as primary.
But does being a replicable pattern const.i.tute information? Without the elaborate system of molecular machines that transcribes DNA or RNA sequences into amino acid sequences, without the resulting protein functions, and without the evolutionary history, there would be no information about anything in its base sequence structure. There is a special case that makes this obvious: so-called junk DNA. Over the course of the past decades, it has become clear that only a small fraction of the DNA contained in a eukaryotic cell actually codes for a protein or a regulatory function. Although there are reasons to suspect that at least some of this is nevertheless retained for other functions, it is almost certain that vast lengths get replicated in each cell division simply because they are linked to useful sections. These still qualify as replicators in Dawkins" a.n.a.lysis (though he would probably call them pa.s.sive replicators), but it is less easy to justify calling these sequences information, precisely because they do not play any role in organizing the cellular dynamics that makes their persistence more probable.
This suggests that the information-bearing function of nucleic acids is dependent on their embeddeness in the metabolism of a cell that is adapted to its context. This means that nucleotide information is not primary. It is an adaptation, not the ultimate basis of adaptation. It is an evolutionarily derived feature, and not a primitive one. This also suggests that we should be able to trace an evolutionary path from a pre-DNA world to a post-DNA world, from protolife to life. In this respect, autogens don"t merely provide an heuristic model of the transition to teleodynamics; they also offer a context in which to investigate how the structure of one molecule (e.g., DNA) can become information about certain patterns of chemical interaction that obtain between other molecules. DNA structure effectively represents, in concrete form, the dynamics of the chemical system that contains it and replicates it. This something that DNA is about is the source of natural selection that maintains the relatively conserved replication of certain sequences as opposed to others. The information function is thus in an evolutionary sense dependent on this prior dynamics, and so is an indirect adaptation for stabilizing the form of this molecular dynamics.
In this way, we may be able to reconstruct the steps from teleodynamics to information that const.i.tutes the most unprecedented feature of life.
AUTOGENIC INTERPRETATION.
The dependency of information on involvement in a teleodynamic process can be demonstrated by a slight complexification of the autogen model. In a discussion with two of my colleagues, Chris Southgate and Andrew Robinson, concerning the semiotic status of autogens, they proposed a modification that we all could agree involved a semiotic aspect. They argue that an autogenic system in which its containment is made more fragile by the bonding of relevant substrate molecules to its surface could be considered to respond selectively to information about its environment. Although our discussion concerned a slightly more subtle question (whether autogenic theory can help decide if iconicity or indexicality is more primary), its bearing on the nature biological of information is more illuminating.
If an autogen"s containment is disrupted in a context in which the substrates that support autocatalysis are absent or of low concentration, re-enclosure and replication will be unlikely. So stability of containment is advantageous for persistence of a given variant in contexts where the presence of relevant substrates is of low probability. If, however, the surface of the containing capsule has molecular features to which the relevant substrate molecules tend to bind, and in so doing weaken its structural stability, then the probability of autogenic replication will be significantly increased. The process will tend to be more stable in environments lacking essential substrates and less stable in environments where they are plentiful. Sensitivity to substrate concentrations would likely also be a spontaneous consequence of this bonding, because if binding of substrates to container molecules weakens the hydrogen bonds between containment molecules, it would follow that weakness of containment would be a correlate of the number of bound substrates. Higher substrate concentrations would make disruption more probable, and subsequent use of local substrates would deplete their concentration and make the replicated autogens more stable and more likely to diffuse to new environments.
In evolutionary terms, this is an adaptation. Autogen lineages with this sensitivity to relevant substrates will effectively be selective about which environments are best to dissociate and reproduce in. Though such "sensitive" autogens would not exactly initiate their own reproduction-that is still a matter dependent on extrinsic disruption-their differential susceptibility to disruption with respect to relevant context is a move in this direction.
It seems to me that at this stage we have introduced an unambiguous form of information and its interpretive basis. Binding of relevant substrate molecules is information about the suitability of the environment for successful replication; and since successful replication increases the probability that a given autogenic form will persist, compared to other variants with less success at replication, we are justified in describing this as information about the environment for the maintenance of this interpretive capacity. Using terms introduced by the father of semiotic theory, Charles Sanders Peirce, we can describe these consequences as interpretants of this information. Peirce introduced this way of talking about the process of interpretation in terms of interpretive consequences in order to more fully unpack the somewhat opaque notion of sign interpretation. Specifically, he would have termed the decreased integrity of containment provided by bound substrates the immediate interpretant of the information, and he would have termed the support that this provides to the perpetuation of this interpretive habit via the persistence of the lineage the final interpretant. We can now unpack the notion of information in semiotic terms as well. The sign in this case is the binding of substrate, and its object is the suitability of the environment. Or again to use Peirce"s more specific terminology, we might describe the presence of substrate in the environment as the dynamical object of the binding (that physical fact that is indicated by the sign), and the general suitability of the environment as the immediate object (that general property of the dynamical object that is significant for the process).9 What is the difference between the sensitivity in this simple molecular system and the sensitivity of a mechanical sensor, such as a thermostat used to control room temperature or a photodetector in a doorway used to detect the entrance of someone into a store? In the absence of the human designer/user, I would describe the action of these mechanical devices as providing Shannonian information only. Certain of the physical constraints embodied in these mechanisms provide the basis for potential information about particular kinds of events. But the rate of drying of a wet towel has the capacity to indicate room temperature and the tracking of dirt in from the street has the capacity to indicate that a person has entered a room. All are potential information about particular sorts of events. But there are also numerous other things that these physical mechanisms and processes could indicate as well, probably a nearly infinite number depending on the interpretive process brought to bear. That is the difference. A person focused only on those aspects deemed relevant to some end they were pursuing would determine what it is information about. There is no intrinsic end-directedness to these mechanisms or physical processes.
FIGURE 14.1: A speculative depiction of the possible evolutionary stages that could lead from a simple autogenic system to full internal representation of the normative relationship between autogen dynamics and environmental conditions. Though somewhat fanciful, this account provides a constructive demonstration that referential normative information is supervenient on (and emergent from) teleodynamics.
A. Depiction of a tubular autogen with a simple modification that provides the capacity to a.s.sess information indicating the presence of favorable environmental conditions. This is accomplished because the exposed structure of the autogen surface includes molecular surface structures that selectively bind catalytic substrate molecules present in the environment, and where increasing numbers of bound substrates weaken containment. This increases the probability that containment will be selectively disrupted in supportive versus non-supportive environments and thus provides information to the autogenic system about the suitability of the environment for successful reproduction.
B. Depiction of a tubular autogen that produces free nucleotides as byproducts. This might evolve in environments with high concentrations of high-energy phosphate molecules as a protection against oxidative damage, and could subsequently be exapted as a means of extracting and mobilizing energy to drive exothermic catalytic reactions.
C. Within the inert state of an autogen-diverse nucleotide, molecules could be induced to polymerize as water is excluded. This would both render phosphate residues inert and conserve nucleotides for future use. Although the spontaneous order of nucleotide binding will be unbiased, the resulting sequence of nucleotides can serve as a substrate onto which various free molecules within the autogen (e.g., catalysts) will differentially bind due to sequence-specific stereochemical affinities.
D. In this way, catalysts and other free molecules can become linearly ordered along a polynucleotide template, such that relative proximity determines reaction probability. Thus, for example, if this template molecule releases catalysts according to linear position (e.g., by depolymerization) they will become available to react in a fixed order. To the extent that this order correlates with the order of reactions that is most efficient at reconst.i.tuting the autogenic structure there will be favored template sequences. So long as one strand of this template is preserved, as in DNA, sequence preservation and replication are possible. Since the optimal network of catalytic reactions will be dependent on the available resources provided in the environment, this template structure is at the same time a representation of this adaptive correspondence. Although this scenario has been described using a nucleotide template in order to be suggestive of genetic information, the molecular basis of such a template could be diversely realized.
This is what is provided in the most minimal sense by the autogen"s tendency to reconst.i.tute or reproduce itself after being disrupted. The autogenic process not only tends to develop toward a specific target state, it tends to develop toward a state that reconst.i.tutes and replicates this tendency as well. So the interpretation of substrate binding is a self-const.i.tuting feature. It is a dynamical organization that is present because of its propensity to bring itself into existence. Of course, each interpretation is a unique event, so it is more accurate to say that the general type of this specific dynamical constraint (or organization) that we have identified as an interpretive process is self-const.i.tuting. It is only the form of this dynamical constraint that will be perpetuated by being pa.s.sed on, not any specific collection of molecules, and so on. To again describe this in terms that resonate with Peirce"s semiotics, the ultimate ground of interpretation is a self-sustaining habit.
There is also a necessary intrinsic normative character to this interpretive process. If by virtue of structural similarity, other molecules that are not potential catalytic substrates also tend to bind to the autogen surface and also weaken it, this would, in effect, be misinformation, or error. Sensitive autogens, which tend to respond in this non-specific way, would be less successful reproducers than those that were more selectively and appropriately sensitive. This would provide a selective influence favoring an increase in specificity. Although individual autogens and autogen lineages themselves would not detect this as error, the autogen lineage would. Over the course of evolution, such error-p.r.o.ne autogens will tend to be eliminated from the population. This exemplifies the fact that as soon as there is information (in the full sense of the term), there is also the possibility for error. Because aboutness is an extrinsic relationship, it is necessarily fallible. Detection of error within an individual involves an additional level of information about the information, and thus an additional level of interpretive process. Autogens are too simple to register anything about the interpretation process. An interpretation of the interpretation processes is a higher logical type relationship. This is why it only arises at the level of autogen lineage selection. As we saw in the last chapter, this implies that natural selection is a form of distributed error detection; and as we will see in the next chapter, only when some a.n.a.logue of natural selection is internalized within the interpretive process itself-in physiology, say, or brain function-does error detection become intrinsic to the system that does the interpreting.
ENERGETICS TO GENETICS.
This use of the autogen model to explore the necessary and sufficient conditions to const.i.tute a minimal interpretive capacity does not, however, address the most compelling issue: the nature and origin of genetic information. The above demonstration of how autogenic teleodynamics can provide the basis for an interpretive dynamic offers a recipe of sorts, showing how a molecular relationship (binding of substrate molecules to capsule molecules) can come to be interpreted as information about some relevant extrinsic state of affairs. Whereas this is a reciprocal coupling between this one molecular relationship and the extrinsic requirements of the teleodynamic system in which it has come to be incorporated, genetic information (and its precursor a.n.a.logues) must in effect involve an additional level of referential relationship. It must be in relationship to the teleodynamics of the organism (or autogen) as this substrate-binding relationship is to the environment. Genetic information is about some aspect of this teleodynamic organization with respect to certain environmental factors. So our question now becomes: How can this infolding of reference arise?
Obviously, as there are innumerable molecular details in the autogen story that I have merely a.s.sumed to be plausible without actually investigating the chemistry involved, when we complicate this account, there are exponentially more to be faced. These const.i.tute the critical science that must be undertaken before any of this can be said to actually apply to the origins of life, or protolife, much less to the origins of genetic information. My purpose, however, is not to explain the origin of life, but rather to get clear about the principles that must be understood in order to focus this research on the most relevant details. What I intend is only to provide what might be described as a proof of principle. Neither protolife nor genetic information may have arisen in the specific ways that I describe; but I believe that the principles exemplified in these scenarios also apply to whatever specific molecular processes actually took place at the dawn of life on Earth.
With this caveat in mind, let"s explore a somewhat fanciful-but not in any sense magical or homuncular-scenario for how a simple form of genetic information might arise in an autogenic context.
The intuition behind this imaginative scenario is motivated by noticing a curious coincidence that appears to be common to all organisms: some of the building blocks of the information-conveying molecules of life (DNA and RNA) are also the princ.i.p.al energy-conveying molecules (e.g., ATP and GTP) and so-called second-messenger molecules (e.g., cAMP and cGMP) of the cell. All of these molecules and their DNA-RNA monomeric counterparts have a three-component structure. This includes a purine double-ring molecule at one end (A = adenine, G = guanine), one or more phosphate molecules (PO4) forming a sort of tail at the other end, and a five-carbon (pentagonal ring) sugar molecule in the middle (ribose). In their non-informational roles, these three-component molecules are transferred or diffused from place to place to serve energy delivery or "switching" functions with respect to other molecular systems. In contrast, their role as bearers of genetic information is only realized in a polymeric form, in which each nucleotide is linked to another by having its phosphate linked to its neighbor"s sugar, one after another, to produce a long sugar-phosphate-sugar-phosphate- . . . "backbone," with base residues linked alongside. In polymeric form, the purine-containing nucleotides (h.o.m.ologous to AMP and GMP) are joined by pyrimidine (single-ring)-containing nucleotides (that can for comparison be designated CMP, TMP, and UMP, where C = cytosine, T = thymine, and U = uracil). In their roles as information conveyors, it is the sequence of these dangling purines and pyrimidines that matters, aided by the preferential binding of A with T, C with G, and U with A (the interchangeability between U and T distinguishes RNA from DNA; RNA subst.i.tutes U for T in DNA and has a slightly modified ribose sugar), that makes replication and translation possible.
Why this coincidence? My hypothesis is that the monomeric functions (energy transfer and switching functions) came first, and the information-conveying functions evolved as an afterthought, so to speak. Again, this suggests that information is not primary.
To begin, I need to postulate the (unexplained) presence of nucleotide molecules serving one or more of these basic energetic functions. Though the spontaneous inorganic synthesis of some of these nucleotide monomers has been recently demonstrated, and claimed as support for an RNA-World origins scenario, all that matters for the scenario proposed here is for there to be some means for their synthesis. Important for this argument is that their synthesis by catalytic processes of a complex autogenic system must be plausible-which for the sake of this scenario I will take as unproblematic. The starting a.s.sumption, then, is that some aspect of autogen catalysis or self-a.s.sembly is potentiated by these phosphate-ferrying molecules, as is the case in living cells. This might involve picking up an additional phosphate molecule or two in the high-energy context of a volcanic vent; but again, for the sake of the principle being explored, this chemistry is unimportant. During the reconst.i.tution and replication phases of the "life cycle" of these more complex energy-a.s.sisted autogens, the presence of captured energy in the form of triphosphates could make up for the need for energy-rich substrate molecules to fuel the catalytic reactions involved. So, where simple autogen catalysis is parasitic on specific energy-rich substrates, the availability of a more generic energy source, in the form of phosphate-phosphate bonds, would both offer a sort of jump start for catalysis and a freedom from such specificity of substrates. Such augmentations of the autogenic reconst.i.tution and replication process would give lineages that generated and incorporated nucleotides the ability to capture and deliver energy, a significant evolutionary advantage.
As I have surveyed the literature on the origins of life, I have found two other authors who have independently proposed this evolutionary direction from energetic to informational use of nucleotides: the evolutionary biologist Lynn Margulis and the theoretical physicist Freeman Dyson.10 Dyson"s argument is that the polymerization of these molecules might be a sort of garbage-collection trick, to remove those monomers that have given up their extra phosphates and might otherwise compete for the extra phosphate residues still available to do work carried by other nucleotides. My speculation is vaguely similar, but specific to the context of autogenic "metabolism." Recall that unlike most living organisms, autogens do not actively maintain non-equilibrium conditions. They are not continually in a dynamic state, but may spend vastly more time as inert structures. Phosphate-mediated molecular chemistry would therefore only be important during those brief, rare periods when containment has broken down and catalysis and self-a.s.sembly processes are critical for reconst.i.tuting this stable form. Polymerization of nucleotides-binding the phosphates between sugars-would make them unavailable for interacting with other molecular components while still maintaining a store of them, available for the next replication cycle. This might further be aided if autogen enclosure tended to exclude water, increasing the dehydrating conditions that facilitate polymerization. Disruption of autogen containment would thus inversely increase exposure to water, allowing rehydration to facilitate depolymerization of the nucleotides, making them again available to capture new high-energy phosphates.
So far, this scenario offers an interesting augmentation of the autogenic logic in which the addition of energy capture and management is a significant evolutionary advantage. But it also offers something in addition-a new potential source of constraint and constraint propagation. To see this possibility, we need to consider certain functionally incidental physical properties of such a polymer. First, as a means of molecular storage, the relative positions of different nucleotides along the polymeric chain are irrelevant. If many different nucleotides are all capable of serving some phosphate-carrying function, their polymerization order will tend to occur at random, only reflecting the degree of their relative prevalence. This unconstrained ordering of nucleotides is, of course, a critical property of nucleic acid information-conveying capacity (its Shannonian entropy). If the nucleic acid sequence were to strongly favor certain bonds or combinations over others, then this would introduce a bias, and thus redundancy and a reduction of the information-bearing capacity. Second, precisely because to be useful in autogen metabolism requires that phosphate-bearing nucleotides bind selectively to the catalytic and other const.i.tutive molecules of the autogen, there will be a tendency for these other molecules to also bind to the polymer; and if there is some specificity a.s.sociated with the various nucleotide sequence variants, then this binding will have some degree of specificity.
Together, these two properties can serve as the basis for the recruitment of the polymeric form to serve a function other than nucleotide collection and phosphate inactivation; it can act as a template.
FIGURE 14.2: Depiction of an autogen thought experiment, demonstrating how a component molecule might spontaneously evolve to provide information about the production of the autogenic system of which it is a part. In this example it is a.s.sumed that variant nucleotide molecules are generated as side products of autocatalysis. In the dynamic phase of autogenesis these nucleotides could serve to capture free energetic phosphate molecules to provide generic energy for catalytic reactions (left). a.s.suming additionally that during the inert phase of the autogenic cycle free nucleotides were induced to polymerize into a randomly arranged linear molecule, different nucleotide orders would tend to provide differential substrates onto which free catalysts might tend to bind (center). The binding order of catalysts along the nucleotide polymer would incidentally bias interaction probabilities between catalysts by virtue of proximity and timing of release. In this way different sequences of nucleotides could come to be selected with respect to the catalytic interaction biases most conducive to autogen reproduction. The selectively favored order in this way re-presents the constraints that const.i.tute the specific teleodynamic reaction network.
There are a number of serious limitations affecting the autogenic form of evolution. One of the most significant has to do with the size of the network of catalytic interactions that is sufficient to complete autogen replication. While an increasingly complex autocatalytic set might provide autogens with some flexibility with respect to variable environments, as the number of catalysts and the complexity of the interaction patterns increase, an upper limit to evolvability will be quickly reached. For every catalyst that is added to an autocatalytic network, the number of possible non-productive molecular interactions between them and other molecules increases exponentially. The larger and more specific the interaction network necessary for autogen replication, the slower and less efficient the process. What is required is some source of constraint on the possible molecular interactions, besides that which is intrinsic to their structures. A mechanism that constrains interactions to significantly favor those that are appropriate to autogen formation, and to significantly inhibit those that are not, would be a significant aid in overcoming this explosion of possible side reactions.
FIGURE 14.3: The living fossil remnants of the major pre-life stages of Morphota evolution may still be exhibited in living systems. Some possibillities are depicted here.
The probability of interaction between molecules is, in large part, a function of relative proximity. Molecules in low concentration are thus on average seldom close enough to interact because of the vast numbers of molecules in between. An autocatalytic process is for this reason significantly sensitive to the relative proximity of any catalyst molecules capable of interacting chemically with any other catalysts. And where autocatalysis involves more than just a few interacting catalysts, and where some reactions between the catalysts in the set are relevant, the differential proximity of catalysts becomes much more important. In living cells, where molecular "machines" may involve sometimes as many as a dozen different molecules working precisely together, the different components are often bound to substrates in a specific configuration, or sequestered in different cellular part.i.tions, or a.s.sembled by sequential availability of components to produce multipart molecular complexes (such as a ribosome), in order to avoid irrelevant interfering molecular interaction. An autogen that utilizes more than a small number of interacting components would likewise require similar constraints.
Without constraints on the relative probabilities of molecular interactions within an even modestly complex autogen, that is, with an autocatalytic set of say more than about four or five catalysts, there would be a tendency for side reactions to impede reproduction by generating inappropriate products and reducing the rates of relevant reactions. Moreover, this would rapidly grow out of control as autogenic chemistry became even modestly more complex and added new components. With respect to this limit-to-complexity problem, the availability of a randomly constructed nucleotide polymer could make a significant difference.
An important factor determining chemical reaction rate is the relative probability that the relevant molecules will interact with one another compared to the probability that competing interactions will take place. So, for example, the relative concentrations of diverse molecules will decrease the probability of any specific molecules interacting. At a molecular level, this is effectively a problem of relative proximity. One of the ways that catalysts can increase reaction rates, then, is to increase the probability that certain molecules will be near enough to interact. But in an a.n.a.logous way, a large linear polymer with irregular structure can bias proximities and reaction rates of other molecules. Presuming that the various proteinlike catalysts enclosed in the inactive autogen will have slightly different affinities to bind to the linear sequence of the nucleotide polymer, this binding order will provide proximity constraints affecting the probability that any two bound catalysts will interact. Molecules bound nearby one another on the same polymer will have a higher probability of interacting than those bound far apart.
To the extent that molecules bound nearby one another are also those whose immediate interaction is conducive to autogen formation, rather than a side reaction, this reaction bias will facilitate preservation of this feature, because more of this variant will get produced as a result of their more efficient chemistry. In other words, the degree of correlation between catalyst binding order and the topology of the optimal autocatalytic reaction network will provide a source of selection favoring preservation of certain nucleotide sequences over others. Selective preservation and transmission of this correlation between nucleotide sequence and catalytic interaction sequence can thus become the crude equivalent of genetic information. It would now be accurate to describe the structural features of a given nucleotide sequence as conveying information about an advantageous chemical reaction sequence. A molecular structure would thereby inherit constraints from the more functional autocatalytic dynamics and transfer those constraints to future dynamics, as some autogen lineages out-reproduced others.
Many more details are required to complete this simple scenario. They include the preservation of advantageous nucleotide sequences, the means by which they can be replicated along with autogen replication, and the maintenance of their linear form. Most of these difficulties can be answered by a.s.suming a double-stranded DNA-like form of these earliest nucleotide polymers, rather than a single-stranded RNA-like form. This of course contradicts RNA-World a.s.sumptions, but in part the RNA advantage is reduced in this case because catalytic functions are already a.s.sumed irrespective of nucleotide functions. The point of this sketchy scenario is not, of course, to make any claim about the relative plausibility of any particular molecular process, but simply to show how a teleodynamic process can offload certain critical constraints that it must preserve onto a separate physical substrate, and in so doing endow this substrate with semiotic functionality-in other words, aboutness.
This vision of a DNA molecule with proteins bound to it in locations determined by nucleotide sequence has a living parallel. Molecular biologists are already familiar with a number of cla.s.ses of DNA-binding proteins. These play diverse roles in living cells-from merely protective and structural packing and maintenance to a variety of regulatory roles. The binding of these various proteins and protein complexes to the DNA molecule is likewise determined by various stereochemical properties, including the structural correspondence between protein structure and the distinctive "twist" of the double helix that correlates with nucleotide sequence at that point. These slight differences in twist are due to the slightly variant properties of the cross-linked bases and slightly different energetic configurations in the way the molecule "relaxes." The regulatory functions of DNA-binding proteins are also mediated by the protein-protein interactions they precipitate or block, as these will determine what other protein complexes will bind to adjacent segments. a.n.a.logously, the scenario just sketched effectively reverses what we might normally imagine the priority of genetic functions to be. What in contemporary molecular genetics we take to be a secondary supportive function of DNA-bound proteins, in service of regulating the expression of the genetic code, is in this scenario taken to be the primordial "coding" function of DNA: a substrate for organizing the interactions among various catalytically interacting proteins.
Any number of additional modifications to this scenario could be suggested to give it increased plausibility for an origins theory. For example, we might speculate that the significant temperature differences found in regions near volcanic vents could act like natural PCR machines, causing DNA-like chains to separate and re-aneal to form replicas. For the purpose of this discussion, explaining how these mechanisms evolved may improve the plausibility of this as a theory of life"s origins, but it adds little to the constructive definition of biological information that results. What questions we may be predisposed to ask about the origins of these biological functions will however be quite different as a result.
It may be a disappointment to some that I have not also attempted an account of the origins of the triplet genetic code that determines protein structure. That is a far more complex problem, which I am not equipped to speculate about. So this is far from an account of the so-called code of codes that matches nucleotide sequences to the amino acid residue sequences const.i.tuting proteins. But to the extent that this scenario provides a general model demonstrating how a teleodynamic molecular processes can facilitate the transfer of critical constraints onto other substrates, then it may provide hints to help reconstruct the likely large number of incremental evolutionary steps that could lead from a reaction sequence template to a molecular code.
If we step back from the specifics of the molecular code problem and ask only how a molecule"s structure could come to play a critical role in the organization of the dynamical properties of life, autogenic theory provides such an account. Interestingly, one implication is that it demotes DNA replication to a supporting role, not the defining feature of information, as in replicator theories. What must get preserved and replicated are certain constraints which are critical to the teleodynamic process that generates them, and these constraints can be embodied in many different ways. Molecular information is not, then, intrinsic to nucleic acid sequences, or to the process of replication, but to these constraints in whatever form or substrate. Which also means that there is no reason to a.s.sume, even in contemporary organisms, that the only information transmitted generation to generation is genetic. In whatever form it occurs, biological information is not an intrinsic attribute of that substrate. As this entire enterprise has been at pains to demonstrate, it is precisely this non-intrinsic character of information (and of all ententional phenomena) that must be accounted for. Only when this general principle is grasped will we be able to avoid importing cryptic homuncular properties into our theories of evolutionary genetics.
This scenario demonstrates that a given component attribute of a teleodynamic system becomes informational because it comes to embody and propagate constraints relevant to the preservation of the dynamical organization that it is a part of. These constraints had to already be an implicit feature of the teleodynamics of that system, prior to becoming re-presented in this additional form. But once redundantly offloaded onto a component substrate, the maintenance of these constraints by features intrinsic to the global teleodynamic organization of the system can degrade without loss of their preservation. This will subsequently decrease the constraints on specific molecular dynamics, allowing a kind of evolutionary exploration of alternative forms that would not be possible otherwise. The highly specific dynamical and chemical features that were previously maintaining these constraints can now become multiply realized.
Thinking of biological information in these dynamical and substrate-neutral terms reframes how we think about the function of biological inheritance. In this scenario, genetic information arises from the shifting of dynamically sustained constraints onto structurally embodied constraints that have no direct dynamical role to play. The localized informational function of this structure is thus secondary to, and parasitic on, these prior, more holistically distributed constraints. In the above scenario, for example, the nucleotide sequence is initially merely redundant with certain favored dynamical constraints. However, the dissociation of this molecular pattern from the dynamics that it influences additionally protects these constraints from the thermodynamic unpredictability of the dynamical context.
Dynamically embodied constraints are probabilistic, and will vary to a greater or lesser degree by chance and with respect to such variables as the relative concentrations and distributions of catalysts and substrates in an autogenic context. The critical teleodynamic constraints will persist within an autogen lineage only because their probability of expression in this diverse milieu is just sufficient to stay slightly ahead of the increase in their thermodynamic degradation. Constraints that are, instead, redundantly embodied in a molecular structure are comparatively insulated from these sources of unpredictability, as well as being more thermodynamically stable. Moreover, both the reduction of constraints on properties that must be embodied in dynamical components and the re-embodiment of critical organizational constraints in structural form opens up new opportunities for incidental physical features to become functional features. This is because there will be an increased tolerance for interchangeable dynamical components and a structural means to "explore" distinct and reproducible variants of global dynamical organization (e.g., via structural modification of the template). In this way, molecular structure changes can give rise to organizational changes and organizational regularities can give rise to new forms of molecular information. So the capacity of a teleodynamic system to generate and transfer constraints from substrate to substrate is the key to open-ended evolution, involving ever more complicated dynamics and ever more diverse substrates.
FALLING INTO COMPLEXITY.
Pursued as an origins-of-life scenario, autogenic theory leads us to ask very different sorts of questions than those typically considered by the origins-of-life research community. For example, elsewhere11 I have argued that the large, inorganically formed polymers that are required to be available for autogen processes to emerge might be more likely to arise on colder planets in non-aqueous conditions, and consist of polymers formed from hydrogen cyanide monomers. Also, because of the generic nature of the autogenic mechanism, I suspect that this type of not-quite-living teleodynamic system would be able to form from a diverse range of molecular substrates. Thus the generic nature of the chemistry, the simplicity of the process, and the diversity of conditions that such a system could evolve within should make this sort of replicating molecular system far more prevalent in the universe than anything as complicated and delicate as life.12 The shift from simple autogen replication to information-based reproduction, though it might be a rare evolutionary transition in a cosmic sense, is one that would make a fundamental difference wherever and whenever it occurred. The capacity to offload, store, conserve, transmit, and manipulate information about the relationship between components in a teleodynamic system and its potential environmental contexts is the ultimate ententional revolution. It marks the beginning of semiosis as we normally conceive of it, and with it a vast virtual representational universe of possibilities, because it marks a fundamental decoupling of what is dynamically possible from immediately present dynamical probabilities-the point at which the merely probable becomes subordinate to representational possibility. This is the source of the explosive profligacy of biological evolution.
This account of the derivative origin of genetic information from teleodynamic processes does not therefore diminish its fundamental role in biological evolution. It does, however, help us to better understand the complex underpinnings of the natural selection process-"life"s several powers"-and specifically the importance of morphodynamic processes in development, and the way this contributes to the evolution of complexity and the generation of new biological information. Though it is beyond the intention of this a.n.a.lysis to explore the implications of these ideas for evolution more generally, it is worth concluding this discussion of the nature of the evolutionary process and its role in the generation of information with a few comments about how this perspective illuminates certain recent issues in evolutionary theory.
First, consider the logic of the transfer of constraints from chemical dynamics to a nucleic acid substrate, thus imbuing the latter with information-conveying capacity. In the autogenic model presented, this originally developed from a redundancy relationship. The sequence structure of the nucleic acid polymer became correlated with the dynamical constraints conducive to efficient autogen replication by a process of natural selection because of the way this redundant source of constraint increased the probability of favorable versus unfavorable reaction networks. This redundancy decouples genetic information from metabolic dynamics, effectively re-presenting constraints on interaction probabilities as constraints on the stereochemistry of the interacting molecules. In this way, both forms of constraint can vary separately, and so novel synergies between these dynamical and structural constraints can be explored by natural selection.
An a.n.a.logous role for redundancy in the creation of new genetic information occurs via the serendipitous duplication of genes. Gene duplication is an error or mutation that causes a length of DNA to be copied and spliced back into the chromosome, often just adjacent to the original. There are some instances where the extra product produced by a duplicate is beneficial and other instances where it is harmful or just neutral. Duplication, like redundancy in Shannon"s information theory, is an important hedge against error. In the case of a duplicated gene, this means that the functional degradation of one copy, such as might be caused by mutational damage, might not produce any degradation of function, much less catastrophic failure. And in general, even the damaged copy will not become entirely non-functional. Point mutations that change only one amino acid residue in a large protein will usually have the effect of only modestly altering its structure. Thus its change of properties will also tend to be modest, if noticeable. By unlucky chance in a protein composed of hundreds of amino acids, it is nonetheless possible for a single subst.i.tution to affect a critical structural feature of the protein, and it is also possible for a point mutation that inserts or deletes a nucleotide to cause a "frame shift" which alters how every subsequent codon is read, and thus produces an entirely different protein, if any at all. So long as there is redundancy and no additional deleterious effect produced by the altered copy, natural selection is relaxed on this duplicate; its copies in various progeny can continue to acc.u.mulate mutations and thereby "explore" variations on the original functional theme, so to speak, without significant cost in viability.
Because even a single molecule"s function is seldom simple, and is often the result of various "chemical compromises," this "exploration" can often produce synergistic effects with respect to the original. This occurs if the slightly divergent duplicate happens to contribute a useful variant version of the original function, and in so doing frees the original form from the constraints of compromises that may have been necessary to operate in multiple contexts. Randomly hitting upon such a synergistic interrelationship is made more likely in s.e.xually reproducing organisms, because constant recombination generation-to-generation effectively samples the combinatorial options that are potentially available in a population with many different independently varying versions of the duplicate. This process of duplication, degradation variation, random recombination of variants, and selective stabilization of the most synergistic combinations can happen repeatedly in the evolution of a species" genome. The result can be a whole family of related genes, working collectively or separately to achieve a complex result.
Probably the cla.s.sic example of this sort of synergy due to gene duplication is duplication of the hemoglobin genes in mammals. Bloodborne hemoglobin is a tetrameric complex in which four hemoglobin molecules-two alpha and two beta hemoglobins-fit together into a tetrahedral shape, with each hemoglobin"s oxygen-binding region facing outward. A gene duplication event resulted in the separate alpha and beta forms diverging in shape over evolutionary time, allowing for the evolution of their current synergistic fit. Independent variations in the shapes of the non-oxygen-binding surfaces of the alpha and beta variants over the course of their evolution made it possible for recombination to "sample" their different tendencies to self-a.s.semble into larger complexes, and to favor the replication of those complexes with improved self-a.s.sembly, stability, and oxygen-carrying capacity. Even more interesting is the fact that further duplications of the beta hemoglobin gene in evolution have resulted in a number of variants of beta hemoglobin, each with slightly different oxygen affinities. These appear to have been recruited by a.n.a.logous recombination and selection processes to serve the changing oxygen-transfer demands of the developing mammalian fetus, whose hemoglobin needs to be able to extract oxygen from its mother"s hemoglobin and deliver it to its own body. Moreover, the different variant fetal beta hemoglobins are produced in different amounts at different stages of gestation, and appear to correspond to the changing oxygenation demands of a growing fetus, thus achieving a sort of temporal synergy.
Probably the most dramatic example of the effect of gene duplication, and subsequent complementation of function, produces the theme-and-variation structure of multicellular plant and animal bodies. This turns out to be the effect of a cla.s.s of genes that play regulatory roles by producing protein products which bind to other regions of the genome and act to effect the expression of large constellations of other genes. The family of genes that was found to be responsible for the segmentation of animal bodies is named the homeobox family of genes ("Hox genes," for short, in vertebrates). These are named for a coding region they share in common that const.i.tutes the critical DNA-binding domain of the proteins they code for. They were discovered in fruit flies because mutations of these genes produce bizarre but systematic modifications of whole body segments, such as a duplicated thoracic region or a leg where an antenna should have developed.
It was found that different variant duplicates of these genes orchestrate the expression of large numbers of "