Impact Statement
Creating digital twins (or the process of digital twinning) is a concept of growing importance in a wide range of industries and technology sectors. Digital twins can be used as a method to obtain value from data and as deployment platforms for AI and data-science techniques such as machine learning and statistical analysis. In many applications, digital twins offer the means to integrate together multiple previously separate components in order to achieve a specified objective(s). This type of integration of digital components is based on a fundamentally holistic philosophy. This paper presents a conceptual framework for digital twins that considers how such a holistic integration can be achieved, including current questions of interest, and challenges for future research.
1. Introduction
A digital twin is a virtual representation of a natural, engineered, or social systemFootnote 1 (called the physical twin) that enables a two-way coupling between the digital and physical domains, using some form of network-based connectivity (that is bidirectional flow of information typically across the Internet). The digital twin evolves over time and is constructed from digitized information such as recorded data and the output of computational models.
The origins of digital twinning are usually attributed to the work of NASA during the Apollo program, where physical duplicates were used to help mission control support their astronauts respond to a critical failure with their oxygen tanks and engine (Rosen et al., Reference Rosen, von Wichert, Lo and Bettenhausen2015). However, the term “digital twin” itself first appears in work relating to product lifecycle management (see Grieves Reference Grieves2019 and discussion therein). The idea has received considerable attention since then in a wide range of areas including product design, manufacturing, civil infrastructure, medicine, asset management, health/condition monitoring, energy networks, space structures, and nuclear fusion.Footnote 2
Digital twins have been promoted as a way to accelerate our ability to understand engineering (and other) systems at previously unmatched levels of performance. This vision and aspiration were captured in the quote from Eric Tuegel and his coauthors (in the context of structural life prediction) in 2011 who stated that:
“The digital twin is a reengineering of structural life prediction and management. Is this science fiction? It is certainly an audacious goal that will require significant scientific and technical developments. But even if only a portion of this vision is realized, the improvements in structural life prediction will be substantial”—Tuegel et al. (Reference Tuegel, Ingraffea, Eason and Spottswood2011).
This is certainly a very exciting prospect. However, it is important to always maintain a healthy level of skepticism when dealing with such claims. The aspiration for digital twins, particularly from commercial vendors, seems to imply that the new technology will somehow capture and contain “the best of everything” in some optimal way (e.g. efficient assemblage and federation between models, data, machine learning methods, processes, controls, decision, etc.). In addition, it is often implied that digital twins will somehow overcome the fundamental challenges and limitations related to modeling that we already have (e.g. limited computational resources, epistemic gaps), enabling benefits such as improved fidelity, trust, and insight. But how exactly might that happen? When such questions are not satisfactorily answered, the conclusion for some is that the whole idea is over-hyped, skepticism can become cynicism, and genuine scientific and technological progress can become stalled.
We believe that part of the underlying issue stems from the fact that the concept of a digital twin is so versatile and universally applicable, and so it is open to a very wide range of interpretations—as evidenced by recent reviews (Korenhof et al., Reference Korenhof, Blok and Kloppenburg2021). Those interpretations come from a large number of different research and practitioner communities, which themselves have very wide-ranging cultures and practices. However, a significant part of the digital twin paradigm is about interconnecting these previously unconnected domains. For example, building socio-technical digital twins is a major ambition in this field (Okita et al., Reference Okita, Kawabata, Murayama, Nishino and Aichi2019; Wang et al., Reference Wang, Qin, Li, Yuan and Wang2020; Zhang et al., Reference Zhang, Cao, Maharjan and Zhang2021a; Savage et al., Reference Savage, Akroyd, Mosbach, Hillman, Sielker and Kraft2022; Yossef Ravid and Aharon-Gutman, Reference Yossef Ravid and Aharon-Gutman2022). As a result, when conversations happen, people are often talking at cross-purposes, because they have different starting points, cultural assumptions, biases, and motivations.
Therefore, in this paper, we seek to understand the philosophical context (or foundations) that underpin the concept of a digital twin. We will argue that the philosophy of digital twins is fundamentally based on the idea of holism. In short, certain properties represented by a digital twin apply only to wholes formed out of assemblies of more basic parts, such that it is conceptually incoherent to apply these properties to lower-level components of the system. Furthermore, a key (and related) aspiration for digital twins is that they can capture emergent phenomena. However, as we will describe, these are not the only philosophies that relate to digital twinning. Little has been previously published on the philosophy of digital twins, specifically, with some notable exceptions, such as the work of Korenhof et al. (Reference Korenhof, Blok and Kloppenburg2021), who have proposed that digital twins are “steering representations”—something we discuss later on.
It is important to note that, because holism and emergentism are typically at odds with the reductionist paradigm used in the majority of current mainstream science and engineering models, the approach is neither well established nor well understood. To support our argument, we will review several aspects related to the philosophy of modeling. In the second half of the paper, we present a set of principles for digital twinning along with some examples that are intended to support the development of a more complete philosophical framework—an exercise we leave for further research. Some readers may wish to jump ahead to these principles prior to reading the first half of the paper, to get an idea of our destination. Others may wish to review the foundational context set out in the first half. We leave this as a choice for the reader.
The paper is structured as follows. In Section 2, we present the philosophical context for the concept of digital twins. The concept of modeling will be key to this discussion, and we will explore related philosophical topics and issues that will help the reader understand the motivation behind the principles presented in Section 4. In Section 3, we then turn to consider the types of complexity that occur in engineering systems, and how this might be represented in a digital twin. With the context set, in Section 4, we then introduce a series of principles that we argue could form a conceptual framework for digital twins, or the process of digital twinning. This framework is then used to suggest answers to four key questions relating to digital twins. Finally, in Section 5, we conclude and suggest several open questions and future directions for research.
2. Philosophical issues and concepts for digital twins
Although there has been much discussion on the potential definitions relating to digital twins (Semeraro et al., Reference Semeraro, Lezoche, Panetto and Dassisti2021; Committee on Foundational Research Gaps and Future Directions for Digital Twins et al., 2024), one area that has not received much attention is the philosophical underpinnings of digital twinning as a distinct concept and practice.Footnote 3 In this section, we seek to address this gap.
We begin by reviewing key topics in the philosophy of modeling as applied to a wide range of scientific and engineering domains, before turning to related concerns (e.g. epistemic issues). Models are very important for digital twins because they are one of the key components that make up a digital twin.Footnote 4 However, the relationship between models and digital twins can also be a common source of confusion (e.g. what distinguishes a digital twin from a mere model or simulation).Footnote 5 Therefore, it is helpful to first unpack some key issues that will help us in our later discussions.
2.1. Some key concepts in the philosophy of modeling
The history of philosophical approaches to modeling as a concept and practice goes back as least as far as ancient periods of Greek, Chinese, and Indian thought (Curd et al., Reference Curd and Psillos2014). It is not our goal to rehearse or review this entire history.Footnote 6 Rather, our goal is simply to understand key parts of the philosophical context of digital twinning in order to motivate the presentation of the principles for digital twinning set out in Section 4. Therefore, we identify and discuss some key concepts that underpin how a model of a physical system is typically made in a science and engineering context—with an emphasis on drawing out certain implicit assumptions. To support this goal, an idealized example of such a model-making process is shown schematically in Figure 1.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250207130253812-0594:S2632673625000048:S2632673625000048_fig1.png?pub-status=live)
Figure 1. Schematic diagram showing the typical method of making a model of a physical system. The physical system can be a process or a material object.
The process starts with a series of “observations” (or measurements) of some physical system. This presumes that there is already a physical system in existence, which may not be the case in engineering when we are asked to design something not previously built. Some discussion on this is given in Wagg et al. (Reference Wagg, Worden, Barthorpe and Gardner2020), but for now, we assume that a physical system is available for observation. These observations are then used to construct a model based on a set of “assumptions”. Next, the “output(s)” of the model (e.g. predictions) are then interpreted, and in many cases, this leads to “improvements” being made to the model (e.g. optimizing the performance of predictions), and the process is repeated as often as deemed necessary.
The idealized process of model making (or abstraction) shown in Figure 1 allows us to locate and separate some key philosophical concerns: natural laws, determinism, reductionism, and holism & emergence.
In science and engineering, a “physical system” is typically assumed to be mechanistic in nature. This mechanistic worldview is characterized by a belief in a set of “natural laws” that govern physical systems. And, science seeks to discover these laws or rules, relies upon them to apply “universally,” and then utilizes them within subsequent processes of model-making. The value of such a belief is articulated well by Descartes, who said that:
“…reliable rules which are easy to apply, and such that if one follows them exactly, one will never take what is false to be true or fruitlessly expend one’s mental efforts, but will gradually and constantly increase one’s knowledge till one arrives at a true understanding of everything within one’s capacity.”—René Descartes: Rules for the Direction of the Mind
(see reprint: Descartes, Reference Descartes1985, first published 1701).Natural laws, such as those of physics, biology, and chemistry, provide a foundational set of “rules” describing how real-world systems operate. Because digital twins aim to represent these real-world systems in a virtual environment, an understanding of these laws allows researchers and developers to ensure that their models accurately reflect these fundamental rules (e.g. incorporating Newton’s laws of motion into a digital twin of a spaceship).
“Determinism” connects to this concept of natural laws insofar as it is assumed that if natural laws are strictly followed, then future states of a system can be precisely predictedFootnote 7. And, if the physical world is deterministic, then digital twins should, in theory, be able to predict the future states of a system given accurate input data. These beliefs (among others) are the first set of assumptions that cascade through the typical model-making process and affect our “observations”—assumptions that we will later challenge.
A belief in a law-governed, deterministic world, however, does not preclude the possibility of a complex world. Physical systems can be highly complex (e.g. economies or ecologies)Footnote 8, and so it is common to (conceptually) decompose (or reduce) physical systems into “simpler” parts to be more effectively represented and studied. By studying this reduced version, it is assumed, useful information about the complete system can be obtained (Heylighen et al., Reference Heylighen, Cilliers and Garcia2007).
For the purpose of this discussion, we can consider two forms of reductionism. The first is component-based reductionism which involves dividing the physical system into separate physical (e.g. geometric or process) components, and if required dividing these components into smaller and/or simpler parts, as required. The second is physics-based reductionism, which is to simplify, approximate, or even neglect entirely some part of the physics. Physics-based reductionism can be applied to the whole system or sub-components of the whole system after component-based reductionism has been carried out. In Figure 1, these reductions are encoded as a further set of assumptions. The reductionist philosophical approach has come to dominate scientific and engineering practices over time, and is associated with a Newtonian world-view—Newton himself said:
“We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances. To this purpose, the philosophers say that Nature does nothing in vain, and more is in vain when less will serve; for Nature is pleased with simplicity and affects not the pomp of superfluous causes.” — Isaac Newton, Principia: The Mathematical Principles of Natural Philosophy (Newton, Reference Newton1686, emphasis added).
Here the idea of avoiding “superfluous causes” and the idea that “Nature is pleased with simplicity” has been taken as an argument for reduction to enable simplification.
Classical mechanics has been built on these beliefs and assumptions, with huge success, and the plethora of law-governed, deterministic, and reduced models are typically defined with a high degree of mathematical rigor. But experience also tells us that much about the physical world still cannot be predicted or modeled to high levels of precision. Reductionist models, by definition, cannot capture the holistic behavior of the physical system, and they also struggle to account for emergent behaviors.
Roughly, “emergence” refers to “phenomena that arise from and depend on some more basic phenomena yet are simultaneously autonomous from that base” (Bedau and Humphreys, Reference Bedau and Humphreys2008, p. 1). However, as is evident from the ongoing debates in the philosophy of science, we should not presume that a single type of emergence exists, nor that a single definition will always and everywhere sufficeFootnote 9. This is also true for the emergence of digital twins, given the myriad uses and phenomena that digital twins can represent and affect. However, a key aspect of emergence that will be important later on, is that complex systems often exhibit properties or behaviors that cannot be fully explained or predicted by understanding the individual components of the system alone. Often such interacting systems contain intricate hierarchies or interdependencies, and emergence (e.g. self-organization) can happen within a part, or across the entire system (see e.g. Bedau and Humphreys Reference Bedau and Humphreys2008). The existence of such behaviors supports the idea of holism, which we can describe as the assumption that systems should be viewed as wholes, not just as collections of parts. In terms of model-making, this approach requires studying (and modeling) the interdependencies and interactions within a system, rather than isolating and analyzing individual components—in contrast to the reductionist approach.
But emergence and holism also place pressure on the notions of determinism and natural laws, as emergent behaviors cannot typically be anticipated or predicted. This raises the question of whether such a barrier is metaphysical or epistemic in nature (that is something about the fundamental nature of the physical system or our knowledge of the system). Let us look at the epistemic issues first.
2.2. The role of knowledge in model making
To start, two distinctions can be drawn between objectivism and subjectivism, and between epistemic uncertainty and aleatory uncertainty.
Very briefly, objectivism holds that the “model making” process in Figure 1 is an attempt to objectively describe and represent reality. Models, then, are seen as tools that aim to accurately capture the underlying structures, laws, and relationships that exist in the natural world, in a manner that is independent of human perception or interpretation. In contrast, subjectivism holds that models are information constructs that reflect or encode human perspectives, interpretations, and choices. As such, the process of observing a physical system is contingent on factors such as the goals of the study.Footnote 10 The distinction between aleatory and epistemic uncertainty relates to this first distinction.
On the one hand, aleatory uncertainty (sometimes known as “stochastic uncertainty”) refers to any uncertainty that arises from inherent randomness or variability in a system (see e.g. Hughes and Hase (Reference Hughes and Hase2010)). Such uncertainty (or chance) is considered irreducible because it is linked to the natural variability of phenomena. An objectivist, therefore, would take this uncertainty to be real and to exist independently of our knowledge or beliefs.
On the other hand, epistemic uncertainty refers to uncertainty that arises from a lack of knowledge or incomplete information about a system. The missing knowledge (or model error/inadequacy) can arise due to many factors (e.g. inadequate measurement or sensors, incomplete theories). In the context of subjectivism, uncertainty is, therefore, not just a feature of the external world but crucially depends on the observer’s knowledge and perspective. For instance, if we are uncertain about the parameters of a model due to insufficient data (that is observations), this uncertainty is epistemic in nature and could in principle be reduced by gathering more data.
In practice, the distinction between objectivism and subjectivism, and between aleatory/epistemic uncertainty is not always as clear-cut as presented here.Footnote 11 Some uncertainties may exhibit characteristics of both aleatory and epistemic uncertainty—but saying more about this is beyond the scope of the current discussion. A related issue in both the philosophy of science, but also moral philosophy and the social sciences is the impact of human bias on the myriad value judgments that intersect with science and engineering (e.g. model selection and design choices).
Firstly, human interpretations are problematic, and it is difficult for us to be objective when constructing models and interpreting the results. Humans tend to adopt worldviews that suit them (e.g. techniques or methods that are familiar to a specific community of practice), and we are all subject to cognitive biases (e.g. confirmation bias). In addition, poor research practices can mean that models are not properly validated, calibrated, or tested once they are built, leading to claims that many published research results may in fact be false (Ioannidis, Reference Ioannidis2005; Marques, Reference Marques2021). Furthermore, groups and communities are subject to negative group dynamic effects, such as those that may stem from a lack of diversity or inclusivity, combined with entrenchment and groupthink that can exacerbate negative views of other groups and their associated philosophies. For example, those working in the “hard sciences” often fail to understand the approach and values of those working in social sciences or humanities and vice versa (see reprint: Snow, Reference Snow2012, first published 1964). There are multiple other types of philosophical tribalism and dogmatic behaviorFootnote 12 that can impede the adoption of useful model-making practices. For example, researchers and practitioners are often philosophically aligned to either quantitative or qualitative methodologies, where in many circumstances, mixed methods (e.g. a combination of quantitative and qualitative methodologies, see e.g., Varga Reference Varga2018) would be more beneficial. This will be an important point for digital twins, where both quantitative and qualitative functions are often required.
So far, these initial philosophical concepts have addressed only the top row of Figure 1. However, the schematic is cyclical in nature, with improvements feeding back in after the interpretation of a model’s output(s). Therefore, let us focus on this stage next.
2.3. Defining a purpose for a model
In 1982, British statistician George Box published the now-famous adage, “All models are wrong, some are useful.” Box’s central point was that no (statistical) model can ever be “correct” in the sense that there is a “perfect” match with the physical system. But Box’s statement also emphasizes the idea of model usefulness (or utilityFootnote 13). We can capture the idea that models can have a useful purpose, even though they can never be perfect, with the concept of utility. But we can also ask, “useful for what?”
An obvious response would be that the primary purpose of a model is to gain (or enhance, extend, and/or clarify) knowledge. Such a view seems to be captured by the idea of “model-dependent realism,” expressed by Hawking & Mlodinow’s Grand Design (Hawking and Mlodinow, Reference Hawking and Mlodinow2010), which claims that the value of models lies in their ability to make accurate predictions and provide useful explanations, not in any claim to absolute truth (that is realism). As Hawking & Mlodinow state:
“According to model-dependent realism, it is pointless to ask whether a model is real, only if it agrees with observation.”—Hawking and Mlodinow (Reference Hawking and Mlodinow2010)
.This pragmatic perspective has a rich history in both philosophy and philosophy of scienceFootnote 14. A common emphasis of pragmatic perspectives is how knowledge gained from scientific models ultimately leads to additional explanatory capability or insight—noting, again, the cyclical nature of Figure 1.Footnote 15
Both insight and utility depend on a final key concept, trust. As Harlow Shapley, the American astronomer said,
“No one trusts a model except the man who wrote it; everyone trusts an observation except the man who made it.”
The first part of this quote acknowledges that models are often, and rightfully, viewed skeptically by those who did not create them. For instance, whereas the creators of a model may be intimately familiar with its strengths and limitations, leading to greater trust or confidence in the model’s utility and ability to generate insights, others may be more critical or cautious about accepting its predictions or conclusions due to a lack of familiarity with the assumptions that went into its development. But as the second part of the quotation emphasizes, the creators of the model who actually made the initial observations will also be more aware of potential errors, biases, or limitations in their methodology, and thus may be more cautious about the reliability of their own observations.
With Shapley’s quote in mind, if we cycle through the process depicted in Figure 1 we can begin to see how the various assumptions and values we have identified so far would become increasingly concealed, leading to greater uncertainty and potential sources of distrust. It is perfectly possible, for instance, for people to be working on a model for which they did not do any of the original model makings, and therefore be unaware of the philosophical context used in developing the original version of the model, or the encoded assumptions within the model that are inherited by successive generations of practitioners (e.g. a choice to use one “standardized” measurement scale over another). And, in many domains (engineering being one) the separation of practitioners from the model-making process (and the associated assumptions) is increasingly common as modeling becomes more frequently integrated into sophisticated software tools or packages.
To take just one example of how this separation could affect notions such as utility, insight, or trust, consider the choice about how to delineate and demarcate the boundaries of the modeled system when dealing with complex physical systems.
In his book, “Seeing Like a State”, James Scott provides a compelling account of how such choices about delineation or demarcation—choices that are inseparable from the process of modeling—can have significant societal impacts. Scott’s account deals with non-computational models, and focuses on the emergence of scientific agriculture (Scott, Reference Scott2020) in 18th and 19th Germany. However, it is still instructive for our paper, because Scott provides evidence of how foresters began to view forests primarily in terms of their commercial timber yield, leading to a model of forests that were orderly, regimented, predictable, and controllable, but also fundamentally mono-cultural and antithetical to the inherent diversity of the real forests. To put it another way, the utility of this scientific (but non-computational) model, optimized the production of a single commodity (timber) while ignoring other aspects of the forest’s ecosystem, such as undergrowth, soil health and nutrient cycles, wildlife habitats, and impacts on local communities who used trees for additional purposes (e.g. sap)—a myopic set of insights.
Initially, this approach led to increased timber yields and was seen as a success. But, because so much of the real-world (or physical system) fell outside of the scope of the model, over time these “simplified” forests became more vulnerable to diseases, pests, and environmental stresses due to how they were managed (or controlled). In summarizing his example, Scott provides a persuasive argument against modeling that is fundamentally reductionistic:
“The metaphorical value of this brief account of scientific production forestry is that it illustrates the dangers of dismembering an exceptionally complex and poorly understood set of relations and process in order to isolate a single element of instrumental value.”—
(Scott, Reference Scott2020, p.21, emphasis added).While some may wish to dismiss such concerns as falling beyond the scope of science, and instead falling within the jurisdiction of ethicists or policy-makers, this would be a naive position to hold.
In contrast, the claim we wish to defend here is that utility, trust, and insight are three key generic requirements (or properties) of models that should also extend to digital twinsFootnote 16. This should not be read as a dismissal of other important scientific values like fidelity, parsimony, cost, or optimality. Rather, we would argue that these characteristics will depend on the specific context of the model (or digital twin) (that is in contrast with the previous generic requirements). Parsimony, however, requires a further comment as it relates to one of the key principles to be discussed later in Section 4.
Essentially the parsimony principle for models means that a simpler model with fewer parameters is regarded as better than more complex models with more parameters, assuming that both models fit the observations similarly well. However, in recent years, and particularly in research related to living systems, cognitive science, and AI, there is a growing amount of evidence that does not favor parsimony. For example:
“AI researchers were beginning to suspect—reluctantly, for it violated the scientific canon of parsimony—that intelligence might very well be based on the ability to use large amounts of diverse knowledge in different ways,”—Pamela McCorduck,
(McCorduck, Reference McCorduck2004).See also discussions in Marsh and Hau (Reference Marsh and Hau1996); Huelsenbeck et al. (Reference Huelsenbeck, ANe, Larget and Ronquist2008); Hastie et al. (Reference Hastie, Tibshirani, Friedman and Friedman2009) (for example) relating to nonparsimonious modelsFootnote 17.
In summary, this section has explored a range of philosophical topics and concepts related to an idealized process of modeling. This will serve as an important context for the principles we set out in Section 4. However, the discussion has been based on an idealized model, with an emphasis on scientific representation. As such, it is important that we provide a more realistic account of the true complexity of such a process, as well as addressing some of the specifics of engineering practice.
3. Complexity in engineering systems
“Engineering is the art of modeling materials we do not wholly understand, into shapes, we cannot precisely analyze, to withstand forces we cannot properly assess, in such a way that the public has no reason to suspect the extent of our ignorance”—Dr. A. R. Dykes, from the British Institution of Structural Engineers President’s Address, 1978.
By the mid 20th Century, many researchers were identifying that complex behaviors occurred that could not be modeled using the reductionist paradigm. This included a diverse range of applications such as the long-established field of life sciences (see e.g. Weaver (Reference Weaver1948)), and new areas like information theory (Shannon, Reference Shannon1948), cybernetics (see e.g. Ashby (Reference Ashby1956)), operations research (see e.g. Churchman et al. (Reference Churchman, Ackoff and Arnoff1957)), and artificial intelligence (see e.g. Turing (Reference Turing1950)). Within these fields, many researchers were interested in studying emergent behaviors, and so were attracted to holistic philosophies for building models. All of these topics would become large fields of research in their own right, but they also all contributed to the three important current fields of study. The first is complexity science Footnote 18, which primarily focuses on emergent and adaptive behaviors (see e.g. Waldrop Reference Waldrop1993; Mitchell Reference Mitchell2009; Jensen Reference Jensen2022). Second is systems research which is focused on managing large-scale socio-technical systems (see for example Meadows (Reference Meadows2008)) and the related field of systems engineering. Lastly is artificial intelligence (see Russell and Norvig (Reference Russell and Norvig2010)), which has been very influential in developing new approaches to learning from data, knowledge modeling, and intelligent agents (amongst many other things).
When considering complex system behavior, it is possible to distinguish between different categories of a system based on linear versus nonlinear, ordered versus disordered, deterministic versus non-deterministicFootnote 19, reduced versus holistic, etc, and combinations of these categories. Here we will adopt the broad distinction that complex relates to a system that can have emergent behavior (i.e. “more is different”) whereas complicated relates to a system that is not “simple” but does not have interacting components that could lead to emergent behaviors.Footnote 20 These distinctions will be important when we discuss the principles related to digital twins in Section 4. We also note here that complexity techniques are already being promoted for digital twins of cities see for example Rozenblat and Fernández-Villacanas Reference Rozenblat and Fernández-Villacanas2023; Caldarelli et al. Reference Caldarelli, Arcaute, Barthelemy, Batty, Gershenson, Helbing, Mancuso, Moreno, Ramasco, Rozenblat, Sánchez and Fernández-Villacañas2023.
3.1. Types of complexity in engineering systems
Engineers are expected to design, build, commission, operate, maintain, manage, and decommission a huge range of different systems. The quote from A. R. Dykes at the start of this section gives a sense of the engineering process. Multiple categories of complex and uncertain factors (in this case materials, shapes, forces, and public expectations) need to be brought together to achieve the required task. Table 1 lists some of the types of complex (and/or complicated) phenomena that can arise in, or influence, physical systems.
Table 1. Examples of complex (and/or complicated) phenomena that can influence physical systems
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250207130253812-0594:S2632673625000048:S2632673625000048_tab1.png?pub-status=live)
It is typical for engineering applications to have multiple types of complexity contained within them from the list in Table 1. For example, geometric complexity and joints are used extensively in a wide range of manufactured products, as are sophisticated materials, such as composites. These different aspects of the manufactured product are often designed, modeled, and tested separately before being integrated into the final version of the product.
As the format of Table 1 Indicates, our usual method for dealing with mixed complexity is to separate it and consider each type independently. Usually, this is mapped onto our siloed (e.g. reduced) set of divisions within subject areas (and education system). Roles and specialisms are also then aligned with these divisions, creating teams of experts in each separate topic area.
Furthermore, unlike scientific inquiry, where the focus is on understanding and explaining the behavior we observe (as in complexity science), engineering is often required to create something new or deal with a socio-technical system that is highly complex/uncertain and is changing over timeFootnote 21. In order to try to address some of the related challenges, the field of systems engineering has developed some methodologies based on a holistic approach, which we discuss next.
3.2. Systems engineering
Systems engineering was developed during the 20th Century alongside the related other fields of systems research and complexity already described aboveFootnote 22 (Schlager, Reference Schlager1956). The underlying philosophy of systems engineering is that of holism, and the field has now matured into an established methodology for managing complex engineering projects (see e.g. Walden et al. Reference Walden, Roedler, Forsberg, Hamelin and Shortell2015; Hirshorn et al. Reference Hirshorn, Voss and Bromley2017). At the heart of current-day systems engineering is the role of processes, to enable the design, implementation, and management of the engineering application or project.
Systems engineering processes have evolved from being document-based to being model-based (Estefan et al., Reference Estefan2007), as technologies have improved to allow information to be captured with more automation and presented graphically. This approach underpins the diagrammatic approach to enterprise architecture (Dandashi et al., Reference Dandashi, Siegers, Jones and Blevins2006) and could be regarded as a predecessor to digital twinning. Indeed, digital twins that enable planning and design may be considered examples of model-based systems engineering as they facilitate the exchange of information, alignment of design, and management of programmatic complexity in the same way as now-traditional systems engineering documentation processes do.
The ethos of systems engineering is to give a framework that enables multiple uncertainties and complexities to be managed simultaneously, and for the technical processes to be aligned with the decision, management, and wider related business processes. It is important to make a clear distinction between working with “engineered systems” and the practice of engineering in complex systems. Confusingly, both can be called systems engineering, but the key distinction is that engineered systems can be controlled/optimized whereas complex systems typically cannot.
The systems engineering community has given a considerable amount of time and thought to the philosophical and pragmatic frameworks needed to deal with complex/complicated engineering applications. For example, in recent papers (Watson, Reference Watson2019; Watson et al., Reference Watson2019) 3 hypotheses for systems engineering were articulated:
H1. If a solution exists for a specific context, then there exists at least one ideal systems engineering solution for that specific context.
H2. System complexity is greater than or equal to the ideal system complexity necessary to fulfil all system outputs.
H3. Key stakeholders’ preferences can be represented mathematically.
We shall discuss these hypotheses further in Section 4.2, in the context of digital twins. But to mention just briefly, H1 relates to the concepts of existence & uniqueness. H2 is related to the idea of counter-parsimony, by which we mean choosing not the simplest model that fits the data, but the model with sufficient complexityFootnote 23. And, H3 is anticipating the stakeholders preference for quantitative solutions.
Other important concepts that are emphasized in systems engineering are the idea of the lifecycle of a system, requirements analysis, and hierarchies of systems that lead to systems-of-systems (see e.g Adams and Meyers (Reference Adams and Meyers2011))Footnote 24.
Although the subject borrows from and integrates, several of the concepts and methodologies from systems research and complexity science, it should be noted that some researchers have been critical of the systems theory ethos. For example, Micheal Grieves expresses reservations about treating everything as a process;
“We like to think that what we do in our organizations is process. Under systems theory, the process is a deterministic way of linking inputs to outputs. In a systems view of the world, we have inputs, processes, and outputs. For any given set of inputs, we get a well-specified and consistent set of outputs. It is all very neat and well defined.”
(Grieves, Reference Grieves2005, page 19).Grieves argues instead that not everything can be made a deterministic process, and that engineers need to make extensive use of practices as well, with results that lead to satisficing Footnote 25 instead of optimization (Grieves, Reference Grieves2005).
The broader point is that engineering contains some form of “art” (alluded to in the A. R. Dykes quote above) typically encoded in the form of attributes like engineering judgments and design choices Footnote 26. As much as many practitioners would like, these creative activities cannot be entirely turned into repeatable processes. It is interesting to note that some in the social science community, who have adapted systems thinking, have extended the concepts to include dialogue and create an architecture of evolution—see for example Christakis (Reference Christakis2006)Footnote 27.
Using more philosophical arguments, Weinbaum (Reference Weinbaum2015) describes systems theories as based on a “black box dogma” with unresolved clarity on issues relating to the role of feedback, evolutionary adaption, and causality.
In response to the criticisms, it is certainly true that the systems engineering approach favors defining multiple processes with associated inputs and outputs, and that in itself could be an over-constraining structural format for some applications. It is also true that the role of reductionism and deterministic modeling was strongly used in some of the early systems research fields, and some of that thinking has been inherited by the modern version of the field. Finally, creative activities cannot always be turned into processes, and we should recognize that.Footnote 28
As pragmatists, engineers often have little concern for this type of philosophical subtlety, but it should be borne in mind when these approaches are used in digital twins. Despite the limitations, systems engineering offers some useful tools for constructing digital twins, and the connections have already begun to be discussed in the literature—e.g. by Heber and Groll (Reference Heber and Groll2017); Schluse et al. (Reference Schluse, Priggemeyer, Atorf and Rossmann2018); Madni et al. (Reference Madni, Madni and Lucero2019); Jinzhi et al. (Reference Jinzhi, Zhaorui, Xiaochen, Jian and Dimitris2022); Michael et al. (Reference Michael, Pfeiffer, Rumpe and Wortmann2022); Olsson and Axelsson (Reference Olsson and Axelsson2023). However, we are seeking to simulate emergent behaviors, of the kind discussed in Section 2, which we will now return to.
3.3. Emergent behaviours
In the context of digital twins, the basic idea is to join components together to reconstruct the dynamic behavior of the combined system. The simplest case is joining two components.
In engineering, we make extensive use of numerical simulation tools that essentially reduce (e.g. break up) complex geometries and behaviors into an assemblage of simpler elements for which the behavior can be defined. These techniques, such as the finite element method, have evolved into sophisticated tools that are widely used to simulate the behavior of complex/complicated systems that cannot be captured using simpler modeling techniques (see e.g. Crisfield Reference Crisfield1997). The outputs from element-based methods are, in fact, emergent behaviors. This usually relates to field quantities such as stress, displacement, flow rate, or temperature, which are approximated as a form of “self-organization” between the elements, acting within the overall element-based model. Essentially, the overall behavior arises from local interactions between the multiple elements.
In addition to self-organization, there are other types of emergent behavior, and multiple authors have described how the various types might be categorized—see for example Ashby (Reference Ashby1956); Holland (Reference Holland2007); Frei and Serugendo (Reference Frei and Serugendo2012); Fernández et al. (Reference Fernández, Maldonado and Gershenson2014); Holland (Reference Holland2018); Tadić (Reference Tadić2019); Jensen (Reference Jensen2022) and references therein. Broadly speaking, the types of emergent behaviors range from relatively simple types, such as self-organization and synchronization, (Jensen, Reference Jensen2022), through to evolutionary forms of emergence (Kauffman, Reference Kauffman2000). The ability to make predictions for emergent behaviors is a significant capability that is seen as a very desirable functionality (Gershenson, Reference Gershenson2013), including for digital twins. We will return to discuss how digital twins might be expected to produce such behaviors, especially for very complicated applications in Section 4.3. Next, we consider the role of artificial intelligence for digital twins.
3.4. Artificial intelligence
The quest for artificial intelligence (AI) (as described, for example, by Nilsson (Reference Nilsson2009)) is multi-faceted, and has been driven by several different motivations. Those motivations include inspiration from human intelligence and other biological examples, the desire to create intelligent machines, and the application of AI to solve complex applied problems. There are multiple other facets, implementations, and deployments of AI, which we leave to the interested reader to explore—see for example Minsky (Reference Minsky1988); Nilsson (Reference Nilsson2009); Russell and Norvig (Reference Russell and Norvig2010); Haenlein and Kaplan (Reference Haenlein and Kaplan2019); Marcus (Reference Marcus2020) and references therein.
Russell and Norvig (Reference Russell and Norvig2010) use the unifying theme of intelligent agents in their comprehensive textbook on artificial intelligence. A current topic of interest is deep reinforcement learning, where agents are used (for example) to solve sequential decision-making problems, such as autonomous driving vehicles (see e.g. Kiran et al. (Reference Kiran, Sobh, Talpaert, Mannion, Al Sallab, Yogamani and Pérez2021)). Sequential decision-making problems are also highly relevant to digital twins, which by their nature are time evolving, and are also required to support a sequence of decision-making tasks.
Importantly for the digital twin paradigm, the AI work on agent-based methods has enabled more sophisticated multi-agent methods than previously developed either in complexity science or systems engineering (although there is now some cross-over between these topics (Vrabič et al., Reference Vrabič, Erkoyuncu, Farsi and Ariansyah2021). For example, techniques such as multi-agent reinforcement learning where the agents take actions and receive feedback in a highly adaptive manner (Graesser and Keng, Reference Graesser and Keng2019; Kiran et al., Reference Kiran, Sobh, Talpaert, Mannion, Al Sallab, Yogamani and Pérez2021).
In very general terms, it could be said that symbolic AI (such as logical reasoning) was the earliest to mature, but despite the success of some aspects, such as expert systems (Krishnamoorthy and Rajeev, Reference Krishnamoorthy and Rajeev2018), it has more recently been overtaken by sub-symbolic AI which has become the dominant force in AI in recent years, particularly deep learning (LeCun et al., Reference LeCun, Bengio and Hinton2015; Goodfellow et al., Reference Goodfellow, Bengio and Courville2016) and most recently large language models (Teubner et al., Reference Teubner, Flath, Weinhardt, van der Aalst and Hinz2023). In the past few years, some AI experts have been pointing out the limitations of connectionism, (Marcus, Reference Marcus2018), and there is a revised interest in the possibility of combining the two approaches in the form of neurosymbolic AIFootnote 29 (Belle, Reference Belle2022).
For the purposes of our discussion, we note the following points regarding AI for digital twins. Firstly, both learning and reasoning are highly desirable functions that we often want to build into our digital twin applications, meaning that AI techniques are very important in this respect. In addition, digital twins can be viewed as a method of deployment for AI and it is associated techniquesFootnote 30. There are multiple examples of this type of deployment—see for example DebRoy et al. (Reference DebRoy, Zhang, Turner and Babu2017); Farhat et al. (Reference Farhat, Chiementin, Chaari, Bolaers and Haddar2020); Kapteyn et al. (Reference Kapteyn, Knezevic and Willcox2020); Ritto and Rochinha (Reference Ritto and Rochinha2021); Tripura et al. (Reference Tripura, Desai, Adhikari and Chakraborty2023); Siyaev et al. (Reference Siyaev, Valiev and Jo2023)—and this is a topic we will return to later on. Finally, just like digital twins, AI still has no formally agreed overarching definition. In large part, this is because of the philosophical breadth of the topic—something which hopefully is described by the preceding discussionFootnote 31.
3.5. Other methods
Lastly in this section, we would like to mention that there are multiple other communities of researchers and practitioners that have developed sophisticated methods for modeling highly complex and uncertain applications. Some overlap with AI and other fields mentioned above, and others have developed their own areas of endeavor. For example (with just a few selected references) dynamical systems theory (Kuznetsov, Reference Kuznetsov2004; Strogatz, Reference Strogatz2019), data assimilation (Evensen et al., Reference Evensen2009; Kutz, Reference Kutz2013), Bayesian statistics (Barber, Reference Barber2012; Särkkä, Reference Särkkä2013; Gelman et al., Reference Gelman, Carlin, Stern and Dunson2014; Kruschke, Reference Kruschke2014), data mining (Hastie et al., Reference Hastie, Tibshirani, Friedman and Friedman2009; Han and Kamber, Reference Han and Kamber2022), game theory (Jones, Reference Jones2000), ensemble modeling (Zhou, Reference Zhou2019), spatiotemporal modeling (Banerjee et al., Reference Banerjee, Carlin and Gelfand2014), agent-based modeling (Abar et al., Reference Abar, Theodoropoulos, Lemarinier and O”Hare2017; Zhang et al., Reference Zhang, Valencia and Chang2021b), statistical relational learning (Getoor and Taskar, Reference Getoor and Taskar2007; Belle, Reference Belle2022), asymptotic theory (Van der Vaart, Reference Van der Vaart2000), time series analysis (Hamilton, Reference Hamilton2020), adaptive & nonlinear control (Åström and Wittenmark, Reference Åström and Wittenmark1995; Fradkov et al., Reference Fradkov, Miroshnik and Nikiforov1999; Barlow, Reference Barlow2002; Wagg and Neild, Reference Wagg and Neild2015), information theory (MacKay, Reference MacKay2003), network science, (Baker, Reference Baker2013), and optimization methods (Boyd and Vandenberghe, Reference Boyd and Vandenberghe2004) to name just a few.
4. Towards a philosophical framework for digital twins
“It ought to be remembered that there is nothing more difficult to take in hand, more perilous to conduct, or more uncertain in its success than to take the lead in the introduction of a new order of things.”—Niccol Machiavelli, The Prince, 1532.
As we discussed in Section 1, the ambitions for digital twins are set very high across a very wide spectrum of possible applications. In practice, we need to manage these high expectations and make clear what are the possibilities and limitations to using digital twins. To this end, in this section we develop the foundations for a philosophical framework within which we can build, evaluate, and better understand specific instances of digital twins.
As noted by Machiavelli, introducing something new is fraught with potential difficulties, and we argue that a firm philosophical foundation is an essential part of the process. However, it is important to note that we are not the first to attempt this goal. For instance, Korenhof et al. (Reference Korenhof, Blok and Kloppenburg2021) reviewed and critically analyzed the dominant conceptualizations of digital twins in the academic literature. In doing so, they raise the question, “If a digital twin is expected to actively intervene in a physical entity, is it really only a representation?.” Their answer is that DTs should be treated as “steering representations” that are used to “direct a physical entity towards certain goals by means of multiple representations.” Their proposal has considerable merit, and should likely figure (in some form) in a fully articulated and developed philosophical framework—one that can also be used to support ethical reasoning and decision-making about the societal impacts of digital twins. However, we do not wish to take this argument as our starting point without first considering the fundamental purpose of a digital twin ourselves. In Section 2.3 we argued that utility, trust, and insight are the three key generic properties we want for digital twins. These three characteristics form the basis of our characterization of purpose.
Specifically, we take utility to mean context-specific usefulness that relates to the application at hand, and is expressed as a set of functional requirements within the contextual setting that the digital twin operates—here, the contextual setting relates to the specific properties of the physical twin, such as its geometry, materials, the environment in which it is located or deployed and so forth. The functional requirements could be, for example, to support decisions, to learn patterns of behavior, or to develop more efficient ways of operation.
The attribute of (unbiased) trust is related to the uncertainty within the digital twin and is also connected to security, openness, and quality (Bolton et al., Reference Bolton, Butler, Dabson, Enzer, Evans, Fenemore, Harradence, Keaney, Kemp and Luck2018). Trust is therefore essential for supporting the functional requirements of the digital twin. Lastly, the role of insight is related to knowledge, but not just lists of facts, insight relates to enhanced understanding of the physical twin within the contextual setting. The insight(s) gained from the use of a digital twin could be some measurable improvement in understanding the behavior of the physical twin, or the learning acquired via the successful completion of a sequential decision-making problem(s) over time (such as mentioned in Section 3.4.)
Since the concept of digital twins was first suggested there has been lots of discussion and debate over what exactly the definition of a digital twins actually is—see for example Negri et al. (Reference Negri, Fumagalli and Macchi2017); Miller et al. (Reference Miller, Alvarez and Hartman2018); Wright and Davidson (Reference Wright and Davidson2020); Wagg et al. (Reference Wagg, Worden, Barthorpe and Gardner2020); Arthur et al. (Reference Arthur, French, Ganguli, Kinard, Kraft, Marks, Matlik, Fischer, Sangid, Seal, Tucker and Vickers2020). This is natural when the idea is new, but can be unhelpful to the overall debate at times. Therefore, in an attempt to give some additional clarity about digital twins, but without getting overly restricted by a technical definition (at least for now), a set of principles for digital twins is proposed here, based in part on the discussion above. While these principles fall short of establishing a full philosophical framework, they are anchored in the philosophical concepts discussed in the previous two sections and are set out in three categories: (a) what digital twins are, (b) what properties they should have, and (c) what they should enable.
We begin with what digital twins are. Digital twins are:
-
1. Holistic in nature, but may use reductionist ideas when appropriate. e.g. both the whole and the parts are considered important, in order to capture any heterogeneity;
-
2. Purpose driven where the clearly articulated useful purpose (or set of purposes) is underpinned by a set of functional requirements;
-
3. Time evolving dynamic systems that can reflect changes in the physical twin that occur over time via updating and evolution of the digital twin;
-
4. Context-specific representations that are bespoke to an individual physical twin, and which can be both artefacts (objects) and/or processes within the contextual setting;
-
5. Counter-parsimonious, meaning not seeking simplicity for its own sake, but instead aiming to reflect the required level of complexity—although they may make use of parsimonious concepts, when appropriate;
-
6. Reconstructivist, meaning they aim to reconstruct (some or all of) the behavior of a physical twin by assembling the components of the digital twin, including emergent behaviors; and
-
7. Biased, due to the philosophical worldviews of the communities that constructed them, but able to acknowledge the limitations that this brings.
Digital twins should have:
-
8. A set of components, which can include agents, models, networks, data sets, and other digital objects;
-
9. Access to real-world data, recorded/streamed from the physical twin, or its surrounding environment.
-
10. A means of dynamic assembly, so that the components can be connected, or otherwise integrated together in a time dependant way;
-
11. An operational platform, consisting of software, hardware and network infrastructure, including a user interface, data storage and other computational resources;
-
12. A method for representing and updating knowledge that is shared between the users and the digital twin;
-
13. A time-dependent connectivity to the physical twin, usually via an internet-of-things (IoT) network or similar, so that data, control, and other signals can be exchanged between the twins; and
-
14. An integration architecture that enables components and/or other parts of the digital twin to interoperate and/or federate with each other, and in some cases entire other digital twins.
Digital twins should enable:
-
15. Outputs to be produced that relate to observed quantities of interest (QoIs) in the physical twin and to the functional requirements;
-
16. Trust in the outputs to be expressed through processes of assurance, including validation, verification, and/or error detection and correctionFootnote 32 in order to account for relevant forms of uncertaintyFootnote 33;
-
17. Inheritance of (at least) some of the properties of the components within the digital twin (e.g. object-property inheritance, described below);
-
18. Interaction, such that the components are able to interact with the aim of reconstructing emergent behavior (s);
-
19. Learning both from data (e.g. QoIs and outputs), and more broadly from the deployment of advanced techniques such as those from AI, statistics, dynamical systems, etc;
-
20. Insights to be obtained that serve the purpose of the user, and maximize explainability and interpretability of the outputs; and
-
21. Exploitation of the insights to provide value (e.g. improved decisions, efficiency gains, etc.) and/or enable real-world actions to be taken such as control/scheduling actions for the physical twin.
These 21 principles incorporate the key attributes of a digital twin and capture the holism of systems engineering, emergent behaviors from complexity science, uncertainty analysis from statistics, time-evolution from dynamical systems theory, techniques from AI, control actions, and decision theories—amongst other things. We believe that such a set of principles is sufficiently versatile and universal to fit a wide range of digital twin applications, across multiple domains, whilst still capturing some of the most important specific aspects of digital twins.
However, to help justify this belief, we now consider how these philosophical principles can be applied to explain some common questions relating to digital twins.
4.1. Why is a digital twin, not a model?
We will offer more than a single answer to this particular question, all of which can coexist with each other. The first is shown in Figure 2 and relates to the connectivity of the physical twin and the digital object. Kritzinger et al. (Reference Kritzinger, Karner, Traar, Henjes and Sihn2018) make the following distinctions between three concepts which are shown schematically in Figure 2:
-
1. Digital model—no connection between virtual and physical (Figure 2a). This is the “traditional” approach to modeling in science and engineering.
-
2. Digital shadow—data received from a connection (e.g. over an IoT network) with the physical twin is used to update and “shadow” the state of the physical twin (Figure 2b). In this way, the digital shadow will evolve over time to reflect changes that occur in the physical twin.
-
3. Digital twin—as for the digital shadow, but with the addition of control actions, or interventions (in the case of a system that cannot be directly controlled) being given over the network to the physical twin domain (Figure 2c)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250207130253812-0594:S2632673625000048:S2632673625000048_fig2.png?pub-status=live)
Figure 2. Schematic diagram of a common way to interpret the distinction between a digital model and a digital twin, showing (a) a digital model, (b) a digital shadow, and (c) a digital twin. This is a common interpretation found within the literature and helps explain why a digital twin that is “bidirectionally connected” with a physical is best seen as a broader “cyber-physical system” rather than two separate components (see for example Kritzinger et al., Reference Kritzinger, Karner, Traar, Henjes and Sihn2018).
The 21 principles set out above relate to digital twins, but digital models and shadows could also be represented by selecting fewer principles to apply.
However, the model/shadow/twin explanation does not capture some aspects that we have discussed above relating to digital twins. Critics can point out that using existing terminology, Figure 2a shows a model. Figure 2b shows an updated model, and Figure 2c shows a control system. For example, the explanation given in relation to Figure 2 has little or no sense of timing or mechanisms. For example, when does the digital become connected to the physical? Is the data transfer to the shadow continuous or intermittent? Are the actions taken part of the digital twin or something separate. Another criticism is that Figure 2 does not show (or even anticipate) connections between digital twins, via federation.
Furthermore, it is difficult to understand the ideas of holism, or emergent behavior with the model/shadow/twin explanation. So, we believe it is useful to also suggest an additional explanation that can complement the rationale of Figure 2. This additional explanation relates to the use of models in digital twins, as we have described in this paper (e.g. as a combination of multiple digital objects). As a result, digital twins will have the property of object-property inheritance. Therefore, digital twins include models among their components, such that digital twins are more than just models (and models are not digital twins). In other words, a digital twin is something more than a model but can be used to perform functions that have been previously carried out using models, because it inherits the properties of the model. In general, object-property inheritance relates to all the components within a digital twin and will be explained in further detail in the next section.
4.2. What previously unseen results can we expect from a digital twin?
“It is the mark of an educated mind to rest satisfied with the degree of precision which the nature of the subject admits and not to seek exactness where only an approximation is possible”—Aristotle
(384 BC–322 BC).It will be fundamental to the purpose of a digital twin to establish whether the digital twin can produce an output that suits our particular purpose(s). As the quote from Aristotle reminds us, every output from a digital twin will most likely include (multiple) approximations, and we should be wary of seeking exactness beyond that which is possible. The “degree of precision,” as Aristotle puts it, relates to the fidelity of an output. However, before any attempt to assess fidelity can be made, we need to consider if a viable output for our particular purpose is possible.
Grieves and Vickers (Reference Grieves and Vickers2017) have considered how the outputs from digital twins could be used to anticipate the types of emergent behaviors that may arise. They proposed a categorization of outcomes for the digital twin that is shown in Fig. 3(a). Here there are four categories of outcome that depend on what the digital twin predicts and whether the predicted behavior was desirable in a design context (meaning the intended design) or undesirable (problematic and/or unwanted designs). This framework is then used iteratively to try and minimize the undesirable and unpredicted aspects as much as possibleFootnote 34.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250207130253812-0594:S2632673625000048:S2632673625000048_fig3.png?pub-status=live)
Figure 3. Schematic diagram showing (a) how the outputs from a digital twin might be able to predict emergent behaviors proposed by Grieves and Vickers (Reference Grieves and Vickers2017), and (b) the “Rumsfeld” matrix.
However, this approach also suffers from the problem of the need to know in advance what to include in the digital twin to get the desired outcome. As pointed out by Kauffman (Reference Kauffman2000) for example, this is a particular problem in the field of emergent behavior. In fact, problems relating to prior knowledge are well known in other fields, such as the domain of uncertainty and risk management (Okashah and Goldwater, Reference Okashah and Goldwater1994; Lanza, Reference Lanza2000). The “Rumsfeld” matrix captures the key issue as shown in Fig. 3(b).Footnote 35
In the Rumsfeld matrix, we create four categories based on what is known (e.g. meaning what we know at this present moment) and what could be known (e.g. all possible knowledge, if we had a way to access it). It should be clear that if we do not know something at the present moment, then we cannot include it in our digital twin, and therefore we can never access the “unknown unknowns” categoryFootnote 36. Knowing in advance, for example by prescribing a specific solution space, is a practical necessity for modeling, but will exclude the more advanced behaviors, particularly evolutionary forms of emergence—see for example Tononi et al. (Reference Tononi, Boly, Massimini and Koch2016); Kauffman (Reference Kauffman2000) and references therein.
To take one example, emergent behaviors are often modeled using multiple agents that interact with each other according to a predefined set of “rules”, typically relating to the environment and their nearest neighboring agents (Jensen, Reference Jensen2022). The idea has already been explored in a digital twin context by several authors—see for example Croatti et al. (Reference Croatti, Gabellini, Montagna and Ricci2020); Zheng et al. (Reference Zheng, Psarommatis, Petrali, Turrin, Lu and Kiritsis2020); Vrabič et al. (Reference Vrabič, Erkoyuncu, Farsi and Ariansyah2021); Clemen et al. (Reference Clemen, Ahmady-Moghaddam, Lenfers, Ocker, Osterholz, Ströbele and Glake2021); dos Santos et al. (Reference dos Santos, Montevechi, de Queiroz, de Carvalho Miranda and Leal2022). So, although the emergent behaviors are not necessarily known in advance, the rules for the agents have to be prescribed in advance, and so the rules are therefore known knowns. The emergence will be a product of the prescribed rules (as was the case for Deepmind AlphaGo algorithm (Silver et al., Reference Silver, Huang, Maddison, Guez, Sifre, Van Den Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot, Dieleman, Grewe, Nham, Kalchbrenner, Sutskever, Lillicrap, Leach, Kavukcuoglu, Graepel and Hassabis2016; Chouard, Reference Chouard2016)), and so if we have never observed a particular type of interaction before, it cannot be included in the digital twin. It also will not be in any of our previously recorded data sets, or associated data-based models.
With this in mind, let us consider what can be reasonably expected from digital twins in terms of emergent behaviors. Object-property inheritance can be interpreted as both related to individual components (objects) in the digital twin, and relational combinations of the components.Footnote 37 The relational combinations of the components are achieved using what we can call “dynamic assembly”—an example of which is described in the next section. Therefore, if a digital twin consists of n objects it would be expected to have a number (say d) of directly inherited properties that come from the n objects without any interaction between them. In addition, the digital twin would have a combinatoric number (say r) of relational properties, including any emergent behaviors, which are generated from the dynamic assembly. Note also that the combinatoric metric will depend on the specific context of the digital twin.
A simplified schematic example for a series of digital objects (e.g. components) is shown in Figure 4, where dynamic assembly methods are used to obtain interactions between the components. In Figure 4, the directly inherited properties are shown to come from the components, and relational properties come from the dynamic assembly of the components. Both direct and relational properties can be then used as digital twin outputs.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250207130253812-0594:S2632673625000048:S2632673625000048_fig4.png?pub-status=live)
Figure 4. Schematic diagram showing how the outputs from a digital twin might be created using a series of digital objects (e.g. the components of the digital twin). The directly inherited properties come from each of the components are grouped together. The relational properties, such as any reconstructed or emergent behaviors, come from the process of dynamic assembly. Both the directly inherited and relational properties can be used to form digital twin outputs.
It is important to emphasize that all the emergent (and non-emergent) behaviors observed in digital twin outputs are contained in the categories of known knowns, known unknowns, and unknown knowns, shown in Figure 3b. The unknown unknowns, shown in Figure 3b are not accessible to the digital twin by definition, and could only be known by the addition of new information not known at the current time.
As a result, assuming that the known knowns category is already well understood, it is the known unknowns, and particularly the unknown knowns categories where value can be obtained from using digital twins. Note that we would expect to see more previously unseen results from an ecosystem of connected digital twins (Nativi et al., Reference Nativi, Mazzetti and Craglia2021). This is simply because of the nature of systems—the more connections there are, the more potential there is for emergence. We now consider an example of dynamic assembly.
4.3. How can emergent behaviors be predicted using a digital twin?
Emergent behavior can be reconstructed via interaction. This can be achieved using certain components in digital twins (e.g. models, agents, etc.) which can be dynamically assembled (e.g. joined together) as was shown schematically in Fig. 4.
We, therefore, consider a digital twin to be made up of a series of digital components that will be combined in such a way that they can reconstruct the time-evolving behavior of the physical twinFootnote 38. Dynamic assembly is interpreted here based on the idea of creating “connectors” such that the resulting connections lead to interactions between the components with the aim of reconstructing emergent behavior. More specifically we define dynamic assembly as a method for connecting the components together such that the subsequent interactions enable emergent behaviors, including time-dependent (e.g. dynamic) behaviors.
We are using the terminology dynamic assembly to include a range of methods via which DT components might be able to interact. These include, but are not limited to:
-
• Coupling of system equations describing the behavior of the DT component. E.g. two separate models are “coupled” via the states and parameters and/or boundary/initial conditions.
-
• Synchronisation of all or part of the states of two DT components. E.g. in a similar way to methods such as hybrid testing, where synchronization is used to achieve a joining effect—see for example Gonzalez-Buelga et al. (Reference Gonzalez-Buelga, Wagg, Wallace, Neild and Macdonald2005).
-
• Organisation of DT components. E.g. into networks and/or hierarchies or more complex structures such as holarchies—see for example Calabrese et al. (Reference Calabrese, Amato, Lecce and Piuri2010); Cardin et al. (Reference Cardin, Derigent and Trentesaux2018).
-
• Interoperation via a bespoke protocol to enable communication between DT components that are not using standardized formats. For example, this is often the case when legacy systems are present.
As an example of dynamic assembly, consider the case where it is desired to create a digital twin of the transport systems in a city. Ideally, we would include all modes of transport, but for argument’s sake, let us assume we had separate models of the road, rail, and pedestrian traffic in a specific area of the city. Running the models separately misses the interactions between transport modes, so we might want to create a method for dynamically assembling the road, rail, and pedestrian traffic models togetherFootnote 39. In order to do this, we create a “dynamically assembled” system, where information is passed between the three simulations to represent the interactions between the transport modes. As a result of the interactions between models, they are (i) more likely to represent the real world, and (ii) made lead to emergent behaviors—see for example discussions in Ambra et al. (Reference Ambra, Caris and Macharis2019); Busse et al. (Reference Busse, Gerlach, Lengeling, Poschmann, Werner and Zarnitz2021); Wolf et al. (Reference Wolf, Dawson, Mills, Blythe and Morley2022); Jafari et al. (Reference Jafari, Kavousi-Fard, Chen and Karimi2023).
4.4. How can we assess the existence and uniqueness of digital twin outputs?
As we said above it will be fundamental to the purpose of a digital twin that some type of output exists that is relevant to the context of the physical twin.Footnote 40 One example of an output is to choose a quantity of interest. In the study of differential equations, an important underlying concept is the idea of the existence and uniqueness of a solution to the problem (Hirsch and Smale, Reference Hirsch and Smale1974; Guckenheimer and Holmes, Reference Guckenheimer and Holmes1983; King et al., Reference King, Billingham and Otto2003). The concept asks the questions (1) does a solution exist?, and (2) if it does, is it a unique solution? If the solution is nonunique, then other solutions will exist that also satisfy the same defined problemFootnote 41.
Although the idea of existence and uniqueness is typically applied in a deterministic worldview, in the absence of a developed theory for digital twins, we consider how questions (1) & (2) could be applied to the case of digital twins in general. To widen the application beyond the deterministic realm instead of “solution” (that typically implies a precise answer to a specific set of equation(s)), we will instead take the idea of an output from the digital twin.
In practical terms, there would appear to be two potential approaches (and at least one caveat) to determining the existence and uniqueness of digital twin outputs. The first approach is to rely on the object-property inheritance of the digital twin, so that if the underlying objects (components) in the digital twin have the property of existence and uniqueness, then the digital twin can also inherit those properties (under some defined conditions). For example, if the digital twin has an ordinary differential equation (ODE) as one component, and that ODE has solutions that exist and are unique, then the digital twin can also inherit those properties—see for example Han et al. (Reference Han, Niyato, Leung, Kim, Zhu, Feng, Shen and Miao2022); Area et al. (Reference Area, Fernández, Nieto and Tojo2022). The caveat is that the philosophical framework for differential equations is (almost always) deterministic, and so this will act as a limiting factor with this approach.
The second possible approach (either separately or in combination with the above) is to consider the behavior of the interconnections between components in the digital twin. It might be possible that the existence and uniqueness of digital twin outputs (e.g. the reconstructed behaviors) could be assessed using the information at the interface of components. Further work is needed to develop a more formal analysis relating to the existence of digital twin outputs.
Now turning to the question of uniqueness, it is perhaps obvious to state that digital twin outputs may or may not be unique. Nonuniqueness could be a major problem for digital twin users if they are expecting (or assuming) a unique output but do not obtain one. However, the precise nature of what is meant by the uniqueness of an output will depend on the context and components that make up the digital twin.
Finally, we note that nonuniqueness relates to a broader issue of spurious solutions and related problems such as missing solutions, and false emergent behaviors—this could be considered to be a failure mode of the digital twin. We will not consider these problems explicitly here, but we would need to consider the possibility of these outcomes when building a digital twin—see discussion in Grieves and Vickers (Reference Grieves and Vickers2017).
5. Conclusions and future directions for research
In this paper, we have explored and discussed key philosophical concepts that apply to the concept of digital twins—particularly, those that underpin the series of principles presented in Section 4, which could be used as the building blocks for a more complete philosophical framework. This discussion also enabled us to consider how the philosophical context could help define a purpose for a model, and it was concluded that utility, trust, and insight are the three key generic requirements of models that we wanted to extend to digital twins.
A key part of the digital twinning philosophy is representing complicated/complex systems. This was discussed in detail in Section 3, where we considered the limitations of traditional reductionist methodologies. We then discussed how systems engineering and complexity science had been used to attempt to overcome these limitations by adopting a more holistic worldview. In particular, we discussed the importance of modeling emergent behaviors, that cannot be captured in a reductionist paradigm. Importantly, it is interactions that lead to emergent behaviors, and these have to occur dynamically—depending on the exact context, we note that the environment might also influence the emergent behavior.
In Section 4 of the paper, we presented 21 principles set out in three categories; what digital twins are, what properties they should have, and what they should enable. We then used the 21 principles to consider some common questions that arise regarding digital twins. Namely, the questions were: Why is a digital twin not a model? What previously unseen results can we expect from a digital twin? How can emergent behaviors be simulated using a digital twin? How can we assess the existence and uniqueness of digital twin outputs? We do not claim to have provided definitive answers to these questions, rather we have used the philosophical principles to frame the questions in a way that might help provide more insight and understanding of the questions and the associated topics they relate to.
In concluding this paper, we draw together some further comments and open research questions.
5.1. Further comments and open questions
As a reflection of some of the key points raised in this paper, we offer the following further comments that lead to some open research questions.
-
1. Potential limitations of model-dependent realism. In Section 2, we mentioned the concept of model-dependent realism, which commits us to the following three beliefs/attitudes:
-
(a) Pragmatism: a digital twin (or model) is deemed successful if it is able to explain and predict phenomena according to some validation criteria (e.g. making observations). The issue of realism versus non-realism is effectively side-stepped.
-
(b) Utility as an over-arching value for digital twin (or model): the new emphasis is on the utility of a digital twin output(s) rather than on finding a digital twin (or model) that is ontologically “true” in terms of representing the behavior (s) of the physical twin.
-
(c) Pluralism: as there may be multiple digital twin output(s) that adequately describe the same phenomena, or have similar levels of utility, the choice between different twins may depend on additional (so-called, extra-theoretic virtues)—which also links to the issue of uniqueness of outputs.
Furthermore, model-dependent realism is developed from a scientific worldview that is focused entirely on explaining the physical behavior of the Universe we live in. It could be considered that the “direction-of-fit” is one-way. In other words, the definition of utility is focused primarily on “representation” or “description”. For engineering problems, we also need to consider other factors, such as (i) the consequences of utility on subsequent actions taken, such as decisions and interventions in the real world, and (ii) it could be the case that there is no physical system to represent if we are trying to engineer something completely new. In both these cases, the argument for a philosophy built on model-dependent realism is more difficult to make, and leaves open the question of whether there is a more appropriate philosophical approach in these cases? We note also, that more formally the utility, trust, and insight requirements could be contextualized using a more detailed philosophical analysis such as that proposed by Douglas (Reference Douglas2013) which distinguishes between internal consistency (minimal criteria) and external consistency (an ideal desiderata, presuming general confidence in other scientific theories). While internal consistency is a minimum requirement for acceptance of any scientific theory, external consistency is not, as it depends on confidence in other theories and external bodies of knowledge.
-
-
2. Emergence is counter-parsimonious. As was described in Section 4.2, a digital twin will only be able to exhibit behaviors within the constraints of the choices and assumptions that have been made during its construction. Therefore the less simplification in the process of constructing the digital twin, the more likelihood there is for a wider range of emergent behaviours to be exhibited in the subsequent digital twin outputs. The aim stated in Principle 5 (and system engineering Hypothesis H2 from Section 3.2) is to represent the observed complexity rather than seek parsimony. Furthermore, it is also possible that if the digital twin maker has been too parsimonious (and/or biased in worldview), there is a possibility of creating a digital twin that is only capable of reinforcing your own (or an inherited) prejudicial expectation. The exact relationship between emergent behaviors and parsimony is an open question.
-
3. Purpose dictates your parsimony. Following on from the comments above, digital twins developed for different purposes will enable different levels of parsimony to be used. Therefore, care should be exercised if transferring a digital twin developed for one purpose into a new domain or purpose. One way to help mitigate these effects is to make use of error detection and correction (EDAC) (MacKay, Reference MacKay2003). Similar comments relate to the inter-operation or federation of digital twins that might have been constructed using different levels of assumed parsimony. It is an open question of how such systems might be integrated systematically.
-
4. Validation of digital twins. Three comments can be made regarding the validation of digital twin outputs:
-
(a) In general the validation of a digital twin is context-specific and will be relevant to a specific applicationFootnote 42. In some cases, validation can be defined as a function of utility, where the metric of validity relates to the output of some utility function. This situation enables a strong connection with the model-dependent realism philosophy.
-
(b) In some applications, the accuracy of a digital twin output does not serve well as a universal metric for validation. For example, from a control perspective, the stability and robustness of a predictive model might be more important than the tracking accuracy of any particular output.
-
(c) In Section 4, we presented a framework for defining what potential outputs can be expected from a digital twin. When a system is relatively “simple,” it is often possible to know in advance what behavior to expect, and therefore validate the output quite easilyFootnote 43. Cases where we cannot know what to expect in advance will obviously be more challenging to validate, and there is ongoing research as to how this might be most effectively achieved.
Next, we propose several areas for future research development.
-
5.2. Future directions for research
-
1. Human factors. Broadly this area of research, as it applies to digital twins, includes topics, such as (i) the role of humans in designing and building digital twins (partially discussed in Section 2.2), (ii) how human users interface with digital twins and act on the outputs they receive, and (iii) digital twins as instances of sociotechnical systems that include humans in some way (e.g. medicine or social systems. Early work in this area includes Nguyen (Reference Nguyen2022); Lin et al. (Reference Lin, Chen, Ali, Nugent, Ian, Li, Gao, Wang, Wang and Ning2022); Sun et al. (Reference Sun, Tian, Fu, Geng and Liu2021); Fan et al. (Reference Fan, Zhang, Yahja and Mostafavi2021).
-
2. Ethical, legal, and societal issues. In their original context as tools for product engineering, digital twins raised a (relatively) narrowly circumscribed set of ethical, legal, and societal issues (e.g. safety compliance). However, as digital twins are now used increasingly to represent not just products or objects, but living entities and systems (from the cellular level to whole ecosystems) they enable new forms of knowledge generation (that is principles 19 and 20: learning and insights obtained from the digital twin and means for interacting with and influencing the coupled physical systems (principle 21: exploiting the derived value of the relevant insights). A number of papers have already explored a variety of normative issues that arise in the context of digital twins, especially in high-stakes and fault-intolerant environments such as health and healthcare (Kuersten, Reference Kuersten2023; Huang et al., Reference Huang, Kim and Schermer2022; Tigard, Reference Tigard2021; Popa et al., Reference Popa, Van Hilten, Oosterkamp and Bogaardt2021; Korenhof et al., Reference Korenhof, Blok and Kloppenburg2021; Braun, Reference Braun2021). In combination with current and emerging frameworks for regulation, governance, and assurance, these analyses provide significant value for identifying and mitigating possible risks that could arise when deploying and using digital twins within society (e.g. unintended behaviors caused by model drift, data privacy, and security violations). There is a lot to explore here, and the presence of a robust and comprehensive philosophical framework could provide a systematic means for grounding and evaluating the myriad normative issues associated with digital twins.
-
3. Methods for dynamic assembly. In practical terms, one of the most interesting areas for future research is methods that enable dynamic assembly of digital objects within a digital twin. As we have already noted, dynamic assembly is the method by which we can recreate interaction within the digital twin, and thereby reconstruct emergent behaviors. There are already techniques, such as agent-based modeling including intelligent agents, and heterogeneous multi-agents, as discussed and reviewed in Sections 3.3 and 3.4. Such models have the potential to recreate the type of multi-level interactions that occur in a complex system, including socio-economic systems (see for example Yossef Ravid and Aharon-Gutman (Reference Yossef Ravid and Aharon-Gutman2022); Wang et al. (Reference Wang, Qin, Li, Yuan and Wang2020); Okita et al. (Reference Okita, Kawabata, Murayama, Nishino and Aichi2019); Tadić (Reference Tadić2019)). However, creating appropriate “connectors” for heterogenous sets of digital objects is an open area of research, several of the most obvious potential methods were listed in Section 4.3. The scope for new developments in this area is significant.
-
4. The role of knowledge. While this relates to the topic of human factors listed above, it is significant enough to warrant a separate discussion point—in particular, the role of knowledge and insight, in supporting subsequent actions taken, such as decisions. One way we can distinguish this topic from human factors is the idea of removing the process from the human, and automating the action/decision process, possibly using some form of artificial intelligence. From a practical perspective, a starting point would be to establish with more rigor what knowledge means in the context of a digital twin, particularly linking to topics such as knowledge representation, inference, and model interpretability.Footnote 44 This would extend and complement much of the existing work, which primarily focuses on ontologies (e.g. Nguyen, Reference Nguyen2022; Akroyd et al., Reference Akroyd, Mosbach, Bhave and Kraft2021; Singh et al., Reference Singh, Shehab, Higgins, Fowler, Reynolds, Erkoyuncu and Gadd2020; West, Reference West2011).
-
5. Logic versus learning. In Section 3.4 we touched upon symbolic and neurosymbolic AI but did not explicitly discuss the types of logical approaches that could be applied to digital twins. There has long been a philosophical discussion about how logic, learning, and probability interrelate (see for example the discussion in Belle (Reference Belle2022)). This is an interesting topic that has several relevant questions for digital twin research. For example, is there an underlying logical methodology relating to digital twins, or is the logic dependent on the context? How is the logic and learning combined? How does the logic relate to a “top-down” versus “bottom-up” approach to creating a digital twin?Footnote 45 It should also be noted that statistical relational learning and hyperdimensional computing are novel approaches that enable knowledge representation, logic, and learning to be brought together (Getoor and Taskar, Reference Getoor and Taskar2007; Thomas et al., Reference Thomas, Dasgupta and Rosing2021). These processes offer the possibility of bringing logic more formally into the digital twin operation. The exact details of how this might work are an open question.
Data availability statement
There is no new data or code associated with this current work.
Acknowledgments
The authors would like to acknowledge the support of The Alan Turing Institute Research and Innovation Cluster for Digital Twins (TRIC-DT). In addition, the authors would also like to thank the DTNet+ Leadership Team members, the TRIC-DT Co-Directors, Mark Girolami, Ben MacArther, Rebecca Ward, Tim Rogers, Nikos Dervilis, Matthew Bonney, Xiaoxue Shen, Ziad Gauch, Matthew Tipuric, Prajwal Devaraja, Saeid Taghizadeh for insightful discussions about these topics.
Author contribution
DW created the initial framework and drafted the initial version of this paper. All authors have contributed to the content and read and approved the final manuscript.
Funding statement
This work was funded by the Alan Turing Institute Research and Innovation Cluster for Digital Twins. The authors would also like to acknowledge the funding support of UKRI via the grants EP/Y016289/1, EP/R006768/1, and the UKRI BRAID program for funding the Trustworthy and Ethical Assurance of Digital Twins (TEA-DT).
Competing interest
The authors declare that they have no conflict of interest.
Ethical standard
The research meets all ethical guidelines, including adherence to the legal requirements of the study country.
Comments
No Comments have been published for this article.