There are four reasons why integrative experiment design as developed by Almaatouq et al. should be extended to thought-experiments.
First, while thought-experiments differ from “real” experiments in being performed only mentally (Kuhn, Reference Kuhn1977), there is a close connection: “thought experiment is experiment (albeit a limiting case of it)” (Sorensen, Reference Sorensen1992, p. 3). One tempting idea is that a thought-experimenter performs an experiment on herself: She runs her own cognitive processes off-line with a certain hypothetical input and then reports the outcome. For instance, in the moral machine experiment (Awad et al., Reference Awad, Dsouza, Kim, Schulz, Henrich, Shariff and Rahwan2018), participants are faced with a certain decision situation. They take the hypothetical situation as an input for running their decision-making process off-line (without actually taking a decision) and report the outcome.
Second, the practice of thought-experimenting is widespread in science, including the social and behavioural sciences. Here, thought-experiments play a role in producing, justifying, and refuting scientific theories. Typical examples from the social sciences are Keynes's beauty contest scenario (Kornberger & Mantere, Reference Kornberger and Mantere2020), DuBois's colour line scenario, or Addams's scenario where only women are allowed to vote (Hill, Reference Hill2005). Moral machine experiments started from ethical thought-experiments and extended them. Participants were invited to perform a thought-experiment, and their responses were treated as data in a real social experiment. The widespread use of thought-experiments calls for methodological reflection.
Third, the practice of thought-experimenting aggravates the incommensurability problems with real experiments. The limited description of a thought-experiment leaves many details unspecified. Thought-experiments are performed only occasionally, one after the other, without coordinating them with each other and with theorizing. At most they follow the scheme “question → theory → hypothesis → experiment → analysis → revision to theory → repeat,” as for instance Gettier experiments in epistemology (Gettier, Reference Gettier1963; Praëm & Steglich-Petersen, Reference Praëm and Steglich-Petersen2015). They are not systematically controlled for variables like implicit biases of the experimenter. There is no systematic variation in their explicit input. Completeness with respect to the target theory is not prioritized. The trolley case and its numerous variations are a clear example of how these problems affect thought-experiments (Dewitt, Fischhoff, & Salin, Reference Dewitt, Fischhoff and Salin2019; Foot, Reference Foot1967).
These issues are not sufficiently discussed in the literature. The latter addresses what a thought-experiment is (Sorensen, Reference Sorensen1992), how thought-experiments are processed (Nersessian, Reference Nersessian2007), whether and how they can provide evidence without empirical observation (Brown, Reference Brown1991), what their function and scope is (Praëm & Steglich-Petersen, Reference Praëm and Steglich-Petersen2015), how they relate to arguments (Norton, Reference Norton2004), and how to formalize them (Dohrn, Reference Dohrn2018; Williamson, Reference Williamson2007). If incommensurability problems are discussed, then mostly with a sceptical twist (Machery, Reference Machery2017). What is lacking is a more constructive methodology for when and how to perform thought-experiments in a coordinated manner that contributes to building and testing theories.
Fourth, thought-experimenting can be a useful or even indispensable device for integrative experiment design. It arguably plays a role in coming up with such a design, and the perspective on thought-experiments naturally supplements the design space.
Having outlined four reasons for giving thought-experiments a role in integrative experiment design, we distinguish two directions of relevance. Thought-experiments are relevant to integrative experiment design and vice versa.
Integrative experiment design proceeds from a research question to a design space of theories and possible experiments with regards to relevant dimensions of independent variables. Thought-experiments are relevant to this procedure.
First, thought-experiments are heuristically useful in ruling out certain theories and focusing on others. For instance, Galilei's thought-experiments ruled out the Aristotelian theory of motion and oriented scientists towards Newtonian mechanics. Moreover, Almaatouq et al. suggest that candidate dimensions of relevant variables are taken “either from the literature or from experience” (target article, sect. 3.1, para. 6). Yet before the advent of integrative design, neither literature nor experiments were guaranteed to systematically survey relevant dimensions. Thought-experimenting provides an efficient heuristic for identifying these dimensions. Take the moral machine experiment: One has to check both variables explicitly fixed (rich vs. poor, old vs. young, etc.) and inexplicit variables that may have an influence on the decision (nationality, education, religion, etc.). One heuristic step in identifying such variables is a mental simulation of their impact on one's own decision making.
Second, integrative design proceeds by identifying a universe of possible experiments in a domain of inquiry. Since not all of these experiments can be really performed, one has to define an order of priority and a coordinating design plan for the experiments to be really performed. Beyond the plan of real experiments, there are those possible experiments in the universe which are dismissed as irrelevant, but there are also those which are relevant but not really performed for various motives: Too complicated, too costly, unethical, or simply too numerous. Sometimes one can reliably anticipate the result of an experiment without having to really perform it. Therefore, among the relevant experiments which are not part of the planned real experiments, one may select those to be performed as thought-experiments. These thought-experiments also should be ordered according to their priority and coordinated with regards to the explicit and implicit parameters to be controlled for.
The last paragraph already shows how integrative experiment design is relevant for the practice of thought-experimenting. Until now, thought-experiments have been largely performed in an anarchic manner. The proposed extension of integrative experiment design lends guidance to performing them methodically. It surveys potential thought-experiments, subjects them to an order of priority, and fixes parameters for explicit variation and parameters to be controlled for. It renders experimental settings comparable and reduces problems with incommensurability.
We note in closing that the anarchic mode of performing thought-experiments may sometimes be useful in playing a disruptive role (Stuart, Reference Stuart2020). Such a role may not be captured by integrative design, but integrative design does not exclude it either.
There are four reasons why integrative experiment design as developed by Almaatouq et al. should be extended to thought-experiments.
First, while thought-experiments differ from “real” experiments in being performed only mentally (Kuhn, Reference Kuhn1977), there is a close connection: “thought experiment is experiment (albeit a limiting case of it)” (Sorensen, Reference Sorensen1992, p. 3). One tempting idea is that a thought-experimenter performs an experiment on herself: She runs her own cognitive processes off-line with a certain hypothetical input and then reports the outcome. For instance, in the moral machine experiment (Awad et al., Reference Awad, Dsouza, Kim, Schulz, Henrich, Shariff and Rahwan2018), participants are faced with a certain decision situation. They take the hypothetical situation as an input for running their decision-making process off-line (without actually taking a decision) and report the outcome.
Second, the practice of thought-experimenting is widespread in science, including the social and behavioural sciences. Here, thought-experiments play a role in producing, justifying, and refuting scientific theories. Typical examples from the social sciences are Keynes's beauty contest scenario (Kornberger & Mantere, Reference Kornberger and Mantere2020), DuBois's colour line scenario, or Addams's scenario where only women are allowed to vote (Hill, Reference Hill2005). Moral machine experiments started from ethical thought-experiments and extended them. Participants were invited to perform a thought-experiment, and their responses were treated as data in a real social experiment. The widespread use of thought-experiments calls for methodological reflection.
Third, the practice of thought-experimenting aggravates the incommensurability problems with real experiments. The limited description of a thought-experiment leaves many details unspecified. Thought-experiments are performed only occasionally, one after the other, without coordinating them with each other and with theorizing. At most they follow the scheme “question → theory → hypothesis → experiment → analysis → revision to theory → repeat,” as for instance Gettier experiments in epistemology (Gettier, Reference Gettier1963; Praëm & Steglich-Petersen, Reference Praëm and Steglich-Petersen2015). They are not systematically controlled for variables like implicit biases of the experimenter. There is no systematic variation in their explicit input. Completeness with respect to the target theory is not prioritized. The trolley case and its numerous variations are a clear example of how these problems affect thought-experiments (Dewitt, Fischhoff, & Salin, Reference Dewitt, Fischhoff and Salin2019; Foot, Reference Foot1967).
These issues are not sufficiently discussed in the literature. The latter addresses what a thought-experiment is (Sorensen, Reference Sorensen1992), how thought-experiments are processed (Nersessian, Reference Nersessian2007), whether and how they can provide evidence without empirical observation (Brown, Reference Brown1991), what their function and scope is (Praëm & Steglich-Petersen, Reference Praëm and Steglich-Petersen2015), how they relate to arguments (Norton, Reference Norton2004), and how to formalize them (Dohrn, Reference Dohrn2018; Williamson, Reference Williamson2007). If incommensurability problems are discussed, then mostly with a sceptical twist (Machery, Reference Machery2017). What is lacking is a more constructive methodology for when and how to perform thought-experiments in a coordinated manner that contributes to building and testing theories.
Fourth, thought-experimenting can be a useful or even indispensable device for integrative experiment design. It arguably plays a role in coming up with such a design, and the perspective on thought-experiments naturally supplements the design space.
Having outlined four reasons for giving thought-experiments a role in integrative experiment design, we distinguish two directions of relevance. Thought-experiments are relevant to integrative experiment design and vice versa.
Integrative experiment design proceeds from a research question to a design space of theories and possible experiments with regards to relevant dimensions of independent variables. Thought-experiments are relevant to this procedure.
First, thought-experiments are heuristically useful in ruling out certain theories and focusing on others. For instance, Galilei's thought-experiments ruled out the Aristotelian theory of motion and oriented scientists towards Newtonian mechanics. Moreover, Almaatouq et al. suggest that candidate dimensions of relevant variables are taken “either from the literature or from experience” (target article, sect. 3.1, para. 6). Yet before the advent of integrative design, neither literature nor experiments were guaranteed to systematically survey relevant dimensions. Thought-experimenting provides an efficient heuristic for identifying these dimensions. Take the moral machine experiment: One has to check both variables explicitly fixed (rich vs. poor, old vs. young, etc.) and inexplicit variables that may have an influence on the decision (nationality, education, religion, etc.). One heuristic step in identifying such variables is a mental simulation of their impact on one's own decision making.
Second, integrative design proceeds by identifying a universe of possible experiments in a domain of inquiry. Since not all of these experiments can be really performed, one has to define an order of priority and a coordinating design plan for the experiments to be really performed. Beyond the plan of real experiments, there are those possible experiments in the universe which are dismissed as irrelevant, but there are also those which are relevant but not really performed for various motives: Too complicated, too costly, unethical, or simply too numerous. Sometimes one can reliably anticipate the result of an experiment without having to really perform it. Therefore, among the relevant experiments which are not part of the planned real experiments, one may select those to be performed as thought-experiments. These thought-experiments also should be ordered according to their priority and coordinated with regards to the explicit and implicit parameters to be controlled for.
The last paragraph already shows how integrative experiment design is relevant for the practice of thought-experimenting. Until now, thought-experiments have been largely performed in an anarchic manner. The proposed extension of integrative experiment design lends guidance to performing them methodically. It surveys potential thought-experiments, subjects them to an order of priority, and fixes parameters for explicit variation and parameters to be controlled for. It renders experimental settings comparable and reduces problems with incommensurability.
We note in closing that the anarchic mode of performing thought-experiments may sometimes be useful in playing a disruptive role (Stuart, Reference Stuart2020). Such a role may not be captured by integrative design, but integrative design does not exclude it either.
Financial support
This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.
Competing interest
None.