Controversies in economics often fizzle out unresolved. One reason is that, despite their professed empiricism, economists find it hard to agree on the interpretation of the relevant empirical evidence. In this paper I will present an example of a controversial issue first raised and then solved by recourse to laboratory experimentation. A major theme of this paper, then, concerns the methodological advantages of controlled experiments. The second theme is the nature of experimental artefacts and of the methods devised to detect them. Recent studies of experimental science have stressed that experimenters are often merely concerned about determining whether a certain phenomeonon exists or not, or whether, when, and where it can be produced, without necessarily engaging in proving or disproving any theoretical explanation of the phenomenon itself. In this paper I shall be concerned mainly with such a case, and focus on the example of preference reversals, a phenomenon whose existence was until quite recently denied by the majority of economists. Their favourite strategy consisted in trying to explain the phenomenon away as an artefact of the experimental techniques used to observe it. By controlled experimentation, as we shall see, such an interpretation has been discredited, and now preference reversals are generally accepted as real. The problem of distinguishing an artefact from a real phenomenon is related to methodological issues traditionally discussed by philosophers of science, such as the theory-ladenness of observation and Duhem's problem. Part of this paper is devoted to clarifying these two philosophical problems, and to arguing that only the latter is relevant to the case in hand. The solutions to Duhem's problem devised by economic experimentalists will be presented and discussed. I shall show that they belong in two broad categories: independent tests of new predictions derived from the competing hypotheses at stake, and ‘no-miracle arguments’ from different experimental techniques delivering converging results despite their being theoretically independent.