Recent embodied theories of meaning known as ‘simulation semantics’ posit that language comprehension engages, or even amounts to, mental simulation. What is meant here by ‘language comprehension’, however, deviates from the perspectives on interpersonal communication adhered to by researchers in social psychology and interactional linguistics. In this paper, we outline four alternative perspectives on comprehension in spoken interaction, each of which highlights factors that have remained largely outside the current purview of simulation theories. These include perspectives on language comprehension in terms of (i) striving for inter-subjective conformity; (ii) recognition of communicative intentions; (iii) prediction and anticipation in a dynamic environment; and (iv) integration of multimodal cues. By contrasting these views with simulation theories of comprehension, we outline a number of fundamental differences in terms of the kind of process comprehension is assumed to be (passive and event-like versus active and continuous), as well as the kind of stimulus that language is assumed to be (comprising unimodal units versus being multimodal and distributed across conversational turns). Finally, we discuss potential points of connection between simulation semantics and research on spoken interaction, and touch on some methodological implications of an interactive and multimodal reappraisal of simulation semantics.