Embodied models of language comprehension are based on the assumption that words become associated with sensorimotor experiences during initial word learning. To test this hypothesis, adult participants learned artificial words as labels for novel objects in a multisensory environment. In a word learning phase, novel objects were located in the participant’s upper or lower visual field and participants learned the objects’ names by interacting with them. In a test phase, participants responded to the color of the words with either an upwards or a downwards directed arm movement in a Stroop-like paradigm. Responses were fastest when the movement direction was compatible with the word’s referent location (i.e., the location of the novel object in vertical space) during the learning phase. This finding suggests that sensorimotor experiences become associated with words during initial word learning. The results of the current study and implications for language learning are discussed.