Movement scientists have proposed to ground the relation between prosody and gesture in ‘vocal-entangled gestures’, defined as biomechanical linkages between upper limb movement and the respiratory–vocal system. Focusing on spoken language negation, this article identifies an acoustic profile with which gesture is plausibly entangled, specifically linking the articulatory behaviour of onset consonant lengthening with forelimb gesture preparation and facial deformation. This phenomenon was discovered in a video corpus of accented negative utterances from English-language televised dialogues. Eight target examples were selected and examined using visualization software to analyse the correspondence of gesture phase structures (preparation, stroke, holds) with the negation word’s acoustic signal (duration, pitch and intensity). The results show that as syllable–onset consonant lengthens (voiced alveolar /n/ = 300 ms on average) with pitch and intensity increasing (e.g. ‘NNNNNNEVER’), the speaker’s humerus is rotating with palm pronating/adducing while his or her face is distorting. Different facial distortions, furthermore, were found to be entangled with different post-onset phonetic profiles (e.g. vowel rounding). These findings illustrate whole-bodily dynamics and multiscalarity as key theoretical proposals within ecological and enactive approaches to language. Bringing multimodal and entangled treatments of utterances into conversation has important implications for gesture studies.