The main aim of this article is to describe the visual perception process of anticipatory rounding gestures in [iC(CCCC)y] sequences by considering the characteristics that contribute to their production (articulatory configurations, temporal data, and kinematic events). Productions of two French speakers were analyzed to obtain the data needed to interpret the results of a perception test composed of truncated visual sequences using the gating paradigm. The results indicate that the perceptually effective portion of the gesture usually begins when a significant velocity peak is observed. In contrast, if the sequence has no prominent velocity peak, the rounded vowel can be recognized only when the labial configurations are closer to the articulatory target. The results can be interpreted on the basis of general models for movement perception, in this case representational momentum.