In the audio-visual Stroop matching task, participants compare one Stroop stimulus dimension (e.g., the color of a written word) to a second stimulus (e.g., a spoken word) and indicate whether these two stimuli match or mismatch. Slower responses on certain trials can be due to conflict which occurs between color representations (semantic conflict) or due to conflict between responses evoked by task comparisons (response conflict). The contribution of these conflicts has been investigated with color word distracters. This is the first study which explores how two types of first- and second-language words affect audio-visual matching. Native French speakers performed a bilingual Stroop matching task with intermixed French (L1) and English (L2) color words (Experiment 1) and color associates (Experiment 2) presented in congruent and incongruent colors simultaneously with spoken French color words. Participants were instructed to indicate whether the spoken word “matches” or “mismatches” the font color, while ignoring written word meaning. Interestingly, the results were similar for the critical “mismatch” trials for both French and English words. The responses were the fastest on trials in which task comparisons activate fewer response alternatives, supporting the assumption of the response conflict account.