No CrossRef data available.
Published online by Cambridge University Press: 07 June 2004
The mid-frequencies and bandwidths of formants 1–5 were measured at targets, at plus 0.01 s and at minus 0.01 s off the targets of vowels in a 100-word list read by five male and five female speakers, for a total of 3390 10-variable spectrum specifications. Each of the six Polish vowel phonemes was represented approximately the same number of times. The 3390* 10 original-data matrix was processed by probabilistic neural networks to produce a classification of the spectra with respect to (a) vowel phoneme, (b) identity of the speaker, and (c) speaker gender. For (a) and (b), networks with added input information from another independent variable were also used, as well as matrices of the numerical data appropriately normalized. Mean scores for classification with respect to phonemes in a multi-speaker design in the testing sets were around 95%, and mean speaker-dependent scores for the phonemes varied between 86% and 100%, with two speakers scoring 100% correct. The individual voices were identified between 95% and 96% of the time, and classifications of the spectra for speaker gender were practically 100% correct.