VOICE WITHOUT SPEAKER: HUMAN SPEECH SYNTHESIS IN ACOUSTIC INSTRUMENTAL CONTEXTS

Andrew Chen

doi:10.1017/S0040298223000098

VOICE WITHOUT SPEAKER: HUMAN SPEECH SYNTHESIS IN ACOUSTIC INSTRUMENTAL CONTEXTS

Published online by Cambridge University Press: 12 July 2023

Andrew Chen

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This article provides an overview of compositional practices that invoke representations of the human voice by acoustic means, without the presence of actual human speakers, and, in identifying the limitations and deficiencies of extant practice, proposes a new method or methodological basis for achieving the same phenomenon, to different ends. Clarence Barlow's seminal synthrumentation method forms a central point of discussion; when considered alongside the work of, among others, Mayuzumi, Grisey and Ablinger, it becomes clear that, in existing synthrumental (or instrumental synthesis) practices, the innate inharmonic components of target material for reproduction have largely been discarded or overlooked. After a brief overview of the basic linguistic and perceptual features of speech, a new method designed to evoke whispered speech, with inharmonicity in mind, is demonstrated.

Type: RESEARCH ARTICLE
Information: Tempo , Volume 77 , Issue 305 , July 2023 , pp. 60 - 71

DOI: https://doi.org/10.1017/S0040298223000098 [Opens in a new window]
Copyright: Copyright © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

¹ Pareidolia is sometimes also used to refer to the broader misrecognition of any discrete object where it does not exist.

² Ellison, Barbara and William, Thomas Bey, Sonic Phantoms, ed. López, Francisco (New York and London: Bloomsbury, 2022), pp. 19–22Google Scholar.

³ Cheetham, Marcus, ‘Editorial: The Uncanny Valley Hypothesis and beyond’, Frontiers in Psychology, 8, no. 1738 (2017), p. 4CrossRef Google Scholar PubMed.

⁴ Frith, Simon, ‘Afterword’, in The Relentless Pursuit of Tone: Timbre in Popular Music, ed. Fink, Robert, Latour, Melinda and Wallmark, Zachary (New York: Oxford University Press, 2018), pp. 373–75Google Scholar.

⁵ For a more in-depth contextualisation of Auto-Tune's usage in popular music production practice and its connections with other methods of vocal modification mentioned here, see Provenzano, Catherine, ‘Auto-Tune, Labor, and the Pop-Music Voice’, in The Relentless Pursuit of Tone: Timbre in Popular Music, ed. Fink, Robert, Latour, Melinda and Wallmark, Zachary (New York: Oxford University Press, 2018), pp. 159–76Google Scholar.

⁶ Andrew Chen, ‘Synthrumentation, Revisited: Towards a New Method of Additively Synthesising Speech in Acoustic Instrumental Contexts’ (master's thesis, Royal College of Music, 2022), doi: 10.24379/RCM.00002280.

⁷ Sachs, Curt, The Rise of Music in the Ancient World, East and West (New York: Dover, 2008), pp. 19–23Google Scholar; Thaut, Michael H., ‘Music as Therapy in Early History’, in Music, Neurology and Neuroscience: Evolution, the Musical Brain, Medical Conditions, and Therapies, ed. Altenmüller, Eckart, Finger, Stanley and Boller, François (Amsterdam: Elsevier, 2015), pp. 146–49Google Scholar.

⁸ Ravel, Maurice, Daphnis et Chloé (Paris: Durand, 1913), p. 52Google Scholar.

⁹ Jessie Fillerup, ‘Purloined Poetics: The Grotesque in the Music of Maurice Ravel’ (Ph.D. thesis, University of Kansas, 2009), pp. 263–64.

¹⁰ Clarence Barlow quoted in Kaske, Stephan, ‘A Conversation with Clarence Barlow’, Computer Music Journal, 9, no. 1 (1985), p. 27Google Scholar. For more in-depth technical explanations of Barlow's technique, see: Poller, Tom Rojo, ‘Clarence Barlow's Technique of “Synthrumentation” and Its Use in Im Januar am Nil’, TEMPO, 69, no. 271 (2015), pp. 7–23CrossRef Google Scholar; and Clarence Barlow, liner note for Musica Algorithmica. 2018, Ensemble Modelo62, pp. 5–6. as well as various other Barlow writings.

¹¹ Barlow, Clarence, ‘On the Spectral Analysis of Speech for Subsequent Resynthesis by Acoustic Instruments’, Forum Phoneticum, 66 (1998), p. 184Google Scholar.

¹² Barlow, liner note for Musica Algorithmica, pp. 5–6.

¹³ Chen, ‘Synthrumentation Revisited’, pp. 24–25. ‘Inharmonic’ is used here to refer generally to the complementary set of non-harmonic frequencies in a given spectrum. Some theorists, including Barlow himself, prefer to make a distinction between inharmonic overtones and noise spectra within this category.

¹⁴ Poller, ‘Clarence Barlow's Technique of “Synthrumentation”’, pp. 9, 22.

¹⁵ Nicholas Donin, ‘Sonic Imprints: Instrumental Resynthesis in Contemporary Composition’, in Musical Listening in the Age of Technological Reproduction, ed. Gianmario Borio (Burlington: Ashgate, 2015), p. 326. I am grateful to Julian Anderson for suggesting ‘synthrumentation’ to me as a useful, concise term to refer to any kind of sound resynthesis by acoustic means without the target source present, as it avoids any possible confusion between ‘instrumental synthesis’ or ‘re-synthesis’ and other, more conventional, electronic-based methods; ‘instrumental synthesis’, for example, might also refer to an electronic synthesis of instrumental sound.

¹⁶ Toshiro Mayuzumi quoted in Judith Ann Herd, ‘The Neonationalist Movement: Origins of Japanese Contemporary Music’, Perspectives of New Music, 27, no. 2 (1989), p. 137.

¹⁷ Yuriko Takakura, ‘A Visual Analysis Methodology for Music Compositional Process with Sound Resynthesis’ (Ph.D. dissertation, Keio University, 2019), p. 51.

¹⁸ François-Xavier Féron, ‘Gérard Grisey: première section de Partiels (1975)’, Genesis, 31 (2010), p. 93.

¹⁹ Yves Krier, ‘Partiels, de Gérard Grisey, manifestation d'une nouvelle esthétique’, Musurgia, 7, nos 3–4 (2000), pp. 162–63.

²⁰ Peter Ablinger, ‘Phonorealism’, https://ablinger.mur.at/phonorealism.html (accessed 1 March 2023).

²¹ Peter Ablinger quoted in Donin, ‘Sonic Imprints’, p. 326.

²² G. Douglas Barrett, ‘Window Piece: Seeing and Hearing the Music of Peter Ablinger’, https://ablinger.mur.at/docs/barrett_window_piece.pdf (accessed 1 March 2023), p. 5.

²³ Ibid.

²⁴ Winfried Ritsch, ‘Robotic Piano Player Making Pianos Talk’, Proceedings of the 8th Sound and Music Computing Conference, Padova (2011), pp. 395–396.

²⁵ Robert Hasegawa, ‘Timbre as Harmony – Harmony as Timbre’, in Oxford Handbook of Timbre, ed. Emily I. Dolan and Alexander Rehding (Oxford: OUP, 2021), p. 537.

²⁶ ‘Auditory Illusions: Hearing Lyrics Where There Are None’, www.youtube.com/watch?v=ZY6h3pKqYI0 (accessed 28 February 2023); and ‘Shrek but the ENTIRE MOVIE is converted to MIDI’, www.youtube.com/watch?v=wcehaxidJZk (accessed 28 February 2023).

²⁷ Jonathan Harvey, Speakings (London: Faber, 2008), p. iv.

²⁸ The Orchidée software was named by IRCAM after Clarence Barlow's Orchideæ Ordinariæ.

²⁹ Gilbert Nouno, Arshia Cont, Grégoire Carpentier and Jonathan Harvey, ‘Making an Orchestra Speak’, Proceedings of the 6th Sound and Music Computing Conference, Porto (2009), p. 280.

³⁰ Ville Pulkki and Matti Karjalainen, Communication Acoustics: An Introduction to Speech, Audio and Psychoacoustics (Chichester: Wiley, 2015), pp. 90–91.

³¹ Chen, ‘Synthrumentation Revisited’, pp. 37–43.

³² Ibid., pp. 44–47.

³³ Jonathan Harrington and Steve Cassidy, Techniques in Speech Acoustics. Vol. 8. Text, Speech and Language Technology (Dordrecht: Springer Netherlands, 1999), pp. 1–4.

³⁴ Chen, ‘Synthrumentation Revisited’, pp. 51–62.

³⁵ Ibid., pp. 55–59.

³⁶ Ibid.

³⁷ These prior examples are different from those shown in this article.

Article contents

VOICE WITHOUT SPEAKER: HUMAN SPEECH SYNTHESIS IN ACOUSTIC INSTRUMENTAL CONTEXTS

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests