Book contents
- Frontmatter
- Dedication
- Contents
- Foreword
- Preface
- Part I Sound Analysis and Representation Overview
- Part II Systems Theory for Hearing
- Part III The Auditory Periphery
- Part IV The Auditory Nervous System
- Part V Learning and Applications
- 24 Neural Networks for Machine Learning
- 25 Feature Spaces
- 26 Sound Search
- 27 Musical Melody Matching
- 28 Other Applications
- Bibliography
- Author Index
- Subject Index
- Plate section
27 - Musical Melody Matching
from Part V - Learning and Applications
Published online by Cambridge University Press: 28 April 2017
- Frontmatter
- Dedication
- Contents
- Foreword
- Preface
- Part I Sound Analysis and Representation Overview
- Part II Systems Theory for Hearing
- Part III The Auditory Periphery
- Part IV The Auditory Nervous System
- Part V Learning and Applications
- 24 Neural Networks for Machine Learning
- 25 Feature Spaces
- 26 Sound Search
- 27 Musical Melody Matching
- 28 Other Applications
- Bibliography
- Author Index
- Subject Index
- Plate section
Summary
I hope my critics will excuse me if I conclude from the opposite nature of their objections that I have struck out nearly the right path. As to my Theory of Consonance, I must claim it to be a mere systematisation of observed facts (with the exception of the functions of the cochlea of the ear, which is moreover an hypothesis that may be entirely dispensed with). But I consider it a mistake to make the Theory of Consonance the essential foundation of the Theory of Music, and, I had thought that this opinion was clearly enough expressed in my book. The essential basis of Music is Melody.
—On the Sensations of Tone, Hermann Ludwig F. Helmholtz (1870)This chapter draws on material from the 2012 paper “The Intervalgram: An Audio Feature for Large-scale Melody Recognition” by Thomas C. Walters, David A. Ross, and Richard F. Lyon (Walters et al., 2013).
In this chapter, we review a system for representing the melodic content of short pieces of audio using a novel chroma-based representation known as the “intervalgram,” which is a summary of the local pattern of musical intervals in a segment of music. We introduced chroma as pitch within an octave in Section 4.7. The intervalgram is based on a chroma representation derived from the pitchogram, or temporal profile of the stabilized auditory image. Each intervalgram frame is made locally key invariant by means of a “soft” pitch transposition to a local reference. Intervalgrams are generated for a piece of music using multiple overlapping windows. These sets of intervalgrams are used as the basis of a system for detection of identical melodies across a database of music. Using a dynamic-programming-like approach for comparisons between a reference and the song database, performance was evaluated on the dataset. A first test of an intervalgram-based system on this dataset yields a precision at top-1 of 53.8%, with a precision–recall curve that shows very high precision up to moderate recall, suggesting that the intervalgram is adept at identifying the easier-to-match cover songs in the dataset with high robustness. The intervalgram is designed to support locality-sensitive hashing, such that an index lookup from each single intervalgram feature has a moderate probability of retrieving a match, with relatively few false matches.
- Type
- Chapter
- Information
- Human and Machine HearingExtracting Meaning from Sound, pp. 467 - 480Publisher: Cambridge University PressPrint publication year: 2017