Modular Composition: An approach towards structural plasticity in music

Samuel Lynch; Helen English; Nathan Scott; Jon Drummond

doi:10.1017/S1355771824000256

Modular Composition: An approach towards structural plasticity in music

Published online by Cambridge University Press: 13 January 2025

Nathan Scott and

Samuel Lynch*: Affiliation:
The University of Newcastle, NSW, Australia
Helen English: Affiliation:
The University of Newcastle, NSW, Australia
Nathan Scott: Affiliation:
The University of Newcastle, NSW, Australia
Jon Drummond: Affiliation:
The University of Newcastle, NSW, Australia
*: Corresponding author: Samuel Lynch; Email: [email protected]

Article contents

Abstract
Introduction
Background
Modular Compositional Approach
Original Works
Conclusion
Footnotes
References

Rights & Permissions

Abstract

This article considers modular composition as an approach to engendering structural plasticity in musical works. Structural plasticity, in this case, is defined as the ability for the components of a musical work (e.g., events, ideas, sequences, textures, timbres) to vary in how and when they are presented. In this research, modular composition is the process for creating a collection of individual musical ideas (e.g., sequences, patterns, phrases) termed ‘modules’, and designing a dynamic system for their assembly into cohesive structures. This approach results in musical works that exist in a state of constant structural flux, allowing for real-time alteration while progressing beyond similar existing approaches observed in video game music and interactive music apps, from which this research takes inspiration. Approaches involving compositionally focused intelligent music systems are also observed, highlighting how modular composition bridges traditional compositional practices and the design of interactive music systems. Two of the authors’ own works are discussed with regard to how modular composition can be implemented in varying creative ways. The outcome of this work illuminates the creative possibilities of integrating traditional compositional practices with new digital approaches to arrive at a more structurally plastic and alterable form of music.

Keywords

Modular Composition Interactive Music Generative Music Structural Plasticity Digital Music Systems

Type: Article
Information: Organised Sound , First View , pp. 1 - 10

DOI: https://doi.org/10.1017/S1355771824000256 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

1. Introduction

Prescribing ‘structure’ to music is a useful way of describing and conceptualising how music changes and evolves throughout time. Conventionally, composers create structure in music by defining the progression of musical ideas, sections, sequences and events within a work to formulate a musical narrative that adheres to their creative intentions. In such cases, musical works exist within a single structural realisation, meaning the narrative progression of musical ideas remains the same every time it is heard. This notion has been challenged by various works throughout history, such as the musical dice games of the eighteenth century (Hedges Reference Hedges1978), the open form works of the twentieth century such as Brown’s Available Forms I (Brown Reference Brown1961) or Stockhausen’s Klavierstück XI (Stockhausen Reference Stockhausen1956), and many minimalist pieces such as Terry Riley’s In C (Riley Reference Riley1964) – all of which rely on performer interpretation or rearranging written notation into different structures. However, with the ever-developing integration of digital technologies in music-making, this evolution towards structurally plastic musical works is greatly expanding. Interactive music apps such as Björk’s Biophilia enable music to change according to user interaction; video games commonly dynamically rearrange pre-composed music to adapt to gameplay events (Zdanowicz and Bambrick Reference Zdanowicz and Bambrick2019); and generative music systems can create or manipulate music via algorithmic processes (Herremans et al. Reference Herremans, Chuan and Chew2017), or via deep learning generative AI models such as in OpenAI’s Jukebox (Dhariwal et al. Reference Dhariwal, Jun, Payne, Kim, Radford and Sutskever2020), Meta AI’s MusicGen (Copet et al. Reference Copet, Kreuk, Gat, Remez, Kant and Synnaeve2023), and Google’s MusicLM (Agostinelli et al. Reference Agostinelli, Denk, Borsos, Engel, Verzetti and Caillon2023). No longer is music bound to a single structural realisation, but it may be liberated to exist in more versatile, alive and dynamic forms that can freely take on new shapes unimpeded by strict compositional decisions. This freedom from fixed musical structure is referred to here as structural plasticity, which can be understood as the ability for the components of a musical work (e.g., events, ideas, sequences, figures, patterns) to be varied in how and when they are presented. This shift towards structural plasticity not only broadens the compositional palette but also invites listeners to experience music that is itself plastic and alterable, thus reshaping the way we perceive and interact with music.

This article presents a specific approach to integrating structural plasticity into music composition formulated out of our own creative experimentations with methods commonly employed in video game music and many interactive music apps. This is referred to here as modular composition, which may be defined as a method of composing flexible and interactive musical works by creating a collection of musical ideas, phrases, sequences and progressions, termed ‘modules’, and designing a system that dynamically constructs these modules into cohesive structures. This approach involves two highly interconnected creative processes:

1. The creation of disjointed segments of musical content (modules).
2. The creation of a system that dynamically restructures musical content.

In this context, ‘musical content’ is used in the broadest sense and encompasses any kind of sonic material designed to embody a musical intention. A module, at its most basic level, may be defined as an individual pre-composed musical idea, sequence, phrase, or progression belonging to a broader collection of modules that constitute a modular musical work. System in this case refers to an executable program that handles the playback of modules. The distinguishing factor of this approach is the focus on achieving flexible and dynamic music via variably sequencing and layering broader musical components (e.g., phrases, progressions, sequences) rather than via the generation of music at a micro level; for example, by generating individual pitches, tones, rhythmic values, chords, and so on (Figure 1).

Figure 1. Modular composition.

Both creative processes of musical content creation and music system creation have been used to varying extent throughout the past to create plastic music structures. For instance – in the cases mentioned earlier – eighteenth-century musical dice games, open form compositions, many minimalist pieces and (to some extent) musique concrète involved the creation of disjointed musical material as well as the creation of rules that dictated how material could be reconstructed and performed. These days, however, digital technology has brought about a plethora of instances in which music exhibits structural plasticity, from the dynamic scores of many video games, to a world of intelligent digital music systems designed to generate music out of programmed rules and algorithmic processes. In the latter case, it is worth mentioning that creative practices do not necessarily involve composition directly, but rather involve the modelling of composition itself (Collins Reference Collins2008b).

The modular compositional process outlined in this article is an investigation into the creative implications of combining two seemingly contrasting creative processes, these being traditional music composition (in the form of modules) and digital music system design (resulting in the program that dynamically combines modules). The specific focus of this experimentation, however, is to arrive at a more flexible and semi-autonomous form of composition in which music does not exist in a single unchanging state but rather requires active shaping and interaction to be experienced. Rather than a strictly anthropocentric perspective of a musical work existing only as a product of human decision, various decisions regarding a work’s structure are intentionally relinquished such that the reshaping of that work by the system or the audience is an inherent aspect in which it exists.

This article outlines a framework of modular composition developed from creative experimentations conducted with the intention of understanding the compositional opportunities afforded by the creation of music as a collection of rearrangeable modular components. The framework itself is not intended to be applicable to any particular musical style or aesthetic, but rather showcase some of the unique compositional outcomes afforded by a modular approach to music creation, which in this case are directly inspired by minimalist music and dynamic video game music. An analysis of two interactive musical works is conducted to demonstrate the creative implications of this approach. The analysis concentrates on the structural considerations of each work and how applying modular composition in alternate ways can lead to varying unique musical outcomes.

2. Background

In contextualising how modular composition relates to the current discourse surrounding contemporary music-making, it is worth overviewing the various areas where structural plasticity in music has previously emerged and how digital technology has augmented this area of the current music-making landscape.

2.1. Non-digital modular music

While computers have unquestionably greatly expanded the means by which music can be created, manipulated and interacted with, composers throughout history have still demonstrated the ability to create structurally plastic musical works through written notation and performance instructions. Perhaps the earliest recorded examples of this are the musical dice games of the eighteenth century such as The Minuet and Polonaise Composer by Johan Philipp Kirnberger or Musikalisches Würfelspiel commonly attributed to Mozart (Hedges Reference Hedges1978). Such works had no inherent structure but instead required audiences to formulate their own structure by cutting and pasting measures of a notated score based on the rolling of dice. In the twentieth century, a number of experimental composers moved towards a level of structural variability by integrating indeterminate elements into how pieces could be performed. For example, via interoperative graphic notation such as in Feldman’s Projections pieces (Boutwell Reference Boutwell2012) or Cage’s Variations I–VIII (1958–78), or by providing only text-based instructions such as in Cardew’s Schooltime Special (1968). Another approach observable in many works is for the various sections of music to be modular, in that the order of musical events can be rearranged in some way. Examples include Stockhausen’s Klavierstück XI (Reference Stockhausen1956) in which 19 brief musical sections may be played in any order, Brown’s Available Forms I (Reference Brown1961) which allows a conductor to freely guide an ensemble through any part of a six-page score, and Cowell’s Mosaic Quartet (Cowell Reference Cowell1935) which contains five movements of no set sequential order. Of a similar disposition are a number of minimalist works such as Riley’s In C (Reference Riley1964) or the original version of Shaker Loops Footnote ¹ (1978) in which music is structured into multiple looping modules that can be variably sequenced and layered with each other. The modular nature of these types of works showcases not only the unique ability for a musical work to vary significantly with each performance but also the capability for a work to be endowed with a level of changeability such that it can be reshaped beyond the direct control of the composer.

In the structurally indeterminate pieces of this period, the performer takes on a unique role; they are no longer just a performer, but also a collaborator invited to creatively contribute to the work itself (Eco Reference Eco and Cancogni1989; Harries Reference Harries2013). Further, the composition itself is no longer strictly delineated but constitutes a type of musical boundary within which varying levels of freedom and expression can be exercised; the composer actively chooses to relinquish control over a work’s structure to forces beyond themselves. In such cases, however, with the exception of the musical dice games, it is difficult to relinquish this power to anyone but an experienced instrumentalist, nor is it readily possible for audiences to participate in this compositional collaboration. These non-digital instances of modularity in composition showcase the creative affordances of allowing a work’s structure to be plastic, but it begs the question of how much further this concept could be taken when utilising digital technology.

2.2. Practices in video game composition

In responding to the interactive aspect of video games, composers have developed a new range of compositional practices heavily integrated with game technology that afford varying levels of structural plasticity depending on creative needs or intentions. The video game framework has been incorporated into a variety of experimental compositional projects from musical puzzle games solved by performing an augmented drumkit (Michalakos Reference Michalakos2021), to virtual environments that encourage competitive composition (Studley et al. Reference Studley, Drummond, Scott and Nesbitt2020), and even to adaptive soundtracks that accompany players within a physical escape room (Délécraz Reference Délécraz2023). However, within more commercial settings, music primarily functions to enhance the players’ emotional experience (Stevens Reference Stevens, Fritsch and Summers2021; Michelmore Reference Michelmore, Fritsch and Summers2021) despite the challenges posed by interactivity. As such, music must, in a sense, narrativise player actions while also satisfying the aesthetic expectations (e.g., instrumentation, style, mood) imposed by a game’s overall design and emotional intentions.

Dynamic music in video games commonly involves the creation of music as separate events that can be rearranged and modified during gameplay. These events, which often exist as rendered digital audio, have been referred to as ‘segments’ (Aska Reference Aska2017), ‘stems’ (Michelmore Reference Michelmore, Fritsch and Summers2021), or ‘modules’ (Medina-Gray Reference Medina-Gray2016; Zdanowicz and Bambrick Reference Zdanowicz and Bambrick2019) as they will be named hereafter. The methods involved in rearranging and modifying these modules can be broken down into two general categories: vertical methods and horizontal methods. Both techniques are discussed in greater detail in section 3.2, but to overview, vertical methods generally refer to the addition, subtraction or substitution of musical layers to make changes to the overall musical arrangement/texture, while horizontal methods refer to changes made to how music progresses over time, that is, the sequencing of musical events. These methods have been discussed in depth by various video game composers (Paul Reference Paul and Moormann2013; Phillips Reference Phillips2014; Sweet Reference Sweet2015; Zdanowicz and Bambrick Reference Zdanowicz and Bambrick2019; Michelmore Reference Michelmore, Fritsch and Summers2021) and many ludomusicologists (Collins Reference Collins2008a; Summers Reference Summers2016; Aska Reference Aska2017; Medina-Gray Reference Medina-Gray2019). The benefit of using digital audio to present sound and music within games is that sonic content can be meticulously crafted according to the stylistic needs of the game while still exhibiting dynamic behaviour by being alterable in how and when constituent audio modules are heard. While the use of audio (a fixed form of media) limits the flexibility of music compared with algorithmic approaches, the favouring of audio-based techniques in modern video games (Plut and Pasquier Reference Plut and Pasquier2020) implies that not only are these techniques suitable for most adaptive music situations, but they also offer certain advantages. These advantages include greater approachability, better stylistic control, and the ability to work within computer processing and memory restrictions, which can outweigh the benefits of highly flexible, algorithmically driven music.

The technological means by which these audio-based methods have been integrated would generally involve programming dynamic behaviour directly within the game engine. However, within the last decade, audio middleware programs such as Wwise (Audiokinetic 2023), FMOD (Firelight Technologies 2023), Elias 4 (Elias Software 2023) and ADX2 (CRI Middleware Co. Ltd 2024) have emerged as more accessible and reliable solutions to integrating sound and music into games. As a result, while these programs provide the tools for creating dynamic music, the video-game-oriented design of these programs arguably inhibits their potential application to other forms of art media, such as enhanced instrument design or sound installations.

A modular approach to game scoring has been integrated into thousands of games; discussions concerning its value as a compositional approach are commonplace within the domain of video game development. However, owing to the close relationship between gaming and modular composition, the latter is rarely discussed on its own terms. The modular compositional approach outlined in this article is highly influenced by dynamic game music practices and seeks to understand its implications on composition detached from considerations defined by the game context.

2.3. Interactive music apps

In a parallel domain to video games, interactive apps that run on smartphones and tablets have hosted an eclectic array of interactive musical works that feature (to varying extent) a level of plasticity in their structures.

In referencing his CEMS system, Joel Chadabe notes that the duality of his system as being both a composition and an instrument also holds true for many music apps (Chadabe Reference Chadabe2015). Apps from Brian Eno such as Bloom (Reference Eno and Chilvers2008), Trope (Reference Eno and Chilvers2009) and Scape (Reference Eno and Chilvers2012) may be considered types of instruments in that they allow audiences to generate their own soundscapes based on how they interact with them. However, the specific ways users can interact with these apps and the range of sounds they can produce give each app a distinct identity, meaning one could also consider them as individual compositions. Other apps that fall into this category include Thicket by Joshue Ott (Reference Ott2010), Bubble Harp by Scott Snibbe (Reference Snibbe2011a), and Borderlands Granular by Chris Carlson (Carlson and Wang Reference Carlson and Wang2012). Composition in this context leans more towards aspects of system programming and interface design through which musical content can be created than the creation of the content itself.

Another category of interactive music apps are those which invite the audience to participate within an auditory world of the composer’s own making. Björk’s Biophilia app (Snibbe Reference Snibbe2011b) and Radiohead’s Polyfauna app (Pyke Reference Pyke2014) both feature musical content created by their respective artists and invite the audience to add to or shape it by interacting within different types of virtual environments. The variPlay system similarly allows artists to consolidate their music into an app format from which listeners are invited to move between alternate mixes and arrangements as the music is playing (Paterson et al. Reference Paterson, Toulson and Hepworth-Sawyer2019). Location based apps such as National Mall (Bluebrain 2011) and the Daoplayer (Hazzard and Greenhalgh Reference Hazzard and Greenhalgh2019) follow similar principles to adaptive video game music and recombine recorded audio events to provide musical experiences based on where the user is and how they move through specific locations. Music in these cases is specifically crafted by the composers/artists and then situated within an interactive system from which audiences are provided with different levels of agency to add to, alter or explore that music. However, with the exception of the Daoplayer, many of these systems rarely enable variability beyond textural alterations. Compared with video games, interactive music apps that incorporate composed music are limited in their musical variability, leaving room for further exploration into how an interactive system can use pre-composed musical material to produce works with a high degree of structural plasticity.

2.4. AI in music: composition and digital music systems

In the context of this discussion, digital music systems are those that generate music via a computer-based algorithmic process. The function and methods behind these systems greatly vary (Herremans et al. Reference Herremans, Chuan and Chew2017), and there is increasing interest in the implementation of deep learning techniques to generate music (Briot et al. Reference Briot, Hadjeres and Pachet2020; Civit et al. Reference Civit, Civit-Masot, Cuadrado and Escalona2022; Ji et al. Reference Ji, Yang and Luo2023), among other forms of machine learning (Roberts et al. Reference Roberts, Engel, Mann, Gillick, Kayacik and Nørly2019; Ens and Pasquier Reference Ens and Pasquier2020; Pachet et al. Reference Pachet, Roy, Carré and Miranda2021). Of particular relevance to this article are the plethora of intelligent music systems designed to engage in compositional activities.

Emerging within the field of Music Metacreation and Human Computer Interaction are numerous digital music systems designed for creative collaboration (D’inverno et al. Reference D’inverno, Gifford, Hutchings, Llano, McCormack and Yee-King2020). From learning and replicating a composer’s style (Lupker Reference Lupker2021) responding to live improvised performance (Erdem et al. Reference Erdem, Wallace and Jensenius2022), to generating multitrack MIDI sequences (Ens and Pasquier Reference Ens and Pasquier2020; Dong et al. Reference Dong, Chen, Dubnov, McAuley and Berg-Kirkpatrick2022), there is a noticeable trend towards using AI as a collaborative creative tool. Notably, plugins such as Google’s Magenta Studio (Roberts et al. Reference Roberts, Engel, Mann, Gillick, Kayacik and Nørly2019) and Sony CSL’s Flow Machines (ibid.) use machine learning to generate new MIDI based musical ideas as a source of creative inspiration for musicians. There is also an emerging trend of AI music tools that are less oriented towards collaboration, but rather involve the full generation of musical works. Systems such as OpenAI’s Jukebox (Dhariwal et al. Reference Dhariwal, Jun, Payne, Kim, Radford and Sutskever2020), Meta’s MusicGen (Copet et al. Reference Copet, Kreuk, Gat, Remez, Kant and Synnaeve2023) and Google’s MusicLM (Agostinelli et al. Reference Agostinelli, Denk, Borsos, Engel, Verzetti and Caillon2023) can be prompted with a type of genre, artist or even a full text description to generate full audio tracks based on that information. For these types of systems, the human user plays a less active role in music creation, instead guiding the creation process by providing descriptive prompts that the system subsequently uses to generate new tracks. However, in the cases outlined thus far, artificially intelligent music systems exist as tools for music creation rather than existing as musical works themselves.

With the growing integration of digital technology and programming in music-making, the term ‘musical work’ may extend to define systems that generate music. This is apparent in the range of programming languages designed for creating algorithmically generated music such as Max/MSP, PureData, SuperCollider or Csound. Within these environments, instead of creating musical content directly, composers create works by modelling composition itself (Collins Reference Collins2008b); they create musical works as systems, and these systems make music. Further examples of this concept can be found within the growing field of affectively driven algorithmic composition in which systems are designed to generate music with specific emotional qualities and affective intentions (Williams Reference Williams, Williams and Lee2018; Williams et al. Reference Williams, Hodge and Wu2020). Additionally, within the field of Music Metacreation there has been extensive experimentation with the creation of quasi autonomous music agents designed to complete creative tasks such as composition, live accompaniment, content generation and arrangement (Tatar and Pasquier Reference Tatar and Pasquier2019; Carnovalini and Rodà Reference Carnovalini and Rodà2020). The act of composing, in these cases, is the act of designing and programming the systems that make music. Structural plasticity is apparent in these systems in that both musical structure (and indeed musical content) is generated in real time and may be guided towards different structures by altering the system itself.

3. Modular Compositional Approach

The following compositional framework outlines the main components, methods and considerations regarding the creation of modular musical works. This includes overviewing the constitution of a music module, vertical and horizontal arrangement methods, and considerations regarding system design. This framework is developed from creative experimentations regarding modular composition and is primarily informed by existing theorisation of dynamic video game music. It is important to note that the framework is not intended to be an all-encompassing method of creating plastic musical structures of any style or aesthetic, but rather is modelled to distinguish our own approach from that of video game music and interactive music apps to create unique musical works that embody our understanding of structural plasticity.

3.1. Music module

A musical module – or simply module – may be defined as an individual pre-composed musical idea, sequence, phrase or progression belonging to a broader collection of modules that constitute a modular musical work. The content of a module is expected to be designed such that spontaneous and variable musical interactions occur as a module is layered and/or sequenced with other modules belonging to a greater collection. This also implies that modules are sonically complementary to their contemporaries. Further, this collection, as well as the ways in which modules rearrange and interact, is what defines the modular composition itself.

Since the variable sequencing of modules is a core aspect of what enables a modular work to be structurally plastic, the duration of a module is generally expected to range from approximately half a second to a minute. This is to ensure that modules can follow one another at a more frequent rate, allowing the musical progression to be more changeable and thus showcasing a greater degree of structural plasticity within the work.

Within our own creative experimentation (presented in section 4), music modules were created in the form of individual audio files. This choice was made for several reasons, the most notable of which being the research focus on structural plasticity as driven by the rearrangement of broader musical components such as musical sequences, phrases and progressions, which can easily be represented by digital audio. Another significant motivation was the fact that recorded audio can be drawn from multiple sources, including human performances on acoustic instruments, sonically versatile virtual instruments, digital or analogue synthesised sounds, and field recordings to name a few. This diversity of potential sound sources provides a broad sonic palette that can be accessed through relatively common music-making tools and is challenging to replicate through synthesis alone. In our case, modules were composed, produced and rendered into audio using Cubase Pro 11, and audio sources included a variety of virtual instrument VSTs as well as recordings of our own acoustic instrument performances.

3.2. Horizontal and vertical methods

Considering the inherent separation of musical material present in this modular framework, the methods by which musical content is combined into new structures carries much compositional significance. The methods outlined here can be broken down into two categories: horizontal and vertical.

Horizontal methods refer to changes made to the succession of musical ideas. This essentially refers to how modules are sequenced and the ways in which broader structures are created via their variable sequential progression. By allowing one module to be followed by a number of potential others, musical progression can branch off in multiple directions.

Vertical methods refer to the addition, subtraction or substitution of musical layers to make changes to the overall musical arrangement/texture. This can be approached in multiple ways, though here we outline two main approaches: verticality within modules (submodules) and verticality through module groups (voices).

Verticality within modules can be achieved by dividing a module into several submodules, with each submodule containing audio for a different musical layer of the main module. For instance, a module containing a single four-bar orchestral phrase might contain separate audio layers (or stems) for each section of the orchestra, or that same module could instead contain several recordings of that phrase played at different levels of intensity from which only one is selected to play.

Alternatively, vertical changes may be made by organising modules into separate layer groups which we refer to as voices. A voice may be defined as a distinct layer within a modular composition, characterised by its specific set of modules that can only be recombined via horizontal methods. So, while the modules within one voice can only vary in their sequence, other voices can also operate concurrently, allowing modules from each voice to play in parallel and combine in variable ways. This brings about considerations regarding the synchronisation of voices to a common metre or tempo but also opens up unique polyrhythmic and polyphonic opportunities.

Both horizontal and vertical methods provide a general means of arranging and moulding the components of a modular composition into alternate structures. Questions regarding the composition of music to suit these methods are highly dependent on how they are intended to be employed and the creative intentions of the composer. The main concern here, however, is how these methods enable musical structure to be fluid, and how they enable the composer to relinquish certain structural decisions to be made within or through the system that the overall composition is situated in.

3.3. System design: relinquishing structural decision-making

Since a modular composition has no inherent structure, all its components must be situated within a system that handles the reconstruction of these components into a single flow of music. The system is just as much a part of the composition as the music itself. The approach to designing such a system fundamentally involves programming behaviours regarding how modules are dynamically rearranged and how the system responds to various inputs such as user interaction or randomisation. This is quite broad and will greatly depend on the musical content, the context of the composition and the creative intentions of the composer. So, while the approach to system design outlined here is not exhaustive, it nonetheless demonstrates an informative process by which modular composition can result in a structurally plastic and living musical work.

Our approach to system design comes down to several key processes: mapping the non-linear structure of a whole work; delineating the means by which modules interact; and defining the boundaries of autonomous behaviour, that is, the freedom by which a work may reshape itself. The latter process also includes considerations of interactivity and the level of control relinquished from the composer to either the system controlling the work or the audience.

3.3.1. Structural mapping

Structural mapping involves designing and defining the organisational framework within which the various modules of a modular composition interact and are arranged. This mapping helps to define the overall form of the piece and its potential to be reshaped. In our own creative experimentations, this process was highly dependent on establishing two essential factors: the voices and the variable parameters present within the work.

As mentioned in section 3.2, a voice is a specific set of modules that can only be recombined via horizontal methods and operates in conjunction with other voices. Assuming a work chooses to integrate more than one voice, defining the number of voices within a work and the musical characteristics that distinguish each voice is an essential step in determining how a modular work operates and sounds. For example, in our work Sum, there are three voices that each represent a different instrumental arrangement. Modules from each voice can be sequenced semi-randomly causing an unpredictable overlapping of musical ideas, but the differing instrumentation of modules from each voice avoids potential timbral clashing. The interaction between the modules of each voice dictates the level and extent of structural plasticity within a modular work.

Parameters, as discussed here, are the broader aspects of a modular work that may vary between multiple states. A parameter could encompass a specific musical feature such as chord types or even broader features such as emotional tone. Each parameter must have a minimum of two distinctive states, and these states are the main contributors to how modules are organised. For instance, Sum includes a broad chord-type parameter which contains five different states, each representing different chords. All modules within the piece are subsequently grouped according to these chord types and only played when their respective state is active. Alternatively, an earlier experimental work of ours includes a much broader ‘emotional brightness’ parameter comprising ‘bright’, ‘medium’ and ‘dark’ states each representing different overall moods (Lynch et al. Reference Lynch, English, Drummond and Scott2024).

By establishing both the voices and the alterable parameters of a work, structural mapping essentially involves organising modules according to these factors. Each voice will contain its own set of modules, and this set of modules will be composed and organised according to the number of parameters and their respective states relevant to that voice.

3.3.2. Delineating modular interaction

The active assembly of modules within a work first requires consideration regarding the delineation of how modules interact in terms of horizontal and vertical arrangement. This concerns aspects of how and when modules are sequenced, the compatibility of modules (i.e., which modules may layer and/or follow one another), and the synchronisation (or lack thereof) of modules along a time scale.

When regarding the horizontal sequencing of modules, this may technically be approached by crossfading between modules or simply by beginning a following module only once another has ended. However, for the approach outlined here, these methods are discouraged. To minimise clear moments of transition and enable a seamless musical flow, it is suggested that modules always be allowed to play out in their entirety and that sequencing always allow for modules to briefly overlap with one another. A method we have referred to in the past as ‘dovetailing’ (Lynch et al. Reference Lynch, English, Drummond and Scott2024), or as ‘imbrication’ by others (Hulme Reference Hulme2017), avoids clear moments of transition between modules by overlapping the beginning of a following module with any musical information from the current module meant to extend past the moment of transition such as a final melodic note, a reverb tail, or the decay of a cymbal for example. This is more beneficial when a common tempo needs to be maintained, though when tempo is of less concern, the following modules may be allowed to overlap with others at any moment as is the case for Sum.

The determination of which modules are compatible with one another in terms of layering and sequential progression is mostly addressed during the structural mapping process as the voices and parameters within a work provide crucial information as to how modules are grouped together. Generally, modules from all voices are designed to be heard simultaneously, while the variations in states of a parameter dictate which modules will follow one another. However, the sequential progression of modules belonging to a single parameter state may be authored such that this progression is random or follows a specific pattern. Sum allows the progression of modules belonging to a single parameter state to be random, whereas in Shifting Patterns, which includes multiple parameters per voice, each combination of parameter states will result in one specific module being played. These approaches result in works that are fundamentally different in their operation.

Synchronisation determines how modules from each voice do or do not follow a common metric pulse. This can occur in one of three ways:

Voices are synchronised to a common metre and tempo. Meaning modules share the same meter and tempo so that the timing and phrasing of musical material remains predictable and consistent.
Voices are synchronised only in tempo. Modules may vary in metre but play in time to a common pulse. This can result in interesting polymetric relationships between modules. This is utilised throughout Shifting Patterns in which the metre of different voices can be freely altered.
Voices are not synchronised. Generally expected when modules have no discernible metric identity. This can be observed throughout Sum as modules are absent of any tempo or metre and may overlap freely with other modules.

Considerations regarding synchronisation will generally concern the overall rhythmic style of a work. Depending on how matters of tempo and meter are approached, musical results are likely to greatly vary.

3.3.3. Relinquishing structural decision-making

The process of structurally mapping a modular work and specifying the intricacies of how modules interact frames the overall process by which such a work operates along with the musical result. The degree of freedom of interaction between modules, and the extent of voices and parameters present, establishes the conditions in which a modular composition may structurally develop and transform. The process of setting these conditions is the process of relinquishing decisions regarding a work’s structure. It must then be considered whether these decisions are being relinquished to the system itself, the audience, or other external or environmental forces, though this will highly depend on the context of how the work is intended to be experienced or shared. For instance, alterations to parameter states and the sequencing of modules could be relinquished to the system itself, whether that be by letting these alterations happen at random or according to an algorithmic process such as a Markov model. Alternatively, these decisions could be relinquished to the audience by providing a graphic user interface that allows them to directly change parameters or select which modules are heard as is the case for the works outlined in the following section. There is also the possibility of binding parameters and sequential progressions to forces beyond the system but not to direct human decision such as environmental or contextual factors.

Ultimately, the chosen method of relinquishing control over a modular composition’s structure shapes the listener’s experience, offering a unique interplay between predetermined musical elements and dynamic variability. By embracing the potential of modularity, and indeed structural plasticity, composers can craft musical works that embody alterability and reflect both the technology and the creativity driving them.

4. Original Works

This section overviews two original musical works created using the modular compositional approach: Sum and Shifting Patterns. Each was created with the primary goal of understanding the creative implications of composing music as rearrangeable modular components. As such, their systems currently exist in a state that requires direct user interaction via a GUI to make structural decisions for testing purposes as well as to understand how audiences might be able to engage with such a work. Nonetheless, each work showcases alternate creative avenues afforded by the overall approach of modular composition.

The composition and rendering of modules was done within Cubase Pro 11, and the programming of each work’s system was conducted using the audio middleware program Wwise, from which the Unity game engine was used to create the user interface.

4.1. Sum

Sum Footnote ² seeks to showcase variable structural progression in terms of both harmony and textural density. The work comprises three voices each representing alternate musical textures conceptualised as spatial perspectives (e.g., background, midground and foreground; Figure 2). Voices and their respective modules are distinct in their instrumental arrangement, with the overall instrumentation similar to that of a chamber orchestra. Further, modules from both the background and the midground voice contain submodules associated with different instrument groups. These submodules are programmed to have a 50% probability of being heard when the module is played, meaning the instrumental arrangement of a single module can vary slightly. The audio content of modules in this case include a mix of virtual instruments, covering many of the string and wind sounds, as well as some live recorded acoustic guitar and violin.

Figure 2. Structural map of Sum.

The work includes a chordal parameter with five variable states, each representing a different diatonic chord within the key of D minor (Dm, Am, Gm, B♭, C) from which all modules were accordingly composed and categorised. Musical material is not metric, meaning the approach to module rearrangement between voices is asynchronous.

The modules within this work generally range between 15 and 25 seconds in duration. Since there is no tempo or metre, modules tend to slowly swell and ebb in terms of volume and intensity to avoid abrupt transitions in and out of silence. Modules from the foreground voice, however, were exempt as their melodic role suited a more abrupt and sudden beginning.

Both the background and the midground voice contain at least four unique modules for each of the five harmonic states. The foreground voice differs in that it can be set to three different instruments. As such, each instrument in the foreground voice contains four modules per harmonic state. All together this work comprises over one hundred unique modules.

In terms of how overall structural decisions are made, the voices and parameters are the main vehicles by which structure may be actively altered. In this case, the GUI allows the user/listener to make general structural decisions by setting the state of each parameter and by triggering when a voice plays one of its constituent modules (Figure 3). The decision of which respective module plays, however, is random. Additional triggering of any voice will result in modules overlapping with each other until those modules end and no more are triggered. Altering the state of any parameters will dictate which modules may be randomly selected to play whenever a voice is triggered. The parameters in this instance include the chord parameter (associated with all voices), and the instrument parameter (only associated with the foreground voice).

Figure 3. Screenshot of Sum interface.

Sum currently demonstrates a unique ability to vary in its textural and harmonic progression. While there are some limitations in terms of the broadness of its structural plasticity as well as its susceptibility to occasional harmonic and timbral clashing, its overall framework is endlessly expandable and potentially applicable to a variety of musical styles.

4.2. Shifting Patterns

Inspired by the minimalist composers Steve Reich and Terry Riley, Shifting Patterns Footnote ³ is a work whose structure is driven by the recombination of looping polymetric patterns.Footnote ⁴ The work itself contains four voices, each representing a different type of pitched percussion: xylophone, vibraphone, marimba and glockenspiel (Figure 4). Each voice also contains four variable parameters (harmony, metre, pattern type and variation), each of which can be altered independently of other modules. Modules are designed to loop, and they all contain different kinds of two-bar patterns, the sounds of which were recorded in Cubase 11 using various virtual instruments. The length of modules in this case generally range from 3 to 6 seconds (including the reverb tail) and they all have a common tempo of 152bpm.

Figure 4. Screenshot of Shifting Patterns interface.

The selection of which module within a voice is played comes down to the combination of states between all four parameters. Each voice contains the same set of parameters: harmony, metre, pattern and variation. Each combination of parameter states within a voice leads to a unique module which, when allowed, will begin to loop. Any adjustments to the parameters will cause the current module to play out until it is finished, after which the newly selected module will begin in time with the pulse.

The variety of parameter states results in a required 180 unique modules per voice, and 720 modules in total for the entire work. While this is a large number, modules between different voices do follow similar rhythmic accents and pitch contours making their creation more formulaic and efficient. The main differences between modules of each voice, beyond their instrumentation, is that they cover different note ranges and harmonic territory.

Considering the importance of rhythm within this work, voices are all synchronised to a common pulse to ensure temporal homogeny. They are not, however, synchronised to a common meter so that unique pattern combinations can arise as a result of loops of differing pattern lengths metrically phasing with one another.

Since Shifting Patterns currently exists to be controlled via a user interface, the user is afforded a significant amount of structural control. Each module can be individually selected to play by adjusting the parameters between voices and there is no element of randomness to this selection process unlike in Sum. There is, however, a set of controls within the user interface that, when triggered, will set their relative parameters to a random state. There are also macro controls that set the relevant parameter of all voices to a selected value (see Figure 4). This overall format of allowing musical patterns to emerge out of the shifting and alternating of fundamental musical modules enables a unique and versatile approach for a musical work to exist such that it requires active decision-making to structurally coalesce.

5. Conclusion

This article aimed to explore the creative opportunities of engendering structural plasticity in music through a modular compositional approach. This approach represents an alternate means by which traditional composition may be structured by taking advantage of digital technology to move towards more flexible semi-autonomous forms of composition. Plastic structures in music have taken many forms, from eighteenth-century musical dice games and twentieth-century open form compositions, to the current digital forms of intelligent music systems and music in interactive media. Such instances showcase the intriguing potential of flexible musical structures, and, as many video games and other forms of interactive media have shown, a modular approach to achieving plastic structures in music composition holds considerable creative potential.

Considering the limited scope of the works analysed in this article, it is still possible to observe the musical and creative opportunities afforded by modular composition. Further creative research is necessary to investigate how modular composition may be applied to other musical genres and contexts. Additionally, there is also a need to conduct user studies to understand how audiences respond and engage with such works. Overall, this article offers a forward-looking perspective on music’s future, presenting a novel method of musical expression and creation. It opens avenues for new formats through which music may be created and shared, and carries positive implications for the evolution of music technology, expression and experience.

Acknowledgements

This work is supported through an Australian Government Research Training Programme Scholarship.

Footnotes

1 The original version of Adams’s Shaker Loops ([1978] Reference Adams2005) is no longer available, but Adams mentioned the initial modular conception of the work in his book Hallelujah Junction: Composing an American Life (2009).

2 Video demonstration available at https://youtu.be/9VWWD5eJoas (accessed January 2024).

3 Video demonstration available at https://youtu.be/d6X480ODVWs (accessed January 2024).

4 The structural and compositional framework outlined here can also be observed in another of our works at https://youtu.be/HDQrNKt7yj0 (accessed January 2024).

References

Adams, J. [1978] 2005. Shaker Loops (music score), 2nd edn. New York: Associate Music Publishers.Google Scholar

Adams, J. 2009. Hallelujah Junction: Composing an American Life. New York: Farrar, Straus and Giroux.Google Scholar

Agostinelli, A., Denk, T. I., Borsos, Z., Engel, J., Verzetti, M., Caillon, A., et al. 2023. Musiclm: Generating music from text. arXiv preprint arXiv:2301.11325.Google Scholar

Aska, A. 2017. Introduction to the Study of Video Game Music. Milwuakee, WI: Lulu Press.Google Scholar

Audiokinetic. 2023. Wwise (version 2023.1) (audio software). www.audiokinetic.com/en/products/wwise/.Google Scholar

Bluebrain. 2011. Announcing ‘The National Mall’: The First Location Aware Album. http://bluebrainmusic.blogspot.com/2011/03/national-mall.html (accessed December 2023).Google Scholar

Boutwell, B. 2012. Morton Feldman’s Graphic Notation: Projections and Trajectories. Journal of the Society for American Music 6(4), 457–482.CrossRef Google Scholar

Briot, J.-P., Hadjeres, G. and Pachet, F.-D. 2020. Deep Learning Techniques for Music Generation. Gewerbestrasse, Switzerland: Springer.CrossRef Google Scholar

Brown, E. 1961. Available Forms I (music score). New York: Universal Edition.Google Scholar

Carlson, C. and Wang, G. 2012. Borderlands: An Audiovisual Interface for Granular Synthesis. New Interfaces for Musical Expression 2012. Ann Arbor, MI: University of Michigan/NIME.Google Scholar

Carnovalini, F. and Rodà, A. 2020. Computational Creativity and Music Generation Systems: An Introduction to the State of the Art. Frontiers in Artificial Intelligence 3. https://doi.org/10.3389/frai.2020.00014.CrossRef Google Scholar

Chadabe, J. 2015. The Role of Apps in Electroacoustic Music: A New Dimension for Music through Tablets and Other Devices. Organised Sound 20(1): 99–104.CrossRef Google Scholar

Civit, M., Civit-Masot, J., Cuadrado, F. and Escalona, M. J. 2022. A Systematic Review of Artificial Intelligence-Based Music Generation: Scope, Applications, and Future Trend. Expert Systems with Applications 209. https://doi.org/10.1016/j.eswa.2022.118190.CrossRef Google Scholar

Collins, K. 2008a. Game Sound: An Introduction to the History, Theory, and Practice of Video Game Music and Sound Design. Cambridge, MA: MIT Press.CrossRef Google Scholar

Collins, N. 2008b. The Analysis of Generative Music Programs. Organised Sound 13(3): 237–48.CrossRef Google Scholar

Copet, J., Kreuk, F., Gat, I., Remez, T., Kant, D., Synnaeve, G. et al. 2023. Simple and Controllable Music Generation. arXiv preprint arXiv:2306.05284.Google Scholar

Cowell, H. 1935. Mosaic Quartet – String Quartet No. 3 (music score) (1965). New York: Associated Music Publishers.Google Scholar

CRI Middleware Co. Ltd. 2024. ADX2 (version 2.41.109LE) (audio software). www.criware.com/en/products/adx2.html.Google Scholar

D’inverno, M., Gifford, T., Hutchings, P., Llano, M. T., McCormack, J. and Yee-King, M. 2020. Design Considerations for Real-Time Collaboration with Creative Artificial Intelligence. Organised Sound 25(1): 41–52.Google Scholar

Délécraz, C. 2023. Scoring the Original Soundtrack of an Escape Room: Electronic Dance Music Influences in Nobody – Vis et ressens. Journal of Sound and Music in Games 4(1): 26–71.CrossRef Google Scholar

Dhariwal, P., Jun, H., Payne, C., Kim, J. W., Radford, A. and Sutskever, I. 2020. Jukebox: A Generative Model for Music. arXiv preprint arXiv:2005.00341.Google Scholar

Dong, H.-W., Chen, K., Dubnov, S., McAuley, J. and Berg-Kirkpatrick, T. 2022. Multitrack Music Transformer. arXiv preprint arXiv:2207.06983.CrossRef Google Scholar

Eco, U. 1989. The Open Work, trans. Cancogni, A.. Cambridge, MA: Harvard University Press.Google Scholar

Elias Software. 2023. Elias Middleware (version 4.0) (audio software). https://eliassoftware.com/products/elias-4/.Google Scholar

Eno, B. and Chilvers, P. 2008. Bloom. Mobile A (version 3.2 2023). Opal Ltd. https://apps.apple.com/gb/app/bloom/id292792586.Google Scholar

Eno, B. and Chilvers, P. 2009. Trope. Mobile A (version 1.4 2023). Opal Ltd. https://apps.apple.com/us/app/trope/id312164495.Google Scholar

Eno, B. and Chilvers, P. 2012. Scape. Mobile A (version 1.2 2018). Opal Ltd. https://apps.apple.com/gb/app/scape/id506703636.Google Scholar

Ens, J. and Pasquier, P. 2020. MMM: Exploring Conditional Multi-Track Music Generation with the Transformer. arXiv preprint arXiv:2008.06048.Google Scholar

Erdem, C., Wallace, B. and Jensenius, A. R. 2022. CAVI: A Coadaptive Audiovisual Instrument–Composition. NIME 2022, Auckland, New Zealand: PubPub.Google Scholar

Firelight Technologies. 2023. FMOD (version 2.02) (audio software). www.fmod.com.Google Scholar

Harries, G. 2013. ‘The Open Work’: Ecologies of Participation. Organised Sound 18(1): 3–13.CrossRef Google Scholar

Hazzard, A. and Greenhalgh, C. 2019. Adaptive Musical Soundtracks: From In-Game to on the Street. Proceedings of the 14th International Audio Mostly Conference: A Journey in Sound. Nottingham: Association for Computing Machinery, 31–8.Google Scholar

Hedges, S. A. 1978. Dice Music in the Eighteenth Century. Music & Letters 59(2): 180–7.CrossRef Google Scholar

Herremans, D., Chuan, C.-H. and Chew, E. 2017. A Functional Taxonomy of Music Generation Systems. ACM Computing Surveys (CSUR) 50(5): 1–30.CrossRef Google Scholar

Hulme, Z. 2017. Killing-off the Crossfade: Achieving Seamless Transitions with Imbricate Audio. G| A| M| E Games as Art, Media, Entertainment 2(6).Google Scholar

Ji, S., Yang, X. and Luo, J. 2023. A Survey on Deep Learning for Symbolic Music Generation: Representations, Algorithms, Evaluations, and Challenges. ACM Computing Surveys 56(1): 1–39.CrossRef Google Scholar

Lupker, J. A. 2021. Score-Transformer: A Deep Learning Aid for Music Composition. NIME 2021. Shanghai, China: PubPub.Google Scholar

Lynch, S., English, H., Drummond, J. and Scott, N. 2024. Exploring Cell-Based Dynamic Music Composition to Create Non-Linear Musical Works. In Innovation in Music: Technology and Creativity. Oxford: Routledge, 119–35.CrossRef Google Scholar

Medina-Gray, E. 2016. Modularity in Video Game Music. In Ludomusicology: Approaches to Video Game Music. Sheffield: Equinox Publishing, 53–72.Google Scholar

Medina-Gray, E. 2019. Analyzing Modular Smoothness in Video Game Music. Music Theory Online 25(3).CrossRef Google Scholar

Michalakos, C. 2021. Designing Musical Games for Electroacoustic Improvisation. Organised Sound 26(1): 78–88.CrossRef Google Scholar

Michelmore, G. 2021. Building Relationships: The Process of Creating Game Music. In Fritsch, M. and Summers, T. (eds.) The Cambridge Companion to Video Game Music Cambridge Companions to Music. Cambridge: Cambridge University Press, 64–73.CrossRef Google Scholar

Ott, J. 2010. Thicket. Mobile A (version 2.52 2016). Interval Studios. http://apps.intervalstudios.com/thicketclassic/.Google Scholar

Pachet, F., Roy, P. and Carré, B. 2021. Assisted Music Creation with Flow Machines: Towards New Categories of New. In Miranda, E. R. (ed.) Handbook of Artificial Intelligence for Music: Foundations, Advanced Approaches, and Developments for Creativity. Berlin: Springer, 485–520.CrossRef Google Scholar

Paterson, J., Toulson, R. and Hepworth-Sawyer, R. 2019. User-Influenced/Machine-Controlled Playback: The variPlay Music App Format for Interactive Recorded Music. Arts 8(3).CrossRef Google Scholar

Paul, L. J. 2013. Droppin’ Science: Video Game Audio Breakdown. In Moormann, P. (ed.) Music and Game: Perspectives on a Popular Alliance. Wiesbaden, Germany: Springer Fachmedien Wiesbaden, 63–80.CrossRef Google Scholar

Phillips, W. 2014. A Composer’s Guide to Game Music. Cambridge, MA: MIT Press.Google Scholar

Plut, C. and Pasquier, P. 2020. Generative Music in Video Games: State of the Art, Challenges, and Prospects. Entertainment Computing 33. https://doi.org/10.1016/j.entcom.2019.100337.CrossRef Google Scholar

Pyke, M. 2014. How We Made Radiohead’s PolyFauna App for iOS and Android. The Guardian. www.theguardian.com/culture-professionals-network/culture-professionals-blog/2014/mar/07/how-we-made-radiohead-polyfauna-app (accessed December 2023).Google Scholar

Riley, T. 1964. In C (music score). New York: Associated Music Publishers.Google Scholar

Roberts, A., Engel, J., Mann, Y., Gillick, J., Kayacik, C., Nørly, S. et al. 2019. Magenta Studio: Augmenting Creativity with Deep Learning in Ableton Live. International Workshop on Musical Metacreation (MUME). Charlotte, CA, USA.Google Scholar

Snibbe, S. 2011a. Bubble Harp. Mobile AScott Snibbe Studio Inc. www.snibbe.com/apps/bubbleharp-app (accessed July 2024).Google Scholar

Snibbe, S. 2011b. Björk: Biophilia. www.snibbe.com/apps/biophilia (accessed July 2024).Google Scholar

Stevens, R. 2021. The Inherent Conflicts of Musical Interactivity in Video Games. In Fritsch, M. and Summers, T. (eds.) The Cambridge Companion to Video Game Music Cambridge Companions to Music. Cambridge: Cambridge University Press, 74–93.CrossRef Google Scholar

Stockhausen, K. 1956. Klavierstücke XI (music score). Vienna: Universal Edition.Google Scholar

Studley, T., Drummond, J., Scott, N. and Nesbitt, K. 2020. Evaluating Digital Games for Competitive Music Composition. Organised Sound 25(1): 75–88.CrossRef Google Scholar

Summers, T. 2016. Understanding Video Game Music. Cambridge: Cambridge University Press.CrossRef Google Scholar

Sweet, M. 2015. Writing Interactive Music for Video Games: A Composer’s Guide. Harrisonburg, VA: Pearson Education.Google Scholar

Tatar, K. and Pasquier, P. 2019. Musical Agents: A Typology and State of the Art Towards Musical Metacreation. Journal of New Music Research 48(1): 56–105.CrossRef Google Scholar

Williams, D. 2018. Affectively-Driven Algorithmic Composition (AAC). In Williams, D. and Lee, N. (eds.) Emotion in Video Game Soundtracking. Gewerbestrasse, Switzerland: Springer International Publishing, 27–38.Google Scholar

Williams, D., Hodge, V. J. and Wu, C.-Y. 2020. On the use of AI for Generation of Functional Music to Improve Mental Health. Frontiers in Artificial Intelligence 3(94).CrossRef Google Scholar PubMed

Zdanowicz, G. and Bambrick, S. 2019. The Game Audio Strategy Guide: A Practical Course. New York: Routledge.CrossRef Google Scholar

Figure 1. Modular composition.

Figure 2. Structural map of Sum.

Figure 3. Screenshot of Sum interface.

Figure 4. Screenshot of Shifting Patterns interface.

Article contents

Modular Composition: An approach towards structural plasticity in music

Abstract

Keywords

1. Introduction

2. Background

2.1. Non-digital modular music

2.2. Practices in video game composition

2.3. Interactive music apps

2.4. AI in music: composition and digital music systems

3. Modular Compositional Approach

3.1. Music module

3.2. Horizontal and vertical methods

3.3. System design: relinquishing structural decision-making

3.3.1. Structural mapping

3.3.2. Delineating modular interaction

3.3.3. Relinquishing structural decision-making

4. Original Works

4.1. Sum

4.2. Shifting Patterns

5. Conclusion

Acknowledgements

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests