SpectraScore: A software tool for the performance of interactive spectral music
Below is a short clip of me using my virtual reality glove to manipulate some of the live input coming into my computer from my synthesiser and microphone.
This technology later developed into SpectraScore.
SpectraScore is my set of software tools that enable the user to generate live spectral music based on any audio input they choose. The software analyses an audio input and generates musical notation, synthesised audio and MIDI based on the most prominent spectra in that input. In his book ‘20th century Harmony’ Vincent Persichetti writes, ‘Resonant harmony is not formed by seeking higher and higher overtones but by using overtones of overtones’. I suggest that this concept need not be confined to highly technical spectral compositional methods that consider only combinations of tones in isolation and assume a ‘natural’ resonance with the audiences ‘innate’ emotional responses. Defining ‘resonant harmony’ as the perception of musical ideas which resonate with personal beliefs, and cultural predispositions of the listener, and ‘overtones of overtones’ as literal notions strongly implied by the selection of particular tone combinations deepens the relationship between the composer’s intention and the perception of musical meaning by an audience.
I have recently completed a masters thesis on the subject which is available here. I will be posting the tools as soon as they are fully stable. They currently exist as Max for Live modules.
Examination of the compositional system “SpectraScore”
Essentially, the process of “interactive spectral music” as implemented in SpectraScore is outlined in the following diagram:
The Process of Interactive Spectral Composition using SpectraScore
The overall goal of this method is to create a tangible link between audio source material and sonic results via mixing up the concepts of harmony and timbre in real-time. A selection of ‘material objects’, say for example five little bells bought at a flea market as in one of my pieces, are made to resonate and a small audio snapshot each of these resonances is analysed using FFT calculations. The results of this calculation are stored in a collection as pairs of amplitude and frequency components. To derive the ‘sonic object’, which is made up of the most essential tones found in the analysis, a histogram analysis is performed on the results of the FFT calculation to determine the most prominent spectra in the collection in terms of amplitude and frequency of detection. Once the most fundamental tones have been determined a ‘dissonance factor’ is calculated, along with a number of musical calculations such as finding the closest key area and mode to the note collection. These parameters are utilised in the generation of scales and chords, melodic improvisation guides, Markov-chain melodies, probabilistic orchestral resyntheses, dissonance weighting calculations and additive resynthesis. In addition, specific emotional descriptors are assigned to each sonic object relating to the ‘dissonace factor’, which is used to trigger samples of speech associated with the descriptor taken from the EmoDB (Berlin Database of Emotional Speech). Finally, this information is summarised as a number of scores, which can be handled in a variety of ways by the user i.e. for distribution of spectral material to an ensemble or solo performer for improvisation purposes via e-scores, or providing parameters for a sound synthesis engine.
It is useful to reference the work of Leonard B. Meyer when approaching this conceptual territory. Meyer makes an important distinction between ‘absolute expressionist’ and ‘referential expressionist’ positions on musical meaning. The ‘absolute expressionist’ group takes the stance that emotional meanings arise in response to music without reference to the ‘extra-musical’ world beyond the music itself. Any work aiming to correlate emotional descriptors and specific harmonic configurations would therefore find itself positioned in the latter group, that presupposes an existence of referential meaning within any symbolic musical language. A primary impetus for the implementation of this system of correlation is the work of Dr Robert Plutchik, a psychologist who proposed in his Emotional Circumplex Model (See below) that there is a psycho-evolutionary basis behind our emotions and that these emotions are co- dependant and interrelated. This makes it possible to graph Plutchik’s proposed eight primary emotions – anger, fear, sadness, disgust, surprise, anticipation, trust, and joy, over a diagram not unlike the colour wheel. The results are plotted according to positive/negative valence and potency across a two dimensional interactive graph deduced from the results of Structure of Emotions (Morgan et al., 1988) so as to compare them with a dissonance score deduced from calculations based on the work of William Sethares’ Tuning, Timbre, Spectrum, Scale (1998) and Vincent Persichetti’s Twentieth Century Harmony (1961)
Plutchik’s Emotional Circumplex Model (Plutchik, 1984)
As part of Ada: Intelligent Space, a project created for the Swiss National Exposition Expo in 2002 similar correlative procedures were used in order to synthesise music targeting a series of desired emotional states. This project was a large collaboration between neuroscientists and musicians and represents a rather sophisticated piece of artificial intelligence of which musical expression was only one expression of the emotional states it synthesized during the exhibition. SpectraScore on the other hand functions as a compositional system, a score and an instrument and contains only a basic correlative system, not a system of artificial intelligence. The assessments drawn about the emotional content of sounds through it are offered merely as suggestions to the performer, and provide a necessary system for structuring these sounds. My application of a metaphor to music composition follows on, perhaps more directly, from similar attempts made to categorise sounds based on finite series’ of ‘non-musical’ descriptors. Many attempts have been made to correlate sound and colour for example Sabaneev and Pring (1929), Skriabin (1910) and Hector (1918). Luigi Russolo famously created his ‘Futurist Manifesto’ The Art of Noises in 1913, noteably at around the same time that Marcel Duchamp presented a urinal at the 291 gallery in New york and established the tradition of the objet trouvé in sculpture. Perhaps with similar intentions in mind, Russolo developed a system of categorisation that represents a fusion of both ‘musical’ and ‘non-musical’ sound worlds. In doing so he has divided up the spectrum of sounds he knew from the world around him into groups unified by both their acoustic qualities but also the referential meanings associated with them.
Booms Thunderclaps Explosions Crashes Splashes Roars Whistles Hisses Snorts Whispers Murmurs Mutterings Bustling noises Gurgles Screams Screeches Buzzes Cracklings Sounds obtained by friction Noises obtained by percussion on metals, wood, stone, terra cotta Voices of animals and men: Shouts Shrieks Groans Howls Laughs Wheezes Sobs
Categories of sounds from The Art of Noises (Russolo, 1913)
It seems to me that this chart represents a system of organising primary musical ‘colours’ according to dynamic and timbral features, seemingly related sometimes also through their associations with behavioural language. For instance sounds from group one could perhaps elicit a fight or flight response, whereas sounds from group three could likely illicit caution and curiosity. On the chart is written: ‘In this list we have included the most characteristic fundamental noises; the others are but combinations of these.’ This same organisational principle involving A) reducing a perceptual phenomenon down to an assortment of fundamental building blocks and B) further dividing aspects of said phenomenon into primary and secondary classes is used in both the structure of Newton’s colour wheel (so fundamental to the work of Seurat) and Plutchik’s Emotion Circumplex.
These systems seek to expand the range of sounds and organisational methods available to the composer, so as to include noise and sound structures from the entire world of human experience. A problem arises however when one wishes to create a harmonic relationship between instruments that are classified as ‘noise makers’ and traditional orchestral instruments, which are optimised to produce spectra closely approximating the harmonic series. ‘Noise’ sounds are generally harmonically ‘dissonant’ when compared to ‘musical instruments’, or are at least less flexible in terms of tuning. One solution for this problem is to create software which analyses the properties of sounds produced by ‘noise makers’ so as to create closely related score fragments that can be interpreted by musicians playing conventional instruments. This method repositions these noisy ‘sonic objects’ as the core elements in a composition replacing chord function with a collection of curious resonators.
Spectral techniques synthesised in SpectraScore
A number of traditional techniques of spectral composition are made available to the composer in real-time through the SpectraScore software. They are:
- FFT analysis (SpectraScore FFT)
- Resynthesis and ‘Spectral modality’ (SpectraScore Scores)
- Sine wave resynthesis (SpectraScore ReSyn)
- Probabilistic orchestral resynthesis (SpectraScore ORS)
- Interpolation between various ‘spectral nodes’ (SpectraScore Nodes)
- ‘Spectral cell’ based improvisation (Spectrascore Scores)
- ‘Spectral collection as harmonic morpheme’ (SpectraScore Plutchik’s Flower)
SpectraScore FFT analysis
With SpectraScore, the spectral data collected through FFT analysis is implemented in the generation of electronic accompaniment and scored pitch aggregates (see Fig 4.4). From this, performers can improvise melodic fragments within a specified harmonic and structural context or follow real-time scores. Accurate spectral analysis algorithms had to therefore be carefully selected and implemented for this whole process to function correctly. After examining the available solutions, Max/MSP seemed the most viable environment in which to develop my software. This is due to its range of real-time Digital Signal Processing objects and the availability of the Java Music Specification Language (Didkovsky) based Maxscore objects (Hajdu), and also the existence of Max for Live (Ableton, Cycling ‘74), which makes the task of performing electronic live music and indeed experimental music much, much easier. Maxscore enables the use of conventional western notation, and offers support for microtonal notation, something that is essential to the accurate representation of spectral data. For the FFT calculation I trialled fiddle~ and Pitch~ (Pucket, 1998) and the later reworking of the fiddle~ object, sigmund~ (Puckette et al., 2004). Further data interpolation and quantisation was in the end required to extract stable pitch information from the data streams being produced by sigmund~ and this object with all of its imperfections proved to be the best choice for a resynthesis patch at the time. The gabor~ object (Schnell, Schwartz, 2005) was also trialled, which is available as part of the FTM overlay for Max/MSP (Schnell et al., 2011). This object is no longer adequately supported in Max version 6 and zsa.freqpeak~ (Jourdan, Malt, 2008), found to be as accurate as gabor~ was implemented, to maintain compatibility. This also simplifies the installation stage, as the FTM installer adds excessive unnecessary functionality to the software for the purposes of the SpectraScore software. The zsa objects have proved supremely useful for a great host of spectral analysis related applications.
The live generated score window is for viewing data collected through FFT, but it is also useful in performance scenarios. The leftmost score window displays the analysis pitches transposed to one octave and ordered from left to right in terms of prominence within the analysed spectrum. This takes into account not just amplitude, but how often each frequency is detected within the specified time window. As with the orchestral resynthesis module this represents a metaphorical view of the ‘sonic object’, and is not a literal re-synthesis of the source material. Through treatment of the ‘sonic object’ as a metaphor from which a scale can be derived melodic improvisation is made possible within traditional performance scenarios. I have referred to this earlier as ‘spectral cell’ based improvisation. Also, if the user is to click on any of the notes in the scores they are played back through MIDI channel 1 for the purposes of previewing.
Automated Mode Selection
Using micro-modes taken from the overtone series, Calin Ioachimescu introduced the idea of ‘Spectralism and Modality’ (Vieru, 1980) in his compositions. To increase the versatility of the software in improvisation situations, a feature was implemented that selects and displays the closest musical mode to the loudest and most commonly detected analysis pitches. After a database of common modes is analysed to determine the mode with the highest number of common notes with the analysis pitches, that mode is displayed in notated form and as a word. In faster passages of music where slow careful sight-reading of quartertones and previously unseen pitch material is an issue, the musician can resort to the mode. This feature makes the software more useful to Jazz musicians engaging in interactive spectral music in a traditional format.
Resynthesis with SpectraScore Nodes
Additive synthesis is a common technique used to perform resynthesis, it is presented here as an expression of conceptual notions of spectral music. Each material object analysed by the software is recorded as a set of nodes placed within a two dimensional graph. The x-axis represents pitch class and the y-axis amplitude. Results are available in scored and resynthesised form for previewing or musical performance. This module may be used to create synthesised textures that are closely related to both the ‘sonic object’ and pitch material displayed in the various scores that SpectraScore generates. If no audio is present to transform other than this sin wave resynthesis texture, electronic sounds may still be produced.
Sine wave resynthesis
The sixty-four loudest partials from the analysis are stored and reproduced using a bank of oscillators. These oscillators allow for control over pitch and amplitude and both parameters are extracted from the spectral collection. This resynthesis is paralleled by an orchestral resynthesis but they are not identical. The orchestral resynthesis only contains thirteen members and is quantised to the closest quartertone whereas the sine wave resynthesis contains sixty-four members and reflects the exact frequencies detected at the analysis stage.
Probabilistic orchestral resynthesis with SpectraScore ORS
The orchestral resynthesis created using SpectraScore is not an attempt to closely represent the subject of analysis, but merely create a metaphorical representation of it. These resynthesised textures are created through implementation of a probability algorithm, which selects the thirteen members of the chord based on their prominence in the spectral collection. This means that if a pitch is particularly prominent in a spectral collection in terms of its amplitude or occurrence it will be more likely to appear within the resynthesis chord. The detected frequencies are quantised to the nearest quartertone based on a rounding system that is similar to the one used by Tristan Murail in his preparation of Gondwana (Rose, 1996) Currently, the chords are spread out over the whole range of a string section consisting of parts for Violin, Viola, Cello and Double Bass but plans exist to expand this ensemble in future versions. Previewing notes via clicking on them is also available through MIDI channels 1-13 as is playback of the entire chord simultaneously over MIDI. For future implementations, having access to all of this material on one screen in a more compact format (i.e a grand staff) could present the concept of this module to the user more clearly.
Interpolation between ‘spectral nodes’
Interpolation is a common spectral technique used by composers such as Gerard Grisey and Tristan Murail (Rose, 1996). In his piece Gondwana Murail creates a stepwise transformation between two chords by pitch shifting every element of chord A very gradually until all instruments land on the composite notes of chord B. In SpectraScore this same technique is available between up to five spectral collections. A node-based controller (Fig 4.6) is available that allows the user to slide between these node configurations in real-time. Each morph between two points on the controller involves a multi-point interpolation on the node diagram, which is translated into glissandi in the case of the sine wave resynthesis and automatically updates the score of those performing orchestral resynthesis.
Spectral collections as harmonic morphemes
A graph resembling Plutchik’s Emotion Circumplex (Plutchik, 1984) was created to represent the emotional states detected in various pitch collections by the SpectraScore module Plutchik’s Flower. This system involves real-time tracking of psycho-acoustic properties of live sound. Calculating what is summarised as a ‘roughness’ factor, the roughness~ object (Mcallum and Einbond, 2008) for Max/MSP was trialed for use to calculate the dissonance factor of ‘sonic objects’. Eventually this was replaced with a custom external developed by John Matthews (University of Wollongong) for this project called mxj Dissonance. It represents a direct implementation of the formula created by William Sethares for calculating sensory dissonance (Sethares, 1994). In addition, a secondary method of deriving a ‘dissonance score’ was implemented using ideas presented by Vincent Persichetti in his book 20th Century Harmony which result in a ranking of the 12 possible chromatic interval pairs based on their respective level of dissonance. By implementing both components within SpectraScore a synthesis between cultural and psychophysical aspects of perception was created within a two dimensional graph not unlike Plutchik’s Emotion Circumplex. The ‘dissonance index’ detected using Persichetti’s method is reduced to a number between 1 and 8 for comparison with ranked emotional descriptors extracted from a study on the structure of emotions over three-dimensions (Morgan, Heise, 1988). In this study a number of participants made evaluations of three emotional dimensions subsequently interpreted as activity level, positive/negative effect and potency. The emotional descriptors used in both Plutchik’s emotion circumplex model and the study by Morgan et al. were compared to achieve a synthesis of both structures and facilitate automated selection of emotional states associated with chord clusters from spectral analysis. The information taken from Persichetti is represented as a point on the graph within the boundaries of the particular emotional descriptor chosen based on the correlation of intensity and harmonic complexity. The proximity of this point to the middle of the graph represents intensity of emotion and is determined by the level of sensory dissonance calculated from a spectral collection using the formula developed by Sethares.
Speech and music correlation through emotional descriptors
The EmoDB Player uses samples taken from the Berlin Database of Emotional Speech (Burkhardt et al., 2005). This database was created through voice acting, based on emotional trigger words and all samples are spoken in German. Each phrase in the database is chosen for its flexibility of interpretation. Using the trigger words the actor turns a simple phrase like ‘The washcloth is on the table’ into an example of speech encapsulating a particular emotion. The EmoDB Player plays back whichever expression is related to the emotional state detected by the Plutchik’s Flower module in SpectraScore.