The first annual Digital Audio Effects Processing conference (DAFX98) took place in the
beautiful city of Barcelona. The call for paper was an opportunity for me to revisit some
toy ideas that had emerged from my work at the Media Lab. It uses the concept of Musical
Gestures (which came from my Ph.D. dissertation) in the context of audio effects processing.
As an embodiment of the general framework is presented, I brought with me a sound example
resulting from the processing of a recording of Ravel's Sonate pour violon et violoncelle.
Harmonic structure likelihoods (standing for the chosen set of musical gestures for this example)
were estimated from the polyphonic recording and they were then used to add a synthetic layer
consisting of female voices (generated as part of the processing through some trivial wavetable
procedure). The resulting audio is satisfying and it illustrates the fact that even a difficult
problem such as real-time polyphonic tracking can be reasonably achieved with a soft analysis/control
system which doesn’t attempt to capture high level musical intentions but rather confines itself
to an expressive and humble set of measurements.
My 4-page paper is available in the conference’s proceedings or you can download it here: SYNOPSIS The first annual Digital Audio Effects Processing conference (DAFX98) took place in the beautiful city of Barcelona. The call for paper was an opportunity for me to revisit some toy ideas that had emerged from my work at the Media Lab. It uses the concept of Musical Gestures (which came from my Ph.D. dissertation) in the context of audio effects processing. As an embodiment of the general framework is presented, I brought with me a sound example resulting from the processing of a recording of Ravel's Sonate pour violon et violoncelle. Harmonic structure likelihoods (standing for the chosen set of musical gestures for this example) were estimated from the polyphonic recording and they were then used to add a synthetic layer consisting of female voices (generated as part of the processing through some trivial wavetable procedure). The resulting audio is satisfying and it illustrates the fact that even a difficult problem such as real-time polyphonic tracking can be reasonably achieved with a soft analysis/control system which doesn’t attempt to capture high level musical intentions but rather confines itself to an expressive and humble set of measurements.
AUDIO EXAMPLE
The following audio example results from the processing of a recording of the first movement
of Ravel's Sonate pour violon et violoncelle. The chosen musical gestures (Harmonic
likelihoods, or more specifically 32 soft keys) are extracted over a range of three octaves
(between 100 and 800 Hz). This extraction follows the general scheme that was outlined in
the paper. As these musical gestures are being estimated, they are fed as control parameters
to a rudimentary wavetable synthesis module that was loaded with samples of female vocals.
We recall that a soft key consists in two state variables: frequency and energy. In this example,
these two state variables are mapped literally to the pitch an the loudness of appropriate
samples in the synthesis engine. This process results in a synthetic choir that is then mixed
with the original stream, leading to the following: (MPEG-1 Layer III - 44.1kHz, 96kbps, joint stereo - 43 seconds / 507 kbytes) |
|