| Serveur © IRCAM - CENTRE POMPIDOU 1996-2005. Tous droits réservés pour tous pays. All rights reserved. |
A Real-Time Spatial Sound Processor for Music and Virtual Reality Applications
Jean-Marc Jot, Olivier Warusfel
ICMC 95, Banff (Canada) 1995
Copyright © Ircam - Centre Georges-Pompidou 1995
Abstract
The Spatialisateur, developed by Espaces Nouveaux and Ircam,
is a real-time spatial processor which allows to reproduce and control the
localization of sound sources and the projection of sounds in a real or virtual
space. It can be configured for various reproduction formats over loudspeakers
or headphones, and controlled through a higher-level user interface including
perceptual attributes derived from psychoacoustical research. Applications
include studio recording and computer music, virtual reality and multimedia, or
variable acoustics in rooms (sound reinforcement and reverberation
enhancement).
Introduction
The Spatialisateur was developed in the Max object-oriented software
environment (Puckette 1991), and is implemented as a Max object (named
Spat~) running in real-time on the Ircam Musical Workstation.
Spat~ can also be considered as a library of elementary objects for
real-time spatial processing of sounds (artificial reverberators, multichannel
panoramic potentiometers, parametric equalizers...). This modularity allows one
to configure the spatial processor for different applications or with different
computational costs, depending on the reproduction format or set-up, the
desired flexibility in controlling the room effect, and the available digital
signal processing resources. The design approach focusses on giving the user
the possibility of specifying the desired effect from the point of view of the
listener rather than from the point of view of the technological apparatus of
physical process which generates that effect. In a musical context, this allows
the user to immediately take spatial effects into account at the composition
stage, without refering to a particular electro-acoustical apparatus or
performing space.
1. Processing Structure
To provide a global description of the reproduced effect, the temporal aspects
and the directional aspects are integrated in a cost-efficient application,
using the capacity of a single programmable digital signal processor per sound
source, with no additional arithmetic hardware. Spat~ can be viewed as
an extension of the system proposed in (Chowning 1971), allowing to control
effectively and intuitively the direction of sound events as well as their
distance or proximity (see section 3 below). The Spat~ processor is
formed by cascade association of four configurable sub-modules, namely:
Source~, Room~, Pan~, Out~. The Room~ module
is a computationally efficient and scalable multi-channel reverberator based on
multi-channel delay networks with feedback, designed to ensure the necessary
naturalness and accuracy for music and virtual reality applications (Jot &
al. 1995). The input signal (assumed devoid of reverberation) is pre-processed
by the Source~ module, including a low-pass filter and a variable delay
line to reproduce the air absorption and the Doppler effect. Input equalizers
allow additional corrections according to the nature of the input signal(s) or
to the position of the microphone(s) relative to the instrument.
2. Configuration According to the Reproduction Context
The directional distribution module Pan~ converts the multi-channel
output of the Room~ module to a given reproduction format, while
simultaneously allowing to control the apparent direction of the sound source.
It can be configured for two-channel formats, including three-dimensional
stereophony (binaural or transaural) over headphones or over a pair of
loudspeakers (Jot & al. 1995), and the simulation of coincident or
non-coincident microphone recordings. Multi-channel configurations, appropriate
for studios or auditoria, include the '3/2-stereo' format derived from the
motion picture industry (Thiele 1993) or systems of 4 to 8 loudspeakers
allowing to reproduce all directions in the horizontal plane. The reproduced
effect can be specified irrespective of the reproduction context and is, as
much as possible, preserved from one reproduction mode or listening room to
another. The Out~ module allows to compensate for the frequency response
of the loudspeakers or headphones and for time lags due to the geometry of the
loudspeaker system. Additionnally, when the listening room is not acoustically
neutral, the processor can take into account measurements made at a reference
listening position in order to automatically perform the necessary corrections
in the room effect synthesis, so that the perceived effect at the reference
position be as close as possible to the specification given by the user.
3. Perceptual Control Interface
The reproduced effect can be specified through a higher-level user-interface,
which controls the different signal processing modules in Spat~
simultaneously. Its heart is a perceptual control module derived from
psychoacoustical research carried out at Ircam (Jullien 1995, Warusfel 1990).
The perceptual approach makes it possible to design a spatial processor which
does not rely on a physical and geometrical description of the virtual
environment for synthesizing the artificial room effect (e.g. Moore 1983,
Foster & al. 1991). Instead, the user-interface is directly related to the
perception of the reproduced sound by the listener, which is described by a
small number of mutually independent perceptual factors:
-
- source proximity, brilliance and warmth (energy and
spectrum of direct sound and early reflections),
- room presence and envelopment (relative energies of direct
sound, early and late room effect),
- running reverberance (early decay time), late reverberance
(late decay time)
- heaviness and liveness (variation of decay time with frequency)
Each perceptual factor is related to a measurable acoustical criterion
characterizing the sound transformation. This allows to map the perceptual
representation into signal processing parameters. Consequently, virtual and
measured acoustical qualities can be manipulated within a unified framework.
4. Applications
By inserting a Spat~ processor in each channel of a mixing console or
virtual mixing environment (devoting one DSP to each source channel) the
localization and room effect can be intuitively controlled for each sound
event. The mix can be produced in traditional as well as currently developing
formats, including 3/2-stereo or three-dimensional two-channel stereo (binaural
or transaural recording). The processor allows dynamic movements of sound
sources and remote control through pointing or tracking devices. The realism of
the sound reproduction over headphonesis substantially enhanced by
interfacing the spatial processor with a head-tracking device, and by the
synthesis of a natural-sounding room effect. Music, multimedia or virtual
reality applications can benefit from a perceptually-oriented user interface
which is particularly suitable for dynamic interpolation processes between
different acoustical qualities. The Spat~ library can also be used in
the design of an electro-acoustic system allowing to dynamically modify the
acoustical quality of a large hall, for sound reinforcement or reverberation
enhancement.
References
J. Chowning, "The simulation of moving sound sources", J. Audio Eng.
Soc., vol. 19, no. 1, 1971.
S. Foster, E. M. Wenzel, R. M. Taylor, "Real-time synthesis of complex acoustic
environments", Proc. IEEE Workshop on Applications of Digital Signal
Processing to Audio and Acoustics, 1991.
J.-M. Jot, V. Larcher, O. Warusfel, "Digital signal processing issues in the
context of binaural and transaural stereophony", Proc. 98th Conv. Audio
Eng. Soc., preprint 3980, 1995.
J.-P. Jullien, "Structured model for the representation and the control of room
acoustical quality", Proc. 15th International Conf. on Acoustics,
1995.
F. R. Moore, "A general model for spatial processing of sounds", Comp. Music
J., vol. 7, no. 6, 1983.
M. Puckette, "Combining event and signal processing in the Max graphical
programming environment", Computer Music Journal, vol. 15, no. 3,
1991.
G. Thiele, "The new sound format '3/2-stereo'", Proc. 94th Conv. Audio Eng.
Soc., preprint 3550a, 1993.
O. Warusfel, "Etude des parametres liés à la prise de son
pour les applications d'acoustique virtuelle", Proc. 1rst French Congress on
Acoustics, 1990.
____________________________
Server © IRCAM-CGP, 1996-2008 - file updated on .
____________________________
Serveur © IRCAM-CGP, 1996-2008 - document mis à jour le .