background image

Stratified approach to spatialisation revisited

December 26, 2009

I remain impressed with the latest version of the IRCAM Spat Max/MSP library for live spatialisation.

I have for some years maintained a small Jamoma UserLib with Jamoma module wrappers for a few of the functionalities offered by Spat. So far the purpose has been to make available to Jamoma some essential functionalities and externals from Spat that are not freely (or GNU LGPL compatibly) available elsewhere: Air filtering and binaural decoding of ambisonic B-format signals. Up until now these were based on prior (3.x) versions of Spat. Today I updated one of them, the air filter, to work with Spat 4.1.5. The other module require more work, and might get abandoned for other and better implementations. It is tempting to start a much more thorough Jamoma wrapping of all of Spat, as a supplement to the other functionalities for spatialisation already present.

The pdf documentation of Spat seems to be in a transitional state between versions 3.x and 4.x currently. This is actually a good thing, for me at least, as it reveals more of the inner logic of the system than the objects provided in Spat 4.1.5 do.

I notice structural ideas related to the stratified approach to spatialisation proposed in a paper by Nils Peters, myself and others at the SMC 2009 conference:

Stratified model according to Peters et al (2009).


In our model, the DSP processing required for spatialisation was structured according to two layers, the endocing and decoding layers.

Spat is expanding or further detailing the layered module. In Spat, signal processing is divided in four successive stages, separating directional effects from temporal effects:

  • Pre-processing of input signals (Source)
  • Room effects module (reverberator) (Room)
  • Directional distribution module (Panning)
  • Output equalization module (Decoding)


Quoting the documentation “the reunion of these four modules constitutes a full processing chain from sound pickup to the output channels, for one source or sound event. Each one of these four modules works independently from the others (they have distinct control syntaxes) and can be used individually. Each module has a number of attributes that allow to vary its configuration (for instance, varying complexities for the room effect module or different output channel configurations for the directional distribution module). This modularity allows easy configuration of Spat~ according to the reproduction format, to the nature of input signals, or to hardware constraints (e.g. available processing power).”

Our encoding layer covers the source, room and half of the panning modules, while our decoding layer covers the spat decoding module as well as the remainder of the panning module. One could imagine the model having three main layers: authoring, signal processing and hardware, with further subdivision as follows:

Update: SVG files apparently do not work to well with RSS readers, so the last image shows up as cropped or not at all in the feeds I have tested so far.