Virtual Synthesis for the Real World

Caillou: A Granular Resynthesis Engine

Caillou combines the techniques of Granular Synthesis and a Resynthesis Engine giving you a powerful sound generation tool.

Granular Synthesis

Granular Synthesis is a synthesis method that divides the input audio into small chunks, or Grains, that can have rules applied to them to alter the sound. Each audio Grain represents a small portion of the overall audio signal and can be repeated or shortened to change the length of the audio signal by expanding it or shortening it with little effect on the overall audio signal itself.

Alternately, a Grain can be played back at a different speed, which changes the pitch of the audio signal. Speeding up the playback speed causes the audio signal's pitch to rise. Slowing down the playback causes the audio signal's pitch to lower. With these levers, you are able to independently control the playback speed and pitch of the audio signal.


Resynthesis is the process of breaking an audio signal into its base frequencies and synthesizing waveforms to play at these frequencies. When multiple voices are combined, a mostly-accurate version of the audio signal can be created.

The accuracy of the resulting audio signal resynthesis depends a few main variables.

Frame Size

The Frame Size adjusts how much of the audio signal is used to determine the frequencies and volumes that in turn generate the resynthesized signal. A small Frame Size is less accurate for determining exact frequencies to generate, as the frequency bands are larger:

frequency_band = sample_rate / frame_size
alternately a smaller Frame Size allows for a higher definition audio signal. A large Frame Size provides more accurate frequency definition, but the tradeoff is a lower definition audio signal.

Wave Shape

Sine waves are the fundamental building block of the sounds we hear, but we also have the ability to use other wave shapes to recreate these sounds. Each wave shape provides a unique aspect to add to the overall sound, ranging from smooth to a very rough response.

Voice Count

The number of voices used to regenerate the signal can alter the overall depth of the resynthesized audio signal. A smaller number of voices translates into a shallow audio signal, favoring the highest magnitude frequencies, whereas using a larger number of voices can provide a fuller and deeper sound.

Granular Resynthesis

Granular Resynthesis takes the best parts of both Granular Synthesis and Resynthesis, and throws them together into an engine that provides aspects of both.


Tuning provides the ability alter the frequencies of all voices simultaneously, without altering any other aspects of the synthesis. This allows for a fine control of the pitch of the audio signal.

Step Size

The Step Size controls the playback speed of the Granular Synthesis, and thus the amount of time that it takes to step through a Frame. Step Size can be positive or negative, giving the option to run in reverse as well as forward.

Sonic Fingerprints

Sonic Fingerprints are the distillation of an audio signal: encompassing multiple Spectral Grains across multiple Frame Rates. These provide a basis for Granular Resynthesis, a selection of frequencies and amplitudes to build new sounds and textures.