His vocoder was a channel vocoder. The original intention was to transmit intelligible voice signals over telephone wires, but it didn't get used that way. (However, vocoders are used in cell phones today.) Dudley mentions in his patent that the vocoder had musical potential, and indeed, as early as 1956, people began using it to create what sounds like "robotic" singing.
That's what we'll be doing here. Step by step, we will take two audio signals and combine them the way a channel vocoder would here in the browser.
The first time through, all of the parameters will be fixed. On subsequent runs, you can change anything you want!
A vocoder needs at least two signals: A carrier signal and an "information" signal. The information signal is usually speech, and the carrier signal is usually a harmonically rich signal, like a sawtooth wave or a cello recording.
In the original communication context, the goal was to faithfully recreate the original information signal, and so a carrier signal was chosen to optimize for that. Here, we want an "interesting" result signal, one which resembles speech but also has the "pitched" qualities of the carrier.
I've chosen a bassoon snippet for the carrier signal and a recording of me talking for the information signal. Listen to them.
The first step is to filter the carrier signal so that we get rid of the frequencies in ranges that aren't relevant to speech and keep the ones that are. Dudley's original vocoder retained frequencies centered around 112.5, 337.5, 575, 850, 1200, 1700, 2350, 3250, 4600, and 6450 Hz. We're just going to use six of these, dropping 112.5, 3250, 4600, and 6450, just because that seems to work fine and is less work for the browser.
Once we run bandpass filters centered at those frequencies, we'll get six separate signals containing only frequencies near those six centers (337.5, 575, 850, 1200, 1700, and 2350 Hz). These are the "channels" referred to by the name "channel vocoder."
How near the frequencies must be to the centers of the bandpasses is determined by the Q value. Q is the narrowness of the bandpasses. The higher the Q, the narrower the frequency band.
Hit "Get channel signals," then listen to the resulting channel signals.
Next, we're going to run an envelope follower on these channel signals. An envelope follower measures the average power of a signal. It's called an "envelope follower" because it produces an envelope that follows the shape of the signal.
This averaging produces a signal that is smoother, meaning that it jumps up and down less abruptly. (If we were going to use this for communication, this would be nice because a signal with less meaningful detail in it can be represented by a smaller amount of data.)
We're not calculating straight averages, though; we have weights, called smoothing factors, on how much quickly a rise in the signal should affect the average and how quickly a fall should affect it. With these we can affect output's the sensitivity to attack and decay in the input.
Hit the "Get envelope followers for bands" button to run the envelope followers on the channel signals.
Then, check out the results. (Some of them will be very quiet; that's OK!)
We're going to split the carrier signal into channels, just as we did with the information signal. Hit the "Get channel signals for carrier" button to run the bandpass filters on the carrier, then check out the results.
Now we're getting into the synthesis part of the process. The previous steps extracted information from the input signals. Now we'll put them together. First, we'll modulate the carrier channel signals with the envelope signals (which we got by running the envelope follower on the information channel signals).
By modulating, I mean multiplying. Every sample in the carrier signal will get multiplied by its corresponding sample in the envelope signal to produce the sample result signal. For example, if we had extremely short signals that each had five samples, the modulation would look like this:
In Web Audio, we do this with a GainNode.
Hit the Modulate button, then listen to the modulated signals, but also look at them. (The differences between the modulated signals and their carrier channel ancestors may be more apparent via sight than sound.)
Hit the "Merge modulated signals" button to assemble the modulated signal into the final result signal. This will sum all of the modulate signals together.
^ There it is! Our final signal! Give it a listen and scroll up to the top to listen to the original input signals, for good measure.
We did it! By following these steps, we've synthesized a signal that has characteristics of both the carrier signal and information signal, albeit with perhaps an "unpleasant electrical accent". (Maybe we'll be able to do better than that in the future with a phase vocoder.)
If you want to do this again with the ability to provide your own input signals and to change all of the parameters, you have two options:
The code for this web app is on GitHub.
I wrote this as part of an exploration of signal processing that I'm doing (virtually) at the Recurse Center. Spencer Russell, a Recurse Center alum, patiently explained envelope followers to me and addressed other gaps in my understanding. So, thanks to Spencer and the Recurse Center!