Back when I wrote an article here about how the digitally controlled oscillator (DCO) in the Roland Juno synths (the old analog ones, not the Juno-D series) works. At the time, I had intended to also write something about how the DCO in the Korg Poly-800 (and its rack-mount sibling, the EX-800) works. I had heard from several sources that it was quite a different design from the Juno DCO, but at the time I wasn't able to find any solid technical information. And I don't actually own a Poly-800, so I didn't have a guinea pig to experiment on. (Years and years ago, I tried one out in a music store, and to be honest I was not that impressed. But back to the topic.)
Recently, I got interested in the topic again, and after a few days, I managed to finally uncover some documentation. And yes, as it turns out, the Poly-800 DCO is quite different from the Juno. It actually flirts with the line between "analog" and "digital" a lot more than the Juno DCO (which has a completely analog audio path) does. And in some ways, it's more capable than the Juno DCO, but in other ways it's quite limited and just plain screwy.
A few months ago I went to the excellent FDISKC web site and downloaded a copy of the Poly-800 schematics. I recall looking at this before and not being able to make much sense of it, and it doesn't help that it's a scanned copy of an original that was already in rather poor condition. But this time, having read up a bit more on the synth's features and its patch programming options, I had a better idea of what to look for. For those who have not encountered one: The Poly-800 is an eight-voice synth. Its DCOs generate outputs in four octaves for each voice. There are 16', 8', 4', and 2' octaves that can be turned on and off individually. There are two choices of waveform (or so the synth likes to pretend; we'll talk about this later): square and sawtooth. No pulse width modulation on the squares, and no triangle or sine. The normal operating mode is a single DCO per voice, but the 800 can be put in a "double" mode wherein two DCOs are allocated to each voice, the penalty being that the synth is reduced to four voices.
When I looked over the schematics, I noticed an IC with the part number MSM5232. It had two groups of outputs marked as being the four footages mentioned above. Aha, I thought, that must be the IC that generates a voice, or possibly two voices. I got to looking for some notation on the schematic that would explain that that part of the circuit was replicated some number of times (4 or 8 was what I expected), but I couldn't find any such. Also, the IC looked like it was maybe some sort of processor; it had incoming address and data lines. And then there were eight lines marked as "C1" through "C8". I couldn't figure out what those were. A bit of Web searching quickly uncovered that this part was once upon a time made by Oki Electric. However, Oki Electric spun off its semiconductor business into a separate company some years ago; I think it may have been through several changes of hands since then, and in any event, Oki Semiconductor, if it still exists, doesn't seem to have a Web site. So no going to the manufacturer for a data sheet.
I saw several mentions of the Poly-800 service manual having the data sheet, but I only turned up a couple of online sources for the manual, and they looked sketchy (they demanded that you disable your firewall and virus protection in order to download). So no luck there. After hours of searching, I finally found a several-years-old posting that had a pointer to an Italian site. I crossed my fingers and clicked. It was there! And it explains a lot. And now I know...
The Original Chiptunes Synth
The reason I couldn't find any block-replication notation on the schematics was that a single MSM5232 handles all eight DCOs. As it turns out, the MSM5232 wasn't intended to be used in music synthesizers -- it was a tune chip for arcade video games. It contains eight counters that divide down a pair of master clock inputs, and bit shifters that act like octave dividers and produce all of the different footages. It also has a sort-of VCA for each voice, and a pair of onboard attack-sustain-release envelope generators. What it does not have is filters, a problem that we'll get to later.
So here's how it works: Each DCO is, as stated above, has a counter-divider that is loaded with a value and then counts down every time the clock signal at the external clock input cycles. When it reaches zero, it sends a reset pulse, and then its value gets reloaded again.. This much is similar to the Juno DCO. On the Juno, each time the counter reaches zero, the pulse resets a fairly conventional sawtooth VCO core. However, the 5232 has no VCO core. Instead, it has a flip-flop that toggles its state on every counter reset -- which means that it is generating a square wave. That's the only waveform it can produce.
Each DCO has a register into which the CPU places a note number when the DCO is to play a note, and a gate flag that turns the voice on and off. The note number is used to look up a counter value from an internal ROM, which will be used to divide down the incoming clock frequency. The flip-flop controlled by the counter drives a chain of octave dividers which generate the four footage outputs. Basically, there is only one octave's worth of counter values, and it taps into the octave divider chain in different places for higher or lower octaves.
So far so good. Now here's where it begins to get screwy. One would think that the logical way to output the voices from the chip would be to have each voice output on its own output pin. That's not what it does. The voices are divided into two groups, and for each group, all of the outputs of a given footage are mixed onto one output pin; for example, all of the 16' footages for voices 1 through 4 come out mixed on pin 28. This answers a big question that is often asked about this synth: why does it use a paraphonic VCF? Answer: because the 5232 doesn't make the individual voice outputs available. The IC provides amplitude control over each voice, but not over the individual footages -- they can only be turned on or off, and the choice applies for all voices in a group. There are two ASR envelope generators onboard, one for each group, but the Poly-800 does not use them. Rather, it applies envelopes generated externally by the synth's CPU. These are input to the chip through eight input pins, one for each voice. I don't think the chip really has VCAs -- I think that all it does is toggle back and forth between the current envelope level and ground, which produces the square wave of the desired amplitude.
Each group of four voices is driven by its own external clock source. The chip itself has no mechanism for any kind of pitch modulation, so pitch bend and envelope/LFO control over pitch have to be implemented external to the chip, by modulating the master clock frequencies. Each group of four voices has its own master clock input. This is reflected in the Poly-800's architecture; if you put it in the "double" mode, it divides the two groups and drives them with clocks of different frequencies when detune is selected.
The drawing below shows the basic signal flows. (To reduce drawing clutter, only 4 of the 8 voices are shown.) Each voice consists of a note number register, a counter/divider, and four octave dividers. To play a note, the synth chooses a voice, writes the desired note number into its note number register, and then sets a flag telling the voice to play. The note number is used to look up the divide-down count in the ROM, which then goes to the counter/divider. This divides down the master clock (not show) for the group that the voice is in (purple or green) and produces the top octave. The four octave dividers then produce the four footages.
This leaves a big question: we've established in the MSM5232 is only capable of generating square waves. But the Poly-800 provides a choice of square or sawtooth waveforms. How does it do that? You may have read something about the Poly-800 using a mathematical technique called "Walsh functions" to generate the sawtooth. What's a Walsh function? Well, you might know that the process called the "Fourier transform" breaks up a waveform into a set of sine waves that are mixed at different amplitudes. Walsh functions are like the sine waves used in Fourier analysis: by adding together a set of Walsh functions at different frequencies and amplitudes, you can re-create an arbitrary waveform, within a certain bandwidth. And that's what the Poly-800 does to approximate a sawtooth wave: it uses the four footages of square wave that the DCO produces to do the inverse Walsh transform equivalent. When you have the "square" waveform selected for the DCOs, the four square-wave footages are mixed together in equal amounts before the composite signal goes to the filter. However, when "sawtooth" is selected, the four footages get routed into an analog adder circuit that adds them in a proportion such that the output roughly resembles a sawtooth. We say "roughly" because trying to do Walsh transforms with only four functions is about like trying to do additive synthesis with only four harmonics. (Further, it's not true that all of the Walsh functions are square waves; only some of them are, and it takes a more complete set to do a good Walsh transform.) Nonetheless, it does sort of produce a sawtooth wave.
I've still got a lot more digging to do into the schematic. For one thing, I'd like to be able to identify how the source oscillator that produces the two clock signals for the 5232 works. It's obviously not a crystal oscillator since it has to be variable in frequency to an extent. It appears to be based on an LC-type resonant circuit, but that part of the schematic is in particularly bad shape and it's hard to read.