Most efficient sine implementation


Hi all. Just started with the BX board.
For a project, I want to have a lot of sine (or cosine) waves running at the same time (about 2-300, if possible). They will run at audio rates (20-20k HZ) and at fixed frequencies.
Is there a way of doing this that minimizes LUT count? Things such as latency does not really matter too much.
Is a CORDID the most efficient way?




What do yo mean by running cosine? are you thinking of analog output of the different sines on different pins or … something different?


I may have been a bit sparse on the details. I want to sum all the different sines together with variable amplitudes in the end, and then output the result on a single pin.


A lookup table (rom) for sine and cosines at whatever resolution you want might be the simplest solution.

See, for example -


It’s also worth considering that you probably don’t actually need to run all of the generators separately.

For example, to get a 20kHz signal out, you only need a 40kHz sample rate. That’s super slow for an FPGA. To save on gate count, instead of running 300 generators at 40kHz, you’re probably better off running a single generator at 12MHz and time-division-multiplexing the calculations.

I ended up doing similar things with some of the synth stuff I built earlier. If you’re doing audio stuff, it’s really easy to chew through the gates on an FPGA like the BX (without DSP blocks) when you start doing things that require multipliers (eg. volume scaling, envelope generators, filters etc). The more that you’re able to re-use the same gates for different things at different points in time, the more you’ll be able to fit in your LUT budget.

I don’t suppose you’re able to provide any more detail about what you’re building (pure curiosity on my part :slight_smile: )? THX deep-note perhaps? :slight_smile:



That’s a very good point - I will try something like that. I am actually coming from a synth angle as well. My first idea was if it was possible to fit part of a hammond organ emulation on the FPGA. The good thing with this is that volume scaling and envelopes are very simple, so I don’t have to spend gates on that.

Your suggestion is a bit like the divide-down oscillators on old transistor organs, right? Just a much higher frequency oscillator, divided down to all the “real” ones…


From what I understand a divide-down oscillator divides down the frequency of the top-octave notes in order to get lower notes. What I was thinking of was more like a pipeline in a CPU…

Imagine you have a set of things that need to be done every sample period (1/40000th of a second):

  • Oscillator: you need to generate the next sample for each of your (91?) tone wheels
  • Bus-bar Mixer: you need to add the appropriate harmonic content from the tone wheels to the (9?) bus-bars based on keys that have been pressed (I assume it’ll be possible to press multiple keys at once?)
  • Harmonic Bar scaler: you need to scale the content of the 9 bus-bars according to the drawbar settings [9 multiplies]
  • Final mixer: You need to mix the content of each of the bus-bars into a mono signal that you can send to the output. [9 adds]

I’m sure there’s probably more complexity in there too, but that’s a useful starting point.

So you can see there, there are probably 3 main components that could be sequenced in time:

  • Oscillator
  • Multiplier/Modulator
  • Adder/Mixer

The trick is in coming up with a way to have a single “instance” (or at least as few as possible) of those components in your design, where you switch out the data each is using per FPGA clock cycle, and sequence everything so that each component can be kept as busy as possible.



Actually, thinking about it, the draw bar volume scaling could probably be done using only simple shift operations, which would make things a lot simpler - I assume that’s what you were meaning?