Timing Analysis leads to wisdom


#1

New Verilog engineer here. I’m writing some code that seems to fail timing (out of my depth a bit probably). As an exercise, I thought perhaps I could create a very simple program and see what timing constraints exist on it. Here’s what I see

Code

    module top (
        input CLK,     // 16MHz clock
        output PIN_6,  // OUT
        input  PIN_13, // IN
        output LED,    // User/boot LED next to power LED
        output USBPU   // USB pull-up resistor
    );

        assign USBPU = 0;
        assign LED = 0;

        assign PIN_6 = PIN_13;

    endmodule

Then the timing output

icetime topological timing analysis report
==========================================

Info: max_span_hack is enabled: estimate is conservative.

Report for critical path:
-------------------------

        pre_io_0_3_0 (PRE_IO) [clk] -> DIN0: 0.307 ns
     0.307 ns net_262 (PIN_13$2)
        odrv_0_3_262_307 (Odrv12) I -> O: 0.796 ns
        t4 (Span12Mux_v12) I -> O: 0.796 ns
        t3 (Span12Mux_v12) I -> O: 0.796 ns
        t2 (Sp12to4) I -> O: 0.662 ns
        t1 (Span4Mux_h4) I -> O: 0.465 ns
        t0 (LocalMux) I -> O: 0.486 ns
        inmux_0_23_2555_2548 (IoInMux) I -> O: 0.382 ns
     4.690 ns net_2548 (PIN_13$2)
        pre_io_0_23_1 (PRE_IO) DOUT0 [setup]: 0.103 ns
     4.793 ns io_pad_0_23_1_din

Resolvable net names on path:
     0.307 ns ..  4.690 ns PIN_13$2

Total number of logic levels: 1
Total path delay: 4.79 ns (208.64 MHz)

It’s little surprise that my designs don’t want to run faster than 200MHz when even the most simple assignment won’t run faster than 208MHz.

More experienced people:

  • what am I doing wrong where my most simple assignment is so slow
  • are there special techniques I can employ to somehow compel this assignment to run faster?
  • in general, should I have hope that my more ambitious code can at 200MHz?

#2

Can I ask what sort of speed you were expecting? The ICE40 datasheet shows, for example, that the global buffer network inside the chip runs at a maximum of 275MHz - so it’s unlikely you’re ever going to get much faster than that with a device like the ice40.

That said, in your example, you’re assigning pin6 to the input of pin13. Those pins aren’t adjacent to each other in the floorplan of the device, so there’ll be some routing overhead (the spans/muxes that are in your critical-path).

I’d be interested to know if things are any faster if you assign pin10 to the pin13 value instead, as pins 10/13 are right next to each other in the layout.

Depending on how far you want to push things this may end up being a matter of expectation-management. These aren’t super-high-speed devices - although they are more than capable for a great lot of interesting use-cases :slight_smile:

D.


#3

Thanks for the quick and clear answer. I didn’t think of pin adjacency. Indeed when I try PIN 10 - PIN 13, as you predict, things look a lot more like what I had hoped to see. See below - 548MHz!

Meanwhile I will certainly manage my expectations for my projects, and keep things chip-appropriate.

icetime topological timing analysis report
==========================================

Report for critical path:
-------------------------

        pre_io_0_3_0 (PRE_IO) [clk] -> DIN0: 0.307 ns
     0.307 ns net_262 (PIN_13$2)
        odrv_0_3_262_141 (Odrv4) I -> O: 0.548 ns
        t0 (LocalMux) I -> O: 0.486 ns
        inmux_0_3_283_269 (IoInMux) I -> O: 0.382 ns
     1.723 ns net_269 (PIN_13$2)
        pre_io_0_3_1 (PRE_IO) DOUT0 [setup]: 0.103 ns
     1.826 ns io_pad_0_3_1_din

Resolvable net names on path:
     0.307 ns ..  1.723 ns PIN_13$2

Total number of logic levels: 1
Total path delay: 1.83 ns (547.56 MHz)