What is Hold time?
As we saw in previous question about setup time, for any sequential element e.g. latch
or flip-flop, data needs to be held stable when clock-capture edge is active.
Actually, data needs to be held stable for a certain time after clock-capture edge deactivates,
because if data is changing near the clock-capture edge, sequential element can get
into a metastable state and can capture wrong value at the output.
This time requirement that data needs to be held stable for after the clock capture-edge
deactivates is called hold time requirement for that sequential.
Hold Time Failure to a Flipflop
Like setup, there is a ‘Hold’ requirement for each sequential element (flop or a latch).
That requirement dictates that after the assertion of the active/capturing edge of the
sequential element input data needs to be stable for a certain time/window.
If input data changes within this hold requirement time/window, output of the sequential element could go metastable or output could capture unintentional input data. Therefore, it is very crucial that input data be held till hold requirement time is met for the sequential in question.
In our figure below, data at input pin ‘In’ of the first flop is meeting setup and is
correctly captured by first flop. Output of first flop ‘FF1_out’ happens to be inverted
version of input ‘In’.
As you can see once the active edge of the clock for the first flop happens, which is
rising edge here, after a certain clock to out delay output FF1_out falls. Now for sake
of our understanding assume that combinational delay from FF1_out to FF2_in is very
very small and signal goes blazing fast from FF1_out to FF2_in as shown in the figure
below.

In real life this could happen because of several reasons, it could happen by design
(Imagine no device between first and second flop and just small wire, even better think
of both flops abutting each-other), it could be because of device variation and you
could end up with very fast device/devices along the signal path, there could be
capacitance coupling happening with adjacent wires, favoring the transitions along the
FF1_out to FF2_in, node adjacent to FF2_in might be transitioning high to fall)
with a sharp slew rate or slope which couples favorably with FF2_in going down and
speeds up FF2_in fall delay.
In short in reality there are several reasons for device delay to speed up along the
signal propagation path. Now what ends up happening because of fast data is that
FF2_in transitions within the hold time requirement window of flop clocked by clk2
and essentially violates the hold requirement for clk2 flop.
This causes the falling transition of FF2_in to be captured in first clk2 cycle where
as design intention was to capture falling transition of FF2_in in second cycle of clk2.
In a normal synchronous design where you have series of flip-flops clocked by a grid
clock (clock shown in figure below) intention is that in first clock cycle for clk1 &
clk2, FF1_out transitions and there would be enough delay from FF1_out to FF2_in
such that one would ideally have met hold requirement for the first clock cycle of clk2
at second flop and FF2_in would meet setup before the second clock cycle of clk2 and
when second clock cycle starts, at the active edge of clk2 original transition of
FF1_out is propagated to Out.
Now if you notice there is skew between clk1 and clk2, the skew is making clk2 edge
come later than the clk1 edge (ideally we expect clk1 & clk2 to be aligned perfectly,
that’s ideally !!).
In our example this is exacerbating the hold issue, if both clocks
were perfectly aligned, FF2_in fall could have happened later and would have met
hold requirement for the clk2 flop and we wouldn’t have captured wrong data!
If Hold Violation exist in the design, Is it ok to signoff?
You cannot sign off the design if you have hold violations. Because hold violations
are functional failures. Setup violations are frequency dependent.
You can reduce frequency and prevent setup failures. Hold violations stemming from the same clock
edge race, are frequency independent and are functional failures because you can end
up capturing unintended data, thus putting your state machine in an unknown state.
