Dynamic Voltage Droop

The current cannot arrive fast enough

Consider a block of logic, perhaps a few million transistors in a processor's execution unit, transitioning from idle to full activity on a single clock edge. In a matter of tens of picoseconds, the block demands a surge of current. The local decoupling capacitance, whatever is placed immediately beneath or beside that block, begins to discharge. If the capacitance is sufficient to supply the entire transient, the story ends there. In practice, it almost never is.

The remaining current must come from farther away. It may come from decoupling capacitors placed millimeters away on the die, or from package-level capacitors centimeters away, or ultimately from the voltage regulator itself. Every path between the switching block and these remote charge reservoirs passes through conductors, and every conductor has inductance. The inductance is not a parasitic that can be eliminated by better manufacturing. It is a consequence of the geometry of the conductor and the magnetic field it creates when current flows through it. It is a physical inevitability.

The voltage drop across an inductor is proportional to the rate of change of current through it: V = L · di/dt. When di/dt is small, as in the gradual ramp-up of a workload over microseconds, the inductive voltage drop is negligible. When di/dt is enormous, as in a logic block switching on in tens of picoseconds, the voltage drop can be tens or even hundreds of millivolts. This is dynamic voltage droop. It is fundamentally different from IR drop, which depends only on the magnitude of the current and the resistance of the path.

The droop formula and its implications

To understand droop quantitatively, it helps to model the on-chip power distribution network as a transmission line. The network has inductance per unit length, which we call L_p, arising from the metal traces that carry current from the supply pads to the logic. It also has capacitance per unit area, which we call C_d, arising from the intentional decoupling capacitors and the intrinsic capacitance of the transistors and interconnect. The characteristic impedance of this power network is Z = √(L_p / C_d).

When a block of logic draws a transient current I from this network, the peak voltage droop is approximately ΔV = I × √(L_p / C_d). This result, derived from transmission-line theory, reveals something important. The droop does not scale linearly with inductance. It scales with the square root of the ratio of inductance to capacitance. Adding decoupling capacitance reduces droop, but with diminishing returns: halving the droop requires quadrupling the capacitance. Reducing inductance is more effective per unit of improvement, but inductance is constrained by the geometry of the power grid, the number and placement of supply vias, and the package design. Neither parameter is easy to change in isolation.

The formula also reveals that droop depends on the current magnitude, not directly on di/dt. This may seem to contradict the L · di/dt relationship, but the two views are consistent. The transmission-line model absorbs the temporal profile of the current into the impedance. A faster edge rate, for a given peak current, produces a droop that arrives sooner and with a sharper onset, but the peak amplitude is set by the impedance and the peak current. The temporal sharpness matters for a different reason: it determines how far the droop propagates before local capacitance can begin to arrest it.

Timescale and the process scaling problem

IR drop reaches its steady-state value on a timescale determined by the RC time constant of the power grid. For a typical on-chip grid, this is on the order of nanoseconds. Once the current distribution stabilizes, the IR drop is fully developed and does not change until the current changes. It is, in effect, a DC phenomenon.

Dynamic droop operates on a much shorter timescale. The droop event begins at the moment the current starts to rise, and its peak occurs within a fraction of a nanosecond. In advanced process nodes, where clock frequencies exceed several gigahertz and transistor switching times are measured in single-digit picoseconds, the current edge rates are extraordinarily steep. A block of logic that demands 10 amperes in 50 picoseconds produces a di/dt of 2 × 10¹¹ A/s. Even a modest inductance of 10 picohenries produces a 2-millivolt drop across that single segment. Summed across the distributed inductance of the entire current path, the droop can easily reach tens of millivolts.

The scaling trend is unfavorable. Each new process generation delivers faster transistors, which means steeper current edges. Supply voltages continue to decrease, which means the same absolute droop consumes a larger fraction of the available voltage margin. The inductance of the power grid does not scale down at the same rate, because inductance is a geometric property that depends on the length and spacing of conductors, not on the minimum feature size. The result is that dynamic droop has become progressively more important with each generation, and in many modern designs it is the dominant contributor to supply voltage variation.

Droop propagates as a wave

IR drop is a spatial field. At steady state, the voltage at every point in the power grid is determined by the resistance between that point and the supply pads, and the current flowing through that resistance. The field is smooth, continuous, and predictable from the grid geometry and the current distribution. It does not change unless the current changes.

Dynamic droop has a different spatial character. When a block of logic switches and draws a sudden current, it creates a localized voltage disturbance. That disturbance does not remain localized. It propagates outward from the switching block as a wave, traveling along the power grid at a velocity determined by the grid's distributed inductance and capacitance: v = 1 / √(LC). The wave carries the voltage deficit outward, momentarily depressing the supply voltage at points increasingly far from the switching block.

The wave eventually attenuates. The resistance of the grid dissipates its energy, and the distributed capacitance absorbs and re-reflects portions of it. But before it attenuates, it can travel a significant distance. In a large die, the wave launched by one switching block can reach another block hundreds of micrometers away within a fraction of a nanosecond. If that remote block is also switching, or is in a timing-critical state, the arriving voltage wave can cause a timing violation or a functional error. The spatial reach of dynamic droop is one of the reasons it is so difficult to manage. The problem is not confined to the block that caused it; it radiates outward and affects its neighbors.

Why droop resists simple prediction

IR drop can be computed by solving a linear system of equations. Given the resistance network and the current sources, the voltage at every node follows from Ohm's law and Kirchhoff's laws. The problem is large, sometimes involving billions of nodes, but it is conceptually straightforward. The solution is unique, and it depends only on the present values of resistance and current.

Dynamic droop is harder to predict for several reasons. First, the droop depends not just on how much current a block draws, but on the temporal profile of that current. Two blocks drawing the same peak current but with different rise times will produce different droop signatures. The current profile, in turn, depends on the switching activity of the logic, which depends on the data being processed and the microarchitectural state of the processor. It is workload-dependent in a way that IR drop is not.

Second, inductance is non-local. The inductance of a conductor depends not only on its own geometry but on the geometry of every other conductor carrying current nearby. Mutual inductance couples the current paths together, so the droop caused by one switching block is influenced by the simultaneous activity of other blocks. This coupling makes the problem inherently multi-dimensional. Analyzing one block in isolation does not give the correct answer.

Third, the interaction between multiple switching events at different times creates interference patterns. Two droop waves arriving at the same point can reinforce or partially cancel, depending on their relative timing. Predicting the worst-case droop requires understanding not just the worst-case current from any single block, but the worst-case combination of currents from all blocks, accounting for their relative phasing. The space of possible combinations is vast, and exhaustive search is impractical.

The partition experiment: when reducing peak current increases noise

A simulation experiment from the early Anasim literature illustrates the counterintuitive nature of dynamic droop. The experiment began with a simple setup: a single clock domain driving a block of logic, producing a triangular current pulse on each clock edge. The resulting power grid noise was measured. Then the clock domain was split into two partitions, each containing half the logic, with the two halves switching at slightly different times. The expectation was straightforward. Fewer gates switching simultaneously should mean less peak current, and less peak current should mean less noise.

The results were startling. The partitioned configuration produced a noise peak of approximately 200 millivolts, roughly 66% higher than the single-domain case. The intuitive prediction was wrong.

The explanation lies in charge conservation and the geometry of current waveforms. When the original block is split in half, each partition draws half the peak current, but over a shorter effective duration. To deliver the same total charge (because the same amount of logic is still switching), each half-amplitude pulse must be narrower. Two half-base triangles must have the same height as one full-base triangle to conserve charge, and this means the narrower pulses have steeper edges. The di/dt of each individual pulse is higher than the di/dt of the original combined pulse.

The dual-peak load waveform, two narrow pulses instead of one broad pulse, displays higher di/dt and produces greater grid noise. This effect is especially pronounced where the inductance to nearby decoupling is significant, because the higher di/dt drives a larger L · di/dt voltage drop. The experiment demonstrates a principle that recurs throughout power integrity engineering: the rate of change of current matters more than the peak current magnitude. An engineer who focuses only on reducing peak current, without considering the effect on di/dt, can inadvertently make the droop problem worse.

This is the central lesson of dynamic voltage droop. It is not enough to know how much current your chip draws. You must know how fast that current changes, how the inductance of your power network interacts with that rate of change, and how the resulting disturbance propagates through the grid. The tools and intuitions developed for IR drop analysis do not transfer directly to the dynamic regime. A new set of models, rooted in transmission-line theory and wave propagation, is required.

In the next lesson, we will formalize the transmission-line model of the power distribution network and show how the impedance profile of the PDN determines the chip's vulnerability to droop across a range of frequencies.