← Symmetry Survey
Symmetry Survey / Engineering Optimizations / Dense–Sparse Refactor
Engineering Optimization

Dense–Sparse Refactor: Threshold the Rotation, Shrink the Budget

Any rotation matrix splits exactly into a dense part (entries below a threshold τ) and a sparse correction (entries above τ). Routing the bf16-safe path through the dense part strictly reduces the associativity error budget when τ < ρ.

This is the constructive optimization step that turns the Bounded Arithmetic monotonicity theorem into an engineering action: find τ, prove the split exact, prove the budget improves, hand the dense part to the bf16 kernel and the sparse part to a higher-precision fallback.

Theorem Reference

Lean anchors. densePart_add_sparsePart, densePart_entrywise_bound, densePart_bf16_budget, densePart_budget_strictly_improves

Exact split.

\[ R^{\tau}_{\mathrm{dense}} + R^{\tau}_{\mathrm{sparse}} = R \]

Strict budget improvement.

\[ \tau < \rho \;\Longrightarrow\; \mathrm{assocBudget}(d,q,k,\tau) \;<\; \mathrm{assocBudget}(d,q,k,\rho) \]

In English. Thresholding at τ splits R exactly (no approximation error in the split itself). The dense part is entrywise bounded by τ, so sending it through a bf16 kernel inherits the tighter budget. Strict improvement holds whenever τ is strictly below the original rotation envelope ρ.

Important caveat. The split introduces a sparse correction that must be handled separately (e.g. in fp32). The net win depends on the sparsity ratio at τ and the relative cost of the two paths. The Lean theorems certify the error-bound side only.

engineering: bf16-safe rotation kernel compilation: mixed-precision code generation optimization: dynamic-range splitting

Splitting a rotation matrix at threshold τ

R full rotation = R dense |·| ≤ τ bf16-safe path + R sparse |·| > τ fp32 fallback budget(τ) < budget(ρ) strict improvement when τ < ρ

The split is algebraically exact: no rounding in the decomposition itself. Error enters only when the dense part goes through bf16 arithmetic — and that error is now bounded by the tighter budget at τ.

Why This Is Useful

Mixed precision

Route the small-magnitude part through bf16, handle large entries in fp32 without losing the rotation.

Certified improvement

densePart_budget_strictly_improves gives a Lean proof that τ < ρ always wins.

Exact decomposition

The split has zero reconstruction error — no approximation until the bf16 kernel fires.

Depends on Bounded Arithmetic Semantics for the budget definition and monotonicity lemma.