← Symmetry Survey
Symmetry Survey / Precision Axioms / Bounded Arithmetic Semantics
Precision Axiom

Bounded Arithmetic Semantics: Making bf16 Error Explicit

A symbolic surrogate for hardware associativity error, monotone in the dynamic range of the rotation matrix. The point is not to fully formalize bf16; it is to make the dependency on dynamic range explicit enough for downstream optimization theorems to consume.

Higham-style worst-case estimate, not a calibrated tensor-core variance model. If the target kernel is dominated by fp32 accumulation with variance-style cancellation, these bounds may be highly pessimistic. The value is in the order structure, not the constants.

Theorem Reference

Lean anchors. associativityBudget_nonneg, associativityBudget_mono_rotation, bf16_non_associative_bound

Budget definition.

\[ \mathrm{assocBudget}(d,\,q,\,k,\,\tau) \;=\; d \cdot q \cdot k \cdot \tau^2 \]

Monotonicity theorem.

\[ \tau_1 \le \tau_2 \;\Longrightarrow\; \mathrm{assocBudget}(d,q,k,\tau_1) \;\le\; \mathrm{assocBudget}(d,q,k,\tau_2) \]

In English. The bf16 associativity error is bounded by a quantity proportional to the squared dynamic range of the rotation matrix. Shrinking the rotation envelope strictly improves the error budget. This is the load-bearing algebraic fact for the Dense–Sparse refactor.

Axiomatic interface. BF16Axiomatics is a typeclass that abstracts the exact hardware semantics. The associator bound holds for any instance; concrete bf16 formalization can refine the instance later without changing theorems that depend only on monotonicity.

engineering: rotation dynamic-range control compilation: bf16-safe code generation axiomatics: hardware-agnostic bound structure

Budget grows as τ²

rotation envelope τ assocBudget τ (dense threshold) ρ (full envelope) budget saving

Thresholding the rotation to dynamic range τ < ρ moves you left on this curve. The quadratic shape means even modest reductions in dynamic range give large budget savings.

Why This Is Useful

Dynamic-range control

Bounds make explicit which part of the rotation drives bf16 error — the large-magnitude tail.

Hardware-agnostic

Typeclass interface lets downstream theorems be proved once; hardware instance can be refined later.

Dense–Sparse bridge

Monotonicity is the key lemma consumed by densePart_budget_strictly_improves.