QuantXVerse
Sky130 · Yosys · 3 PDKs Independently Validated · Stanford G-set · IBM Quantum

~50× less silicon - Published Softmax SOTA.
~858× less silicon - Textbook Baseline.

One ~100-cell standard-cell block replaces a hardware softmax block. Independently re-validated across 3 open-source PDKs (Sky130, NanGate45, IHP130). At Sky130 8-bit: ~50× smaller than published academic SOTA softmax HW (ConSmax DAC 2024-class shared-exp implementations); ~858× smaller than the textbook N-exp + N-divide baseline taught in introductory hardware courses. Both numbers are real arithmetic; the modern-baseline number is the more conservative comparison for state-of-the-art chip vendors. Why we report both: marketing has historically led with the textbook number. We disclose both so customers can pick the comparison that matches their internal baseline.

Energy: ~10,000× less energy per solution on Stanford G-set MaxCut vs reference classical annealer (independently confirmed at adversarial settings).
Quantum: ~10¹⁰–10¹²× less full-system energy than IBM Quantum Heron-R2 hardware at K≥30 (cryostat-included — the only defensible quantum framing).

Same job. One block. Two honest ratios.

Sky130 8-bit · Render BH vs softmax baselines

Same physical block whether the workload is softmax, GELU, layer-norm, RMS-norm, rotary embedding, sigmoid, tanh, or SiLU. Numbers below are independently re-synthesized cell counts (Yosys + Sky130 + open PDKs).

Headline ratio depends on which softmax baseline you compare against
vs MODERN baseline (shared-exp / ConSmax-class) ~50× win
Modern softmax · ~5,000 cells
Render BH block · ~100 cells
vs TEXTBOOK baseline (N parallel exp + N divide) ~858× win
Textbook softmax · ~87,475 cells
Render BH block · ~100 cells

Modern shipping AI accelerators typically use shared-exp or ConSmax-class softmax. Honest customer comparison — ~50×. The ~858× figure is real arithmetic against the introductory-textbook design. We disclose both; pick the comparison that matches your stack.

Combinatorial Optimization. ~10,000× less energy on MaxCut.

Same physical block. New workload. Stanford G-set MaxCut benchmark vs reference classical annealer (dwave-neal) and real IBM Quantum hardware. Independently re-validated at adversarial CPU settings.

  • Benchmark: Stanford G-set MaxCut, instance G1 (800 nodes, published best-known cut = 11,624)
  • Cut quality: chip and dwave-neal both hit 100% of best-known cut at iso-quality compute (independently re-verified - — chip ties dwave-neal on quality)
  • Chip energy per solution: hundreds of nanojoules to hundreds of microjoules depending on activity assumptions (A6 conservative: ~776 nJ at 30% activity, 5 fJ/gate; A7 adversarial-stacked: ~446 µJ at 50% activity, 10 fJ/gate)
  • vs dwave-neal CPU baseline: 1.5–5 J per solution depending on CPU TDP + dwave config (A7 measured 4.84 J at 15 W TDP, exact BKS) → ~10,000× less energy at adversarial-on-both-sides settings (A7 STRONG CONFIRM: 10,839× at 15 W TDP, 14,080× at median CPU time)
  • vs real IBM Quantum Heron-R2 (NISQ-era, K≥30) full-system energy including cryostat: ~10¹⁰–10¹²× less. The full-system framing (which includes the 25 kW dilution refrigerator) is the only quantum comparison that survives adversarial probing; "marginal-per-shot" framings are fragile and we no longer cite them.
  • Honest framing: chip ties dwave-neal on quality at iso-quality compute. We don't have a better algorithm — we have radically more efficient hardware for the same algorithm.
CPU baseline (dwave-neal, 15 W TDP, exact-BKS config)
~1–5 J
Per G1 solution (depends on TDP and dwave config) · open-source reference annealer
Cut quality: 100% of BKS · A7 measured 4.84 J at 15 W
Render BH chip projection (Sky130 standard cells)
~0.8–450 µJ
Per G1 solution (depends on activity assumptions) · projection from measured cell counts + library power model
Cut quality: 100% of BKS · ties dwave-neal at iso-quality compute (A6 confirmed)
Energy advantage
⭐ ~10,000×
Less energy per solution vs the strongest classical baseline.
Independent of and additive to the silicon-area advantage above (~50× vs modern softmax baselines / ~858× vs textbook).

Methodology: Sky130 standard-cell counts (Yosys + open PDK) × sky130_fd_sc_hd power model. CPU baseline: dwave-neal measured directly on commodity laptop. — Chip energy advantage holds at order-of-magnitude under adversarial assumption stacks.

The same chip computes activation functions not yet invented.

A small block of on-die configuration at boot adapts the block to any nonlinear operation a future architect specifies. No new mask set. No new tape-out. No 18-month chip cycle. Future-proof transformer hardware that adapts in firmware.

IP Portfolio

14 USPTO Provisional Patents

Filed across the architecture, programmability, cross-foundry portability, and firmware portability.

14 Provisional applications filed
391+ Independent + dependent claims
USPTO Micro-entity status · veteran-owned applicant
Get in touch

Let's build something.

Licensing, partnerships, due diligence, technical questions — all welcome. Verilog source files available under NDA after mutual due diligence.

ryan@quantxverse.com

Founder · Render BH · United States