Banking77 · official test split

Small models. Serious
accuracy.

SeedFrontier ships task-specific models that match SOTA accuracy at a fraction of the size, latency and cost. Built for humanoid robotics, where milliseconds and megabytes are the budget — not the rounding error.

10,000 free calls / month · no credit card · API key in 30 seconds

94.42%
Banking77 acc · official test split
~225 ms
CPU latency · single core
~68 MiB
On-disk footprint
Built for the platforms shipping humanoids
Tesla Optimus
Figure 02
Boston Dynamics Atlas
1X NEO
Agility Digit
Apptronik Apollo
Sanctuary Phoenix
Unitree H1
02 — Products

Three products. One deployment story.

AutoArch finds the right model shape. Cortex OS keeps that system learning without forgetting. Soto is the first shape we shipped — a 27 MB byte-level encoder you can call today.

Soto

A 27 MB byte-level encoder, already shipping.

01

Soto is the first frontier point we productized — a tiny byte-level encoder serving classification and embeddings at BERT-class quality, 8 ms latency, and roughly $0.01 per million calls. Hosted API, downloadable checkpoint, MCU-deployable.

  • 86.3% on Banking77 at 7.2M params, no tokenizer, no vocabulary to maintain.
  • Same checkpoint runs on a laptop CPU, a $5 microcontroller, or our hosted API.
  • Free API key in a minute — the smallest, fastest corner of the same frontier AutoArch maps.
curl https://soto.seedfrontier.ai/v1/classify \
  -H "Authorization: Bearer $SOTO_KEY" \
  -d '{"task":"banking77","text":"My card was declined"}'

# → { "results": [{ "label": "card_declined", "score": 0.94 }, ...] }
# 8 ms · 27 MB model · no tokenizer
Seed AutoArch

Find the smallest model that clears your bar.

02

Seed AutoArch searches the accuracy, latency, and size frontier for a specific task so teams can deploy the smallest model that still clears the bar.

  • Finds efficient frontier points instead of returning one opaque model.
  • Keeps runtime, footprint, and quality visible in the decision loop.
  • Built for teams shipping models onto real hardware.
Cortex OS

Lifelong memory that never forgets.

03

Cortex OS is SeedFrontier's long-horizon memory system for AI products that need continuity, recall, and adaptation over time without catastrophic forgetting.

  • Maintains durable memory across sessions, updates, and changing tasks.
  • Designed for systems that need persistent knowledge instead of stateless inference.
  • Pairs naturally with Seed AutoArch when efficiency and memory both matter.
03 — Explorer

Pick your point on the frontier.

Three Seed shapes for the same task. Slide between top accuracy, balanced and most efficient — and watch the size, latency and capability move together.

0.1 MiB1 MiB10 MiB100 MiB1k MiB91%92%93%94%95%MEMORY (LOG)BANKING77 ACCURACYSPACECUDACCBALEFF
AccuracyBalancedEfficient
+0.59 pp vs CUD

Beats CUD by +0.59 pp at a fraction of the size.

When accuracy is non-negotiable. The high-accuracy Seed shape clears the official Banking77 test bar above SOTA, while running on commodity CPU and fitting alongside other modules in the same memory budget.

94.42
%
Banking77 acc
68.4
MiB
Memory
225
ms
Latency CPU
502,170
weights
Parameters
04 — Proof

No asterisks. Official test set.

Banking77 with the published, untouched test split. Same protocol everyone else reports. Compared to current SOTA and the published CUD baseline.

ModelBanking77ParamsMemoryLatency
SPACE (current SOTA)94.94%~hundreds of M~1–2 GiB20–100 ms (GPU)
Seed AutoArch — High-Accuracy94.42%502,170~68 MiB~225 ms (CPU)
Test SOTA (CUD baseline)93.83%~355 M+~1.4 GiB15–50 ms (GPU)
Seed AutoArch — Efficient91.46%54,000210 KiB0.11 ms (CPU)
Soto (shipped)86.3%7.2 M27 MiB · ~7 MiB int8~8 ms (CPU)
05 — Why robotics

Humanoids run on milliseconds,
not gigabytes.

01

Latency budgets are absolute

Vision, planning, balance, locomotion, manipulation, speech and intent must all close their control loops in tens of milliseconds. Cloud round-trips and giant transformer stacks simply cannot fit.

02

Memory and power are scarce

Humanoid platforms run on battery. Every gigabyte of RAM and every watt of GPU draw is taken from balance, vision and runtime. You need dozens of skills, not one giant model.

03

Reliability over hype

On a robot, accuracy without bounded latency is worthless. Operators need shapes whose worst-case behavior is known, repeatable, and small enough to deploy at the edge.

06 — Control stack

The shape of the budget.

Public industry numbers. Each layer of a humanoid stack has its own latency budget and memory ceiling — and Seed shapes fit cleanly inside.

LayerFrequencyMemory ceiling
Balance & gait1 kHz< 1 MiB
Manipulation200–500 Hz< 5 MiB
Vision tracking60–120 Hz< 50 MiB
Scene & semantics10–30 Hz< 200 MiB
Intent & dialogue5–10 Hz< 500 MiB
Planning1–5 Hz< 1 GiB
07 — Get started

Pick a frontier point — or try one in your terminal in 60 seconds.

Soto is live today — free API key, no card. Or bring your task, latency budget, and hardware ceiling and we'll find you a Seed shape that fits.