SeedFrontier ships task-specific models that match SOTA accuracy at a fraction of the size, latency and cost. Built for humanoid robotics, where milliseconds and megabytes are the budget — not the rounding error.
10,000 free calls / month · no credit card · API key in 30 seconds
AutoArch finds the right model shape. Cortex OS keeps that system learning without forgetting. Soto is the first shape we shipped — a 27 MB byte-level encoder you can call today.
Soto is the first frontier point we productized — a tiny byte-level encoder serving classification and embeddings at BERT-class quality, 8 ms latency, and roughly $0.01 per million calls. Hosted API, downloadable checkpoint, MCU-deployable.
curl https://soto.seedfrontier.ai/v1/classify \
-H "Authorization: Bearer $SOTO_KEY" \
-d '{"task":"banking77","text":"My card was declined"}'
# → { "results": [{ "label": "card_declined", "score": 0.94 }, ...] }
# 8 ms · 27 MB model · no tokenizerSeed AutoArch searches the accuracy, latency, and size frontier for a specific task so teams can deploy the smallest model that still clears the bar.
Cortex OS is SeedFrontier's long-horizon memory system for AI products that need continuity, recall, and adaptation over time without catastrophic forgetting.
Three Seed shapes for the same task. Slide between top accuracy, balanced and most efficient — and watch the size, latency and capability move together.
Beats CUD by +0.59 pp at a fraction of the size.
When accuracy is non-negotiable. The high-accuracy Seed shape clears the official Banking77 test bar above SOTA, while running on commodity CPU and fitting alongside other modules in the same memory budget.
Banking77 with the published, untouched test split. Same protocol everyone else reports. Compared to current SOTA and the published CUD baseline.
| Model | Banking77 | Params | Memory | Latency |
|---|---|---|---|---|
| SPACE (current SOTA) | 94.94% | ~hundreds of M | ~1–2 GiB | 20–100 ms (GPU) |
| Seed AutoArch — High-Accuracy | 94.42% | 502,170 | ~68 MiB | ~225 ms (CPU) |
| Test SOTA (CUD baseline) | 93.83% | ~355 M+ | ~1.4 GiB | 15–50 ms (GPU) |
| Seed AutoArch — Efficient | 91.46% | 54,000 | 210 KiB | 0.11 ms (CPU) |
| Soto (shipped) | 86.3% | 7.2 M | 27 MiB · ~7 MiB int8 | ~8 ms (CPU) |
Vision, planning, balance, locomotion, manipulation, speech and intent must all close their control loops in tens of milliseconds. Cloud round-trips and giant transformer stacks simply cannot fit.
Humanoid platforms run on battery. Every gigabyte of RAM and every watt of GPU draw is taken from balance, vision and runtime. You need dozens of skills, not one giant model.
On a robot, accuracy without bounded latency is worthless. Operators need shapes whose worst-case behavior is known, repeatable, and small enough to deploy at the edge.
Public industry numbers. Each layer of a humanoid stack has its own latency budget and memory ceiling — and Seed shapes fit cleanly inside.
| Layer | Frequency | Memory ceiling |
|---|---|---|
| Balance & gait | 1 kHz | < 1 MiB |
| Manipulation | 200–500 Hz | < 5 MiB |
| Vision tracking | 60–120 Hz | < 50 MiB |
| Scene & semantics | 10–30 Hz | < 200 MiB |
| Intent & dialogue | 5–10 Hz | < 500 MiB |
| Planning | 1–5 Hz | < 1 GiB |
Soto is live today — free API key, no card. Or bring your task, latency budget, and hardware ceiling and we'll find you a Seed shape that fits.