← All research
Engineering

CPU, not cloud

2026-05-25 · Qovaryx Team

Every architectural decision in Qovaryx was made in service of one constraint: the model has to run on your CPU, on your laptop, locally. Not "could in theory run." Not "runs with a Pro subscription to our cloud." Runs.

Why this is the right constraint

What it forced us to give up

Honest answer: depth. A massive transformer would have more raw capacity. We don't have that.

What we have instead is a cluster of small heads that, between them, cover the dimensions that matter for the task. The aggregate parameter count is a fraction of a frontier LLM, but the task-specific surface is dense.

How we made it fit

We won't write the recipe here — that's the part that took 18 months. The shape:

What it looks like at runtime

The Trading Engine card in the app shows CPU: Ryzen 7 7700 (44%) and 54.9 / 63.0 GB RAM. No GPU row, no GPU dependency, no "spinning up inference" delay. The first chart scored fires within milliseconds of pressing send.

Cloud AI is what you do when the model is too big for the machine. We made the machine right by making the model small.
Not financial advice. Architecture notes describe what we built, not how to trade. Options trading involves substantial risk of loss.