Projects over posts
Back to projects

Building A Deterministic Quote Router In Rust

shipped · 2026

A technical case study on Aqueducta's snapshot-first routing API for Movement L1.

RustDeFiRoutingDistributed SystemsMovement

Building A Deterministic Quote Router In Rust

Aqueducta started with a simple product question: if a wallet asks for the best route between two assets on Movement L1, how much hidden nondeterminism am I willing to accept?

My answer ended up being: as little as possible.

In DeFi routing, the obvious hard parts are pool math, graph search, and execution handoff. The less obvious hard part is making the system explainable. If two requests return different routes, I want to know whether the liquidity changed, the request changed, the policy changed, or the software regressed. That pushed the backend design toward a snapshot-first architecture: discover chain state outside the request path, name that state with a content-derived identifier, and route against that fixed input.

This article covers the Rust routing/API side of Aqueducta as a technical case study. The on-chain Move routing module is a separate article.

The Shape Of The System

At runtime, the backend has three jobs:

  1. keep an in-memory view of liquidity fresh enough to serve quotes
  2. evaluate exact-in route candidates deterministically
  3. expose enough diagnostics for clients and operators to know what happened

The core design looks like this:

flowchart LR
  Client["Wallet, app, or route preview"] --> API["Rust quote API"]
  API --> Snapshot["Hot snapshot<br/>named by snapshot_id"]
  API --> Graph["Token graph<br/>k-hop route search"]
  API --> Math["Quote engine<br/>CP + CLMM math"]
  API --> Response["Deterministic quote response"]

  Worker["Refresh workers"] --> Discovery["DEX discovery adapters"]
  Discovery --> Chain["Movement REST API"]
  Discovery --> Bundle["Snapshot bundle"]
  Bundle --> Snapshot

  API --> Metrics["Metrics and status"]

The important boundary is between refresh work and request work. Quote requests do not go back to chain discovery to figure out what pools exist. They evaluate against the active snapshot, or against a retained historical snapshot if the client asks for one explicitly.

That one decision simplified nearly everything else.

Snapshot-First Routing

The snapshot is the unit of truth. It contains pool metadata, pool state, token metadata, and discovery diagnostics. The backend sorts the material that goes into the snapshot and derives a snapshot_id from the canonical payload.

Conceptually:

struct SnapshotBundle {
    chain_id: u64,
    as_of_unix_secs: u64,
    snapshot_id: String,
    pools: Vec<PoolMeta>,
    snapshots: Vec<PoolSnapshot>,
    tokens: Vec<TokenInfo>,
    diagnostics: Option<DiscoveryDiagnostics>,
}

The snapshot ID is not just a label. It is the input handle for replay. A quote response can say, in effect: “this route was computed against snapshot sha256:....” If a client wants to verify or replay a quote, it can pass required_snapshot_id and force the API to use that retained snapshot rather than whatever happens to be hot now.

sequenceDiagram
  autonumber
  participant Worker as Refresh worker
  participant DEX as Discovery adapters
  participant Chain as Movement REST
  participant Store as Snapshot store
  participant API as Quote API
  participant Client

  Worker->>DEX: discover pools and pool state
  DEX->>Chain: view/resources/events
  Chain-->>DEX: liquidity data
  DEX-->>Worker: sorted discovery result
  Worker->>Worker: build snapshot_id
  Worker->>Store: install active snapshot
  Client->>API: quote request
  API->>Store: load active or required snapshot
  API-->>Client: quote response with snapshot_id

This creates a useful operational distinction:

  • freshness is a property of the snapshot lifecycle
  • determinism is a property of routing over a named snapshot

Those are related, but they are not the same problem. Freshness is monitored with snapshot age, discovery coverage, and readiness checks. Determinism is tested with fixed fixtures and repeated request replay.

Request Path

The quote path is deliberately staged. Every stage has a narrow purpose.

flowchart TD
  Request["Quote request"] --> Validate["Validate amount, chain, tokens,<br/>slippage, deadline, fee controls"]
  Validate --> Snapshot{"required_snapshot_id?"}
  Snapshot -->|yes| Historical["Load retained snapshot"]
  Snapshot -->|no| Active["Load active snapshot"]
  Historical --> Policy["Apply routing policy"]
  Active --> Policy
  Policy --> Pools["Filter pools by DEX and token controls"]
  Pools --> Cache{"Route skeleton cache hit?"}
  Cache -->|yes| Candidates["Use cached candidates"]
  Cache -->|no| Search["Build graph and enumerate candidates"]
  Search --> Candidates
  Candidates --> Quote["Run quote math"]
  Quote --> Rank["Rank by quality, output, stable route key"]
  Rank --> IDs["Derive quote_id and route_id"]
  IDs --> Response["Return quote + diagnostics"]

A simplified request model looks like this:

struct QuoteRequest {
    chain_id: u64,
    token_in: String,
    token_out: String,
    amount_in: u128,
    slippage_bps: u16,
    deadline_unix_secs: u64,
    max_hops: usize,
    routing_mode: Option<RoutingMode>,
    required_snapshot_id: Option<String>,
    allow_dexes: Option<Vec<String>>,
    exclude_dexes: Option<Vec<String>>,
    partner_fee_bps: Option<u16>,
    partner_fee_recipient: Option<String>,
}

The real request surface has more controls, but the theme is the same: clients can trade latency for breadth, pin snapshots for replay, narrow DEXes, and ask for diagnostics when they need to explain a decision.

Route Search As A Token Graph

Pools become directed edges in a token graph. A pool between token A and token B contributes both A-to-B and B-to-A edges. The graph search is breadth-first with cycle prevention, and every ordering decision is canonicalized.

The route search is not trying to be clever first. It is trying to be stable first.

struct Edge {
    pool: PoolRef,
    dex: String,
    token_in: String,
    token_out: String,
    fee_bps: Option<u16>,
}

struct PoolGraph {
    // token -> outgoing edges
    adj: HashMap<String, Vec<Edge>>,
}

The key detail is edge ordering. Before route enumeration, each token’s outgoing edges are sorted by stable fields: output token, DEX name, pool reference, and fee. Candidate routes are sorted again by a route key. That means equivalent input produces equivalent candidate order.

flowchart LR
  A["Token A"] -- "Pool 1" --> B["Token B"]
  B -- "Pool 2" --> C["Token C"]
  A -- "Pool 3" --> C
  C -- "Pool 4" --> D["Token D"]

  subgraph Routes["Candidate routes"]
    R1["A -> C"]
    R2["A -> B -> C"]
    R3["A -> C -> D"]
  end

I use routing modes to make the latency/coverage tradeoff explicit:

ModePurposeBehavior
fastwallet previews and immediate UI feedbacksmaller candidate budget, tighter hop policy
balancedranked alternativesbroader candidate search
best_priceslower power-user comparisonswidest search budget

This is a product decision as much as an engineering one. A wallet preview and a research route comparison should not pretend to have the same latency budget.

Caching Route Skeletons, Not Quotes

One easy trap in quote services is caching too much. Full quote responses depend on amount, slippage, deadline, partner fees, diagnostics, and execution options. Caching those can create subtle invalidation bugs.

Aqueducta caches route skeletons instead.

A route skeleton says: “for this snapshot and token pair, these are candidate paths worth evaluating.” It does not say how much output they produce for a particular request amount.

flowchart TD
  Key["snapshot_id + token pair + mode + hop policy"] --> Skeletons["Cached route skeletons"]
  Skeletons --> QuoteA["Quote amount A"]
  Skeletons --> QuoteB["Quote amount B"]
  Skeletons --> QuoteC["Quote amount C"]

That gives the hot path a useful optimization without turning the cache into a source of stale quote data. Popular pairs can skip repeated graph search, while every request still runs fresh quote math against the selected snapshot.

Quote Math Boundaries

The quote engine evaluates a route one hop at a time. Each hop receives the current input amount, pool metadata, and pool snapshot. The output of one hop becomes the input of the next.

trait Quoter {
    fn quote_hop(
        &self,
        token_in: &str,
        token_out: &str,
        pool: PoolContext,
        amount_in: u128,
    ) -> Result<HopQuote>;
}

For constant-product pools, the math is straightforward integer arithmetic:

fn quote_constant_product(
    reserve_in: u128,
    reserve_out: u128,
    fee_bps: u16,
    amount_in: u128,
) -> u128 {
    let fee_denominator = 10_000u128;
    let amount_after_fee =
        amount_in * (fee_denominator - fee_bps as u128) / fee_denominator;

    amount_after_fee * reserve_out / (reserve_in + amount_after_fee)
}

Concentrated liquidity is more involved because the quote depends on tick state. The design keeps that complexity behind the same hop-quote interface. The route engine does not need to know whether a hop is constant-product, CLMM, or backed by an upstream preview call. It only needs a quote quality and an output amount.

Ranking

After quote math, routes are sorted by:

  1. quote quality
  2. highest expected output
  3. deterministic route key

The third point is easy to overlook. If two routes tie, random map iteration order should not decide which one a wallet sees first.

quotes.sort_by(|a, b| {
    quality_rank(b.quality)
        .cmp(&quality_rank(a.quality))
        .then_with(|| b.amount_out.cmp(&a.amount_out))
        .then_with(|| route_key(&a.plan).cmp(&route_key(&b.plan)))
});

That tie-breaker makes tests sharper. A snapshot replay should fail because behavior changed, not because a collection happened to iterate differently.

Response Identity

The API returns deterministic IDs:

  • snapshot_id names the input state
  • quote_id names the normalized request over that input state
  • route_id names a specific route for that normalized request
flowchart TD
  Snapshot["snapshot_id"] --> QuoteID["quote_id"]
  Request["normalized request"] --> QuoteID
  Snapshot --> RouteID["route_id"]
  Request --> RouteID
  Route["route key"] --> RouteID

The normalized request key sorts unordered filters, lowercases case-insensitive fields, and includes policy-relevant controls. The goal is that semantically equivalent requests produce the same identity material.

This also makes client caching cleaner. An app can key route state by snapshot_id, token pair, amount, slippage, hop policy, and routing mode, then invalidate naturally when the snapshot changes.

Operating The Router

The backend exposes two types of status:

  • product-level status for clients and dashboards
  • infrastructure-level status for orchestration
flowchart LR
  API["Quote API"] --> Health["/v1/health"]
  API --> Ready["/readyz"]
  API --> Live["/livez"]
  API --> Status["/v1/status"]
  API --> Metrics["/metrics"]

  Ready --> Kube["Kubernetes readiness"]
  Live --> Kube
  Metrics --> Prom["Prometheus"]
  Prom --> Grafana["Grafana"]

The status surface includes the things I would want during an incident:

  • active snapshot ID
  • retained snapshot IDs
  • snapshot age
  • pool count
  • discovery coverage by DEX
  • tick coverage for CLMM pools
  • route cache entries and warm status
  • refresh worker success/failure timestamps
  • runtime controls such as enabled DEXes and execution hint mode

The most valuable design choice here is that liveness and readiness are not the same. The process can be live while the router is not ready to serve traffic because no fresh snapshot is available. That distinction matters in Kubernetes.

Testing For The Things That Actually Break

The tests I care about most are not only “does the endpoint return 200?” They are also:

  • does the same request over the same snapshot return the same response?
  • does a required historical snapshot quote against that snapshot or fail explicitly?
  • does route ordering stay stable when two routes tie?
  • does the API reject invalid fee, token, hop, and deadline controls?
  • does the route cache behave as an optimization rather than a source of quote truth?
  • does the service report degraded discovery without hiding it?

Fixed snapshot fixtures are central to this. They let the test suite replay quote behavior without depending on live chain state.

What I Would Keep

The snapshot-first design is the part I would repeat. It gave the system a useful spine:

  • discovery is responsible for building named state
  • routing is responsible for deterministic candidate generation
  • quote math is responsible for evaluating candidates
  • the API is responsible for policy, identity, and diagnostics

That separation made the project easier to test and easier to reason about. It also created better vocabulary. When something changes, I can ask a precise question: did the snapshot change, did the request change, did policy change, or did the code change?

For a DeFi routing service, that question is worth designing around.