Mastering the Orderbook

Welcome

Polymarket prices are outcome-token prices, not clean probabilities. This tutorial shows you how to reconstruct a live book, compute real executable edge, and build a paper-trading loop that survives contact with actual spreads and latency.

Estimated time: 90 minutes Format: Step-by-step

What you'll learn

Why edge = fair_prob minus effective fill price, never midpoint
How to map UP and DOWN token IDs without guessing
How to reconstruct a live level-2 book from WebSocket deltas
How to compute spread, depth, microprice, and diffusion fair value
How to gate signals with regime filters and churn guards
How to paper-trade with latency-aware depth-walked fills and markout logging

What you'll need

Python 3.10+ with websockets, httpx, scipy, and numpy installed
Familiarity with REST and WebSocket APIs (exchange experience is fine)
A Polymarket account for reading market metadata (no funds required for paper trading)
Basic understanding of binary options or prediction markets is helpful but not required

Understand Outcome Tokens and the Edge Formula

Before writing a single line of bot code, you need a precise mental model of what a Polymarket price actually represents. Getting this wrong produces strategies that look profitable in backtests and lose money in production. This step establishes the vocabulary and the one formula that governs every trading decision in this tutorial.

Follow Along With AI

Walk me through this step

Paste this into your AI coding agent to work through this step. Includes both walk-me-through framing and the specific sub-tasks for this step.

I'm working through a step-by-step tutorial.

I'm on Step 01: Understand Outcome Tokens and the Edge Formula.

Step goal: Before writing a single line of bot code, you need a precise mental model of what a Polymarket price actually represents. Getting this wrong produces strategies that look profitable in backtests and lose money in production. This step establishes the vocabulary and the one formula that governs every trading decision in this tutorial.

Walk me through this step interactively. Ask me clarifying questions if I'm stuck. When I write code, review it for any setup-specific gotchas before I run it. When I hit errors, quote my logs back to me with a plain-English explanation. Don't assume I know every library or API surface this step touches — point me to the right docs when I need them. Confirm I've actually completed the step before suggesting we move on.

Outcome tokens are not probabilities

In a Polymarket binary market, two tokens exist: one that pays 1 unit of collateral if the outcome resolves UP, and one that pays 1 unit if it resolves DOWN. A displayed price of 0.64 means the market is currently valuing that token at 64 cents on the dollar. That is not automatically a 64% probability.

The gap between a displayed price and a tradeable probability matters because you cannot buy at the displayed price. You buy at the **best ask**, and for any meaningful size you buy at the **depth-walked effective fill price**, which is always worse than the top-of-book ask. Every cent of that gap eats directly into your edge.

There are six distinct numbers a bot must track and never confuse: the displayed probability, the midpoint, the best bid, the best ask, the effective fill price for a given size, and the model fair value. Only the last two produce a meaningful edge calculation.

Embed

ConceptExample ValueUse In Bot
displayed_probability0.50UI only — never trade on this
midpoint0.50Reference only — not executable
best_bid0.46Sell price at top of book
best_ask0.54Buy price at top of book
effective_fill_price (50 shares)0.57Actual cost after walking depth
model_fair_value0.53Your diffusion estimate

Concept	Example Value	Use In Bot
displayed_probability	0.50	UI only — never trade on this
midpoint	0.50	Reference only — not executable
best_bid	0.46	Sell price at top of book
best_ask	0.54	Buy price at top of book
effective_fill_price (50 shares)	0.57	Actual cost after walking depth
model_fair_value	0.53	Your diffusion estimate

The six price concepts and their relationship for a typical BTC UP token

The one formula that governs every trade

Edge is always computed against the price you will actually pay, not the price you see displayed. For a taker buy of the UP token, the formula is: taker_buy_edge = model_fair_probability - effective_fill_price. If that number is negative, you do not trade.

The midpoint trap is the most common mistake in prediction-market backtests. A model probability of 0.53 looks bullish against a midpoint of 0.50, but if the ask is 0.54 the executable edge is -0.01. The correct decision is no trade. Using midpoint instead of fill price inflates backtest returns by the full half-spread, which can be 4-8 cents on a thin BTC 5-minute market.

python · edge.py

Core edge functions: never pass midpoint as fill_price

from dataclasses import dataclass
from typing import Optional

@dataclass
class EdgeResult:
    fair_prob: float
    fill_price: float
    raw_edge: float          # fair_prob - fill_price
    net_edge: float          # after fee and latency buffers
    tradeable: bool

FEE_BUFFER    = 0.002   # ~0.2 c round-trip fee estimate
LATENCY_BUFFER = 0.003  # conservative latency adverse-selection buffer
TICK_SIZE     = 0.01    # Polymarket minimum price increment

def compute_taker_buy_edge(
    fair_prob: float,
    effective_ask: float,   # depth-walked fill price, NOT midpoint
    fee_buffer: float = FEE_BUFFER,
    latency_buffer: float = LATENCY_BUFFER,
) -> EdgeResult:
    raw   = fair_prob - effective_ask
    net   = raw - fee_buffer - latency_buffer
    # Edge must exceed one tick to be meaningful
    return EdgeResult(
        fair_prob=fair_prob,
        fill_price=effective_ask,
        raw_edge=round(raw, 6),
        net_edge=round(net, 6),
        tradeable=(net > TICK_SIZE),
    )

# Example: looks bullish on midpoint, negative on ask
result = compute_taker_buy_edge(fair_prob=0.53, effective_ask=0.54)
print(result)
# EdgeResult(fair_prob=0.53, fill_price=0.54, raw_edge=-0.01,
#            net_edge=-0.015, tradeable=False)

Warning

Tick size kills small edges

Polymarket uses a tick size of 0.01. A raw edge of 0.006 rounds to zero or negative net edge after fees. Always require net_edge > TICK_SIZE before flagging a signal as tradeable. Many paper strategies fail here.

Warning

UP + DOWN = 1 is not always true

The complement identity holds for standard binary markets but breaks for negative-risk markets and multi-outcome events. Always check the neg_risk metadata flag before applying complement logic. Applying it blindly to the wrong market type produces silent mispricing.

Discover the Market and Map UP/DOWN Token IDs

A bot that trades the wrong token with full confidence is worse than a bot that does nothing. Polymarket does not guarantee that the first token in an API response is UP and the second is DOWN. You must query market metadata, read the outcome names explicitly, and build a verified token map before touching any orderbook data. This step shows exactly how to do that for the BTC Up/Down 5-Minute market.

Follow Along With AI

Walk me through this step

Paste this into your AI coding agent to work through this step. Includes both walk-me-through framing and the specific sub-tasks for this step.

I'm working through a step-by-step tutorial.

I'm on Step 02: Discover the Market and Map UP/DOWN Token IDs.

Step goal: A bot that trades the wrong token with full confidence is worse than a bot that does nothing. Polymarket does not guarantee that the first token in an API response is UP and the second is DOWN. You must query market metadata, read the outcome names explicitly, and build a verified token map before touching any orderbook data. This step shows exactly how to do that for the BTC Up/Down 5-Minute market.

Walk me through this step interactively. Ask me clarifying questions if I'm stuck. When I write code, review it for any setup-specific gotchas before I run it. When I hit errors, quote my logs back to me with a plain-English explanation. Don't assume I know every library or API surface this step touches — point me to the right docs when I need them. Confirm I've actually completed the step before suggesting we move on.

---

Specific sub-tasks to complete during this step:

## TASK 1: Generate a market-discovery validator
Use this after writing market_discovery.py to generate a test that catches bad token maps before any live code runs.

I have a Python function `discover_btc_updown_market(slug)` that returns a MarketMeta TypedDict with fields: condition_id (str), question (str), tokens (dict with keys 'UP' and 'DOWN' mapping to token ID strings), end_time_utc (ISO string), tick_size (float), min_order_size (float), neg_risk (bool), enable_order_book (bool).

Write a pytest test module `test_market_discovery.py` that:
1. Mocks the httpx.get call with a fixture that returns a realistic Gamma API response containing two markets, one with outcomes ['Up', 'Down'] and one with outcomes ['Down', 'Up'] (reversed order).
2. Asserts that in both cases the returned token_map['UP'] matches the token ID paired with the 'Up' outcome name, not the first element.
3. Tests that a market with neg_risk=True raises a warning (use warnings.warn, not an exception) when complement logic would be applied.
4. Tests that a slug with no matching market raises ValueError.
5. Uses only stdlib + pytest + pytest-mock. No real HTTP calls.

Why token mapping is the first thing to get right

Every Polymarket market has a condition ID, a list of outcome names, and a corresponding list of token IDs. The token IDs are the handles you pass to the CLOB when subscribing to orderbook data or placing orders. If you swap UP and DOWN, your model will compute a bullish signal and place a bet on the outcome it thinks is cheap, which is actually the opposite side.

The Gamma Markets API is the right place to start for market discovery. It returns structured metadata including the event slug, question text, outcomes array, and clobTokenIds array. The indices of outcomes and clobTokenIds correspond, so outcome[0] maps to clobTokenIds[0]. Read the outcome name, do not assume position.

For the BTC Up/Down 5-Minute market, the relevant fields are the condition_id (used for CLOB subscriptions), the two token IDs, the end_date_iso (used to compute tau), the tick_size (0.01), and the minimum_order_size. Collect all of these before writing any book-reading code.

python · market_discovery.py

Query Gamma API, extract condition ID, and build an explicit UP/DOWN token map

import httpx
from datetime import datetime, timezone
from typing import TypedDict

GAMMA_API = "https://gamma-api.polymarket.com"
CLOB_API  = "https://clob.polymarket.com"

class TokenMap(TypedDict):
    UP:   str
    DOWN: str

class MarketMeta(TypedDict):
    condition_id:       str
    question:           str
    tokens:             TokenMap
    end_time_utc:       str
    tick_size:          float
    min_order_size:     float
    neg_risk:           bool
    enable_order_book:  bool

def discover_btc_updown_market(slug: str) -> MarketMeta:
    """Fetch metadata for a BTC Up/Down market by event slug."""
    resp = httpx.get(f"{GAMMA_API}/markets", params={"slug": slug}, timeout=10)
    resp.raise_for_status()
    markets = resp.json()
    if not markets:
        raise ValueError(f"No market found for slug: {slug}")

    m = markets[0]  # take the first matching market

    # Build explicit token map — never assume index order
    outcomes   = m["outcomes"]          # e.g. ["Up", "Down"] or ["Down", "Up"]
    token_ids  = m["clobTokenIds"]      # same length, same order
    token_map: TokenMap = {}
    for outcome, tid in zip(outcomes, token_ids):
        key = outcome.strip().upper()   # normalise to "UP" or "DOWN"
        if key in ("UP", "DOWN"):
            token_map[key] = tid

    if set(token_map.keys()) != {"UP", "DOWN"}:
        raise ValueError(f"Unexpected outcomes: {outcomes}")

    return MarketMeta(
        condition_id      = m["conditionId"],
        question          = m["question"],
        tokens            = token_map,
        end_time_utc      = m["endDateIso"],
        tick_size         = float(m.get("tickSize", 0.01)),
        min_order_size    = float(m.get("minOrderSize", 1)),
        neg_risk          = bool(m.get("negRisk", False)),
        enable_order_book = bool(m.get("enableOrderBook", True)),
    )

if __name__ == "__main__":
    meta = discover_btc_updown_market("btc-up-down-5-minute")
    print(meta)

Note

Check neg_risk before complement math

If neg_risk is True, the UP + DOWN = 1 complement identity does not hold without additional conversion mechanics. For standard BTC 5-minute markets neg_risk is typically False, but always assert it in code rather than assuming.

Tip

BTC 5-minute markets roll over

Each 5-minute cycle is a separate market with a new condition_id and new token IDs. Build a scheduler that re-runs discovery at the start of each cycle. Caching the token map across cycles will silently trade a closed market.

python · market_discovery.py

Compute seconds remaining (tau) from market end_time — needed for the diffusion formula in Step 4

from datetime import datetime, timezone

def seconds_remaining(end_time_utc: str) -> float:
    """Return tau: seconds until market close. Returns 0.0 if already closed."""
    end = datetime.fromisoformat(end_time_utc.replace("Z", "+00:00"))
    now = datetime.now(timezone.utc)
    return max(0.0, (end - now).total_seconds())

# Usage
tau = seconds_remaining(meta["end_time_utc"])
print(f"tau = {tau:.1f}s")

Success

Checkpoint: what a correct token map looks like

Print meta['tokens'] and confirm you see exactly two keys, 'UP' and 'DOWN', each with a long hex token ID string. Also confirm neg_risk is False and enable_order_book is True before proceeding. If either flag is wrong, the rest of this tutorial does not apply to that market.

Reconstruct the Live Orderbook via WebSocket

A REST snapshot gives you a starting point, but it goes stale within seconds on an active BTC 5-minute market. The only way to maintain a reliable local book is to fetch a REST snapshot once, then apply every WebSocket delta in order, remove zero-size levels, and resync when the state drifts. This step builds that pipeline for both the UP and DOWN token books simultaneously.

Follow Along With AI

Walk me through this step

Paste this into your AI coding agent to work through this step. Includes both walk-me-through framing and the specific sub-tasks for this step.

I'm working through a step-by-step tutorial.

I'm on Step 03: Reconstruct the Live Orderbook via WebSocket.

Step goal: A REST snapshot gives you a starting point, but it goes stale within seconds on an active BTC 5-minute market. The only way to maintain a reliable local book is to fetch a REST snapshot once, then apply every WebSocket delta in order, remove zero-size levels, and resync when the state drifts. This step builds that pipeline for both the UP and DOWN token books simultaneously.

Walk me through this step interactively. Ask me clarifying questions if I'm stuck. When I write code, review it for any setup-specific gotchas before I run it. When I hit errors, quote my logs back to me with a plain-English explanation. Don't assume I know every library or API surface this step touches — point me to the right docs when I need them. Confirm I've actually completed the step before suggesting we move on.

REST snapshot plus WebSocket delta is the only reliable approach

Polymarket's CLOB exposes a REST endpoint at GET /book?token_id=<id> that returns the current level-2 book as a list of bid and ask price levels with sizes. This is your initial state. The moment you receive it, it begins to age. On a BTC 5-minute market near the cycle midpoint, quotes can change every few hundred milliseconds.

The WebSocket channel wss://ws-subscriptions-clob.polymarket.com/ws/market delivers price_change events. Each event contains a token ID, a side (BUY or SELL), a price, and a new size. The rule is simple: if the new size is greater than zero, upsert that level. If the new size is zero, delete that level. Apply every message in arrival order.

You need two independent book states: one for the UP token and one for the DOWN token. Subscribe to both token IDs in a single WebSocket connection using the assets_ids field. Never merge the two books. Never infer one from the other using complement math during reconstruction — apply complement logic only after both books are independently confirmed valid.

Embed

Book reconstruction pipeline: REST snapshot feeds initial state, WebSocket deltas maintain it

python · book.py

LocalBook class: REST-initialized, WebSocket-maintained, with stale-state detection

import time
from collections import defaultdict
from typing import Literal, Optional

Side = Literal["bids", "asks"]

class LocalBook:
    """Level-2 orderbook for a single token, maintained via WS deltas."""

    STALE_THRESHOLD_S = 5.0  # resync if no update for this many seconds

    def __init__(self, token_id: str):
        self.token_id   = token_id
        self.bids: dict[float, float] = {}  # price -> size
        self.asks: dict[float, float] = {}
        self._last_update = 0.0
        self._snapshot_ts = 0.0

    def load_snapshot(self, bids: list[dict], asks: list[dict]) -> None:
        """Initialise from REST /book response."""
        self.bids = {float(b["price"]): float(b["size"]) for b in bids}
        self.asks = {float(a["price"]): float(a["size"]) for a in asks}
        now = time.monotonic()
        self._last_update = now
        self._snapshot_ts = now

    def apply_delta(self, side: str, price: float, size: float) -> None:
        """Apply a single price_change event from the WebSocket."""
        book = self.bids if side.upper() == "BUY" else self.asks
        if size == 0.0:
            book.pop(price, None)   # remove zero-size level
        else:
            book[price] = size      # upsert
        self._last_update = time.monotonic()

    @property
    def is_stale(self) -> bool:
        return (time.monotonic() - self._last_update) > self.STALE_THRESHOLD_S

    def best_bid(self) -> Optional[float]:
        return max(self.bids) if self.bids else None

    def best_ask(self) -> Optional[float]:
        return min(self.asks) if self.asks else None

    def spread(self) -> Optional[float]:
        bb, ba = self.best_bid(), self.best_ask()
        return round(ba - bb, 6) if bb and ba else None

python · ws_feed.py

WebSocket feed: subscribe to both UP and DOWN tokens, apply deltas, trigger resync on stale state

import asyncio
import json
import time
import httpx
import websockets
from book import LocalBook

CLOB_REST = "https://clob.polymarket.com"
CLOB_WS   = "wss://ws-subscriptions-clob.polymarket.com/ws/market"

async def fetch_snapshot(token_id: str) -> dict:
    async with httpx.AsyncClient() as client:
        r = await client.get(f"{CLOB_REST}/book", params={"token_id": token_id})
        r.raise_for_status()
        return r.json()

async def run_book_feed(
    up_token_id: str,
    down_token_id: str,
    on_update,          # async callback(up_book, down_book, recv_ts)
) -> None:
    books = {
        up_token_id:   LocalBook(up_token_id),
        down_token_id: LocalBook(down_token_id),
    }

    # 1. Seed both books from REST before opening WebSocket
    for tid, book in books.items():
        snap = await fetch_snapshot(tid)
        book.load_snapshot(snap.get("bids", []), snap.get("asks", []))

    subscribe_msg = json.dumps({
        "auth": {},
        "type": "Market",
        "assets_ids": [up_token_id, down_token_id],
    })

    async for ws in websockets.connect(CLOB_WS, ping_interval=20):
        try:
            await ws.send(subscribe_msg)
            async for raw in ws:
                recv_ts = time.time()
                events = json.loads(raw)
                if not isinstance(events, list):
                    events = [events]
                for event in events:
                    if event.get("event_type") != "price_change":
                        continue
                    tid   = event["asset_id"]
                    price = float(event["price"])
                    size  = float(event["size"])
                    side  = event["side"]   # "BUY" or "SELL"
                    if tid in books:
                        books[tid].apply_delta(side, price, size)

                # Resync stale books
                for tid, book in books.items():
                    if book.is_stale:
                        snap = await fetch_snapshot(tid)
                        book.load_snapshot(snap.get("bids", []), snap.get("asks", []))

                await on_update(
                    books[up_token_id],
                    books[down_token_id],
                    recv_ts,
                )
        except websockets.ConnectionClosed:
            continue  # reconnect via the async-for loop

Warning

Never trade off a stale book

If is_stale returns True, the local book may be missing levels or showing phantom liquidity. Gate all edge calculations behind a staleness check. A stale book that shows a wide spread is not a trading opportunity — it is a data gap.

Tip

Resync at cycle boundaries

At the start of each new 5-minute cycle, force a REST resync for both books even if the WebSocket appears healthy. The new cycle has new token IDs and a fresh book state. Carrying over the previous cycle's book is a silent bug.

Compute Executable Edge: Spread, Depth, Microprice, and Fair Value

With a live local book in hand, you can now compute the metrics that actually drive trading decisions. This step covers five calculations: best bid/ask and spread, depth within price bands, microprice as a weighted mid, depth-walked effective fill price for multiple sizes, and the diffusion fair value anchored to BTC spot price and time remaining. All five feed into the final edge formula.

Follow Along With AI

Walk me through this step

Paste this into your AI coding agent to work through this step. Includes both walk-me-through framing and the specific sub-tasks for this step.

I'm working through a step-by-step tutorial.

I'm on Step 04: Compute Executable Edge: Spread, Depth, Microprice, and Fair Value.

Step goal: With a live local book in hand, you can now compute the metrics that actually drive trading decisions. This step covers five calculations: best bid/ask and spread, depth within price bands, microprice as a weighted mid, depth-walked effective fill price for multiple sizes, and the diffusion fair value anchored to BTC spot price and time remaining. All five feed into the final edge formula.

---

Specific sub-tasks to complete during this step:

## TASK 1: Generate a vectorized edge surface for multiple sizes and tau buckets
Use after implementing metrics.py and fair_value.py to explore how edge varies with trade size and time remaining.

I have two Python functions:
- `effective_fill_price(book, side, quantity)` that depth-walks a LocalBook and returns the average fill price for a given quantity, or None if depth is insufficient.
- `diffusion_fair_prob(spot, strike, sigma_per_sqrt_s, tau_seconds)` that returns P(UP) using the binary diffusion formula Phi(log(S/K) / (sigma * sqrt(tau))).

Write a function `edge_surface(up_book, spot, strike, sigma, tau, sizes, fee_buffer, latency_buffer)` that:
1. Accepts a list of `sizes` (e.g. [5, 10, 25, 50, 100]).
2. For each size, computes effective_fill_price for a taker buy of the UP token.
3. Computes net_edge = diffusion_fair_prob - fill_price - fee_buffer - latency_buffer.
4. Returns a pandas DataFrame with columns: size, fill_price, raw_edge, net_edge, tradeable (bool, net_edge > 0.01).
5. Also adds a column `depth_consumed_pct` showing what fraction of total ask-side depth within 3 cents is consumed by that size.

Then write a second function `tau_edge_surface(up_book, spot, strike, sigma, tau_values, size)` that holds size fixed and varies tau over a list of values, returning a similar DataFrame. This lets me see how edge changes as the cycle approaches expiry.

Use only pandas, numpy, and the two functions above. No matplotlib yet.

Five metrics, one decision

Top-of-book spread tells you the minimum round-trip cost. Depth within 1 cent and 3 cents tells you how much size is available at reasonable prices. Microprice weights the midpoint toward the side with more size, giving a better estimate of where the book is leaning. The depth-walked fill price tells you what you actually pay for a specific quantity. And the diffusion fair value tells you what the UP token should be worth given BTC's current position relative to the strike and the time remaining.

None of these metrics alone is sufficient. A tight spread with no depth is a trap. A good diffusion fair value with a stale book is noise. A large depth-walked edge that disappears after 2 seconds of latency is not executable. The edge engine computes all five and gates on all five before flagging a signal.

For the BTC Up/Down 5-Minute market, the diffusion anchor is a Black-Scholes-style binary probability: P(UP) = Phi(log(S/K) / (sigma * sqrt(tau))), where S is the current BTC spot price, K is the cycle reference price, sigma is short-horizon realized volatility per square-root second, and tau is seconds remaining. This is not a final model but a powerful sanity check against the live book.

python · metrics.py

Spread, depth, microprice, and depth-walked fill price from a LocalBook

from typing import Optional
from book import LocalBook

def depth_within(book: LocalBook, side: str, band: float) -> float:
    """Total size available within `band` cents of best price."""
    if side == "ask":
        best = book.best_ask()
        if best is None:
            return 0.0
        return sum(sz for px, sz in book.asks.items() if px <= best + band)
    else:
        best = book.best_bid()
        if best is None:
            return 0.0
        return sum(sz for px, sz in book.bids.items() if px >= best - band)

def microprice(book: LocalBook) -> Optional[float]:
    """Size-weighted mid: leans toward the heavier side."""
    bb, ba = book.best_bid(), book.best_ask()
    if bb is None or ba is None:
        return None
    bid_sz = book.bids.get(bb, 0.0)
    ask_sz = book.asks.get(ba, 0.0)
    total  = bid_sz + ask_sz
    if total == 0:
        return (bb + ba) / 2
    return (bb * ask_sz + ba * bid_sz) / total  # weighted toward the heavier-sized side

def effective_fill_price(book: LocalBook, side: str, quantity: float) -> Optional[float]:
    """Depth-walk the book for `quantity` shares. Returns None if not enough depth."""
    if side == "buy":
        levels = sorted(book.asks.items())       # ascending price
    else:
        levels = sorted(book.bids.items(), reverse=True)  # descending price

    remaining = quantity
    cost      = 0.0
    for price, size in levels:
        take       = min(remaining, size)
        cost      += take * price
        remaining -= take
        if remaining <= 1e-9:
            break

    if remaining > 1e-9:
        return None  # insufficient depth
    return cost / quantity

def book_summary(book: LocalBook, qty: float = 10.0) -> dict:
    return {
        "best_bid":      book.best_bid(),
        "best_ask":      book.best_ask(),
        "spread":        book.spread(),
        "depth_1c_bid":  depth_within(book, "bid", 0.01),
        "depth_1c_ask":  depth_within(book, "ask", 0.01),
        "depth_3c_ask":  depth_within(book, "ask", 0.03),
        "microprice":    microprice(book),
        "fill_price_10": effective_fill_price(book, "buy", 10.0),
        "fill_price_50": effective_fill_price(book, "buy", 50.0),
    }

python · fair_value.py

Diffusion fair value for BTC Up/Down: P(UP) = Phi(log(S/K) / (sigma * sqrt(tau)))

import math
from scipy.stats import norm
from typing import Optional

def diffusion_fair_prob(
    spot: float,          # current BTC price (e.g. from Binance/Coinbase)
    strike: float,        # cycle reference/comparison price
    sigma_per_sqrt_s: float,  # realized vol per sqrt-second (e.g. 0.00012)
    tau_seconds: float,   # seconds remaining in cycle
    epsilon: float = 0.5, # floor to avoid division by zero near expiry
) -> float:
    """
    Binary diffusion probability that BTC closes above strike.
    Uses the same math as a digital call option.
    Returns a float in (0, 1).
    """
    tau = max(tau_seconds, epsilon)
    log_moneyness = math.log(spot / strike)
    z = log_moneyness / (sigma_per_sqrt_s * math.sqrt(tau))
    return float(norm.cdf(z))

def realized_vol_per_sqrt_second(
    returns: list[float],  # list of log-returns, one per second
) -> float:
    """Estimate sigma_per_sqrt_s from recent second-by-second log-returns."""
    if len(returns) < 2:
        return 1e-4  # fallback
    mean = sum(returns) / len(returns)
    var  = sum((r - mean) ** 2 for r in returns) / (len(returns) - 1)
    return math.sqrt(var)  # already per sqrt-second since returns are per second

# Example: BTC at 67,450, strike 67,400, 45s remaining, sigma 0.00015/sqrt(s)
p_up = diffusion_fair_prob(spot=67450, strike=67400,
                           sigma_per_sqrt_s=0.00015, tau_seconds=45)
print(f"P(UP) = {p_up:.4f}")  # -> 0.7694

Assembling the full edge signal

With diffusion_fair_prob and effective_fill_price in hand, the edge calculation from Step 1 becomes concrete. For a taker buy of the UP token at quantity Q: edge = diffusion_fair_prob(spot, strike, sigma, tau) - effective_fill_price(up_book, 'buy', Q). Subtract fee_buffer and latency_buffer. If the result exceeds one tick, the signal is tradeable.

The same logic applies to the DOWN token using the complement: down_fair_prob = 1 - p_up. This only works when neg_risk is False, which you verified in Step 2. Always compute both sides independently and only trade the side with the larger net edge, not both simultaneously unless you are explicitly running an arbitrage strategy.

Tip

Calibrate sigma from recent seconds, not daily vol

Daily realized volatility divided by sqrt(86400) gives a per-second sigma, but BTC intraday vol is not constant. Use a rolling window of the last 30-60 second-by-second log-returns from your external BTC feed. Stale sigma estimates make the diffusion anchor unreliable near high-volatility events like macro prints or large spot moves.

Add Regime Filters and Churn Guards

A raw edge signal fires too often. Many of those fires are in market conditions where the edge is illusory: the book is churning without real liquidity, volatility has spiked and the diffusion anchor is unreliable, the cycle is in its final seconds where spreads blow out, or the book has not repriced after a BTC move. Regime filters and churn guards are the gates that prevent the bot from trading in these conditions.

Follow Along With AI

Walk me through this step

Paste this into your AI coding agent to work through this step. Includes both walk-me-through framing and the specific sub-tasks for this step.

I'm working through a step-by-step tutorial.

I'm on Step 05: Add Regime Filters and Churn Guards.

Step goal: A raw edge signal fires too often. Many of those fires are in market conditions where the edge is illusory: the book is churning without real liquidity, volatility has spiked and the diffusion anchor is unreliable, the cycle is in its final seconds where spreads blow out, or the book has not repriced after a BTC move. Regime filters and churn guards are the gates that prevent the bot from trading in these conditions.

---

Specific sub-tasks to complete during this step:

## TASK 1: Generate a regime transition logger and post-trade regime attribution
Use after implementing regime.py and churn.py to build a diagnostic that shows which regime suppressed the most signals.

I have a `RegimeState` dataclass with fields: label (str), trade_allowed (bool), reason (str). My trading loop calls `classify_regime(...)` on every WebSocket tick and logs the result.

Write a `RegimeLogger` class that:
1. Accepts RegimeState objects via a `.record(state, timestamp)` method.
2. Tracks, per regime label: total ticks in that regime, ticks where trade_allowed=False (suppressed), and the most recent reason string.
3. Exposes a `.summary()` method that returns a pandas DataFrame with columns: regime, total_ticks, suppressed_ticks, suppression_rate_pct, last_reason.
4. Exposes a `.transition_log()` method that returns a list of dicts recording every time the regime label changes: from_label, to_label, timestamp, duration_in_prior_regime_s.
5. Exposes a `.plot_timeline(ax)` method that draws a horizontal bar chart of regime durations on a matplotlib Axes object, color-coded: calm=green, high_vol_boundary=orange, stale_book=red, close_gamma=purple, thin_liquidity=gray.

Also write a `post_trade_regime_attribution(trade_log, regime_log)` function that, given a list of trade dicts (each with a 'signal_time' field) and the RegimeLogger, returns a DataFrame showing for each trade what regime was active at signal_time and whether it was suppressed or allowed through.

Why raw edge signals are not enough

The diffusion fair value is a model. Models are wrong in specific, predictable ways. Near the cycle boundary (z close to zero), tiny BTC moves flip the probability dramatically and the book can lag by several seconds. In high-volatility regimes, sigma estimates are noisy and the diffusion anchor oscillates. When the book is thin, the effective fill price for even 10 shares may consume most of the visible depth, making the signal self-defeating.

Churn is a separate problem. Quote churn means market makers are rapidly posting and cancelling orders without real intent to trade. A book that shows 50 shares at the ask but cancels and resets every 200ms is not providing 50 shares of liquidity. If your bot fires on that quote, it will either miss the fill or get filled at a worse level after the churn resolves.

The five regime states to detect are: calm (normal trading, filters pass), high-vol near-boundary (diffusion anchor unreliable), stale-book (book has not moved despite BTC moving), close-gamma (final 30 seconds, spreads blow out), and thin-liquidity (depth within 3 cents is below minimum threshold). Each state has a different action: suppress, widen threshold, resync, or skip entirely.

python · regime.py

Five-state regime classifier: returns a regime label and a trade_allowed flag

import math
from dataclasses import dataclass
from typing import Literal
from book import LocalBook

RegimeLabel = Literal["calm", "high_vol_boundary", "stale_book", "close_gamma", "thin_liquidity"]

@dataclass
class RegimeState:
    label:         RegimeLabel
    trade_allowed: bool
    reason:        str

# Thresholds — tune these against your paper-trade log
MIN_DEPTH_3C       = 15.0   # minimum shares within 3c of ask
MAX_SPREAD         = 0.08   # suppress if spread > 8c
CLOSE_GAMMA_TAU    = 30.0   # seconds: final window, spreads blow out
HIGH_VOL_Z_THRESH  = 0.25   # abs(z) < this AND high vol = boundary risk
HIGH_VOL_SIGMA_MUL = 2.0    # sigma > 2x rolling median = high-vol regime
STALE_BOOK_DELTA   = 0.005  # book mid unchanged by < 0.5c despite BTC move
STALE_BTC_MOVE_BPS = 3.0    # BTC moved > 3 bps but book did not reprice

def classify_regime(
    up_book:       LocalBook,
    tau:           float,
    z:             float,       # log(S/K) / (sigma * sqrt(tau))
    sigma:         float,
    sigma_median:  float,       # rolling median sigma for comparison
    btc_move_bps:  float,       # abs BTC move in bps over last 2s
    book_mid_move: float,       # abs change in UP microprice over last 2s
) -> RegimeState:

    spread = up_book.spread()
    depth  = sum(sz for px, sz in up_book.asks.items()
                 if up_book.best_ask() and px <= up_book.best_ask() + 0.03)

    if up_book.is_stale:
        return RegimeState("stale_book", False, "WebSocket book is stale")

    if tau <= CLOSE_GAMMA_TAU:
        return RegimeState("close_gamma", False,
                           f"tau={tau:.0f}s: final window, spreads unreliable")

    if depth < MIN_DEPTH_3C:
        return RegimeState("thin_liquidity", False,
                           f"depth_3c={depth:.1f} < {MIN_DEPTH_3C}")

    if spread and spread > MAX_SPREAD:
        return RegimeState("thin_liquidity", False,
                           f"spread={spread:.3f} > {MAX_SPREAD}")

    high_vol = sigma > HIGH_VOL_SIGMA_MUL * sigma_median
    near_boundary = abs(z) < HIGH_VOL_Z_THRESH
    if high_vol and near_boundary:
        return RegimeState("high_vol_boundary", False,
                           f"sigma={sigma:.5f} high and abs(z)={abs(z):.2f} near boundary")

    stale_book = (btc_move_bps > STALE_BTC_MOVE_BPS
                  and book_mid_move < STALE_BOOK_DELTA)
    if stale_book:
        return RegimeState("stale_book", False,
                           f"BTC moved {btc_move_bps:.1f} bps but book mid unchanged")

    return RegimeState("calm", True, "all filters pass")

python · churn.py

Churn guard: track quote-age and cancel rate at the best ask level

import time
from collections import deque

class ChurnGuard:
    """
    Track how often the best-ask price and size change.
    High churn rate = market makers are not committed to their quotes.
    """

    def __init__(self, window_s: float = 5.0, max_churn_rate: float = 4.0):
        self.window_s      = window_s
        self.max_churn_rate = max_churn_rate  # changes per second
        self._events: deque[float] = deque()  # timestamps of best-ask changes
        self._last_best_ask: float | None = None
        self._last_best_size: float | None = None

    def update(self, best_ask: float | None, best_ask_size: float | None) -> None:
        now = time.monotonic()
        # Prune old events outside the window
        while self._events and now - self._events[0] > self.window_s:
            self._events.popleft()

        changed = (
            best_ask  != self._last_best_ask or
            best_ask_size != self._last_best_size
        )
        if changed:
            self._events.append(now)
            self._last_best_ask  = best_ask
            self._last_best_size = best_ask_size

    @property
    def churn_rate(self) -> float:
        """Changes per second over the rolling window."""
        return len(self._events) / self.window_s

    @property
    def is_churning(self) -> bool:
        return self.churn_rate > self.max_churn_rate

    @property
    def quote_age_s(self) -> float:
        """Seconds since the last best-ask change."""
        if not self._events:
            return float("inf")
        return time.monotonic() - self._events[-1]

Warning

Churn spikes precede adverse fills

A churn rate above 4 changes per second at the best ask usually means a market maker is repricing rapidly in response to BTC movement. If you fire a taker order into a churning book, you are likely to fill at a worse level than the signal computed. Gate on is_churning before any order submission.

Warning

Final 30 seconds: spreads blow out

In the last 30 seconds of a 5-minute cycle, the binary gamma is enormous. A 1-2 bps BTC move can flip settlement. Spreads widen, depth thins, and adverse selection risk is highest. The close_gamma regime suppresses all trades in this window by default. Only override this with strong empirical evidence from your paper-trade log.

Success

Checkpoint: regime filter is working

Run the regime classifier against 10 minutes of recorded book data and check the summary. In a typical BTC 5-minute cycle, you should see close_gamma suppressing the final 30 seconds, thin_liquidity firing occasionally when depth drops, and calm dominating the mid-cycle window. If calm is suppressing more than 80% of ticks, your thresholds are too tight.

Paper-Trade with Realistic Fills, Log Everything, and Iterate

Paper trading is not a formality. It is the only way to find out whether your edge survives latency, depth consumption, and adverse selection before any real capital is at risk. This step builds a paper-trade loop that applies a configurable latency delay before checking fills, caps size to available depth, records every prediction with markout at 5s/10s/30s and final settlement, and runs a compliance geoblock check as the first line of any execution path.

Follow Along With AI

Walk me through this step

Paste this into your AI coding agent to work through this step. Includes both walk-me-through framing and the specific sub-tasks for this step.

I'm working through a step-by-step tutorial.

I'm on Step 06: Paper-Trade with Realistic Fills, Log Everything, and Iterate.

Step goal: Paper trading is not a formality. It is the only way to find out whether your edge survives latency, depth consumption, and adverse selection before any real capital is at risk. This step builds a paper-trade loop that applies a configurable latency delay before checking fills, caps size to available depth, records every prediction with markout at 5s/10s/30s and final settlement, and runs a compliance geoblock check as the first line of any execution path.

---

Specific sub-tasks to complete during this step:

## TASK 1: Generate a paper-trade performance report bucketed by regime, tau, and edge
Use after collecting at least one full session of paper-trade fills with markout data to generate the diagnostic report.

I have a list of PaperFill dataclass objects with these fields: fill_id, token_side, signal_time, fill_time, intended_ask, fill_price (None if no fill), fill_size, fair_prob_at_signal, raw_edge_at_signal, regime, markout_5s, markout_10s, markout_30s, markout_final.

Write a function `performance_report(fills: list[PaperFill]) -> dict` that returns a dict of pandas DataFrames with these sections:

1. 'summary': total signals, fill rate (fill_price is not None), mean fill_size, mean raw_edge_at_signal, mean markout_5s, mean markout_10s, mean markout_30s, mean markout_final, win_rate (markout_final > 0).

2. 'by_regime': group by regime label, same metrics as summary.

3. 'by_edge_bucket': bucket raw_edge_at_signal into [0.01-0.02, 0.02-0.04, 0.04-0.06, 0.06+], same metrics. The mean markout_final should increase monotonically with edge bucket if the signal is real.

4. 'fill_realism': for each fill, compute ask_slippage = fill_price - intended_ask. Report mean, p50, p90, p99 slippage. Also report no_fill_rate by regime.

5. 'adverse_selection': for filled trades, report the fraction where markout_5s < 0 (filled and immediately moved against). Break this down by regime.

Return each section as a separate DataFrame. Print a text summary of the most important finding from each section. Use only pandas and numpy.

What makes a paper-trade loop realistic

Most paper-trade implementations are optimistic: they assume the signal price is available at the exact moment the signal fires, fill the full requested size, and record the midpoint as the fill price. All three assumptions are wrong. The realistic version applies a latency delay (1-3 seconds is a conservative estimate for a non-co-located bot), checks whether the ask is still available and at the same price after that delay, caps fill size to the depth visible at the delayed timestamp, and records the actual depth-walked fill price.

Markout is the most important diagnostic. For every paper fill, record the fair value at 5s, 10s, and 30s after fill, and at final settlement. If your fills have negative markout at 5 seconds, you are being adversely selected: the book moved against you before you even had a chance to profit. Negative 5-second markout is a strong signal that your entry timing is wrong or that you are chasing stale quotes.

The compliance check is not optional. Polymarket restricts order placement from certain jurisdictions including the United States as of the documentation reviewed. The geoblock check must be the first line of any function that would submit a real order. For paper trading it is a log-only warning, but the code path must exist so that switching from paper to live does not silently bypass it.

python · compliance.py

Geoblock compliance check: must be called before any order-placement code path

import logging
import os

log = logging.getLogger(__name__)

# Jurisdictions blocked for order placement as of Polymarket docs reviewed 2026-05-18.
# Verify current restrictions at https://polymarket.com/geographic-restrictions
# before enabling live execution.
BLOCKED_JURISDICTIONS = {"US", "USA", "UNITED STATES"}

def check_compliance(jurisdiction: str, paper_trade: bool = True) -> bool:
    """
    Returns True if order placement is allowed.
    In paper_trade mode, logs a warning but does not raise.
    In live mode, raises RuntimeError for blocked jurisdictions.
    """
    j = jurisdiction.strip().upper()
    if j in BLOCKED_JURISDICTIONS:
        msg = (
            f"Jurisdiction '{jurisdiction}' is listed as blocked for Polymarket "
            f"order placement. Verify current restrictions before enabling execution."
        )
        if paper_trade:
            log.warning("[COMPLIANCE] %s (paper-trade mode: logging only)", msg)
            return False
        else:
            raise RuntimeError(f"[COMPLIANCE BLOCK] {msg}")
    return True

# Usage: call this as the FIRST line of any order-submission function
# allowed = check_compliance(os.environ.get("JURISDICTION", "US"), paper_trade=True)
# if not allowed:
#     return  # skip order, log suppression

python · paper_trader.py

Paper-trade loop: latency delay, depth-capped fill, markout logging

import asyncio
import time
import uuid
from dataclasses import dataclass, field
from typing import Optional
from book import LocalBook
from metrics import effective_fill_price
from edge import compute_taker_buy_edge
from compliance import check_compliance
import os

@dataclass
class PaperFill:
    fill_id:        str
    token_side:     str          # "UP" or "DOWN"
    signal_time:    float
    fill_time:      float
    intended_ask:   float
    fill_price:     Optional[float]  # None = no fill (ask moved)
    fill_size:      float
    fair_prob_at_signal: float
    raw_edge_at_signal:  float
    regime:         str
    markout_5s:     Optional[float] = None
    markout_10s:    Optional[float] = None
    markout_30s:    Optional[float] = None
    markout_final:  Optional[float] = None

LATENCY_DELAY_S = 1.5   # simulate round-trip latency before checking fill
MAX_FILL_SIZE   = 50.0  # cap per-entry size

async def attempt_paper_fill(
    book_at_signal:  LocalBook,
    book_after_delay: LocalBook,   # same book object, updated by WS
    token_side:      str,
    quantity:        float,
    fair_prob:       float,
    raw_edge:        float,
    regime:          str,
    jurisdiction:    str = "US",
) -> PaperFill:

    fill_id      = str(uuid.uuid4())[:8]
    signal_time  = time.time()
    intended_ask = book_at_signal.best_ask()

    # Compliance check first — always
    allowed = check_compliance(jurisdiction, paper_trade=True)
    if not allowed:
        # Blocked jurisdiction: record a no-fill and stop. In live mode
        # check_compliance raises instead of returning False.
        return PaperFill(
            fill_id=fill_id, token_side=token_side,
            signal_time=signal_time, fill_time=time.time(),
            intended_ask=intended_ask, fill_price=None,
            fill_size=0.0, fair_prob_at_signal=fair_prob,
            raw_edge_at_signal=raw_edge, regime=regime,
        )

    # Simulate latency
    await asyncio.sleep(LATENCY_DELAY_S)
    fill_time = time.time()

    # Check if ask is still available after delay
    delayed_ask = book_after_delay.best_ask()
    if delayed_ask is None or (intended_ask and delayed_ask > intended_ask + 0.01):
        return PaperFill(
            fill_id=fill_id, token_side=token_side,
            signal_time=signal_time, fill_time=fill_time,
            intended_ask=intended_ask, fill_price=None,
            fill_size=0.0, fair_prob_at_signal=fair_prob,
            raw_edge_at_signal=raw_edge, regime=regime,
        )

    # Cap size to available depth within 3c
    available = sum(
        sz for px, sz in book_after_delay.asks.items()
        if px <= delayed_ask + 0.03
    )
    capped_qty = min(quantity, MAX_FILL_SIZE, available)
    fill_price = effective_fill_price(book_after_delay, "buy", capped_qty)

    return PaperFill(
        fill_id=fill_id, token_side=token_side,
        signal_time=signal_time, fill_time=fill_time,
        intended_ask=intended_ask, fill_price=fill_price,
        fill_size=capped_qty, fair_prob_at_signal=fair_prob,
        raw_edge_at_signal=raw_edge, regime=regime,
    )

python · markout.py

Markout recorder: update each fill with fair value at 5s, 10s, 30s, and settlement

import time
from typing import Callable
from paper_trader import PaperFill
from fair_value import diffusion_fair_prob

async def record_markouts(
    fill:          PaperFill,
    get_spot:      Callable[[], float],   # live BTC spot price getter
    strike:        float,
    sigma:         float,
    get_tau:       Callable[[], float],   # live tau getter
    final_outcome: Callable[[], float | None],  # returns 1.0/0.0 or None
) -> PaperFill:
    """Wait for markout horizons and record fair value vs fill price."""
    if fill.fill_price is None:
        return fill  # no fill, nothing to mark

    horizons = {"markout_5s": 5, "markout_10s": 10, "markout_30s": 30}
    for attr, delay in horizons.items():
        await asyncio.sleep(delay)
        spot = get_spot()
        tau  = get_tau()
        fv   = diffusion_fair_prob(spot, strike, sigma, tau)
        # markout = fair_value_now - fill_price (positive = good for buyer)
        setattr(fill, attr, round(fv - fill.fill_price, 6))

    # Wait for settlement
    while True:
        outcome = final_outcome()
        if outcome is not None:
            fill.markout_final = round(outcome - fill.fill_price, 6)
            break
        await asyncio.sleep(1)

    return fill

import asyncio  # ensure asyncio is available for asyncio.sleep calls above

Warning

Event-level, not row-level backtests

Do not report per-second row EV as if each row were an independent trade. The correct unit is the cycle event. Allow at most one to three entries per 5-minute cycle. Row-level EV from per-second data overstates edge by conflating correlated observations with independent trades.

Success

You are ready to iterate when

Your paper-trade log shows: fill rate above 60%, mean markout_5s positive, markout_final win rate above 52% in the calm regime, and no_fill_rate below 30% after latency delay. If markout_5s is negative, fix entry timing before tuning anything else.