Stealing HFT’s mean reversion playbook for your Polymarket bot.

If you Google “mean reversion strategy,” you’ll get a thousand variations of the same advice: _“When price is two standard deviations below the moving average, buy. When it’s above, sell.”_ That’s not how the pros do it. In an HFT shop, mean reversion isn’t about Bollinger Bands - it’s about studying **price movements** themselves and betting that recent moves get unwound.

The same logic applies brilliantly to Polymarket. Prediction market prices are bounded between 0 and 1, news shocks cause overreactions, and retail traders pile in late on every poll release. That’s a mean reversion playground - _if_ you build the bot the right way.

The core idea

If a price moved **down** today, bet it goes **up** tomorrow. If it moved **up**, bet it goes **down**. That’s it. No moving averages, no indicators. Just: _what goes up must come down._

The trick is proving, statistically, that this pattern actually exists in your data and persists into the future. Most “strategies” overfit a pattern that’s already evaporated by the time you deploy. We use real out-of-sample validation to avoid that.

Step 01Get the data

The technique

Pull daily OHLC (Open, High, Low, Close) data for the asset you want to study. Each row is one bar: date, symbol, duration, open, close, high, low. For a daily strategy on a liquid asset like Bitcoin Cash, a few years of history is plenty.

One thing worth flagging early: Polymarket prices are probabilities (0 to 1), not unbounded asset prices. That actually helps mean reversion. A market at 0.85 literally cannot trend to infinity, so reversion is mechanically more likely.

Step 02Convert prices to log returns

The technique

This is the most important conceptual move in the whole approach. Stop looking at prices. Start looking at price movements. Specifically, log returns:

log_return = log(today's close / yesterday's close)

Why log returns? Two reasons. First, they’re additive: sum them up and you get your compound rate of return. Second, they’re symmetric. A +5% log return and a -5% log return cancel out exactly, which makes the math clean.

Step 03Add the lag (autoregression)

The technique

Create a new column called close_log_return_lag_1, yesterday’s log return, sitting next to today’s. Now every row in the dataset says: “yesterday moved this much, today moved this much.”

This is autoregression, using a previous price movement to predict the next one. It’s the foundation of the whole strategy.

Step 04Encode the direction

The technique

Reduce each lagged return to a simple sign, +1 if it went up, -1 if it went down. Throw away the magnitude on purpose. This lets you group the data into two clean buckets: “previous bar was up” vs “previous bar was down.”

direction = +<span class="v">1</span> <span class="k">if</span> lag > <span class="v">0</span> <span class="k">else</span> -<span class="v">1</span>

Step 05Study the price movements

The technique

This is where the mean reversion either shows up or it doesn’t. Group every row by direction (was the previous bar up or down?) and compute three numbers per bucket:

Sum of today’s log returns within each bucket
Mean of today’s log returns within each bucket
Count (how many bars fell in each bucket)

On Bitcoin Cash daily data from 2022 onward, the result is clean:

When the previous bar was down, today’s average return is positive.
When the previous bar was up, today’s average return is negative.

That’s mean reversion, statistically confirmed. The mean of each bucket is your expected value (EV) per trade, and both buckets show a tiny positive EV when traded in the reversion direction.

Step 06Out-of-sample validation

The technique

This is the single most important step, and the one most retail “quants” skip. Split the data 75/25 by time. The oldest 75% is “in-sample,” the newest 25% is “out-of-sample.” Run the same analysis on each chunk separately.

If the mean reversion pattern shows up in both the old data and the recent data, it’s probably real. If it shows up only in old data, the pattern is dead and you’ll lose money trading it.

Financial data is non-stationary. Patterns shift. Think FTX collapsing overnight: Bitcoin’s return distribution changed dramatically in a single day. A pattern that worked from 2020-2022 might be gone by 2024.

Step 07Generate the signal and trade

The technique

The signal is dead simple. Flip the sign of the previous return:

signal = -<span class="v">1</span> * direction(lag_1)

If yesterday went down (direction = -1), signal = +1 (bet it goes up). If yesterday went up, signal = -1 (bet it goes down). Then:

trade_log_return = signal * close_log_return

This gives you the realized return of each trade. Sum them up cumulatively and you have your equity curve.

Step 08Evaluate the strategy

Three numbers matter, in this order.

Win rate

On the Bitcoin Cash example, this strategy wins only 52% of trades. That’s it. People obsess over win rate and miss the point. What matters is that your average trade is positive (positive EV). A 49% win-rate strategy with big wins and small losses crushes a 70% win-rate strategy with small wins and huge losses.

Total compound return

Convert log returns back to normal returns:

total_return = exp(sum(trade_log_returns)) - <span class="v">1</span>

On the Bitcoin Cash example, this works out to ~21x over the period. Log returns naturally model compounding: every winning trade increases your next position size, every loss decreases it.

Annualized Sharpe ratio

Risk-adjusted return:

sharpe = (mean_trade_return / std_trade_return) * sqrt(N)

Where N is the number of bars per year (365 for daily crypto, 252 for daily equities, way higher for hourly bars). Higher Sharpe = smoother equity curve = safer to use leverage.

Worked example: 15-min BTC Up/Down bot

Everything above, applied end-to-end to one specific Polymarket market series. This is the running example the rest of the article has been pointing at.

The market

Polymarket lists a fresh “Will BTC be up in the next 15 minutes?” market every 15 minutes. The YES contract pays $1 if BTC is up at the next 15-min UTC boundary versus the previous one. Otherwise the NO side pays $1. New market, fresh book, every 15 minutes, all day, every day.

Bot loop

Once the per-bucket EV is validated and clears spread cost, the live loop looks like this:

every 15 min at UTC :00, :15, :30, :45:
    <span class="c"># 1. close the previous bar</span>
    p_close_t = last_trade_price(active_market)
    log_ret_t = log(p_close_t / p_close_t-1)

    <span class="c"># 2. close any open position; record realized PnL</span>
    <span class="k">if</span> open_position:
        exit_at_market()

    <span class="c"># 3. open the next market</span>
    new_market = subscribe_to_next_market()
    p_open = mid_price(new_market)

    <span class="c"># 4. compute the signal</span>
    direction = +<span class="v">1</span> <span class="k">if</span> log_ret_t > <span class="v">0</span> <span class="k">else</span> -<span class="v">1</span>
    signal    = -direction          <span class="c"># mean reversion: bet against last move</span>

    <span class="c"># 5. check edge clears costs</span>
    <span class="k">if</span> abs(modeled_ev[direction]) < spread_cost + buffer:
        skip()
        <span class="k">continue</span>

    <span class="c"># 6. enter</span>
    side  = <span class="s">'YES'</span> <span class="k">if</span> signal == +<span class="v">1</span> <span class="k">else</span> <span class="s">'NO'</span>
    size  = <span class="v">0.02</span> * capital          <span class="c"># 2% of bot capital</span>
    place_marketable_limit(new_market, side, size, slippage=<span class="v">1</span>tick)

    <span class="c"># 7. update stats & (weekly) re-run validation</span>
    log_trade(...)

Expected per-bar economics

Annualized math

Punching the per-bar net through 96 bars/day, 365 days, with 2% capital sizing per trade:

<span class="v">96</span> bars/day × <span class="v">365</span> days       = <span class="v">35,040</span> trades/year
<span class="v">2</span>% sizing × $<span class="v">10,000</span> capital  = $<span class="v">200</span> per trade
$<span class="v">2</span> net edge per $<span class="v">1,000</span>       = $<span class="v">0.40</span> net per trade
<span class="v">0.40</span> × <span class="v">35,040</span>                = ~$<span class="v">14,000</span>/year on $<span class="v">10K</span> capital, before recompounding
With reinvestment (Kelly-ish): equity curve climbs ~<span class="v">3</span>-<span class="v">5</span>x per year, calibrated

That’s not “$200/year retail,” and it’s not “$25M HFT desk” either. It’s the boring middle: a small, validated edge that pays because compute is cheap and the bot trades 35,000 times a year.

Reality check

These numbers are illustrative, not a guarantee. Real performance depends on (a) whether the reversion edge is currently alive on this market, (b) how tight your execution actually is, and (c) regime stability. Run the validation pipeline before you trust any estimate. The bot’s whole job is to keep checking.

The real lesson

The Bitcoin Cash example wins 52% of its trades and 21x’s the capital because of one thing: **a tiny statistical edge, traded frequently, with compounding.** It’s not magic, it’s not deep learning, it’s not even a particularly sophisticated model. It’s careful data analysis, honest validation, and disciplined execution.

That’s the model to copy for Polymarket. Don’t reach for a neural network. Find a signal with a clean statistical edge, validate it survives out-of-sample, account for fees and slippage, and let compounding do the work. Re-validate constantly, because prediction markets shift faster than crypto.

The bot doesn’t need to be smart. It needs to be honest about its edge.

Building this in production?

The Discord `#research-mean-reversion` channel has the full notebook, the dataset, and members actively running this on live markets.

Join the Discord