/polydoge/  /  journal/
PolyDoge Journal

Every pivot, every loss, every win

Honest write-ups of every major decision. No silent pivots. No revisionism.
Entry #006 · Reliability 2026-05-31

We gave the bot a harness. It caught a lie on day one.

We wrapped the v4 engine in a harness: one command to verify everything (./init.sh) and a feature_list.json where a feature only turns green when its check actually passes. First run: 146/147. A test that had read “137/137 passing” in our notes for ten days was secretly reading the live replay file the */30 cron keeps growing — 62 records when we froze $65.98, 138 by the time the harness ran it ($77.28). Not an engine bug; a test reading a moving target and calling it correct. Fix: froze the original 62 records as a committed fixture and pointed the test at that. 147/147 now, zero engine changes. The harness’s whole job is making “green” mean something.
harness-engineeringreliabilityflaky-testfixturebuilding-in-public
Entry #005 · Experiments #015 + #016 2026-05-21

Phase B is live. On paper.

15 dispatched tasks across two stages. Phase B replaces the single-pass scanner with a stateful order machine: OrderClient interface, Inventory state file (seeded $65.98 from Phase A), 5 kill switches, pre-resolution cancel, one-sided fill resolution, Discord alerts. Phase B.2 Stage 1 expands the universe to 5 categories (was crypto-only), fixes a 5x undercount in the maker rebate constant, and adds queue-aware fill simulation — the truthful version of paper that drains existing maker depth before crediting our fills. Two pre-existing bugs caught: CoinGecko funding rate was 100x under-reported; migration could be poisoned by NaN/Inf. 137/137 tests. Cron bumped to */30. Phase C gate: 2026-06-04.
v4.0phase-bstate-machinequeue-aware-fills5-categorieskill-switchesrebate-fix
Entry #004 · Experiment #014 2026-05-18

The 0% that was actually 30%.

Yesterday's "0% conversion = pivot the strategy" read was wrong. The number was a measurement bug — classify_replay() substring-matched outcome strings against "YES"/"NO", but Polymarket's "Up or Down" series uses "Up"/"Down". 100% of our fills were invisible for two days. Actual conversion: 30% both-side fills, 27% dollar conversion ($45.68 / $171.26). The Phase B decision gate is already exceeded. Pre-committed rules + wrong measurement = pre-committed pivot away from a working strategy. Fixed, tested, and shipping with a pytest regression so this bug class fails loudly next time.
postmortemmeasurement-bugreplay-validationpre-commitv4.0
Entry #003 · Experiment #014 2026-05-16 (evening)

First quoteables, asset surprise, $50 stake.

Six hours after launch, v4.0 paper has 11 cycles, 180 markets scanned, 9 quoteable opportunities. The arbitrage thesis fired faster than expected — and not where expected. The biggest spreads are in DOGE (20¢ room) and XRP (19¢), not BTC (6¢). Bitcoin's hourly market is efficient; the alt-coin "Up or Down" series are wide open. Bumping stake $5 → $50/side so the daily numbers are loud enough to learn from. Phase B decision gate scales accordingly ($0.30/day → $3/day). Tomorrow morning the first overnight replay data tells us whether the spreads are tradeable or theatrical.
v4.0paper-resultsasset-mixstake-bumpdogexrp
Entry #002 · Experiment #014 2026-05-16 (afternoon)

v4.0 first quoteable opportunity.

The combined-cost arbitrage thesis fired its first signal. 2 markets crossed the MIN_BID_ROOM = 0.03 threshold within hours of the v4.0 deploy — bid_combined under $0.97 on Bitcoin and Dogecoin "Up or Down" hourly series. Paper mode only; the replay layer revisits these over the next ~8 hours to check whether actual taker volume crossed the hypothetical bid prices. "Quoteable" ≠ "would have filled" — that's the whole point of replay validation.
v4.0first-signalpaper-modeauto-journal
Entry #001 · Experiment #014 2026-05-16 (morning)

v3.x is dead. v4.0 lives. We're now a market maker.

After 83 days, 6,267 predictions, and $582 in losses, the directional prediction engine retired. We ran one final diagnostic and found a structural problem: across 4,294 shadow predictions, the algorithm never confidently disagreed with the market (0 instances). Every signal we use is public, so we reach the same conclusion as the market — there's no information edge to find. v4.0 pivots to combined-cost arbitrage on Polymarket binaries: market-neutral, no fair-value model, relies on Polymarket's maker rebate program. Currently paper-only via GHA hourly cron. Honest projection: $0.30-1.00/day on $500 capital. Science project, not business — at current scale.
pivotv4.0liquidity-provisioncombined-cost-arbitragehonest-loss
Next checkpoint: tomorrow morning (2026-05-17) — first overnight replay data lands. Either the spreads were real or they were theatrical.