Anyone can show you a chart that went up. The entire discipline of this platform is refusing the three lies that power every hype tool: peak-price screenshots, resurrection accounting, and survivorship. Here is exactly how we grade — including the tests we ran against our own ideas and lost.
When a flagged token's liquidity collapses below 15% of its level at the flag, that fire is invalidated — permanently. If the token later pumps, that pump is a new episode with its own entry point, its own grade, and its own record. It never retro-credits the original call.
Two of our shadow-lane fires died exactly this way — observed liquidity walking down to nothing over days, no recovery. Their grades are failures, permanently, no matter what any later clone of the ticker does.
And one correction, because this rule bit us: our liquidity feed emits 0.0 when it loses a pool (tokens migrate exchanges) while trading continues. For two weeks the grader read those zeros as deaths — it invalidated fires on tokens that were alive, including one that later ran ~700×. We caught it by cross-checking independent price data, fixed the grader (zero-liquidity rows are missing data, never death), re-graded every fire, and published the changes the same day. Invalidation now requires observed nonzero liquidity below the threshold — death must be proven, not inferred from silence.
A signal is graded tradeable only if it reached at least 2× within its original episode — net of execution cost. We model price impact against the actual pool depth plus fees at three position sizes ($500 / $2,000 / $10,000), and we check whether a realistic stop (−50% before the peak) would have killed the trade before the move ever happened.
A 3× "move" in a $30k pool that a $2,000 fill would have eaten is not a 3× for you. A 10× that drew down 60% first belongs to whoever had no stop and infinite nerve — which is nobody. Grades carry all three fill sizes so you can read them at your size.
Binary asks "did it run?" Tradeable asks "could you have caught it?" The gap is brutal: our biggest binary "runner" carried a −66% drawdown in its first three hours — any realistic stop dies before the move — and needed weeks of holding through chop to reach its peak. Thin pools grade out entirely at larger fill sizes. A binary hit-rate flatters the teller; tradeable precision, stated per fill size with drawdown and time-to-peak attached, is the only number a trader can act on. We publish that, or nothing.
Every promising pattern gets a temporal split: discover it on older data, verify it on newer data it never saw. Patterns that fade out-of-sample are published as failed — including our own favorites. Signals run in shadow lanes, against real forward data, before they are allowed to touch anything a member sees.
| Claim we tested | Verdict | What we measured |
|---|---|---|
| "Lots of smart wallets buying = bullish" | INVERTED — holds OOS | 10+ tracked smart wallets was an anti-signal: 0.65× the baseline 2× rate in-sample, 0.26× out-of-sample. The crowd is exit liquidity. n=29,514 tokens. |
| "The first smart wallet in, early, predicts runners" | FAILED OOS | Our own favorite candidate edge. 2.44× lift in-sample collapsed to 0.91× out-of-sample — an artifact. Published as failed, never shipped. |
| "KOLs with good track records keep winning" | NO PERSISTENCE | Leak-free split (classify on early trades, measure on later ones): good-track wallets hit 2× on 38.5% of future buys, bad-track 35.0%. Past performance told you almost nothing. n=424 splittable trades. |
| "A high engine score means it will run" | NO — and we say so | Tokens scoring 65+ ran at 0.98× the baseline — no predictive edge over random. The scorer is a safety filter, not a crystal ball, and we froze it for prediction claims until that changes. |
| "Your gates are blocking winners" | GATES PROTECT | 30-day audit, n=386 eligible tokens: gate-blocked tokens were 3% tradeable vs 9% for those that passed. No gate group beat the control at meaningful sample size. |
| "The revival, not the first fire, is the tradeable event" | RETIRED — was an artifact | Every detected "revival" followed a fake death: a liquidity-feed dropout the grader misread as a rug. With artifact rows excluded, zero revival episodes remain. We had started building on this pattern; the audit killed it first. |
| "Our own grading data is trustworthy" | NO — verified, corrected, test-locked | 2026-07-04 audit: the liquidity feed emits 0.0 on pool migration while trading continues; the grader read that as death and mis-graded fires (one "round-trip" later ran ~700×). Fixed, re-graded, regression-locked. The measurement layer measures itself too. |
This table is the product. Most of it says no edge found — that is what an honest measurement layer looks like in a market where ~85% of tokens die. When something finally survives every rule on this page, you'll know exactly how much scrutiny it beat.
Discovery tools market their winners and bury their misses. Before any external source influences anything here, we capture its entire flag stream — every token it surfaces, hourly, winners and corpses alike — and grade that stream with the same rules above. If a tool's flags don't beat the base rate, it isn't intelligence; it's a lagging spotlight. Results get published either way.
See the grading applied to real tokens — including the ones that made us look wrong.
See the Receipts