Depth, not breadth: what a stock-pattern library taught me about systems

I spent a few months building a library of a particular kind of event in the stock market — the rare cases where a small company’s shares run up several hundred percent in a short window and then collapse. Not to trade them in the moment; I’d given up on that for reasons I’ll get to. I wanted a catalog of how these things actually unfold, so I could study the pattern before pretending I knew what it meant.

The way you build a catalog like that is to scan the whole market for the pattern and collect what you find. So that’s what I built first: a scanner that swept thousands of tickers across every major exchange and added anything matching the profile — low-float microcaps that spike on thin volume and then unwind. It worked. And then it exposed a systems mistake I’d been making the whole time: I was optimizing the wrong axis.

The day the scanner stopped finding things

For the first while, every run added new events and it felt like progress. Then the curve flattened. A full sweep of the entire market would surface one new event. Then sometimes zero. I’d built a machine whose entire job was to find more, and it had essentially run out of more to find, because the pattern I was hunting is genuinely rare and I’d already caught most of the historical instances.

My instinct, which I suspect is most people’s, was to widen the net. Loosen the thresholds. Add more exchanges. Scan more often. Get more.

That instinct was wrong, and why it was wrong is the whole point. The scanner wasn’t underperforming. It had succeeded. Within the universe and history I could actually collect, the remaining undiscovered cases were scarce — running the sweep harder couldn’t manufacture events that hadn’t happened. Each additional scan was burning effort to add a row or two to a collection that was already nearly complete. I was measuring the system by how much it collected, and by that measure it looked stalled, so I reached for the lever I knew how to push.

The actual value wasn’t in how many events I had. It was in how much I understood about each one. And on that axis, the library was barely started.

Breadth was done. Depth had barely started.

Each event in my catalog was, at that point, a thin record: a ticker, a date, a peak, a percentage. That tells you almost nothing about why it happened or whether the next one will rhyme. The questions that actually mattered were depth questions, asked of events I already had:

How much of the company was even available to trade at the moment it peaked — a tiny float that a little buying can rip upward, or a large one? Did the company quietly issue new shares right into the spike, which is sometimes a big part of the story? Was there a single catalyst, a filing or a piece of news, and what kind? Did the run happen in one violent move or several? None of those answers required finding a new event. They required going deeper into the ones in front of me.

So I stopped scanning and started enriching. I went back through the existing catalog and added the float at the moment of the peak, then the share-issuance timing, then the catalyst behind each one. Same events, much richer records — a thin row of ticker/date/peak/percent became something that could answer questions instead of just listing occurrences. The breadth had been finished for weeks. I just hadn’t noticed, because I was watching the wrong number.

The general trap

The mistake wasn’t specific to markets.

Most systems we build have a breadth dimension and a depth dimension, and breadth is almost always the one that’s easier to measure and easier to push: how many rows, users, files, items collected. So that’s what we keep pushing, well past the point where it’s paying anything, because the number still moves a little and a number that moves feels like progress.

I didn’t notice breadth had saturated until I was weeks past the point where it mattered. The hard part was noticing when another sweep bought one row instead of a hundred. Not “scan harder.” Stop scanning, and go deeper on what you already have. That switch is uncomfortable because the depth axis has no satisfying counter ticking up. Enriching records is slow and quiet and doesn’t make the collection look bigger. It made the catalog more useful. My metrics just weren’t built to notice that.

A second discipline: don’t pretend you have data you don’t

There’s a related habit this project drilled into me, and it belongs in any honest system. My data feed runs on a delay — roughly fifteen minutes, the standard lag for the data tier I was on. For a long time I kept trying to engineer around that to catch these moves as they happened. I finally accepted that the delay made live reaction pointless for me: by the time a move showed up clearly in my delayed feed, the part I had been trying to study had usually already happened, and any “alert” I built would just be inviting me to buy a thing that had already peaked.

So I redesigned the entire system around what the data could honestly support: after-the-close analysis, pattern study, recognizing the shape for next time. Not live trading I had no real ability to do. Once I admitted the feed was delayed, the system got less exciting but more useful.

I had been rewarding the scanner for getting bigger, even after bigger had stopped helping. I still run the scanner once in a while; it almost never finds anything new now, and that used to feel like failure. It isn’t. It just means the catalog is finally deep enough to answer the questions I’m actually asking.


About the author

Fillip Kosorukov is a published researcher in behavioral psychology and substance use intervention — co-author of a peer-reviewed study on protective behavioral strategies and motivational interviewing (Journal of Substance Use, 2023; PMID: 37275205), BS Psychology (Summa Cum Laude), University of New Mexico, 2020. He now applies a behavioral-science lens to applied decision-making, market-data systems, and founder-led technology products.

Elsewhere:
ORCID ·
Google Scholar ·
Scopus ·
Web of Science ·
ResearchGate ·
Academia.edu ·
LinkedIn ·
Substack ·
fillipkosorukov.me ·
fillipkosorukov.net ·
wellness

Topics

Solo Founder Startup Resilience Behavioral Science Personal SEO SEO Knowledge Panel SetupLens Online Reputation LocalMention Schema Markup FixMyRecord Media & Blogs

About the Author

About the Author