Six Market Microstructure Signals That Fire Before the Price Print: A Practitioner's Execution Quality Architecture

Ariel Silahian

HFT Systems Architect & Consultant | 20+ years architecting high-frequency trading systems. Author of "Trading Systems Performance Unleashed" (Packt, 2024). Creator of VisualHFT.

I help financial institutions architect high-frequency trading systems that are fast, stable, and profitable.

>> Learn more about what I do:
https://hftAdvisory.com

>> Your execution logs contain $200K+ in recoverable edge.
>> Microstructure Diagnostics — one-time audit, 3-5 day turnaround
https://hftadvisory.com/microstructure-diagnostics

The Gap Between When the Move Begins and When It Shows Up on Your Report

Every execution quality report I have reviewed in the past several years shares the same structural limitation: the data starts at the print. Effective spread, implementation shortfall, price reversion at T+30. These are measured from the moment the trade lands in the tape. The cascade that produced that fill is invisible to the report.

This is an instrumentation design choice that most post-trade analytics platforms made deliberately, because their buyer is compliance and regulatory reporting, not real-time execution management. The SEC’s amended Rule 605 (compliance deadline extended to August 1, 2026 per SEC Release No. 34-104147) requires execution-quality disclosure including effective spread and time-to-execution in millisecond increments. All of it is post-trade. That is the regulatory ask. The question of what fires in the one second before the print is entirely outside that scope.

Thank you for reading this post, don't forget to subscribe!

Subscribe by Email

The six signals described in this article operate in the T+0 to T+1s window. They are measurable, instrumentable, and in several cases well-established in academic microstructure literature. What is rarer is a single architectural description of how they sequence, what each one tells you about information flow in the book, and where the academic findings separate cleanly from practitioner-calibrated thresholds.

That is what this article covers. The diagnostic protocol is at the end. If your current instrumentation cannot walk back one second from a bad fill and tell you which signals fired and in what order, the gap this article describes is yours.

T+0 to T+10ms: The Cancellation Layer

The first signal in the cascade is the absence of a resting order on one side of the book.

Before a directional move, informed participants reposition their resting quotes. They cancel from the side they expect to be adversely selected against, and in some cases resubmit at a less exposed price or on a different venue. The result is cancel-side asymmetry: one side of the book is draining limit orders faster than the other, while the spread itself has not yet moved and no trade has printed.

Foucault, Hombert, and Roşu (2016, “News Trading and Speed,” Journal of Finance 71(1):335–382) document how fast informed speculators reposition ahead of news: they cancel and resubmit before the information becomes public, extracting value from slower participants who have not yet updated their quotes. Hendershott and Riordan (2013, “Algorithmic Trading and the Market for Liquidity,” JFQA 48:1001–1024) show that algorithmic order flow is correlated with value-relevant information and imposes adverse selection costs on slower participants, with the informational advantage expressed primarily through order management speed rather than trade aggression. Dahlström, Hagströmer, and Nordén (2024, “The Determinants of Limit Order Cancellations,” Financial Review 59:181–201) establish that depth changes at the best bid/offer and queue position are measurable determinants of cancellation decisions, which means the cancellation event itself is a signal, not just noise.

In practice, I observe cancel-side asymmetry running 3–5x on one side prior to a directional move on single-venue books. That is a practitioner-calibrated threshold, not a number from the academic literature. It is consistent with the repositioning mechanics described above.

The second signal in this layer is refresh-latency widening. When a market maker’s quote-refresh cycle lengthens, the book goes stale before it is formally crossed. A signal I track from production instrumentation: the p95 of quote refresh latency widens measurably in the seconds before a directional move, while the p50 stays flat. This divergence is consistent with conditional quote management: the desk adjusting its refresh cadence based on inventory or information signals. Bouchaud, Gefen, Potters, and Wyart (2004, “Fluctuations and Response in Financial Markets,” Quantitative Finance 4(2):176–190) characterize this dynamic at the market structure level: limit orders mean-revert while market orders super-diffuse. A slow quote refresh tilts the book toward persistence rather than reversion. Persistence, in this context, is a precursor to a directional run.

These two signals (cancel-side asymmetry and refresh-latency widening) are the earliest observable precursors in the cascade. They are also the signals most frequently absent from agency execution and smart order router inputs.

T+50ms to T+100ms: The Toxicity and Depth Layer

By T+50ms, the information has begun flowing through the tape in a measurable way, even if no large print has appeared. This is where flow toxicity and multi-level order book imbalance become the primary diagnostic instruments.

VPIN (Volume-synchronized Probability of Informed Trading) was introduced by Easley, López de Prado, and O’Hara (2011, “The Microstructure of the ‘Flash Crash’,” Journal of Portfolio Management 37(2):118–128) and formalized in their canonical paper (2012, “Flow Toxicity and Liquidity in a High-Frequency World,” Review of Financial Studies 25(5):1457–1493). The construct measures the imbalance between buyer-initiated and seller-initiated volume within standardized volume buckets, using that imbalance as a proxy for the probability that recent order flow is informed. The practical insight is that when informed traders are active, they tend to arrive on one side; uninformed flow is more balanced. A sustained VPIN elevation signals that one-directional informed flow is dominating recent volume.

In my operational use of VPIN, I trigger on values above approximately 0.7 sustained across 8 or more recent volume bars. Those are practitioner-calibrated parameters. The Easley et al. papers use percentile-based triggers and a larger bucket count, calibrated for the full-day volume profile rather than real-time intrabar sensing. My thresholds are tuned for single-venue real-time execution management; they would need recalibration on different instruments and venues.

One counter-view is necessary here. Andersen and Bondarenko (2014, “VPIN and the Flash Crash,” Journal of Financial Markets) found that after controlling for volume and volatility, VPIN showed no incremental predictive power for future volatility in their sample. That finding limits the claim. VPIN is a flow-toxicity monitor, not a stand-alone directional predictor. I use it as one layer in a multi-signal stack, not as a primary signal. A recent 2026 study in Research in International Business and Finance (Vol. 81) extends VPIN to Bitcoin and finds it significantly predictive of price jumps in crypto markets, suggesting the construct retains signal value in fragmented, less-mature venues.

The companion signal at T+100ms is multi-level order book imbalance. Cont, Kukanov, and Stoikov (2014, “The Price Impact of Order Book Events,” Journal of Financial Econometrics 12(1):47–88) establish that price changes are primarily driven by order flow imbalance at the best bid and ask prices, a level-1 finding. The multi-level extension comes from Xu, Gould, and Howison (2019, “Multi-Level Order-Flow Imbalance in a Limit Order Book,” arXiv:1907.06230), who show that the out-of-sample goodness-of-fit of the OFI-price relationship improves with each additional price level incorporated into the imbalance calculation. That is the paper to cite when referencing 10-level depth shifts. The level-1 finding (Cont et al.) and the multi-level extension (Xu et al.) are separate contributions.

In the cascade I instrument, a multi-level LOB shift across 10 or more depth levels at T+100ms, combined with VPIN crossing threshold at T+50ms, is a high-confidence pre-trade signal that a large directional print is imminent. Neither signal alone is sufficient. Together, they represent aligned information across flow toxicity (VPIN) and structural book imbalance (multi-level OFI), which is harder for noise to replicate.

T+500ms to T+1s: The Spread and Print Layer

By T+500ms, the leading signals have resolved into a structural shift in the book. The spread widens at the top, and within another 500ms the print lands.

The spread-widening at T+500ms is a confirmation, not a leading signal: adverse selection pressure has already been incorporated into market maker pricing. Glosten and Milgrom (1985, “Bid, Ask and Transaction Prices in a Specialist Market with Heterogeneously Informed Traders,” Journal of Financial Economics 14(1):71–100) establish the information-asymmetry origin of the bid-ask spread: the quoted spread compensates the market maker for the expected cost of trading against an informed counterparty. Kyle (1985, “Continuous Auctions and Insider Trading,” Econometrica 53(6):1315–1335) formalizes the lambda framework: market depth as an inverse proxy for informed-flow intensity. When informed flow is elevated, depth decreases and the spread widens. The T+500ms spread signal is these mechanics playing out in observable data.

For execution purposes, the spread widening is too late to use as an avoidance signal. If your system is waiting for the top-of-book spread to widen before adjusting its order management strategy, it is reacting to the confirmation, not the cause. The cascade was signaling 500ms earlier through VPIN and multi-level OFI. This is the structural argument for instrumenting earlier in the timeline.

The print at T+1s is the last stage in the cascade. Queue position determines who receives which fill quality at that print, and queue position is itself a function of everything that preceded it. Moallemi and Yuan (2016, “A Model for Queue Position Valuation in a Limit Order Book,” SSRN 2996221) quantify the value of queue position. Their model finds that for some large tick-size stocks, the front-of-queue versus average-queue value difference is of the same order of magnitude as the half-spread. That is a meaningful but bounded finding. For institutional flow, the compounded effect of consistent queue-position degradation across thousands of fills represents a real execution-cost component, even if each individual instance is measured in fractions of a tick.

Hasbrouck (1993, “Assessing the Quality of a Security Market,” RFS 6(1):191–212) documented average transaction costs of approximately 26 basis points on an NYSE sample in that period. That figure is 30-year-old data on a very different market structure and should be read as directional context, not a current HFT benchmark. The relevant point is that execution cost decomposition has been an active research area for decades, and the pre-trade cascade I am describing is the part of that cost that most decomposition frameworks still do not reach.

What Production Desks Actually Compute, and What They Don’t

The following reflects patterns I have observed across production architectures. These are practitioner observations, not industry survey data.

HFT market makers treat VPIN and LOB imbalance as core risk management inputs. Both signals are native to their execution infrastructure, typically computed at sub-millisecond latency and used to adjust quote aggressiveness and inventory risk limits in real time. Cancel-side asymmetry is also monitored, though the specific threshold calibration varies by venue and instrument. This is the desk type for which the full cascade is most natively instrumented.

Agency execution desks and smart order routers tend to operate on depth and price snapshots at the best bid and ask. Multi-level OFI as a real-time execution input is less common. Cancel-asymmetry monitoring in real time is rarer still. Most SOR logic I have reviewed optimizes for static spread and available liquidity at level 1, with some venues incorporating intraday volume patterns. The cascade signals at T+50ms and T+100ms are typically not in the real-time decision loop.

Post-trade analytics platforms operate almost entirely in the post-print domain: effective spread, implementation shortfall, price impact measured at T+30m or T+1d, VWAP deviation. These platforms are execution-quality reporting tools, and the best of them are well-designed for that purpose. They are not designed to surface pre-trade cascade signals, and most desks do not use them as such.

Quant desks often use VPIN and OFI as post-hoc strategy calibration inputs: backtesting execution quality under different toxicity regimes, tuning participation rates based on historical VPIN distributions. Real-time VPIN as an execution gate is less common than its role as a research and calibration signal.

How the Cascade Behaves Across Venue Classes

The single-venue calibration described throughout this article is a starting point. In my experience advising desks operating across multiple venue classes, three architectural realities consistently complicate the portable-threshold assumption.

Opaque venues (dark pools, block crosses). In production observations across venues like POSIT, Liquidnet, and Sigma X, the T+0 cancel-side asymmetry signal is partially or fully unobservable when a venue does not publish cancel events in its feed. The cascade’s earliest precursor disappears. What remains is post-fill flow analysis and queue-position math on the fills that do report. This is an architectural constraint, not a calibration problem. Desks routing 30 to 40 percent of flow through dark venues should not expect the full six-signal stack to be available on that slice; the signal architecture needs to account for which signals each venue class actually exposes.

Crypto venues (24/7 fragmented CEX and DEX). Patterns I have observed across venues such as Binance, Coinbase Advanced, and Kraken suggest that VPIN baselines shift in ways that matter for threshold calibration. Contributing factors include tick-size variance across pairs, the absence of overnight gaps that typically anchor volume-bucket sizing in equities, and what appears to be a more concentrated set of active market-making participants at any given time. The 0.7 threshold and 8-bar window calibrated on single-venue equity books may not transfer directly. One observation that runs counter to what I expected: on several crypto books I have reviewed, the multi-level OFI signal appears sharper than on comparable equity venues, because full depth visibility extends further in the published feed. Whether that depth visibility advantage persists under stressed conditions is something I treat as venue-specific rather than generalizable.

Multi-venue OFI does not roll up cleanly. When execution splits across venues with millisecond-level latency spread, a VPIN cross on the primary venue does not arrive synchronously with the OFI shift on the secondary. The cascade fires per-venue, not portfolio-wide. Smart order routers that assume synchronized snapshot truth across venues, then aggregate at the parent-order level, can get caught between venues where the signal has fired on one leg and not yet on the other. The practical consequence is whipsawed routing decisions during the exact conditions where careful routing matters most.

The architectural takeaway from all three observations: instrumentation should be designed per venue class first, then aggregated across venues. The single-venue cascade is a building block, not a finished architecture. Venue-class translation carries its own threshold-calibration problem, and the desks I have seen handle it well treat each venue class as a distinct instrumentation problem rather than a parameter to adjust at the aggregation layer.

The practical implication is that the desks for whom the cascade matters most in real-time (agency execution, mid-frequency quant executing into lit markets) are the ones least likely to have all six signals in their instrumentation stack. HFT market makers have it by necessity. Everyone else tends to meet the cascade at T+500ms or T+1s, where the structural shift is already priced in.

The Diagnostic Protocol: What to Instrument and What to Expect

The following is an instrumentation checklist for the full cascade. Each signal is listed with what to capture, at what granularity, and what the absence or presence of that signal tells you about fill quality.

1. Cancel-side asymmetry (T+0) Capture: cancel event counts by side, normalized to resting order volume on each side, in a rolling 100ms window. Resolution: individual cancel events with nanosecond timestamps. Presence: a 3–5x imbalance on one side (practitioner threshold) in the 100ms before a fill is consistent with informed repositioning ahead of the print. Absence: balanced cancel rates suggest the fill quality problem is downstream: queue position, venue selection, or timing, not pre-trade informed flow.

2. Refresh-latency p95 widening (T+0 to T+10ms) Capture: quote-update timestamps from the feed, grouped by market participant identifier where available, tracking the distribution of inter-quote intervals. Resolution: microsecond timestamping on quote updates. Presence: p95 widening while p50 is flat in the seconds before a fill signals conditional quote management. Absence: flat latency distribution suggests the market maker on the other side was not adjusting behavior ahead of the fill.

3. VPIN (T+50ms) Capture: volume-synchronized buy/sell classification using Lee-Ready or tick-rule classification; bucket size calibrated to your typical daily volume. Compute the rolling imbalance across 8 recent buckets. Resolution: per-trade classification, updated on each trade. Presence: VPIN above 0.7 (practitioner threshold) sustained across 8 buckets in the period before a fill indicates elevated informed-flow probability. Absence: VPIN in the 0.4–0.6 range prior to the fill suggests the fill cost was not primarily adverse-selection driven.

4. Multi-level LOB shift (T+100ms) Capture: order book snapshots at levels 1 through 10 on both sides, with the OFI calculated per the Xu, Gould, and Howison (2019) multi-level extension. A net multi-level OFI shift of consistent sign across 5 or more levels constitutes a structural depth signal. Resolution: full depth-of-book feed, 100ms or better. Presence: a sustained multi-level shift in the direction of the eventual fill is the highest-confidence signal in the cascade. Absence: flat multi-level OFI with elevated VPIN suggests toxicity is concentrated in large-size flow at the top of book rather than distributed across depth.

5. Top-of-book spread widening (T+500ms) Capture: best bid and ask, quoted spread in basis points, rolling 10-second window. Resolution: top-of-book feed, 1ms or better. Presence: spread widening by more than one tick in the 500ms before a fill is confirmation that market makers have updated adverse-selection pricing. At this point in the cascade, it is confirmation, not a leading signal. Absence of spread widening despite elevated VPIN and multi-level OFI suggests the fill arrived before market makers fully repriced, a relatively favorable execution environment.

6. Print and queue position (T+1s) Capture: fill price vs. arrival price, fill size vs. resting depth at fill level, time-to-fill from order submission. Resolution: nanosecond order and fill timestamps. Presence of slippage beyond the spread is consistent with fill arriving at queue positions behind significant resting size, or with market impact from your own order.

The falsifiability test: Take 10 fills from last month where slippage exceeded your average. Walk back one second from each fill timestamp. Which of the six signals fired? Which one fired first? If VPIN was flat, multi-level OFI was balanced, and cancel-side symmetry was intact, and you still had a bad fill, the problem is your queue position management or venue selection, not your signal stack. If VPIN crossed threshold and LOB imbalance shifted before your order was in the market, the information was available and the question becomes why it was not in your execution decision.

Where I have not closed the loop is multi-venue calibration. The thresholds I use (the 3–5x cancel asymmetry, the 0.7 VPIN trigger, the 8-bar window) are calibrated on single-venue books in equities. How these parameters translate across fragmented crypto venues, dark pools with different queue mechanics, or futures books with mixed participant types is something I treat as venue-specific tuning rather than portable calibration. If you have run this stack in production across multiple venue types and found stable parameters, that calibration is the part of this architecture that is hardest to document from outside a specific venue’s data.

One open-source tool that surfaces these signals in real time is VisualHFT (Apache-2.0, github.com/VisualHFT/VisualHFT, 1,100+ stars on GitHub). It was built as practitioner instrumentation for exactly this kind of pre-trade cascade monitoring. The visualhft.com blog has implementation deep-dives if that is the starting point you want.

This article was originally shared as a LinkedIn post with a 60-second video walking through the cascade signal by signal.

Never Miss an Update

Get notified when we publish new analysis on HFT, market microstructure, and electronic trading infrastructure. No spam.

Subscribe by Email

Ariel Silahian

HFT Systems Architect & Consultant | 20+ years architecting high-frequency trading systems. Author of "Trading Systems Performance Unleashed" (Packt, 2024). Creator of VisualHFT.

I help financial institutions architect high-frequency trading systems that are fast, stable, and profitable.

>> Learn more about what I do:
https://hftAdvisory.com

>> Your execution logs contain $200K+ in recoverable edge.
>> Microstructure Diagnostics — one-time audit, 3-5 day turnaround
https://hftadvisory.com/microstructure-diagnostics

... more info about me 👇

Six Market Microstructure Signals That Fire Before the Price Print: A Practitioner’s Execution Quality Architecture

Ariel Silahian

The Gap Between When the Move Begins and When It Shows Up on Your Report

T+0 to T+10ms: The Cancellation Layer

T+50ms to T+100ms: The Toxicity and Depth Layer

T+500ms to T+1s: The Spread and Print Layer

What Production Desks Actually Compute, and What They Don’t

How the Cascade Behaves Across Venue Classes

The Diagnostic Protocol: What to Instrument and What to Expect

Never Miss an Update

Leave a Reply Cancel reply

Subscribe to Updates

Ariel Silahian

The Gap Between When the Move Begins and When It Shows Up on Your Report

T+0 to T+10ms: The Cancellation Layer

T+50ms to T+100ms: The Toxicity and Depth Layer

T+500ms to T+1s: The Spread and Print Layer

What Production Desks Actually Compute, and What They Don’t

How the Cascade Behaves Across Venue Classes

The Diagnostic Protocol: What to Instrument and What to Expect

Never Miss an Update

Related Posts

Begginner’s guide to trade automation ($spy $spx $eurusd)

Best Practices on HFT low-latency software

High-frequency trading firms can easily get to 64% accuracy in predicting direction of the next trade, Princeton study finds

Leave a Reply Cancel reply