Main Page > Articles > Machine Learning Trading > Natural Language Processing for Feature Creation from Financial News and Filings

Natural Language Processing for Feature Creation from Financial News and Filings

From TradingHabits, the trading encyclopedia · 7 min read · February 28, 2026
The Black Book of Day Trading Strategies
Free Book

The Black Book of Day Trading Strategies

1,000 complete strategies · 31 chapters · Full trade plans

High-frequency trading (HFT) is a game of speed and information. In the time it takes to blink, an HFT firm can execute thousands of trades. To succeed in this environment, traders need to be able to process vast amounts of data and make decisions in a fraction of a second. This requires a deep understanding of market microstructure, the study of how assets are traded and how prices are formed. This article explores the world of microstructure feature engineering, focusing on two key concepts: order book imbalance and trade flow.

The Order Book: A Window into the Market

The order book is a real-time record of all the buy and sell orders for a particular asset. It provides a wealth of information about the supply and demand for the asset at different price levels. By analyzing the order book, traders can get a sense of the market's direction and identify potential trading opportunities.

One of the most important features that can be extracted from the order book is the order book imbalance (OBI). The OBI is a measure of the difference between the volume of buy orders and sell orders at different price levels. A positive OBI indicates that there is more buying pressure than selling pressure, which is a bullish signal. Conversely, a negative OBI indicates that there is more selling pressure than buying pressure, which is a bearish signal.

Trade Flow: Following the Smart Money

Trade flow is another important concept in market microstructure. It refers to the sequence of trades that are executed in the market. By analyzing the trade flow, traders can get a sense of who is buying and who is selling, and whether they are informed or uninformed traders. For example, a large buy order that is executed at the ask price is likely to be from an informed trader who has positive information about the asset. Conversely, a small sell order that is executed at the bid price is likely to be from an uninformed trader who is simply looking to liquidate their position.

By combining information about the order book imbalance and the trade flow, traders can get a much more complete picture of the market. For example, if we see a positive OBI and a large number of buy orders being executed at the ask price, it is a strong signal that the price is likely to go up. Conversely, if we see a negative OBI and a large number of sell orders being executed at the bid price, it is a strong signal that the price is likely to go down.

Feature Engineering for HFT

The concepts of order book imbalance and trade flow can be used to engineer a wide variety of features for HFT models. For example, we can create features that measure the OBI at different levels of the order book, or that measure the size and direction of the trade flow over different time horizons. We can also create features that measure the interaction between the OBI and the trade flow. For example, we could create a feature that measures the OBI at the time of a large trade.

The key to successful feature engineering for HFT is to be creative and to experiment with different ideas. There is no single set of features that will work for all markets and all trading strategies. The best features will be those that are tailored to the specific characteristics of the market and the trading strategy being used.

Conclusion

Microstructure feature engineering is a important component of any successful HFT strategy. By extracting information from the order book and the trade flow, traders can get a much more complete picture of the market and identify trading opportunities that would not be visible from the price data alone. While the world of HFT is constantly evolving, the concepts of order book imbalance and trade flow are likely to remain at the heart of many successful trading strategies for years to come.