Module 1 · Chapter 12 · Lesson 9

Monitoring and Alerting Infrastructure

5 min readSetting Up Your Trading Infrastructure
The Black Book of Day Trading Strategies
Free Book

The Black Book of Day Trading Strategies

1,000 complete strategies · 31 chapters · Full trade plans

Real-time Performance Monitoring

Watch mean reversion strategies continuously. Real-time data feeds into a dashboard. This dashboard shows key performance indicators (KPIs). Track daily profit and loss (P&L), maximum drawdown, and trade count. Display current exposure per asset. For example, a strategy trading SPY and QQQ shows separate P&L for each. It also shows the combined portfolio P&L.

Implement live P&L calculation. Update P&L every minute. Use the last traded price for open positions. For instance, if a strategy holds 1,000 shares of SPY bought at $400, and the current price is $400.50, the unrealized P&L is $500. Aggregate these for total portfolio P&L.

Track maximum drawdown. This metric measures peak-to-trough decline. A strategy with a $1,000,000 starting capital and a P&L dip from $1,050,000 to $1,020,000 then to $1,060,000 has a maximum drawdown of $30,000. Display this as a percentage: 2.86%.

Monitor the number of open positions. A mean reversion strategy might target 5-10 simultaneous trades. If the system shows 20 open positions, it indicates a potential issue. This could be over-allocation or a bug in the position-sizing algorithm.

Track execution quality. Monitor slippage and fill rates. Slippage measures the difference between the expected execution price and the actual execution price. A large order to buy 1,000 shares of MSFT at $300.00, filled at an average of $300.05, incurred $0.05 slippage per share. High slippage erodes profitability. Low fill rates indicate liquidity issues or incorrect order types.

Important Alerting Systems

Set up alerts for abnormal behavior. Alerts notify traders of potential problems. Use multiple communication channels. Send emails, SMS messages, and push notifications to a mobile app. Prioritize alerts by severity.

Configure P&L deviation alerts. A mean reversion strategy typically exhibits stable, albeit small, daily P&L. Set a threshold for negative P&L. If daily P&L drops below -$5,000, trigger an alert. For example, on October 26, 2023, a strategy might lose $6,500. This triggers a "High Severity" alert. Investigate the cause. This could be an unexpected market move or a strategy malfunction.

Implement maximum drawdown alerts. If the maximum drawdown exceeds a pre-defined limit, send an alert. A 5% drawdown limit on a $1,000,000 portfolio means an alert triggers if the portfolio value drops below $950,000 from its peak. This helps prevent large losses. For instance, if a portfolio peaks at $1,020,000 and then drops to $965,000, an "Urgent Severity" alert triggers.

Set alerts for trading activity. Abnormal trade frequency indicates a problem. A strategy designed to execute 5-10 trades per day should trigger an alert if it executes 50 trades in an hour. This might indicate a runaway loop or an incorrect signal generation. Conversely, zero trades for an extended period, when trades are expected, also warrants an alert. This suggests a data feed issue or a strategy halt.

Monitor connectivity and data feeds. Loss of connection to an exchange or a data provider has major consequences. An alert should trigger immediately. For example, if the connection to Interactive Brokers drops for more than 30 seconds, send an "Urgent" alert. This prevents stale data from causing erroneous trades. Verify data integrity. If a stock quote for AAPL suddenly jumps from $170 to $1,700, this indicates a bad tick. Filter these out and alert the trader.

Infrastructure Health Checks

Automate infrastructure monitoring. Ensure all components function correctly. This includes servers, databases, and trading platforms. Use health check scripts.

Monitor server resource utilization. Track CPU usage, memory consumption, and disk space. High CPU usage (e.g., above 90% for 10 minutes) on a trading server indicates a process consuming too many resources. This can slow down execution. Memory leaks can crash processes. If a trading application's memory usage increases by 20% in an hour, trigger an alert. Disk space running low can prevent logging or data storage. If disk usage exceeds 80%, alert.

Check database status. Ensure the database is accessible and queries execute efficiently. A slow database query can delay signal generation or order placement. Monitor query latency. If a specific query takes longer than 1 second, flag it. Check for database connection pools. Ensure they are not exhausted.

Verify trading application status. Confirm the trading application process is running. A simple "heartbeat" mechanism sends a signal every minute. If no heartbeat is received for 5 minutes, assume the application crashed. Trigger an "Urgent" alert. Log application errors. Parse logs for specific error messages like "Order Rejected" or "Connection Lost."

Implement network latency monitoring. Measure the round-trip time to exchanges and data providers. Increased latency impacts execution speed. If latency to NASDAQ exceeds 50 milliseconds for 3 consecutive minutes, send an alert. This could indicate network congestion or routing issues.

Regularly test the entire alerting system. Simulate an important event, like a server going down. Confirm all alerts trigger and reach the correct recipients. Do this quarterly. On January 15, 2024, at 10:00 AM EST, initiate a test where the data feed for GOOGL is intentionally stopped. Verify the "Data Feed Down" alert triggers within 1 minute via email and SMS. This ensures the system works when needed.