TradingHabits.com

Expectancy quantifies the average profit or loss per trade. A positive expectancy indicates a profitable system over time. However, a small sample size can misrepresent true expectancy. This lesson details the mathematical requirements for a reliable expectancy calculation.

Understanding Statistical Significance

Statistical significance determines if an observed result is likely due to chance or a true underlying effect. For trading, this means distinguishing between random winning and losing streaks and a system's actual edge. We use hypothesis testing to establish significance.

The null hypothesis ($H_0$) states there is no true edge; the system's expectancy is zero. The alternative hypothesis ($H_1$) states the system has a positive expectancy. We aim to reject $H_0$ with a high degree of confidence.

Key Statistical Concepts

Standard Deviation ($\sigma$)

Standard deviation measures the dispersion of individual trade outcomes around the mean. A higher standard deviation indicates greater variability in trade results.

$\sigma = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1}}$

Where:

$x_i$ is the outcome of each individual trade
$\bar{x}$ is the mean outcome (expectancy)
$n$ is the number of trades

Standard Error of the Mean (SEM)

The SEM estimates the standard deviation of the sample mean itself. It quantifies how much the sample mean is likely to vary from the true population mean. A smaller SEM indicates a more precise estimate of expectancy.

$SEM = \frac{\sigma}{\sqrt{n}}$

Confidence Intervals

A confidence interval provides a range within which the true population expectancy likely falls. A 95% confidence interval means that if we were to take many samples, 95% of those intervals would contain the true population expectancy.

For a 95% confidence interval, we use a Z-score of 1.96 (for large samples, $n \ge 30$).

$CI = \bar{x} \pm (Z \times SEM)$

Where:

$\bar{x}$ is the sample expectancy
$Z$ is the Z-score corresponding to the desired confidence level
$SEM$ is the standard error of the mean

Determining Minimum Sample Size

The minimum sample size depends on the desired precision of our expectancy estimate and the variability of our trade outcomes. We need enough trades to ensure the confidence interval around our calculated expectancy is sufficiently narrow.

Let's define the maximum acceptable error ($E$) as the half-width of our desired confidence interval.

$E = Z \times \frac{\sigma}{\sqrt{n}}$

We can rearrange this formula to solve for $n$:

$n = \left( \frac{Z \times \sigma}{E} \right)^2$

This formula requires an estimate of $\sigma$. If no prior data exists, a pilot sample can provide an initial $\sigma$. Alternatively, a conservative estimate can be used.

Worked Example: Futures Trading

Consider a futures day trading system on the ES contract. Each point is worth $50. We want to estimate the system's expectancy with a 95% confidence interval, such that the margin of error ($E$) is no more than $10 per contract.

Assume we have historical data or a pilot study suggesting the standard deviation of trade outcomes ($\sigma$) is $150 per contract.

Desired Confidence Level: 95%
Z-score for 95% CI: 1.96
Estimated Standard Deviation ($\sigma$): $150
Maximum Acceptable Error ($E$): $10

Now, calculate the minimum sample size ($n$):

$n = \left( \frac{Z \times \sigma}{E} \right)^2$ $n = \left( \frac{1.96 \times 150}{10} \right)^2$ $n = \left( \frac{294}{10} \right)^2$ $n = (29.4)^2$ $n = 864.36$

Therefore, we need approximately 865 trades to be 95% confident that our calculated expectancy is within $10 of the true expectancy.

Impact of Variability and Desired Precision

Higher Variability ($\sigma$)

If the standard deviation of trade outcomes is higher, more trades are needed to achieve the same level of precision.

Example: If $\sigma$ were $300 instead of $150, with $E = $10$:

$n = \left( \frac{1.96 \times 300}{10} \right)^2$ $n = \left( \frac{588}{10} \right)^2$ $n = (58.8)^2$ $n = 3457.44$

This requires approximately 3458 trades. Doubling the standard deviation quadrupled the required sample size.

Tighter Precision ($E$)

If a tighter margin of error is desired, more trades are needed.

Example: If $E$ were $5 instead of $10, with $\sigma = $150$:

$n = \left( \frac{1.96 \times 150}{5} \right)^2$ $n = \left( \frac{294}{5} \right)^2$ $n = (58.8)^2$ $n = 3457.44$

This also requires approximately 3458 trades. Halving the acceptable error quadrupled the required sample size.

Practical Considerations for Day Trading

Data Collection

Accurate trade journaling is essential. Record every trade's outcome, including commissions and fees. This data forms the basis for $\bar{x}$ and $\sigma$.

Non-Stationarity

Market conditions change. A system's expectancy and standard deviation may not be constant over long periods. A sample of 800 trades collected over two years may not be representative if market regimes shifted significantly. Consider using rolling windows for expectancy calculations or segmenting data by market type.

Small Sample Bias

Relying on a small number of trades (e.g., 30-50) to determine expectancy is unreliable. A few large wins or losses can skew the average significantly, leading to incorrect conclusions about system profitability.

Cost of Data

Collecting a large sample size takes time and capital. Traders must balance the need for statistical reliability with the practical constraints of trading. A system with a very high expectancy and low $\sigma$ may require fewer trades to demonstrate profitability with confidence than a system with a low expectancy and high $\sigma$.

Conclusion

A minimum sample size is not arbitrary. It is mathematically derived from the desired confidence level, the variability of trade outcomes, and the acceptable margin of error. For most day trading systems, hundreds of trades are necessary to generate a statistically reliable expectancy figure. Without sufficient data, any calculated expectancy is merely an estimate with a wide confidence interval, making it an unreliable indicator of true profitability. Focus on consistent data collection and rigorous statistical analysis to validate your trading edge.

The Minimum Sample Size You Need for Reliable Expectancy