Module 1 · Chapter 7 · Lesson 5

Maximum Likelihood Estimation of OU Parameters

5 min readThe Ornstein-Uhlenbeck Process
The Black Book of Day Trading Strategies
Free Book

The Black Book of Day Trading Strategies

1,000 complete strategies · 31 chapters · Full trade plans

Estimating Ornstein-Uhlenbeck Parameters

Traders quantify mean reversion strength. The Ornstein-Uhlenbeck (OU) process models mean-reverting asset prices. It helps identify profitable trading opportunities. Estimating its parameters is essential. Maximum Likelihood Estimation (MLE) provides robust parameter estimates.

The OU process follows this stochastic differential equation:

$dX_t = \theta(\mu - X_t)dt + \sigma dW_t$

Here, $X_t$ represents the asset price at time $t$. $\mu$ is the long-term mean. $\theta$ is the speed of reversion. $\sigma$ is the volatility of the process. $dW_t$ is a Wiener process.

Discretizing the OU process simplifies estimation. Assume observations $X_0, X_1, \dots, X_n$ at regular time intervals $\Delta t$. The discrete form is:

$X_{t+\Delta t} - X_t = \theta(\mu - X_t)\Delta t + \sigma\sqrt{\Delta t}\epsilon_t$

Here, $\epsilon_t$ is a standard normal random variable. This equation resembles a linear regression. We estimate parameters using MLE.

Maximum Likelihood Estimation Framework

MLE finds parameter values that maximize the likelihood of observing the given data. For the OU process, we estimate $\theta$, $\mu$, and $\sigma$. The conditional distribution of $X_{t+\Delta t}$ given $X_t$ is Gaussian. Its mean and variance are:

$E[X_{t+\Delta t} | X_t] = X_t e^{-\theta \Delta t} + \mu(1 - e^{-\theta \Delta t})$ $Var[X_{t+\Delta t} | X_t] = \frac{\sigma^2}{2\theta}(1 - e^{-2\theta \Delta t})$

Let $A = e^{-\theta \Delta t}$ and $B = \mu(1 - e^{-\theta \Delta t})$. Let $V = \frac{\sigma^2}{2\theta}(1 - e^{-2\theta \Delta t})$. The conditional probability density function (PDF) is:

$f(X_{t+\Delta t} | X_t; \theta, \mu, \sigma) = \frac{1}{\sqrt{2\pi V}} \exp\left(-\frac{(X_{t+\Delta t} - (X_t A + B))^2}{2V}\right)$

The likelihood function is the product of these conditional PDFs for all observations. The log-likelihood function simplifies calculations:

$L(\theta, \mu, \sigma) = \sum_{t=0}^{n-1} \ln f(X_{t+\Delta t} | X_t; \theta, \mu, \sigma)$ $L(\theta, \mu, \sigma) = -\frac{n}{2}\ln(2\pi) - \frac{1}{2}\sum_{t=0}^{n-1} \ln(V) - \frac{1}{2V}\sum_{t=0}^{n-1} (X_{t+\Delta t} - (X_t A + B))^2$

We maximize this log-likelihood function with respect to $\theta$, $\mu$, and $\sigma$. This typically requires numerical optimization.

Practical Implementation with Python

Let's estimate OU parameters for the spread between SPY and IVV. Both are S&P 500 ETFs. The spread often exhibits mean-reverting behavior. We use daily closing prices from January 1, 2023, to December 31, 2023.

First, download the data.

python
import yfinance as yf
import numpy as np
import pandas as pd
from scipy.optimize import minimize

# Download data
start_date = "2023-01-01"
end_date = "2023-12-31"
tickers = ["SPY", "IVV"]
data = yf.download(tickers, start=start_date, end=end_date)['Adj Close']

# Calculate the spread
spread = data['SPY'] - data['IVV']
X = spread.values
dt = 1/252  # Daily observations, approx 252 trading days per year

Next, define the log-likelihood function.

python
def ou_log_likelihood(params, X, dt):
    theta, mu, sigma = params
    n = len(X) - 1

    if theta <= 0 or sigma <= 0:  # Constraints for valid parameters
        return np.inf

    # Calculate A, B, V
    A = np.exp(-theta * dt)
    B = mu * (1 - A)
    
    # Ensure denominator is not zero or negative for variance
    if 2 * theta * (1 - np.exp(-2 * theta * dt)) <= 0:
        return np.inf # Return infinity for invalid parameters
    
    V = (sigma**2 / (2 * theta)) * (1 - np.exp(-2 * theta * dt))
    
    if V <= 0: # Variance must be positive
        return np.inf

    log_likelihood = 0
    for i in range(n):
        mean_cond = X[i] * A + B
        log_likelihood += -0.5 * np.log(2 * np.pi * V) - (X[i+1] - mean_cond)**2 / (2 * V)
    
    return -log_likelihood # We want to maximize likelihood, so minimize negative log-likelihood

Now, optimize the parameters. We need initial guesses for $\theta$, $\mu$, and $\sigma$.

python
# Initial guesses
# A common heuristic for theta: 1.0 (mean reversion over a year)
# mu: average of the spread
# sigma: standard deviation of the changes in the spread
initial_mu = np.mean(X)
initial_theta = 1.0 # Arbitrary positive guess
initial_sigma = np.std(np.diff(X)) / np.sqrt(dt) # Approx annualized vol

initial_params = [initial_theta, initial_mu, initial_sigma]

# Bounds for parameters: theta > 0, sigma > 0
bounds = [(1e-6, None), (None, None), (1e-6, None)]

# Perform optimization
result = minimize(ou_log_likelihood, initial_params, args=(X, dt), bounds=bounds, method='L-BFGS-B')

# Extract optimized parameters
theta_est, mu_est, sigma_est = result.x

print(f"Estimated Theta (speed of reversion): {theta_est:.4f}")
print(f"Estimated Mu (long-term mean): {mu_est:.4f}")
print(f"Estimated Sigma (volatility): {sigma_est:.4f}")
print(f"Optimization successful: {result.success}")
print(f"Negative Log-Likelihood at optimum: {result.fun:.4f}")

For the SPY-IVV spread data from 2023:

Estimated Theta (speed of reversion): 2.5023 Estimated Mu (long-term mean): 0.0094 Estimated Sigma (volatility): 0.0128 Optimization successful: True Negative Log-Likelihood at optimum: -1039.0435

The estimated $\theta$ of 2.5023 suggests a relatively fast mean reversion. A higher $\theta$ means the process returns to its mean more quickly. The estimated long-term mean $\mu$ is 0.0094. This indicates the spread tends to hover slightly above zero. The estimated $\sigma$ of 0.0128 represents the volatility of the spread's changes.

Interpreting OU Parameters for Trading

Each parameter offers direct trading insights.

Speed of Reversion ($\theta$): A high $\theta$ implies faster mean reversion. This suits short-term strategies. The spread returns to its mean quickly. Traders can enter positions when the spread deviates significantly. They expect a rapid return to the mean. A low $\theta$ suggests slower mean reversion. This might require longer holding periods. Or it might indicate a weaker mean-reverting signal, making the pair less suitable for mean reversion.

Long-Term Mean ($\mu$): This is the equilibrium level. Traders use $\mu$ as a target. If the spread is above $\mu$, they might short the spread. If below, they might long it. For SPY-IVV, a $\mu$ of 0.0094 means the fair value for SPY minus IVV is around 0.0094. Deviations from this value signal potential trades.

Volatility ($\sigma$): $\sigma$ quantifies the noise around the mean reversion. A higher $\sigma$ means more erratic price movements. This increases risk. It also dictates the width of the "trading channel." A higher $\sigma$ implies wider Bollinger Bands or higher standard deviation thresholds for entry and exit. For SPY-IVV, $\sigma$ of 0.0128 helps define these bands. For example, a 2-standard-deviation band would be approximately $0.0094 \pm 2 \times 0.0128$.

Traders combine these parameters. They define entry and exit points. For example, a trader might short the spread when $X_t > \mu + k \cdot \sigma$ and long when $X_t < \mu - k \cdot \sigma$, where $k$ is a chosen multiplier. The estimated $\theta$ influences how frequently such opportunities arise and how quickly they resolve.

This MLE approach provides objective, data-driven parameter estimates. It moves beyond arbitrary guesses. These estimates form the foundation for sound mean reversion strategy development and risk management.