Strategy Design & Modeling
4. Strategy Design & Modeling
This document covers the principles and methodologies for designing, developing, and validating systematic trading strategies, from initial concept through backtesting to deployment.
4.1 Trading Styles
Definition
Trading styles define the timeframe and execution cadence a trader uses to engage with markets. Trading style is independent of strategy logic and reflects constraints such as available time, attention, and risk tolerance.
Important Distinction:
Trading style defines when you trade. Strategy archetypes define why a trade has edge.
Core Principles
- Timeframe Selection: Choose style based on available time, personality, and goals
- Consistency: Stick to one primary style rather than mixing approaches
- Capital Requirements: Different styles have different capital and margin needs
- Risk Profile: Each style has characteristic risk/reward patterns
- Lifestyle Fit: Trading style must align with personal schedule and temperament
Trading Style Comparison
| Style | Timeframe | Holding Period | Key Traits |
|---|---|---|---|
| Scalping | Seconds to minutes | Seconds to minutes | Tiny moves, high volume, requires intense focus |
| Day Trading | Minutes to hours | Same day | Exploit intraday moves, avoid overnight risk |
| Swing Trading | Days to weeks | Days to weeks | Fewer trades, larger moves, higher selectivity |
| Position Trading | Weeks to months | Weeks to months | Regime- or fundamentally driven |
Scalping
Characteristics: - Trade duration: Seconds to a few minutes - Profit targets: 1-10 ticks/pips - High frequency: Dozens to hundreds of trades per day - Requires: Fast execution, tight spreads, intense focus
Best For: Full-time traders with direct market access and low commissions.
Challenges: High transaction costs, requires constant attention, mentally exhausting.
Day Trading
Characteristics: - Trade duration: Minutes to hours - All positions closed by market close - No overnight risk exposure - Typical trades: 1-10 per day
Best For: Traders who can dedicate market hours to active trading.
Advantages: No overnight gap risk, fresh start each day, compounds learning quickly.
Swing Trading
Characteristics: - Trade duration: Days to weeks - Captures multi-day price swings - Fewer trades, higher quality setups - Can be done with a full-time job
Best For: Part-time traders, those seeking work-life balance.
Advantages: Less screen time, larger profit targets, lower transaction costs as percentage of gains.
Best Practices for Style Selection
- Match trading style to available time and lifestyle
- Start with longer timeframes (swing) before attempting shorter (day/scalp)
- Account for transaction costs relative to target profits
- Consider psychological fit: can you hold overnight? Can you focus for hours?
- Paper trade the style before committing capital
4.2 Strategy Archetypes
Definition
Strategy archetypes classify trading strategies by the market behavior they exploit, not by timeframe or instrument. Archetypes describe why a strategy has expected positive return under certain conditions.
Core Principles
- Behavioral Alignment: Each archetype exploits a specific market dynamic
- Regime Dependence: Archetypes rotate in effectiveness across regimes
- Style Purity: Mixing archetypes without intention creates fragile strategies
- Distinct Risk Profiles: Each archetype has characteristic skew and drawdowns
- Complementarity: Robust systems often combine multiple archetypes with clear hierarchy
Common Use Cases
- Strategy selection based on market conditions
- Portfolio diversification across archetypes
- Understanding strategy performance attribution
- Designing hybrid approaches with clear signal hierarchy
- Setting appropriate expectations for win rate and skew
Trend Following
Core Premise: Markets exhibit persistent directional regimes driven by sustained capital flows.
What it answers:
Should we be exposed in this direction?
Mechanism: - Identify sustained directional movement - Enter after trend confirmation - Exit on regime break, not noise
Risk Profile: - Win Rate: 30-40% (low) - Skew: POSITIVE (occasional large wins) - Tail Risk: MODERATE - losses are bounded by stops - Drawdowns: Extended choppy periods cause bleeding
Characteristics: - Large winners offset frequent small losses - Performs well in trending markets - Underperforms in range-bound markets - Positive skew means you profit from fat tails
Example Strategies: - Moving average crossovers - Donchian channel breakouts - Time-series momentum - Managed futures / CTA strategies
Risk Warning: Trend following has LOW win rate. Do NOT expect to be right often. The edge comes from letting winners run far enough to compensate for frequent small losses.
Momentum
Core Premise: Price movement exhibits short-term persistence due to acceleration, positioning pressure, and behavioral effects.
What it answers:
Is directional pressure increasing or fading right now?
Mechanism: - Measure rate of change or acceleration - Enter during expansion of pressure - Exit as momentum decays
Risk Profile: - Win Rate: 50-60% (moderate) - Skew: NEUTRAL to slightly positive - Tail Risk: MODERATE - vulnerable to sharp reversals - Drawdowns: Quick reversals can trap momentum traders
Characteristics: - Higher win rate than trend following - Shorter holding periods - Vulnerable to sharp reversals - Works both within and outside trends
Example Strategies: - RSI / ROC momentum - MACD-based systems - Cross-sectional momentum - Short-term continuation models
Relationship to Trend Following:
Trend defines regime. Momentum defines timing.
In practice, trend is often used as a filter and momentum as an entry/exit engine.
Breakouts (Transition Archetype)
Core Premise: Periods of volatility compression are followed by expansion as new information is revealed.
What it answers:
Is a new move beginning?
Role in Systems: - Often marks trend initiation - Combines momentum entry with trend potential - Accepts false breakouts as cost of discovery
Risk Profile: - Win Rate: 40-50% - Skew: POSITIVE (breakouts that work tend to run) - Tail Risk: MODERATE - false breakouts are common - Drawdowns: Choppy markets produce many false signals
Example Strategies: - Range breakouts - Volatility squeeze breakouts (Bollinger Band squeeze) - Opening range breakouts
Usage Note: Breakouts bridge momentum (entry timing) and trend (regime establishment). They work best when combined with regime filters.
Volatility
Core Premise: Profit from significant price movements regardless of direction.
What it answers:
Is the market moving enough to capture profitable swings?
Mechanism: - Trade when volatility expands - Direction-agnostic (long or short based on breakout direction) - Exit when volatility contracts or move exhausts
Risk Profile: - Win Rate: 30-50% - Skew: NEUTRAL to slightly positive - Tail Risk: MODERATE - whipsaws during false breakouts - Drawdowns: Choppy, range-bound markets produce losses
Characteristics: - Works in both trending and ranging markets - Requires volatility to capture meaningful moves - Often uses ATR, Bollinger Bands, or volatility indicators - Not directionally biased
Example Strategies: - ATR breakout strategies - Bollinger Band squeeze breakouts - Straddle/strangle equivalents in spot - Keltner Channel expansions - Volatility expansion trades
Usage Note: Volatility strategies don't predict direction - they trade movement size. Best in markets with alternating quiet and active periods.
Mean Reversion
Core Premise: Prices temporarily deviate from equilibrium and revert as liquidity providers and arbitrageurs act.
What it answers:
Has price extended too far from fair value?
Mechanism: - Buy oversold conditions - Sell overbought conditions - Exit near perceived mean
Risk Profile: - Win Rate: 60-70% (high) - Skew: NEGATIVE (occasional large losses) - Tail Risk: HIGH - can fail catastrophically during trend emergence - Drawdowns: Single large losses can wipe out many small wins
Characteristics: - High win rate is psychologically seductive - Performs best in range-bound regimes - DANGEROUS in trending markets - Negative skew means tail losses are disproportionately large
Example Strategies: - RSI mean reversion - Bollinger Band reversion - Pairs trading - Statistical arbitrage
CRITICAL WARNING:
DO NOT describe mean reversion as "lower risk" or "safer."
High win rate does NOT equal low risk. The negative skew means losses, when they occur, are disproportionately large. Mean reversion strategies can and do blow up when trends emerge.
"The trend is your friend" exists because fighting trends is dangerous.
Archetype Risk Summary Table
| Archetype | Win Rate | Skew | Tail Risk | Best Regime |
|---|---|---|---|---|
| Trend Following | 30-40% | Positive | Moderate | Strong trends |
| Momentum | 50-60% | Neutral | Moderate | Any with clear direction |
| Volatility | 40-50% | Neutral | Moderate | High volatility periods |
| Breakouts | 40-50% | Positive | Moderate | After compression |
| Mean Reversion | 60-70% | Negative | High | Range-bound only |
Archetype Contraindications
| Archetype | DO NOT USE WHEN | Why It Fails |
|---|---|---|
| Trend Following | Choppy/range-bound markets | Whipsaws accumulate losses |
| Momentum | End of trend / momentum divergence | Reversal traps |
| Volatility | Sustained low volatility | No meaningful moves to capture |
| Breakouts | Low volatility / no compression | False breakouts dominate |
| Mean Reversion | Strong directional trends | Will blow up |
Best Practices for Archetype Selection
- Match archetypes to current market regime
- Combine complementary archetypes with clear hierarchy
- Understand skew and tail risk for each archetype
- Never combine conflicting archetypes without explicit priority (e.g., momentum vs mean reversion)
- Monitor regime transitions and adapt allocation
- Size positions according to archetype risk profile, not just win rate
4.3 Building a Trade Plan
Definition
A trade plan is a comprehensive document that defines all aspects of your trading approach, including entry criteria, exit rules, risk parameters, and psychological guidelines. It serves as your trading rulebook and removes emotional decision-making from the trading process.
Core Principles
- Pre-Definition: All rules defined before trading, not during
- Specificity: Rules must be clear enough that another trader could execute them
- Completeness: Plan covers all scenarios: entry, exit, sizing, timing
- Documentation: Written down and accessible during trading
- Evolution: Reviewed and updated based on performance data
Trade Plan Components
1. Entry Criteria
Conditions required before taking a trade: - Price action setup (pattern, structure break, etc.) - Indicator alignment (trend confirmation, momentum) - Volume confirmation requirements - Higher timeframe alignment - Minimum reward-to-risk threshold
def entry_checklist(setup):
"""
Validate all entry conditions are met
"""
criteria = {
'price_action_setup': setup.has_valid_pattern(),
'trend_alignment': setup.higher_tf_trend == setup.trade_direction,
'indicator_confirmation': setup.indicators_aligned(),
'volume_confirmation': setup.volume > setup.avg_volume,
'rr_threshold': setup.reward_risk >= 2.0,
}
# All criteria must be True
return all(criteria.values()), criteria
2. Confirmation Rules
Additional validation before execution: - Wait for candle close (don't enter on wick) - Retest of broken level - Volume spike on breakout - Indicator crossover completion
3. Exit Criteria
Stop-Loss Rules: - Technical placement (below support, above resistance) - ATR-based calculation (e.g., 1.5x ATR) - Maximum dollar/percentage loss per trade
Take-Profit Rules: - Technical targets (next resistance/support) - Risk-multiple targets (1.5R, 2R, 3R) - Partial exit strategy (scale out at levels)
Trailing Stop Rules: - When to activate trailing stop - How much to trail (fixed % or ATR-based)
Indicator-Driven Exits: - Specific conditions that signal exit regardless of P/L
Trade Plan Template
TRADE PLAN: [Strategy Name]
Date: [Creation Date]
Last Updated: [Date]
MARKET & TIMEFRAME:
- Markets traded: [e.g., ES, NQ, BTC]
- Primary timeframe: [e.g., 15-min]
- Higher timeframe for context: [e.g., 4-hour]
ENTRY RULES:
1. [Specific condition 1]
2. [Specific condition 2]
3. [Specific condition 3]
- Minimum R:R required: [e.g., 2:1]
CONFIRMATION:
- [What confirms the entry]
EXIT RULES:
- Stop-loss: [How determined]
- Take-profit 1: [Level and size]
- Take-profit 2: [Level and size]
- Trailing stop: [When and how]
POSITION SIZING:
- Risk per trade: [e.g., 1% of account]
- Maximum positions: [e.g., 3]
TRADING HOURS:
- [e.g., 8:00 AM - 11:30 AM EST]
DAILY RULES:
- Max daily loss: [e.g., 3%]
- Stop trading after: [e.g., 3 consecutive losses]
Best Practices for Trade Plans
- Write your plan when not actively trading (clear head)
- Be specific: vague rules lead to inconsistent execution
- Include "what if" scenarios and how to handle them
- Review plan performance weekly/monthly
- Update based on data, not emotions or single trades
- Treat the plan as non-negotiable during trading hours
4.4 Signal Types
Definition
Signal types classify trading signals by their role in the decision-making process. Understanding signal types helps traders combine signals effectively and build robust strategies.
The Four Signal Types
| Type | Purpose | When It Fires | Example |
|---|---|---|---|
| Entry | Triggers opening a new position | When conditions favor starting a trade | macd_bullish_cross - MACD line crosses above signal line |
| Exit | Triggers closing an existing position | When conditions suggest ending a trade | rsi_overbought - RSI exceeds 70, indicating potential reversal |
| Filter | Screens out trades / Sets context | Continuous condition check that gates other signals | adx_strong_trend - Only trade when ADX > 25 |
| Confirmation | Validates another signal | Secondary check that strengthens conviction | macd_positive - Histogram positive confirms bullish momentum |
Mapping Signal Types to Archetypes
| Archetype | Typical Signal Role |
|---|---|
| Trend Following | Filter - Establishes directional context |
| Momentum | Entry / Exit - Times entries and exits |
| Mean Reversion | Entry - Triggers counter-trend position |
| Regime Detection | Meta-Filter - Gates which archetypes to deploy |
How Signal Types Work Together
A typical strategy combines multiple signal types:
def generate_trade_signal(data):
"""
Combine signal types for robust decision-making
"""
# FILTER: Is the environment right for trading?
if not filter_price_above_200_sma(data):
return None # Only trade in uptrend
# ENTRY: Should we get in now?
if not entry_rsi_oversold(data):
return None # Wait for entry trigger
# CONFIRMATION: Are we sure?
if not confirmation_macd_positive(data):
return None # Need momentum confirmation
return "LONG" # All conditions met
def check_exit(data, position):
"""
Exit signals close positions
"""
if position == "LONG":
if exit_rsi_overbought(data):
return "CLOSE"
return "HOLD"
Signal Type Characteristics
Entry Signals:
- Fire at specific moments when trade setup is complete
- Typically require immediate action
- Often based on crossovers, breakouts, or pattern completions
- Examples: sma_cross_up, macd_bullish_cross, rsi_oversold
Exit Signals:
- Indicate when to close existing positions
- Can be based on profit targets, stop losses, or reversal conditions
- Should be defined before trade entry
- Examples: sma_cross_down, macd_bearish_cross, rsi_overbought
Filter Signals:
- Provide ongoing context rather than discrete triggers
- Answer: "Should we be trading in this direction at all?"
- Reduce false signals by ensuring market environment is favorable
- Examples: is_above_sma, adx_strong_trend, vwap_above
Confirmation Signals:
- Add conviction to entry signals
- Reduce false positives by requiring multiple conditions
- Often come from different indicator categories (trend + momentum + volume)
- Examples: macd_positive, obv_bullish, bb_squeeze
Best Practices for Signal Types
- Layer signals intentionally: Filter -> Entry -> Confirmation is a common flow
- Don't over-confirm: 2-3 signals is usually sufficient; more adds lag
- Match signal types to archetype: Trend-following needs different filters than mean reversion
- Test signal combinations: Backtest which combinations improve expectancy
- Document signal roles: Know why each signal is in your strategy
4.5 Entry Logic Frameworks
Definition
Entry logic frameworks are systematic approaches for determining when to initiate positions. A robust entry framework provides clear, repeatable criteria that generate consistent signal quality.
Core Principles
- Objectivity: Rules should be unambiguous and programmable
- Edge Definition: Understand why the entry has expected positive return
- Confirmation: Multiple confirming factors improve reliability
- Timing: Entry timing affects risk/reward ratio
- Filter Quality: Pre-entry filters reduce false signals
Signal Generation Approaches
Indicator-Based:
def indicator_entry(data, indicator_threshold):
"""
Enter when indicator crosses threshold
"""
indicator_value = calculate_indicator(data)
signal = indicator_value > indicator_threshold
return signal
Pattern-Based:
def pattern_entry(data, pattern_func):
"""
Enter when pattern is detected
"""
pattern_detected = pattern_func(data)
return pattern_detected
Statistical-Based:
def statistical_entry(returns, threshold_zscore=2):
"""
Enter on statistical deviation
"""
zscore = (returns - returns.mean()) / returns.std()
long_signal = zscore < -threshold_zscore
short_signal = zscore > threshold_zscore
return long_signal, short_signal
Entry Confirmation Methods
Multi-Factor Confirmation:
def multi_factor_entry(data):
"""
Require multiple signals to align
"""
trend_signal = calculate_trend_signal(data)
momentum_signal = calculate_momentum_signal(data)
volume_signal = calculate_volume_signal(data)
# Require at least 2 of 3 signals
confirmation_count = trend_signal + momentum_signal + volume_signal
entry = confirmation_count >= 2
return entry
Timeframe Confirmation:
def multi_timeframe_entry(data_dict):
"""
Require alignment across timeframes
"""
htf_trend = calculate_trend(data_dict['daily'])
mtf_signal = calculate_signal(data_dict['4h'])
ltf_trigger = calculate_trigger(data_dict['1h'])
# Long: HTF uptrend + MTF bullish + LTF trigger
long_entry = (htf_trend > 0) & (mtf_signal > 0) & ltf_trigger
return long_entry
Entry Timing Refinements
Pullback Entry:
def pullback_entry(price, trend_signal, pullback_threshold=0.02):
"""
Enter on pullback within trend
"""
recent_high = price.rolling(20).max()
pullback_pct = (recent_high - price) / recent_high
entry = trend_signal & (pullback_pct > pullback_threshold)
return entry
Breakout Entry:
def breakout_entry(price, period=20):
"""
Enter on breakout of recent range
"""
high = price.rolling(period).max()
low = price.rolling(period).min()
long_entry = price > high.shift(1)
short_entry = price < low.shift(1)
return long_entry, short_entry
Entry Filters
Regime Filter:
def regime_filtered_entry(signal, regime):
"""
Only trade in favorable regime
"""
favorable_regime = regime == 'trending'
filtered_signal = signal & favorable_regime
return filtered_signal
Volatility Filter:
def volatility_filtered_entry(signal, volatility, min_vol, max_vol):
"""
Avoid extremes of volatility
"""
vol_okay = (volatility > min_vol) & (volatility < max_vol)
filtered_signal = signal & vol_okay
return filtered_signal
Best Practices for Entry Logic
- Define entry criteria precisely before coding
- Use multiple confirmation factors
- Test entries with various exit rules to isolate entry quality
- Track entry efficiency (how far price moves from entry in intended direction)
- Avoid overly complex entry rules that overfit
- Include regime and volatility filters
4.6 Exit Logic
Definition
Exit logic determines when to close positions. Effective exit management is often more important than entry timing for overall strategy performance.
Core Principles
- Protect Capital: Exits should limit losses first, maximize gains second
- Let Winners Run: Don't exit winning trades prematurely
- Cut Losers Short: Remove losing positions quickly
- Rule-Based: Exits should be as systematic as entries
- Exit Reason: Know why you're exiting (stop, target, time, signal)
Stop-Loss Exits
Fixed Stop:
def fixed_stop_exit(entry_price, current_price, stop_pct=0.02, direction='long'):
"""
Exit on fixed percentage loss
"""
if direction == 'long':
stop_price = entry_price * (1 - stop_pct)
exit_signal = current_price <= stop_price
else:
stop_price = entry_price * (1 + stop_pct)
exit_signal = current_price >= stop_price
return exit_signal, stop_price
ATR-Based Stop:
def atr_stop_exit(entry_price, current_price, atr, multiplier=2.0, direction='long'):
"""
Exit based on ATR distance
"""
stop_distance = atr * multiplier
if direction == 'long':
stop_price = entry_price - stop_distance
exit_signal = current_price <= stop_price
else:
stop_price = entry_price + stop_distance
exit_signal = current_price >= stop_price
return exit_signal, stop_price
Profit Target Exits
Fixed Target:
def fixed_target_exit(entry_price, current_price, target_pct=0.04, direction='long'):
"""
Exit on fixed percentage profit
"""
if direction == 'long':
target_price = entry_price * (1 + target_pct)
exit_signal = current_price >= target_price
else:
target_price = entry_price * (1 - target_pct)
exit_signal = current_price <= target_price
return exit_signal, target_price
Risk-Multiple Target:
def risk_multiple_target(entry_price, stop_price, risk_multiple=2.0, direction='long'):
"""
Target based on multiple of risk
"""
risk = abs(entry_price - stop_price)
if direction == 'long':
target = entry_price + (risk * risk_multiple)
else:
target = entry_price - (risk * risk_multiple)
return target
Trailing Stop Exits
Simple Trailing Stop:
def trailing_stop(prices, trail_pct=0.02, direction='long'):
"""
Trail stop behind price
"""
if direction == 'long':
highest = prices.expanding().max()
stop = highest * (1 - trail_pct)
else:
lowest = prices.expanding().min()
stop = lowest * (1 + trail_pct)
return stop
ATR Trailing Stop:
def atr_trailing_stop(prices, atr, multiplier=2.5, direction='long'):
"""
Trail stop by ATR distance
"""
if direction == 'long':
highest = prices.expanding().max()
stop = highest - (atr * multiplier)
else:
lowest = prices.expanding().min()
stop = lowest + (atr * multiplier)
return stop
Signal-Based Exits
Indicator Exit:
def indicator_exit(position, indicator_value, exit_threshold):
"""
Exit when indicator crosses threshold
"""
if position == 'long':
exit_signal = indicator_value < exit_threshold
else:
exit_signal = indicator_value > exit_threshold
return exit_signal
Opposing Signal Exit:
def opposing_signal_exit(current_position, new_signal):
"""
Exit when opposite signal triggers
"""
exit_signal = (current_position * new_signal) < 0
return exit_signal
Time-Based Exits
def time_based_exit(entry_time, current_time, max_holding_period):
"""
Exit after maximum holding period
"""
holding_time = current_time - entry_time
exit_signal = holding_time >= max_holding_period
return exit_signal
Best Practices for Exit Logic
- Always have a stop-loss for every position
- Use ATR-based stops for volatility-adjusted risk
- Consider trailing stops for trend-following strategies
- Don't let winners turn into losers (use break-even stops)
- Test different exit strategies with same entry to find optimal
- Document reason for each exit in trade log
4.7 Time-Based Logic
Definition
Time-based logic incorporates temporal factors into trading decisions, including holding period constraints, session filtering, and calendar-based adjustments.
Core Principles
- Holding Period: Strategies have optimal holding periods
- Session Selection: Not all trading hours are equal
- Calendar Awareness: Certain dates have predictable behavior
- Time Decay: Time affects option and volatility strategies
- Periodicity: Returns may cluster at specific times
Session Filters
def session_filter(timestamp, allowed_sessions):
"""
Filter trades by market session
"""
# Example sessions
sessions = {
'asia': ('00:00', '08:00'),
'london': ('08:00', '16:00'),
'new_york': ('13:00', '21:00')
}
time = timestamp.time()
for session in allowed_sessions:
start, end = sessions[session]
if start <= time <= end:
return True
return False
Holding Period Constraints
def enforce_holding_period(position_time, min_hold, max_hold):
"""
Enforce minimum and maximum holding periods
"""
can_exit = position_time >= min_hold
must_exit = position_time >= max_hold
return can_exit, must_exit
Calendar Filters
def calendar_filter(date):
"""
Avoid trading on specific dates
"""
# Avoid NFP days, FOMC days, etc.
blackout_dates = load_economic_calendar()
if date in blackout_dates:
return False
# Avoid first/last day of month for some strategies
if date.day == 1 or date.day == date.days_in_month:
return False
return True
4.8 Regime Detection & Filtering
Conceptual Anchor
Market regimes explain why archetypes rotate in effectiveness.
Regime detection does not predict markets - it selects which behaviors to exploit.
Definition
Regime detection identifies the current market state (trending, ranging, volatile, calm) to adapt strategy behavior or filter signals inappropriate for current conditions.
Core Principles
- Market States: Markets cycle through distinct regimes
- Strategy Matching: Different strategies excel in different regimes
- Lag Acceptance: Regime detection has inherent lag
- Probabilistic: Regime identification is probabilistic, not certain
- Adaptation: Strategies should adapt to detected regime
Trend vs. Range Detection
ADX-Based:
def adx_regime(adx, trend_threshold=25, range_threshold=20):
"""
Classify regime based on ADX
"""
if adx > trend_threshold:
return 'trending'
elif adx < range_threshold:
return 'ranging'
else:
return 'transitional'
Efficiency Ratio:
def efficiency_ratio_regime(price, period=20):
"""
Efficiency ratio: Direction / Volatility
"""
direction = abs(price.diff(period))
volatility = price.diff().abs().rolling(period).sum()
efficiency = direction / volatility
if efficiency > 0.6:
return 'trending'
elif efficiency < 0.3:
return 'ranging'
else:
return 'mixed'
Volatility Regime Detection
Percentile-Based:
def volatility_regime(current_vol, historical_vol, lookback=252):
"""
Classify volatility regime by percentile
"""
percentile = (historical_vol < current_vol).rolling(lookback).mean()
if percentile > 0.8:
return 'high_volatility'
elif percentile < 0.2:
return 'low_volatility'
else:
return 'normal_volatility'
Hidden Markov Model:
from hmmlearn import hmm
def hmm_regime_detection(returns, n_states=2):
"""
HMM-based regime detection
"""
model = hmm.GaussianHMM(n_components=n_states, covariance_type="diag")
model.fit(returns.values.reshape(-1, 1))
regime = model.predict(returns.values.reshape(-1, 1))
return regime
Strategy Adaptation by Regime
def regime_adapted_strategy(price, regime):
"""
Select strategy based on regime
"""
if regime == 'trending':
signal = trend_following_signal(price)
elif regime == 'ranging':
signal = mean_reversion_signal(price)
else:
signal = 0 # No trade in uncertain regime
return signal
Best Practices for Regime Detection
- Accept lag in regime detection; don't expect real-time accuracy
- Use multiple regime indicators for confirmation
- Have default behavior for uncertain/transitional regimes
- Backtest regime detection separately from strategy
- Consider regime detection failures in risk management
4.9 Data Quality & Preprocessing
Definition
Data quality and preprocessing ensure that strategy inputs are accurate, complete, and properly formatted. Poor data quality is a leading cause of strategy failure.
Core Principles
- Garbage In, Garbage Out: Strategy is only as good as its data
- Survivorship Bias: Ensure delisted assets are included
- Look-Ahead Bias: Never use future information
- Point-in-Time: Use data as it was available at each historical point
- Corporate Actions: Adjust for splits, dividends, mergers
Data Cleaning
Handling Missing Data:
def clean_missing_data(df):
"""
Handle missing values appropriately
"""
# Forward fill for prices (common approach)
df['close'] = df['close'].fillna(method='ffill')
# Don't fill volume - missing volume is meaningful
# df['volume'] remains NaN
# Drop rows with critical missing data
df = df.dropna(subset=['open', 'high', 'low', 'close'])
return df
Outlier Detection:
def detect_outliers(returns, threshold=5):
"""
Identify potential data errors
"""
zscore = (returns - returns.mean()) / returns.std()
outliers = abs(zscore) > threshold
return outliers
Corporate Action Adjustments
Split Adjustment:
def adjust_for_splits(price, splits):
"""
Adjust historical prices for stock splits
"""
adjusted = price.copy()
for split_date, split_ratio in splits.items():
adjusted.loc[:split_date] *= split_ratio
return adjusted
Dividend Adjustment:
def adjust_for_dividends(price, dividends):
"""
Create total return series including dividends
"""
# Calculate adjustment factor
adjustment = (price + dividends) / price
adjusted_price = price * adjustment.cumprod()
return adjusted_price
Avoiding Biases
Survivorship Bias Prevention:
def include_delisted(universe, date):
"""
Include stocks that existed on date, even if later delisted
"""
# Point-in-time universe
available_stocks = universe[universe['listing_date'] <= date]
available_stocks = available_stocks[
(available_stocks['delisting_date'].isna()) |
(available_stocks['delisting_date'] > date)
]
return available_stocks
Look-Ahead Bias Prevention:
def point_in_time_data(df, target_date):
"""
Return only data available on target_date
"""
# Shift data to simulate reporting lag
reporting_lag = pd.Timedelta(days=1) # Or appropriate lag
available_data = df[df.index <= target_date - reporting_lag]
return available_data
Best Practices for Data Quality
- Validate data against multiple sources
- Check for obvious errors (negative prices, impossible moves)
- Document data sources and any adjustments made
- Use point-in-time databases when available
- Include delisted securities in backtests
- Account for reporting lags in fundamental data
4.10 Backtesting Best Practices
Definition
Backtesting is the process of applying a trading strategy to historical data to evaluate how it would have performed. Proper backtesting methodology is critical for realistic performance estimation.
Core Principles
- Historical Simulation: Test strategy on past data
- Out-of-Sample Testing: Reserve data for validation
- Realistic Assumptions: Include costs, slippage, constraints
- Multiple Tests: Test across different periods and conditions
- Skepticism: Assume backtest overstates live performance
Backtesting Framework
class Backtester:
def __init__(self, data, strategy, capital=100000):
self.data = data
self.strategy = strategy
self.capital = capital
self.positions = []
self.equity_curve = []
def run(self):
"""
Execute backtest
"""
equity = self.capital
position = 0
for i in range(len(self.data)):
# Get signal
signal = self.strategy.generate_signal(self.data.iloc[:i+1])
# Execute trades with realistic costs
if signal != position:
trade_cost = self.calculate_costs(position, signal)
equity -= trade_cost
position = signal
# Update equity
if position != 0:
returns = self.data['returns'].iloc[i]
equity *= (1 + position * returns)
self.equity_curve.append(equity)
return self.calculate_metrics()
def calculate_costs(self, old_pos, new_pos):
"""
Realistic transaction costs
"""
commission = 0.001 # 0.1%
slippage = 0.0005 # 0.05%
return abs(new_pos - old_pos) * (commission + slippage) * self.capital
Realistic Assumptions
Transaction Costs:
def calculate_transaction_costs(trade_value, asset_class):
"""
Realistic cost assumptions by asset class
"""
costs = {
'equities': 0.001, # 10 bps
'futures': 0.0002, # 2 bps
'fx': 0.0001, # 1 bp
'crypto': 0.002, # 20 bps
'options': 0.01 # $1 per contract + spread
}
return trade_value * costs.get(asset_class, 0.001)
Slippage Modeling:
def estimate_slippage(trade_size, adv, volatility):
"""
Market impact estimation
"""
participation = trade_size / adv
impact = volatility * np.sqrt(participation) * 0.1
return impact
Performance Metrics
def calculate_backtest_metrics(returns):
"""
Standard backtest performance metrics
"""
metrics = {
'total_return': (1 + returns).prod() - 1,
'annual_return': returns.mean() * 252,
'annual_volatility': returns.std() * np.sqrt(252),
'sharpe_ratio': returns.mean() / returns.std() * np.sqrt(252),
'max_drawdown': calculate_max_drawdown(returns),
'win_rate': (returns > 0).mean(),
'profit_factor': returns[returns > 0].sum() / abs(returns[returns < 0].sum()),
'avg_win': returns[returns > 0].mean(),
'avg_loss': returns[returns < 0].mean(),
'num_trades': count_trades(returns)
}
return metrics
Best Practices for Backtesting
- Use out-of-sample data for final validation
- Include realistic transaction costs and slippage
- Test across multiple market regimes
- Be skeptical of exceptional results
- Compare to simple benchmarks
- Document all assumptions and parameters
4.11 Walk-Forward Optimization
Definition
Walk-forward optimization (WFO) is a method for optimizing strategy parameters while avoiding overfitting by repeatedly optimizing on in-sample data and testing on subsequent out-of-sample data.
Core Principles
- Rolling Optimization: Optimize on window, test on next period
- Out-of-Sample Validation: Only OOS results count
- Adaptive Parameters: Parameters adjust to changing market conditions
- Overfitting Prevention: Reduces curve-fitting to historical data
- Realistic Performance: OOS results estimate true performance
Walk-Forward Process
def walk_forward_optimization(data, strategy_class,
optimization_window=252,
test_window=63,
param_grid=None):
"""
Walk-forward optimization framework
"""
results = []
for i in range(optimization_window, len(data) - test_window, test_window):
# In-sample optimization window
train_data = data.iloc[i-optimization_window:i]
# Out-of-sample test window
test_data = data.iloc[i:i+test_window]
# Optimize parameters on training data
best_params = optimize_parameters(train_data, strategy_class, param_grid)
# Test on out-of-sample data
strategy = strategy_class(**best_params)
oos_returns = strategy.backtest(test_data)
results.append({
'period_start': data.index[i],
'period_end': data.index[i+test_window],
'params': best_params,
'oos_returns': oos_returns,
'oos_sharpe': calculate_sharpe(oos_returns)
})
return pd.DataFrame(results)
Parameter Optimization
from itertools import product
def optimize_parameters(data, strategy_class, param_grid):
"""
Grid search for optimal parameters
"""
best_sharpe = -np.inf
best_params = None
# Generate all parameter combinations
param_combinations = list(product(*param_grid.values()))
for params in param_combinations:
param_dict = dict(zip(param_grid.keys(), params))
strategy = strategy_class(**param_dict)
returns = strategy.backtest(data)
sharpe = calculate_sharpe(returns)
if sharpe > best_sharpe:
best_sharpe = sharpe
best_params = param_dict
return best_params
Anchored vs. Rolling Walk-Forward
Anchored (Expanding Window):
def anchored_walk_forward(data, strategy_class, test_window=63):
"""
Training window expands over time
"""
min_train = 252
results = []
for i in range(min_train, len(data) - test_window, test_window):
# Training: all data from start to i
train_data = data.iloc[:i]
test_data = data.iloc[i:i+test_window]
# ... rest of optimization
Rolling (Fixed Window):
def rolling_walk_forward(data, strategy_class, train_window=252, test_window=63):
"""
Training window is fixed size
"""
results = []
for i in range(train_window, len(data) - test_window, test_window):
# Training: fixed window before i
train_data = data.iloc[i-train_window:i]
test_data = data.iloc[i:i+test_window]
# ... rest of optimization
Best Practices for Walk-Forward
- Use at least 2-3 years of data for optimization windows
- Test windows should be long enough for statistical significance
- Monitor parameter stability across periods
- Aggregate OOS results for overall performance estimate
- Be wary of strategies that require frequent reoptimization
4.12 Strategy Validation (Avoiding Biases)
Definition
Strategy validation involves rigorous testing procedures to ensure a strategy has genuine predictive power rather than being an artifact of data mining, overfitting, or statistical flukes.
Core Principles
- Skepticism: Assume the strategy doesn't work until proven otherwise
- Multiple Testing: Account for trying many strategies/parameters
- Economic Rationale: Strategy should have logical explanation
- Robustness: Results should hold across variations
- Statistical Significance: Returns should be statistically significant
Common Biases to Avoid
Look-Ahead Bias: - Using information not available at trade time - Prevention: Use only data available at decision point
Survivorship Bias: - Excluding failed companies/assets from analysis - Prevention: Use point-in-time databases with delisted securities
Data Snooping Bias: - Testing many variations until one works - Prevention: Pre-specify tests, use out-of-sample data
Selection Bias: - Cherry-picking favorable time periods - Prevention: Test across full available history
Overfitting: - Fitting noise rather than signal - Prevention: Use simple models, walk-forward optimization
Statistical Significance Testing
T-Test for Returns:
from scipy import stats
def test_significance(returns, null_return=0):
"""
Test if mean return is significantly different from null
"""
t_stat, p_value = stats.ttest_1samp(returns, null_return)
return {
't_statistic': t_stat,
'p_value': p_value,
'significant_5pct': p_value < 0.05,
'significant_1pct': p_value < 0.01
}
Multiple Testing Correction:
def bonferroni_correction(p_values, alpha=0.05):
"""
Adjust for multiple hypothesis tests
"""
n_tests = len(p_values)
adjusted_alpha = alpha / n_tests
significant = p_values < adjusted_alpha
return significant, adjusted_alpha
Robustness Checks
Parameter Sensitivity:
def parameter_sensitivity(data, strategy_class, param_name, param_range):
"""
Test how sensitive results are to parameter changes
"""
results = {}
for param_value in param_range:
strategy = strategy_class(**{param_name: param_value})
returns = strategy.backtest(data)
results[param_value] = calculate_sharpe(returns)
# Check if results are stable across parameter range
values = list(results.values())
stability = np.std(values) / np.mean(values)
return results, stability
Time Period Stability:
def period_stability(data, strategy, periods):
"""
Test performance across different time periods
"""
period_results = {}
for period_name, (start, end) in periods.items():
period_data = data[start:end]
returns = strategy.backtest(period_data)
period_results[period_name] = calculate_metrics(returns)
return period_results
Universe Stability:
def universe_stability(strategy, universes):
"""
Test on different asset universes
"""
universe_results = {}
for universe_name, universe_data in universes.items():
returns = strategy.backtest(universe_data)
universe_results[universe_name] = calculate_metrics(returns)
return universe_results
Validation Checklist
- Economic Rationale: Can you explain why the strategy works?
- Out-of-Sample Testing: Does it work on data not used in development?
- Statistical Significance: Are returns significantly different from zero?
- Parameter Stability: Do results hold across parameter variations?
- Time Period Stability: Does it work in different market regimes?
- Transaction Costs: Does it survive realistic costs?
- Capacity: Can it be implemented at desired scale?
- Simplicity: Is the strategy simpler than alternatives?
Best Practices for Validation
- Pre-register hypotheses before testing
- Use out-of-sample data for final validation only
- Account for multiple testing in significance claims
- Require economic rationale, not just statistical evidence
- Test robustness to reasonable parameter perturbations
- Be skeptical of Sharpe ratios above 2.0 without clear edge
- Paper trade before committing real capital
Summary
Effective strategy design requires:
- Clear separation of style, archetype, and signal - Know when you trade vs. why you trade
- Explicit alignment between market behavior and strategy logic - Match archetypes to regimes
- Understanding of risk profiles - Win rate alone is misleading; skew determines survivability
- Regime-aware deployment - Don't apply mean reversion in trends or trend following in ranges
- Robust execution and exits - Entry is less important than risk management
- Rigorous validation and skepticism - Assume strategies don't work until proven otherwise
Critical Insight:
Strategies do not fail randomly - they fail when applied to the wrong market behavior.
Mean reversion in a trend will blow up. Trend following in a range will bleed out.
Know your archetype's risk profile and contraindications.