← Back to Research

107 Features for Gold: Building an Institutional Feature Pipeline

Feature Engineering··Rahul S. P.

Abstract

We describe the design and validation of a 107-feature pipeline for intraday gold trading. The pipeline spans six feature groups: price dynamics, cross-asset signals, volatility regimes, microstructure proxies, temporal patterns, and statistical complexity measures. We detail the engineering choices behind each group, the cache invalidation strategy, and the empirical AUC contribution of each feature family. The pipeline supports both batch backtesting and live execution with sub-second latency.

1. Introduction

Feature engineering is widely acknowledged as the most labor-intensive and consequential stage of quantitative model development. In academic literature, model architectures receive disproportionate attention, yet practitioners consistently report that the quality and diversity of input features determines the ceiling of model performance. A well-constructed feature pipeline can make a simple model competitive; a poor one renders even sophisticated architectures ineffective.

We document the design, implementation, and validation of a 107-feature pipeline for intraday XAUUSD (gold) trading at the M1 (one-minute) frequency. The pipeline spans six cross-asset instruments, four feature groups, and 14 feature families. Each feature was individually validated via AUC scoring on a held-out validation set, with rigorous quality controls including feature inversion, noise removal, and cache invalidation.

This paper serves as both a technical reference for the pipeline and an empirical guide to which feature families contribute meaningful signal for gold intraday trading. The pipeline is implemented in a single function, the main feature builder function, which accepts M1 OHLCV DataFrames for all six instruments and returns a fully aligned feature matrix. The features are registered in an official feature column registry (105 core entries as of February 2026, plus 2 Alpha101 features — alpha024 and alpha083 — added conditionally when ENABLE_ALPHA101=True, totaling 107), which serves as both the canonical feature set and the cache invalidation key.

2. Data Sources

The pipeline ingests M1 OHLCV bars from six instruments, providing a multi-asset view of the macroeconomic environment:

Instrument Symbol Role Source
Gold XAUUSD Primary traded instrument MT5 / CSV
Silver XAGUSD Precious metals co-movement MT5 / CSV
US Dollar Index DX.f Currency regime MT5 / CSV
Nasdaq 100 NAS100 Risk appetite proxy MT5 / CSV
S&P 500 US500.f Broad equity regime MT5 / CSV
VIX VIX.f Implied volatility / fear gauge MT5 / CSV

Data loading is handled by a bar loader function, which attempts CSV first (from the data directory) and falls back to MetaTrader 5's Python API. The CSV-first approach allows development and backtesting without a live MT5 connection, while the MT5 fallback enables live trading with real-time data. XAUUSD OHLCV columns are always prefixed with "xau_" (producing xau_open, xau_high, xau_low, xau_close, xau_volume) to avoid namespace collisions during cross-asset merges. This renaming occurs regardless of whether cross-asset data is available, ensuring consistent column names throughout the pipeline.

Cross-asset data is merged on timestamps via an inner join. Bars where any instrument has missing data are dropped rather than forward-filled, ensuring that no feature computation uses stale or interpolated prices. This approach reduces the available bar count by approximately 5–10% (due to misaligned trading hours and data gaps) but eliminates look-ahead bias from filling future prices into past bars.

3. Feature Groups

3.1 Original Features (10 features)

The foundational feature set, designed to capture core price dynamics and temporal structure. These were the first features implemented and have been in the pipeline since inception:

# Feature Description Computation
1 accelz_60_30 Acceleration z-score Z-score of the difference between 30-bar and 60-bar momentum (second derivative of price). Captures whether momentum is accelerating or decelerating. High positive values indicate accelerating upward movement.
2 volaccelz60_30 Volatility acceleration Z-score of the difference between 30-bar and 60-bar realized volatility. Detects transitions between calm and volatile regimes. A rising volaccel often precedes breakouts.
3 dist_ma120 Distance from 120-bar MA $\frac{\text{close} - \text{SMA}(\text{close}, 120)}{\text{SMA}(\text{close}, 120)}$. Normalized distance from the 2-hour moving average. Mean-reversion anchor: extreme values suggest overextension.
4 resid_z60 AR(1) residual z-score Z-score of the residual from a 60-bar rolling linear regression of close prices. Captures deviations from the recent linear trend. Computed by the residual z-score function.
5 resid_z60_dxy AR(1) residual z-score (DXY) Z-score of the residual from a 60-bar rolling linear regression of XAUUSD against DXY. Captures gold-specific deviations from the dollar relationship.
6 resid_z60_nas100 AR(1) residual z-score (NAS100) Z-score of the residual from a 60-bar rolling linear regression of XAUUSD against NAS100. Captures gold-specific deviations from the equity relationship.
7 er60 Efficiency ratio (60-bar) $\frac{|\text{close}_t - \text{close}_{t-60}|}{\sum_{i=t-59}^{t} |\text{close}_i - \text{close}_{i-1}|}$. Ranges [0, 1]. High values indicate trending (price moved far relative to path length); low values indicate choppy/mean-reverting.
8 tod_sin Time-of-day (sine) $\sin\left(\frac{2\pi \cdot \text{minutes\_since\_midnight}}{1440}\right)$. Cyclical encoding of time that the model can use to learn session-dependent patterns without discrete session boundaries.
9 tod_cos Time-of-day (cosine) $\cos\left(\frac{2\pi \cdot \text{minutes\_since\_midnight}}{1440}\right)$. Paired with tod_sin to provide a complete cyclical encoding. Note: INVERTED (original AUC was 0.476; inverted to 0.524). The cosine component peaked at midnight UTC, which anti-correlates with direction during the Asian session.
10 leadcorr_nas100 Lead-lag correlation with NAS100 Rolling 60-bar Pearson correlation between XAUUSD and NAS100 returns. Captures the time-varying risk-on/risk-off relationship. Not predictive as a lagged feature (see companion paper), but informative as a regime indicator.

3.2 OG Extended Features (26 features)

Extensions to the original set, adding multi-horizon returns, cross-asset correlations, betas, and volatility structure. These features provide the model with a richer view of price dynamics at multiple timescales:

  • Returns (6): Log returns computed at 1-minute, 5-minute, 30-minute, 60-minute, and 120-minute horizons, plus 30-bar rolling realized volatility (vol_30m). Each horizon captures different dynamics: 1m returns are dominated by microstructure noise but have significant AR(1) structure; 5m returns capture short-term momentum; 30m, 60m, and 120m returns capture intra-session trends. The model learns to weight these horizons differently via the VSN. Features: ret_1m, ret_5m, ret_30m, ret_60m, ret_120m, vol_30m.
  • MA distances (3): Normalized distance from 15-bar, 30-bar, and 290-bar moving averages. The 15-bar distance captures very short-term mean-reversion potential; the 30-bar captures short-term; the 290-bar distance captures the position within the broader intraday trend. Distance is normalized by the MA value to produce a percentage rather than an absolute dollar distance. Features: dist_ma_15, dist_ma_30, dist_ma_290.
  • Cross-asset correlations (6): Rolling Pearson correlation of XAUUSD returns with XAGUSD and DXY at three horizons (30, 60, 120 bars). These are not predictive as lagged features (confirmed by our cross-asset lead-lag study) but serve as regime indicators: a breakdown in the normally strong gold-silver correlation, or an inversion of the gold-dollar correlation, signals a regime change that affects optimal trading behavior. Features: corr_xau_xag_30, corr_xau_xag_60, corr_xau_xag_120, corr_xau_dxy_30, corr_xau_dxy_60, corr_xau_dxy_120.
  • Cross-asset betas (7): Rolling regression beta of XAUUSD returns against XAG at four horizons (5, 30, 60, 120 bars) and DXY at three horizons (30, 60, 120 bars). The beta measures gold's sensitivity to each instrument. Note: the three XAG beta features (beta_xag_to_xau_30, beta_xag_to_xau_60, beta_xag_to_xau_120) required inversion — their natural orientation had AUC below 0.500, meaning that higher beta (more sensitivity to cross-assets) actually predicted opposite gold direction. After inversion, AUC values improved to 0.519–0.525 (1 − original AUC). Features: beta_xag_to_xau_5, beta_xag_to_xau_30, beta_xag_to_xau_60, beta_xag_to_xau_120, beta_xau_to_dxy_30, beta_xau_to_dxy_60, beta_xau_to_dxy_120.
  • XAU core (4): xaucore is the residual return after removing cross-asset beta exposures: $r_{ ext{gold}} - eta_{ ext{DXY}} cdot r_{ ext{DXY}} - eta_{ ext{NAS}} cdot r_{ ext{NAS}}$. Computed at four horizons (5, 30, 60, 120 bars), this isolates gold-specific returns from cross-asset factor exposure at multiple timescales. Positive xaucore indicates gold is outperforming what cross-asset factors predict, potentially due to gold-specific flows (physical demand, central bank buying, ETF inflows). Features: xaucore_5, xaucore_30, xaucore_60, xaucore_120.

3.3 Level and Channel Features (17 features)

Support/resistance levels and regression channels provide structural context that pure price dynamics features cannot capture. A price that is $2 above the nearest resistance level has different implications than a price that is $2 below it, even if the returns, volatility, and momentum are identical.

  • KMeans levels (13): Computed via the the LevelState dataclass, which maintains a dynamic set of K=7 price levels computed by K-Means clustering on a 5-day (7,200-bar) lookback of close prices. The the level features function function returns 13 values per bar (extended from the original 10 in February 2026):
    • dist_nearest: Distance to the nearest KMeans level
    • dist_support: Distance to the nearest level below current price
    • dist_resistance: Distance to the nearest level above current price
    • dist_nearest_norm: Distance to nearest level, normalized by ATR
    • dist_support_norm: Distance to support, normalized by ATR
    • dist_resistance_norm: Distance to resistance, normalized by ATR
    • level_rank: Ordinal rank of the nearest level among all K levels
    • near_level_flag: Binary flag: within 0.5 ATR of any level
    • touch_count_6h: Number of times price has touched any level in the last 6 hours
    • bounce_count: Number of bounces off levels (reversals after touch)
    • touch_velocity: Speed of approach to the nearest level
    • false_breakout_count: Number of false breakouts through levels in the last 24 hours
    • time_since_last_touch: Minutes since the last interaction with any level
    The three features added in the February 2026 extension (bounce_count, false_breakout_count, time_since_last_touch) improved the model's ability to distinguish between fresh levels (recently formed, actively tested) and stale levels (not touched in hours, likely irrelevant).
  • Quantile regression channels (4): Fit quantile regression lines at Q=[0.1, 0.5, 0.9] over a 180-minute rolling window. The features are:
    • channel_upper: Q=0.9 regression line value (upper boundary)
    • channel_lower: Q=0.1 regression line value (lower boundary)
    • channel_width: $( ext{upper} - ext{lower}) / C$ (normalized width)
    • channel_position: $(C - ext{lower}) / ( ext{upper} - ext{lower})$, ranging [0, 1], indicating position within the channel (0 = at lower boundary, 1 = at upper boundary)
    Quantile regression is preferred over standard linear regression for channels because it captures the actual boundaries of price movement rather than the central tendency. The Q=0.1 and Q=0.9 lines approximate the 10th and 90th percentile price paths.

3.4 Session and Timing Features (5 features)

Gold's 23-hour trading day spans three major liquidity sessions with distinct microstructure characteristics. Session features allow the model to learn session-dependent behavior:

Feature Description Rationale
session_asian Binary: Asian session (00:00–08:00 UTC) Low volume, narrow ranges, strong mean reversion. The model should apply tighter stops and prefer counter-trend trades during this session.
session_london Binary: London session (07:00–16:00 UTC) Highest liquidity, London AM/PM gold fixes, pronounced trending behavior. The session where most genuine moves occur.
session_ny Binary: New York session (13:00–22:00 UTC) Equity-correlated flows, macroeconomic data releases (NFP, CPI, FOMC). Most volatile during the London-NY overlap.
session_overlap Binary: London–NY overlap (13:00–16:00 UTC) The single most liquid and volatile period of the trading day. Both London and New York desks are active simultaneously. ~40% of daily gold volume concentrates here.
vol_session_ratio Continuous: current volatility / session average volatility Normalizes volatility by session expectations. A vol_session_ratio of 2.0 during the Asian session is more noteworthy than 2.0 during the London-NY overlap.

Removed feature: london_open (binary flag for the first 15 minutes of London session) was tested and removed with AUC = 0.503 — indistinguishable from noise. The session boundaries themselves provide sufficient temporal context without precise-minute indicators.

3.5 Extended Features (47 features)

The largest feature group, organized into 14 sub-families. Each sub-family targets a specific aspect of market microstructure, regime dynamics, or cross-asset relationships:

Sub-Family Count Features Description
Price-Volume Interaction 4 vw_return_60, obv_slope_60, vol_surprise, pv_corr_60 Volume-weighted returns capture whether price moves are backed by participation. OBV (On-Balance Volume) slope tracks cumulative buying/selling pressure. Volume surprise detects unusual activity. PV correlation measures the price-volume relationship (positive in trending, negative in distribution).
Tick Proxies 3 tick_direction_ratio, tick_intensity, bar_tick_vol_ratio Since true order flow is unavailable in OTC gold, we estimate it from price microstructure. Tick direction ratio counts uptick vs. downtick bars in a rolling window. Tick intensity measures the number of price changes per bar. Bar-to-tick volatility ratio detects when bar volatility diverges from tick-level volatility (indicating large single prints vs. gradual movement).
Multi-TF Momentum 3 rsi_14, mom_divergence, mom_60 RSI computed via the rolling RSI function (window=14) captures overbought/oversold conditions. Momentum at the 60-bar horizon and momentum divergence ($ ext{mom}_5 - ext{mom}_{60}$) flags when short-term momentum opposes long-term — often a reversal precursor.
Regime Indicators 4 hurst_60, trend_mr_class, regime_persistence, regime_change_rate Hurst exponent (>0.5 = trending, <0.5 = mean-reverting) via R/S analysis. Trend/MR classification is a derived binary. Regime persistence measures how many consecutive bars have maintained the same regime. Regime change rate counts regime switches per 120-bar window.
Volatility Proxies 3 parkinson_vol, gk_vol, vol_of_vol Parkinson volatility uses high-low range (more efficient than close-to-close). Garman-Klass (_garman_klass_vol) uses full OHLC. Vol-of-vol (volatility of volatility) captures clustering.
Volatility Clustering 2 vol_autocorr_10, vol_breakout_flag Volatility autocorrelation at 10-bar lag quantifies clustering strength (typically 0.6–0.8 for gold M1, confirming strong GARCH-like behavior). Vol breakout flags bars where volatility exceeds 2σ above its 120-bar mean.
Risk-On/Off 3 gold_equity_corr_regime, vix_level, vix_change Gold-equity correlation regime captures whether gold is trading as a risk asset (positive correlation with equities) or a safe haven (negative correlation). VIX level and change capture fear gauge dynamics.
Lead-Lag Improvements 3 dxy_lag5_ret, nas_lag5_ret, xag_lag5_ret Despite our finding that 1-bar lagged returns have no predictive power, we retain 5-bar lagged returns as features because they capture a slightly different dynamic: the 5-minute lag allows for slower information transmission channels (e.g., option hedging flows). Their AUC is marginal (0.505–0.510) but they contribute to the model's regime awareness.
Candle Patterns 3 body_range_ratio, upper_wick_ratio, lower_wick_ratio Body-to-range ratio measures conviction (high ratio = strong directional bar, low ratio = indecision). Wick ratios quantify rejection from high/low prices.
Liquidity Windows 3 mins_to_session_open, mins_to_session_close, london_fix_proximity Distance (in minutes) to the nearest session open/close. Session opens attract positioning flows; session closes attract book-squaring. London fix proximity (distance to 10:30 AM and 3:00 PM London gold fixes) captures the pre-fix positioning that systematically affects gold prices.
Calendar Patterns 3 week_of_month, month_of_quarter, nfp_week_flag Week-of-month captures turn-of-month effects (institutional rebalancing, pension fund flows). Month-of-quarter captures quarter-end dynamics. NFP week flag marks the first Friday of each month ±2 days, when non-farm payrolls data creates unique volatility patterns.
S/R Improvements 3 round_number_10, prev_day_hl_dist, pivot_point_dist Round number proximity (distance to nearest $10 levels) captures psychological support/resistance. Previous-day high/low distance provides key structural levels. Pivot point distance (classic floor-trader pivots) captures institutional reference levels.
Multi-Scale Analysis 4 fractal_dim_60, dfa_60, wavelet_energy_ratio, multiscale_entropy Fractal dimension (_rolling_fractal_dimension, Higuchi method) measures price path complexity [1,2]. DFA (_rolling_dfa, Detrended Fluctuation Analysis) quantifies long-range dependence. Wavelet energy ratio captures the distribution of variance across frequency bands. Multi-scale entropy measures complexity at multiple embedding dimensions. Note: Hurst exponent is counted under Regime Indicators.
Self-Similarity 6 autocorr_lag1, autocorr_lag5, autocorr_lag15, autocorr_lag60, partial_autocorr, self_similarity_idx Autocorrelation at four lags captures the AR structure at different horizons. Partial autocorrelation isolates the direct (not mediated) lag effect. Self-similarity index measures how well the recent return distribution matches the longer-term distribution (a form of stationarity test). Note: alpha024 and alpha083 (the two surviving Kakushadze 2016 factors) are added conditionally when ENABLE_ALPHA101=True, bringing the total from 105 to 107.

4. Feature Quality Control

4.1 AUC-Based Validation

Every feature in the pipeline undergoes individual AUC testing on the held-out validation set (last 20% of training data). The target is binary: 1 if the next M1 bar's close exceeds the current close, 0 otherwise. AUC measures how well the feature alone discriminates between positive and negative next-bar returns, providing a baseline assessment of univariate predictive power before any feature interactions are considered.

The AUC threshold for inclusion is context-dependent:

  • AUC > 0.515 or < 0.485: Strong candidate. Features below 0.485 are inverted (see 4.2), which flips them above 0.515.
  • AUC 0.505–0.515 or 0.485–0.495: Marginal. Included if they provide incremental value in forward feature selection (tested by adding to the existing set and measuring pipeline AUC improvement).
  • AUC 0.495–0.505: Noise. Removed. No feature in this band has ever survived forward selection.

4.2 Feature Inversion

Features with AUC consistently below 0.500 are inverted (multiplied by −1) rather than discarded. A feature with AUC = 0.480 is just as informative as one with AUC = 0.520 — it simply has the opposite sign convention. The inversion is applied in the feature pipeline before caching, ensuring that the model always sees the correctly oriented version. Four features required inversion:

Feature Original AUC Post-Inversion AUC Explanation
tod_cos 0.476 0.524 Cosine component peaked at midnight UTC; gold direction during Asian session was opposite to what the raw encoding implied.
beta_xag_to_xau_30 0.475 0.525 Higher gold-silver beta (more sensitivity) predicted opposite gold direction, possibly because high beta indicates regime stress.
beta_xag_to_xau_60 0.476 0.524 Same dynamic as 30-bar beta, stronger at longer horizon.
beta_xag_to_xau_120 0.481 0.519 Same pattern, weaker at the longest horizon as the relationship stabilizes.

The inversion list is tracked in the feature inversion list and included in the cache signature hash. Any change to the inversion list triggers automatic cache invalidation and feature recomputation.

4.3 Feature Removal

Features with AUC indistinguishable from 0.500 (within ±0.005 after inversion) were removed entirely. These contribute no directional signal and add noise to the model. Notable removals include:

  • sign_agree — cross-asset sign agreement (fraction of cross-asset instruments moving in the same direction as gold). AUC: 0.501. This feature measures contemporaneous agreement, which has no predictive value for the next bar.
  • kalman_state, kalman_gain — outputs from a Kalman filter applied to close prices. AUC: 0.499, 0.502. The Kalman filter's state estimate is essentially a smoothed price, and the gain measures how much the filter trusts new observations. Neither provides directional signal beyond what the existing MA distance and residual z-score features capture.
  • london_open — binary flag for the first 15 minutes of the London session. AUC: 0.503. The session indicators (session_london) already capture the London session boundary; a precise 15-minute window adds no incremental information.
Figure 2: Feature Quality Control Pipeline 120+ Raw Candidates AUC Validation Remove AUC ≈ 0.50 -9 noise Invert AUC<0.50 4 features flipped 107 Official Features Cache invalidation via SHA-256 hash of OFFICIAL_FEATURE_COLS

Figure 2: Feature quality control pipeline. Starting from 120+ candidates, noise features are removed via AUC validation, 4 features are inverted, yielding the final 107 official features.

4.4 Alpha101 Screening

All 101 Kakushadze (2016) alpha factors were evaluated as candidate features. Only 4 exceeded AUC > 0.515, and only 2 survived forward selection: alpha024 (AUC: 0.521) and alpha083 (AUC: 0.518). The remaining 99 were discarded. The overall survival rate (1.98%) is consistent with the hypothesis that equity cross-sectional factors do not transfer to single-instrument commodity intraday trading. See our companion paper for a detailed analysis of the five structural failure modes.

5. Helper Functions

The pipeline relies on eight core vectorized helper functions, each designed for efficiency on M1-scale datasets (hundreds of thousands of rows). All helpers are defined before the feature builder function in the source file and are implemented with NumPy vectorized operations, avoiding Python-level loops:

Function Signature Output Range Purpose
_rolling_rsi (series, window) → series [0, 100] Relative Strength Index: $\text{RSI} = 100 - \frac{100}{1 + \text{RS}}$ where $\text{RS} = \frac{\text{EMA}(\text{gains}, w)}{\text{EMA}(\text{losses}, w)}$. Uses the Wilder smoothing method (equivalent to EMA with span=2*window-1). RSI=14 is the standard configuration used in the pipeline.
_rolling_atr_series (high, low, close, window) → series [0, ∞) Average True Range (vectorized). Computes $\text{TR}_t = \max(H_t - L_t,\, |H_t - C_{t-1}|,\, |L_t - C_{t-1}|)$, then applies EMA smoothing: $\text{ATR}_t = \text{EMA}(\text{TR}, w)$. This is the vectorized version for the feature pipeline; a separate scalar a scalar ATR function function exists for live execution where only the latest value is needed.
_rolling_hurst (series, window) → series [0, 1] Hurst exponent via Rescaled Range (R/S) analysis: $H$ is estimated from the scaling law $\frac{R}{S} \sim n^H$. $H > 0.5$ indicates trending (persistent) behavior; $H < 0.5$ indicates mean-reverting (anti-persistent) behavior; $H = 0.5$ indicates random walk. Minimum bar floor: 20 (below this, R/S analysis is statistically unreliable).
_rolling_fractal_dimension (series, window) → series [1, 2] Higuchi fractal dimension. D=1 for a smooth curve, D=2 for space-filling noise. Values near 1.5 are typical for financial time series. Higher values indicate more complex/noisy price paths. Minimum bar floor: 12 (Higuchi method requires at least 12 samples for stable k-max estimation).
_rolling_dfa (series, window) → series [0, 2] Detrended Fluctuation Analysis exponent. α > 1 indicates long-range correlations (trending); α < 0.5 indicates anti-correlations (mean-reverting); α = 0.5 indicates white noise. More robust than the Hurst exponent for non-stationary series.
_garman_klass_vol (high, low, close, open, window) → series [0, ∞) Garman-Klass volatility estimator. Uses full OHLC information, making it approximately 8× more efficient than close-to-close volatility estimation. Formula: $\sigma_{GK}^2 = \frac{1}{n}\sum_{i=1}^{n}\left[\frac{1}{2}\left(\ln\frac{H_i}{L_i}\right)^2 - (2\ln 2 - 1)\left(\ln\frac{C_i}{O_i}\right)^2\right]$, averaged over the rolling window.
_rolling_beta (y, x, window) → series (−∞, ∞) Rolling OLS regression beta coefficient. Computes cov(y, x) / var(x) on a rolling window. Used for cross-asset betas (gold returns regressed on DXY, NAS100, US500 returns).
_resid_z60 (close) → series (−∞, ∞) Z-scored residual from a 60-bar rolling linear regression. Fits a linear trend to the last 60 close prices, computes the residual (actual - predicted), then z-scores it. Captures how far price deviates from its recent linear trend, normalized by the typical deviation magnitude.

6. Feature Caching

6.1 Architecture

Feature computation is expensive: the full 107-feature pipeline on six months of M1 data requires approximately 90 seconds on a modern workstation. The bottleneck is the statistical features (Hurst, fractal dimension, DFA) which require rolling window computations over hundreds of thousands of rows. To avoid redundant computation, we implement a Parquet-based caching system:

  • Cache format: Apache Parquet with SNAPPY compression. Parquet is columnar, enabling efficient reading of individual feature columns without deserializing the entire frame. A 107-feature, 180K-row cache file is approximately 40 MB compressed.
  • Metadata: Companion a companion metadata file file storing the feature signature hash, creation timestamp, feature count, row count, and data time range. The metadata enables quick validation without reading the Parquet file.
  • Signature: The cache key is a composite hash of multiple configuration elements: a composite of the features mode, official feature list, inversion list, Alpha101 flag, and any custom cache keys. The cache_key_dict can include additional application-specific keys (e.g., the data file modification timestamp). Any change to any component of this signature triggers a full cache rebuild.

6.2 Invalidation

The cache is automatically invalidated when any of the following change:

  • A feature is added to or removed from the official feature list
  • A feature is added to or removed from the the feature inversion list list
  • The order of features changes (affects model input layer ordering)
  • The the Alpha101 toggle flag is toggled (adds/removes 2 features)
  • The underlying data source is updated (detected via file modification timestamp in cache_key_dict)
  • Any custom cache key in cache_key_dict changes

On cache hit, feature loading takes approximately 0.3 seconds (vs. 90 seconds for full recomputation) — a 300× speedup. Cache rebuild events are logged with the reason for invalidation (which component of the signature changed), facilitating debugging when unexpected rebuilds occur. This makes iterative model development practical without risking stale feature data.

7. Minimum Bar Floors

Several statistical features require a minimum number of input bars to produce mathematically stable outputs. Below these thresholds, the estimators are unreliable or degenerate. The the bar-floor scaling helper scaling function enforces minimum bar floors before computing these features:

Feature Minimum Bars Rationale Behavior Below Floor
Hurst exponent (_rolling_hurst) 20 R/S analysis requires partitioning the window into sub-ranges. With <20 bars, the partition sizes are too small for stable R/S estimation, and the Hurst estimate degenerates toward 0.5 (random walk) regardless of the true data-generating process. Output is NaN, forward-filled from the last valid estimate
Fractal dimension (_rolling_fractal_dimension) 12 The Higuchi method computes curve lengths at multiple resolutions (k=1,2,...,k_max). With <12 bars, k_max is too small to fit a reliable log-log regression for the dimension estimate. Output is NaN, forward-filled
Wavelet energy ratio 2 Wavelet decomposition at the minimum level (1) requires at least 2 data points. This is a very low bar — the wavelet feature is available from the 2nd bar onward. Output is 0.5 (equal energy at all scales)
DFA exponent (_rolling_dfa) 16 DFA requires fitting linear trends to windows of multiple sizes. With <16 bars, the range of window sizes is too narrow for reliable exponent estimation. Output is NaN, forward-filled

The bar floors are enforced in the the bar-floor scaling helper helper, which wraps each statistical feature computation and replaces outputs below the floor with NaN. The NaN values are then forward-filled from the most recent valid estimate. This approach ensures that the model never sees garbage statistical estimates from insufficient data, while avoiding the loss of entire rows at the start of the dataset. In practice, the bar floors only affect the first 10–20 bars of a session (after a data gap or session start), which are typically excluded from training and inference anyway due to the sequence length requirements of the model (minimum 30 bars for the SHORT scale).

8. Summary Feature Table

The complete pipeline organized by family, with feature counts:

Group Family Count Key Signals
Original (10) Price dynamics 7 Acceleration, MA distance, residual z-scores (XAU, DXY, NAS100), efficiency ratio
Temporal encoding 3 Time-of-day (sin/cos), lead correlation
OG Extended (26) Returns 6 ret_1m, ret_5m, ret_30m, ret_60m, ret_120m, vol_30m
MA distances 3 dist_ma_15, dist_ma_30, dist_ma_290
Correlations & betas 13 corr_xau_xag/dxy at 30/60/120, beta_xag 5/30/60/120, beta_dxy 30/60/120
XAU core 4 xaucore_5, xaucore_30, xaucore_60, xaucore_120
Level & Channel (17) KMeans levels 13 Distances (nearest/support/resistance, raw + ATR-normalized), level rank, near-level flag, touch/bounce counts, touch velocity, false breakouts, time-since-touch
Quantile channels 4 Upper/lower bounds, width, position
Session (5) Session indicators 5 Asian/London/NY/overlap flags, vol_session_ratio
Extended (47) Price-volume interaction 4 Volume-weighted returns, OBV, PV correlation
Tick proxies 3 Tick direction, intensity, bar-tick vol ratio
Multi-TF momentum 3 RSI, momentum, divergence
Regime indicators 4 Hurst, trend/MR class, persistence, change rate
Volatility proxies 3 Parkinson, Garman-Klass, vol-of-vol
Volatility clustering 2 Vol autocorrelation, breakout flag
Risk-on/off 3 Gold-equity corr regime, VIX level/change
Lead-lag improvements 3 DXY/NAS100/XAG lagged returns
Candle patterns 3 Body ratio, wick ratios
Liquidity windows 3 Session open/close proximity, London fix
Calendar patterns 3 Week-of-month, month-of-quarter, NFP week
S/R improvements 3 Round number proximity, prev day H/L, pivot
Multi-scale analysis 4 Fractal dimension, DFA, wavelets, entropy
Self-similarity 6 Autocorrelations (4 lags), partial autocorr, self-similarity
Core Total 105 + 2 Alpha101 features (alpha024, alpha083) when ENABLE_ALPHA101=True = 107
Figure 1: Feature Composition (105 core + 2 Alpha101 = 107 Total) 10 26 17 5 47 Original OG Extended Level & Channel Session Extended PV(4) Tick(3) Mom(3) Regime(4) Vol(3) VolClust(2) Risk(3) Lead(3) Candle(3) Liq(3) Cal(3) S/R(3) Scale(4) Self(6) Original (10) OG Extended (26) Level & Channel (17) Session (5) Extended (47)

Figure 1: Feature composition across the five major groups. The Extended group (47 features) is the largest, spanning 14 sub-families of market microstructure, regime, and cross-asset features. The 105 core features plus 2 conditionally-added Alpha101 features total 107.

Feature AUC stability across time periods

Figure 3: Feature AUC stability heatmap across different time periods. Features with consistent AUC values across periods (uniform colour) are the most reliable; those with high variance (mixed colours) may be regime-dependent.

Regime gate effectiveness versus feature AUC

Figure 4: Regime gate effectiveness versus feature AUC. Features with high static AUC tend to also receive high gate activation in the Variable Selection Network, but some features with moderate AUC show regime-dependent gating that amplifies their conditional predictive power.

Quantile regression channel with price overlay

Figure 5: Quantile regression channel (Q=0.1 and Q=0.9) overlaid on XAUUSD price data. The channel width and the position of price within the channel are two of the 17 level and channel features in the pipeline.

XAUUSD close price versus AR(1) model prediction

Figure 6: XAUUSD close price versus AR(1) model prediction over a 24-hour window. The residual from this regression (the AR(1) residual z-score) is one of the original 10 features in the pipeline and captures short-term deviations from the linear trend.

9. Conclusion

A structured feature pipeline with rigorous quality control is essential for robust quantitative trading. Our 107-feature set balances breadth — spanning 6 instruments, 4 groups, and 14 feature families — with discipline: every feature passes AUC validation, features with inverted polarity are corrected rather than discarded, and a hash-based caching system ensures reproducibility without stale data.

The pipeline supports both batch backtesting (full historical recomputation) and live execution (incremental feature updates with sub-second latency). The main entry point, the feature builder function, accepts M1 OHLCV DataFrames for all six instruments and returns a fully aligned feature matrix ready for model ingestion. The function handles all preprocessing (column renaming, timestamp alignment, missing data removal) internally, presenting a clean interface to the caller.

The most informative feature families, ranked by their contribution to pipeline AUC, are: (1) regime indicators (Hurst exponent, trend/MR classification), (2) volatility proxies (Garman-Klass, Parkinson, vol-of-vol), (3) multi-scale analysis (fractal dimension, DFA), (4) level features (KMeans distances, touch counts), and (5) price dynamics (acceleration z-score, efficiency ratio). Cross-asset features provide critical regime context but limited direct predictive power, consistent with our finding that cross-asset lead-lag relationships do not hold at the M1 frequency.

Key Finding: Of the 107 features, the Extended group (47 features) contributes the largest share of marginal AUC, with volatility proxies, regime indicators, and multi-scale analysis being the highest-value families. Cross-asset features provide critical regime context but limited direct predictive power. The Alpha101 screening (101 candidates, 2 survivors) demonstrates that feature sourcing from adjacent domains has extremely low yield — domain-specific engineering is irreplaceable.

The pipeline is maintained via the the official feature list registry, which serves as both the canonical feature list and the cache invalidation key. Adding or removing a feature requires only updating this list — the caching system, model input layer, and validation suite adapt automatically. The minimum bar floors enforced by the bar-floor scaling helper ensure that statistical features are never computed on insufficient data, and the forward-fill strategy for sub-floor bars preserves row alignment without introducing garbage estimates into the training data.