Fixing Gold Market Overfitting: A Predictive Machine Studying Method – Buying and selling Techniques – 17 February 2026

By admin2010

February 17, 2026

0

1

Fixing Gold Market Overfitting: A Predictive Machine Studying Method with ONNX and Gradient Boosting

Case Examine: The “Golden Gauss” Structure

Writer: Daglox Kankwanda

ORCID: 0009-0000-8306-0938
Technical Paper: Zenodo Repository (DOI: 10.5281/zenodo.18646499)

The algorithmic buying and selling house, notably in retail markets, faces a basic credibility drawback. The sample is predictable and pervasive: techniques exhibit spectacular backtest efficiency, adopted by speedy degradation in ahead testing, culminating in account destruction throughout stay deployment. This failure mode stems from a single root trigger—optimization for in-sample efficiency with out rigorous out-of-sample validation.

The mathematical actuality is easy: given enough levels of freedom, any mannequin can “memorize” historic value patterns. Such memorization produces spectacular backtest metrics whereas offering zero predictive energy for future market habits. The mannequin has discovered the noise, not the sign.

Past overfitting, conventional indicator-based approaches endure from a basic timing deficiency. Technical indicators, by development, are reactive—they course of historic knowledge to generate alerts after value actions have already begun.

Core Thesis: A very helpful buying and selling system should establish the circumstances previous vital value exercise, not the exercise itself. The aim is prediction, not affirmation.

This text presents a strategy that synthesizes machine studying analysis insights right into a sensible, deployable buying and selling system for XAUUSD (Gold) markets, demonstrated by way of the “Golden Gauss” structure.

2. The Core Issues in Algorithmic Buying and selling

2.1 The Overfitting Disaster

The proliferation of “AI-powered” buying and selling techniques in retail markets has created a credibility disaster, with most techniques exhibiting catastrophic failure when deployed on unseen knowledge attributable to extreme overfitting.

Determine 1: Conceptual illustration of the standard Professional Advisor lifecycle. Fashions optimized for historic efficiency often fail catastrophically when deployed on unseen market circumstances.

2.2 The Latency Downside in Technical Evaluation

Technical indicators are inherently reactive:

By the point RSI crosses the overbought threshold, the value has already moved considerably
By the point a MACD crossover confirms, the optimum entry window has handed
By the point a breakout is “confirmed,” stop-loss necessities have expanded considerably

Determine 2: Comparability of timing between reactive technical indicators and predictive machine studying approaches. Conventional indicators affirm strikes after optimum entry has handed, whereas predictive techniques establish setup circumstances earlier than execution.

2.3 Literature Context

The applying of machine studying to monetary time-series prediction has advanced considerably. A number of constant findings are related:

Discovering	Implication
Gradient Boosting Dominance on Tabular Knowledge	Regardless of advertising and marketing attraction of “deep studying,” ensemble strategies constantly outperform neural networks on structured monetary knowledge
Function Engineering Criticality	High quality of engineered options sometimes determines mannequin success greater than architectural selections
Temporal Validation Necessities	Customary cross-validation that shuffles knowledge is inappropriate for monetary time-series attributable to lookahead bias
Cross-Asset Info	Monetary devices don’t commerce in isolation; correlated devices present helpful context

3. Methodology

3.1 The Predictive Labeling Methodology

Customary approaches to coaching buying and selling fashions label knowledge on the level the place value motion happens. This creates a basic drawback: if the mannequin learns options calculated from the identical bars which can be labeled, it successfully learns to acknowledge strikes which can be already taking place slightly than strikes which can be about to occur.

The Golden Gauss structure employs a strategy that maintains temporal separation between characteristic calculation and label placement:

The labeling course of identifies worthwhile zones the place value moved considerably in a selected route
All options are calculated from market knowledge that occurred earlier than the labeled zone begins

Determine 3: Guide labeling interface displaying XAUUSD value motion with recognized directional zones. The labeled BUY and SELL areas characterize worthwhile strikes used as coaching targets; the mannequin learns to foretell these strikes utilizing options calculated from previous market knowledge.

Implications: This temporal separation ensures the mannequin learns to acknowledge preconditions—the market microstructure patterns that precede vital strikes—slightly than traits of the strikes themselves.

3.2 High quality-Filtered Coaching Labels

Not all value actions are significant or tradeable. Many are:

Too small to beat transaction prices (unfold + fee)
Too erratic to execute cleanly
A part of bigger consolidation patterns with out directional follow-through

The labeling course of applies strict filtering standards, figuring out solely zones the place value moved with enough magnitude and directional consistency. This ensures the mannequin learns solely from setups that exceeded minimal profitability thresholds.

3.3 Twin-Mannequin Directional Structure

Market dynamics exhibit basic asymmetry between bullish and bearish habits:

Accumulation patterns differ structurally from distribution patterns
Worry-driven promoting sometimes executes quicker than greed-driven shopping for
Assist habits differs from resistance habits
Quantity traits differ between advances and declines

To respect this asymmetry, the structure employs two impartial binary fashions:

Mannequin	Output	Coaching Knowledge
BUY Mannequin	P(Bullish Transfer Imminent)	Educated solely on bullish labels
SELL Mannequin	P(Bearish Transfer Imminent)	Educated solely on bearish labels

Every mannequin is a binary classifier detecting solely its respective directional setup. This prevents the confusion that happens when a single mannequin makes an attempt to study contradictory patterns concurrently.

3.4 Stroll-Ahead Validation Protocol

Customary machine studying cross-validation, which shuffles knowledge randomly, is inappropriate for monetary time-series attributable to temporal dependencies and lookahead bias dangers.

The system makes use of strict walk-forward validation with full chronological separation:

Coaching knowledge extends by way of December 31, 2024
All architectural choices, hyperparameters, and have engineering selections have been finalized utilizing solely this knowledge
The mannequin was then frozen and validated on a 13-month out-of-sample interval (January 2025 by way of January 2026)

Determine 4: Temporal knowledge separation for walk-forward validation. Coaching knowledge extends by way of finish of 2024; all 2025-2026 analysis represents strictly out-of-sample efficiency on knowledge not used for coaching.

Important Guidelines:

No shuffling of time-series knowledge
Analysis interval evaluation solely in any case mannequin choices finalized
No iterative “peeking” at analysis outcomes to regulate parameters

4. System Structure

The system includes two distinct however built-in elements:

Coaching Pipeline — carried out in Python for mannequin improvement and validation
Execution Engine — carried out in MQL5 for real-time deployment inside MetaTrader 5

Determine 5: Excessive-level structure of the system. The coaching pipeline (high) processes historic knowledge by way of characteristic engineering and mannequin coaching, exporting through ONNX. The execution engine (backside) calculates options instantaneously, obtains likelihood scores, and applies commerce administration logic for place execution.

4.1 Mannequin Structure Choice

The selection of mannequin structure was pushed by empirical analysis towards standards particular to monetary time-series prediction:

Criterion	Precedence
Efficiency on structured/tabular knowledge	Important
Robustness to noise and outliers	Important
Dealing with of regime adjustments	Excessive
Coaching knowledge effectivity	Excessive
Inference pace for stay deployment	Excessive
Interpretability (characteristic significance)	Medium

Based mostly on in depth testing, Gradient Boosting Choice Timber (GBDT) have been chosen. This alternative aligns with constant findings within the machine studying literature that GBDT architectures outperform deep studying approaches on structured monetary knowledge.

Why Not Neural Networks?

Whereas “Neural Community” generates advertising and marketing attraction, the technical actuality for tabular monetary knowledge:

GBDTs deal with characteristic interactions naturally with out specific specification
GBDTs are extra strong to noise and outliers in monetary knowledge
GBDTs require considerably much less coaching knowledge
GBDTs present interpretable characteristic significance rankings
GBDTs prepare quicker, enabling extra in depth hyperparameter search

4.2 ONNX Deployment

The mannequin is exported through ONNX (Open Neural Community Change) for platform-agnostic deployment, enabling Python-trained fashions to execute at C++ speeds inside MT5.

A essential requirement is training-serving parity: characteristic calculations in MQL5 should be mathematically an identical to these carried out throughout Python coaching. Any discrepancy creates “training-serving skew” that degrades mannequin efficiency.

4.3 The MQL5-ONNX Interface

The bridge between Python coaching and MQL5 execution depends on the native ONNX API launched in MetaTrader 5 Construct 3600. The first engineering problem is making certain the enter tensor form matches the Python export precisely, and accurately decoding the classifier’s dual-output construction.

Under is the structural logic used to initialize and run inference with the Gradient Boosting mannequin throughout the Professional Advisor:

Mannequin Initialization

#useful resource "InformationBULLISH_Model.onnx" as uchar ExtModelBuy[]
lengthy g_onnx_buy;
const int SNIPER_FEATURES = 239;

bool InitializeONNXModels()
{
    Print("Loading ONNX fashions...");
    
    
    g_onnx_buy = OnnxCreateFromBuffer(ExtModelBuy, ONNX_DEFAULT);
    if(g_onnx_buy == INVALID_HANDLE)
    {
        Print("[FAIL] Didn't load BUY mannequin");
        return false;
    }
    
    
    ulong input_shape_buy[] = {1, SNIPER_FEATURES};
    if(!OnnxSetInputShape(g_onnx_buy, 0, input_shape_buy))
    {
        Print("[FAIL] Didn't set BUY mannequin enter form");
        return false;
    }
    
    Print("   [OK] BUY mannequin loaded efficiently");
    return true;
}

Likelihood Inference

The classifier outputs two tensors: predicted labels and sophistication possibilities. For probability-based execution, we extract the likelihood of the goal class:

bool GetBuyPrediction(const float &options[], double &likelihood)
{
    likelihood = 0.0;
    
    if(g_onnx_buy == INVALID_HANDLE)
    {
        Print("[FAIL] BUY mannequin not loaded");
        return false;
    }
    
    
    float input_data[];
    ArrayResize(input_data, SNIPER_FEATURES);
    ArrayCopy(input_data, options);
    
    
    
    
    
    lengthy output_labels[];      
    float output_probs[];      
    
    ArrayResize(output_labels, 1);
    ArrayResize(output_probs, 2);
    ArrayInitialize(output_labels, 0);
    ArrayInitialize(output_probs, 0.0f);
    
    
    if(!OnnxRun(g_onnx_buy, ONNX_NO_CONVERSION, input_data, output_labels, output_probs))
    {
        int error = GetLastError();
        Print("[FAIL] BUY ONNX inference failed: ", error);
        return false;
    }
    
    
    
    likelihood = (double)output_probs[0];
    
    return true;
}

Key Implementation Particulars:

Twin-Output Construction: Gradient Boosting classifiers exported through ONNX produce two outputs—the anticipated label and the likelihood distribution throughout lessons. The likelihood output is used for threshold-based execution.
Class Mapping: Class 0 represents the goal situation (BULLISH for the BUY mannequin). The likelihood output_probs[0] instantly signifies mannequin confidence in an imminent bullish transfer.
Form Validation: Strict form checking at initialization catches training-serving mismatches instantly slightly than producing silent prediction errors throughout stay buying and selling.

4.4 Execution Configuration

Parameter	Worth
Image	XAUUSD solely
Timeframe	M1 (characteristic calculation)
Lively Hours	14:00–18:00 (dealer time, configurable)
Likelihood Threshold	88%
Cease Loss	Mounted preliminary; dynamically managed
Take Revenue	Goal-based with ratchet safety
Prohibited Methods	No grid, no martingale

5. Function Engineering

The system processes 239 engineered options throughout a number of research-backed domains. These options have been developed by way of tutorial literature evaluate, area experience in market microstructure, and iterative empirical testing with strict validation protocols.

5.1 Function Classes Overview

Class	Conceptual Focus
Volatility Regime	Market state classification, tradeable vs. non-tradeable circumstances
Momentum	Multi-scale fee of change, pattern persistence
Quantity Dynamics	Participation ranges, uncommon exercise detection
Worth Construction	Assist/resistance proximity, vary place
Cross-Asset	Correlated instrument alerts, correlation regime shifts
Microstructure	Directional stress and short-horizon stress proxies
Temporal	Session timing, cyclical patterns
Sequential	Sample recognition, run-length evaluation

5.2 Key Driving Options

The next options constantly ranked among the many most influential in line with international SHAP significance evaluation:

ADX Development Power (14-period): Measuring pattern power, impartial of route
VWAP Volatility Deviation: Distance of value from intraday VWAP, normalized by current volatility
Volatility Regime Classifier: ATR relative to its transferring common, indicating low-, normal-, or high-volatility states
MACD Histogram Momentum: Capturing short-term momentum and potential reversals
60-minute Gold/DXY Rolling Correlation: Rolling correlation between XAUUSD and DXY returns
60-minute Gold/USDJPY Rolling Correlation: Rolling correlation between XAUUSD and USDJPY returns
Directional Volatility Regime: Signed volatility characteristic combining EMA-based pattern power with present ATR regime
Order-Stream Persistence: Proxy for a way lengthy directional strikes persist throughout current candles
EMA Unfold Dynamics: Distances and slopes between quick and sluggish EMAs

The presence of well-known indicators (ADX, MACD) alongside proprietary regime and correlation options demonstrates that the mannequin enhances, slightly than replaces, established market relationships with higher-resolution timing alerts.

5.3 Cross-Asset Intelligence

Gold (XAUUSD) doesn’t commerce in isolation. Its value motion is influenced by:

US Greenback Dynamics: Sometimes inverse correlation; greenback power usually pressures gold costs
Protected-Haven Flows: Correlation with different safe-haven belongings throughout risk-off durations
Yield Expectations: Relationship with actual rate of interest proxies

The characteristic set incorporates lagged returns from correlated devices, rolling correlations at a number of time scales, divergence detection, and regime change alerts.

6. Validation and Outcomes

The validation strategy follows a single precept: exhibit generalization, not memorization. Any mannequin can obtain spectacular outcomes on knowledge it has seen. The one significant analysis is efficiency on strictly unseen knowledge.

6.1 Out-of-Pattern Efficiency

All 2025 efficiency represents true out-of-sample (OOS) outcomes. The mannequin structure, hyperparameters, and have set have been frozen earlier than any 2025 knowledge was evaluated.

Determine 6: Backtest fairness and stability curves from Jan 2021 to Jan 2026. The interval Jan 2021–Dec 2024 represents knowledge included in mannequin coaching; the interval Jan 2025–Jan 2026 constitutes strictly out-of-sample analysis.

Metric	Full Interval (Jan 2021– Jan 2026)	OOS Solely (Jan 2025–Jan 2026)
Win Charge	88.71%	83.67%
Whole Trades	1,030	319
Revenue Issue	1.77	1.50
Sharpe Ratio	9.90	13.9
Max Drawdown (0.01 lot)	~$500	~$313
Restoration Issue	11.57	3.66
Avg Holding Time	30 min 30 sec	30 min 30 sec

Interpretation: The out-of-sample interval demonstrates continued profitability with metrics that degrade gracefully from the coaching interval:

Win fee decreases from 88.71% to 83.67%—a managed 5% discount indicating the mannequin generalizes slightly than memorizes
Revenue issue stays above 1.50, confirming optimistic expectancy on unseen knowledge
The upper OOS Sharpe ratio (13.9 vs 9.90) gives robust proof towards overfitting

This efficiency hole is anticipated and wholesome. The managed degradation confirms real sample generalization.

6.2 Likelihood Threshold Evaluation

The mannequin outputs steady likelihood scores. Evaluation reveals the connection between likelihood ranges and commerce outcomes:

Likelihood Vary	Trades	Win Charge
0.880 – 0.897	231	88.3%
0.897 – 0.923	167	90.4%
0.923 – 0.950	190	93.2%
0.950 – 0.976	107	87.9%
0.976 – 0.993	27	96.3%

Why 88% Minimal Threshold? The 88% threshold was decided by way of systematic analysis because the optimum entry level balancing commerce frequency towards high quality. Under this threshold, false-positive charges enhance considerably.

6.3 Exit Composition Evaluation

Exit Kind	Share	Interpretation
Ratchet Revenue (SL_WIN)	87.1%	Dynamic revenue seize
Take Revenue (TP)	3.2%	Full goal reached
Cease Loss (SL_LOSS)	9.7%	Managed losses

The overwhelming majority of successful trades exit through the ratchet system, capturing earnings dynamically slightly than ready for full TP.

6.4 Temporal Consistency

12 months	Trades	Win Charge	Standing
2021	172	93.6%	Coaching
2022	125	93.6%	Coaching
2023	64	87.5%	Coaching
2024	124	93.5%	Coaching
2025	237	85.2%	Out-of-Pattern
2026	—	—	—

All years worthwhile with constant efficiency patterns throughout coaching and out-of-sample durations.

7. Commerce Administration

The system implements a complete commerce administration layer that extends past easy entry execution.

7.1 Likelihood-Based mostly Choice Making

Not like techniques that generate discrete “purchase” or “promote” alerts, the structure calculates likelihood scores instantaneously on every new bar:

Entry Choice: Likelihood should exceed 88% threshold earlier than place opening
Route Choice: Larger likelihood between BUY and SELL fashions determines route
Exit Timing: Likelihood adjustments inform place closure choices
Maintain/Shut Logic: Steady likelihood monitoring throughout open positions

7.2 Entry Validation and Filtering

Twin-Mannequin Affirmation: Each BUY and SELL mannequin possibilities are assessed to substantiate directional bias and filter ambiguous circumstances
Regime Filtering: Extra filters detect unfavorable market regimes (excessive volatility occasions, low liquidity durations)
Conditional Execution: Commerce execution proceeds solely after likelihood thresholds are happy and regime filters affirm favorable circumstances

7.3 Ratchet Revenue Safety

Downside Addressed: Worth might transfer 80% towards the take-profit stage, then reverse—with out lively administration, this unrealized revenue can be misplaced.

Ratchet Answer: As value strikes favorably, the system progressively locks in revenue by tightening exit circumstances, making certain that vital favorable strikes are captured even when the complete take-profit shouldn’t be reached.

7.4 Ratchet Loss Minimization

Downside Addressed: Even high-confidence predictions often fail; ready for the fastened stop-loss ends in most loss on each dropping commerce.

Ratchet Answer: When value strikes adversely, the system actively manages the exit to reduce loss slightly than passively ready for stop-loss execution, lowering common loss per unsuccessful commerce.

8. Sincere Limitations

8.1 What This System Is NOT

Not infallible: Roughly 15–18% of alerts end in suboptimal entries relying on market circumstances
Not common: Educated solely for XAUUSD with its particular market microstructure and session dynamics
Not static: Periodic retraining (3–6 months) is required as markets evolve
Not assured: Out-of-sample validation demonstrates methodology soundness however doesn’t assure future efficiency

8.2 Recognized Danger Elements

Danger	Description	Mitigation
Regime Change	Market construction evolves by way of coverage shifts and geopolitical occasions	Periodic retraining protocol
Execution Danger	Slippage throughout volatility can degrade realized outcomes	Session-aware execution, lively hours restriction
Edge Decay	Predictive edges face decay as markets evolve	Retraining with methodology preservation
Focus	Unique XAUUSD focus gives no diversification	Consumer accountability for portfolio allocation

8.3 Execution Assumptions

All reported outcomes are primarily based on historic simulations. No further slippage mannequin has been utilized, and real-world execution might result in materially totally different efficiency. These statistics must be interpreted as estimates beneath preferrred execution circumstances.

9. Conclusion

This text introduced a strategy for fixing two basic failures that characterize retail algorithmic buying and selling—overfitting to historic noise and reactive sign technology—by way of rigorous machine studying practices.

The core improvements demonstrated within the Golden Gauss structure embrace:

Predictive labeling that permits real anticipation of value strikes
Twin-model directional specialization that respects market asymmetry
Likelihood-driven execution that quantifies confidence earlier than commerce entry
Clever commerce administration that minimizes losses when predictions show suboptimal

On strictly out-of-sample 2025 knowledge—collected in any case mannequin choices have been finalized—the system demonstrates roughly 83.67% directional accuracy on the 88% likelihood threshold. The managed efficiency differential from coaching metrics signifies real sample studying slightly than memorization.

Key Takeaways for Practitioners

By no means shuffle time-series knowledge throughout validation—this creates lookahead bias and knowledge leakage
Out-of-sample efficiency is the one significant metric for evaluating stay buying and selling potential
Likelihood thresholds allow accuracy/frequency tradeoffs—increased thresholds yield fewer however higher-quality alerts
Twin binary fashions respect the asymmetry between bullish and bearish market dynamics
Commerce administration amplifies edge—ratchet mechanisms maximize wins and decrease losses
All techniques have limitations—sincere acknowledgment permits applicable deployment and threat administration

The retail algorithmic buying and selling trade suffers from systematic misalignment between vendor incentives and consumer outcomes. The methodology introduced right here—strict temporal separation, documented efficiency degradation, bounded confidence claims—presents a template for sincere system analysis that prioritizes sustainable operation over advertising and marketing attraction.

Professional critique of the validation methodology and underlying assumptions is welcomed. Progress in algorithmic buying and selling requires techniques designed to outlive scrutiny slightly than keep away from it.

10. Implementation & Availability

The structure described on this paper—particularly the predictive labeling engine and the ONNX likelihood inference—has been totally carried out within the Golden Gauss AI system.

To assist additional analysis and validation, the entire system is offered for testing within the MQL5 Market. The package deal consists of the “Visualizer” mode, which renders the likelihood cones and “Kill Zones” instantly on the chart, permitting merchants to watch the mannequin’s decision-making course of in real-time.

Danger Disclaimer: Buying and selling foreign exchange and CFDs entails substantial threat of loss and isn’t appropriate for all traders. Previous efficiency, whether or not in backtesting or stay buying and selling, doesn’t assure future outcomes. The validation outcomes introduced characterize historic evaluation beneath particular market circumstances that will not persist. Merchants ought to solely use capital they’ll afford to lose and may contemplate their monetary scenario earlier than buying and selling.

References

Cao, L. J. and Tay, F. E. H. (2001). Monetary forecasting utilizing assist vector machines. Neural Computing & Functions, 10(2), 184-192.
Chen, T. and Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the twenty second ACM SIGKDD Worldwide Convention on Information Discovery and Knowledge Mining, 785-794.
López de Prado, M. (2018). Advances in Monetary Machine Studying. Wiley.
Bailey, D. H. and López de Prado, M. (2014). The likelihood of backtest overfitting. Journal of Computational Finance, 17(4), 39-69.
Pardo, R. (2008). The Analysis and Optimization of Buying and selling Methods (2nd ed.). Wiley.
Krauss, C., Do, X. A., and Huck, N. (2017). Deep neural networks, gradient-boosted bushes, random forests: Statistical arbitrage on the S&P 500. European Journal of Operational Analysis, 259(2), 689-702.
Baur, D. G. and McDermott, T. Okay. (2010). Is gold a protected haven? Worldwide proof. Journal of Banking & Finance, 34(8), 1886-1898.
ONNX Runtime Builders (2021). ONNX Runtime: Excessive efficiency inference and coaching accelerator. Out there: https://onnxruntime.ai/

Fixing Gold Market Overfitting: A Predictive Machine Studying Method – Buying and selling Techniques – 17 February 2026

Fixing Gold Market Overfitting: A Predictive Machine Studying Method with ONNX and Gradient Boosting

Contents

Revenue Buyers: These Canadian Corporations Are Elevating Payouts Once more

Greatest Dealer Circumstances for MT4/MT5 EAs: Spreads, Slippage, and Execution (What Really Issues) – My Buying and selling – 16 February 2026

Right here Are My 2 Favorite ETFs to Purchase for Excessive-Yield Passive Revenue in 2026

LEAVE A REPLY Cancel reply

Most Popular

What to do Earlier than, Throughout and after a Commerce » Study To Commerce The Market

The Obtain: The rise of luxurious automobile theft, and preventing antimicrobial resistance

Kraken Brings Crypto OTC Buying and selling Into ICE Chat as Establishments Step Up Curiosity

Why does migrating a legacy bitcoin pockets create a unimportable descriptor?

Recent Comments

ABOUT US

POPULAR POSTS

What to do Earlier than, Throughout and after a Commerce » Study To Commerce The Market

The Obtain: The rise of luxurious automobile theft, and preventing antimicrobial resistance

Kraken Brings Crypto OTC Buying and selling Into ICE Chat as Establishments Step Up Curiosity

POPULAR CATEGORY