Models Reference
How the intelligence is built. Twelve production ML models, plus an algorithmic layer (CTMC win/podium probability with a full position distribution, ECP/ECPA expected championship points, and EKF tyre health), produce the analytics.* and weather.* feeds. This page documents how each model works — its type, inputs, and calibration. For the payload each one emits, see the Analytics feeds →
analytics.strategy); the models behind it run server-side in real time. The model count is the documented source of truth; the live GCS manifest is the runtime truth.Production ML models
Deep dives for the most-integrated models follow. The rest share the same training and calibration pipeline.
Pit stop — BiLSTM
A bidirectional LSTM reads a 20-lap rolling feature window per driver — tyre age, fuel-corrected pace delta, stint length, gap to the cars ahead and behind, and track status — and emits pitStopProbability each lap. pitRecommended trips when the probability clears an era-adjusted threshold (0.53 pre-2026, 0.60 for 2026). It drives analytics.strategy and analytics.tire-strategy.
Safety car — LightGBM + Hawkes
A gradient-boosted classifier scores incident risk from track state, lap, field spread, and recent incidents; a Hawkes self-exciting process layers on the post-incident clustering that a static classifier misses. A 3-class variant separates full safety car from VSC (fullScProbability / vscProbability). Session-level — the same value applies to all drivers.
Overtake — XGBoost
Predicts overtakeProbability — P(position gain via on-track overtake within three laps) — from the gap to the car ahead, pace delta, tyre offset, and a per-circuit overtake index. For 2026 the indices were recalibrated for zone-independent Overtake Mode rather than fixed DRS zones.
Pace — LTOE
Lap Time Over Expected. A LightGBM model predicts the expected lap time from fuel-corrected pace and stint context; the residual (actual − expected) is the LTOE signal — negative means faster than expected. It feeds paceMode classification (PUSH / HOLD / MANAGE / WARM_UP / DELTA). During the 2026 calibration window it returns with reduced confidence (ltoeConfidenceFlag: "2026_early").
Algorithmic intelligence layer
Three non-ML components ship alongside the models and produce the bulk of the race-odds and tyre signals.
Win probability — CTMC
A continuous-time Markov chain over race positions produces a full 20-element positionDistribution per driver, from which podium/points odds, expected points, expected laps led, and forward position-change projections are derived analytically. Powers analytics.race-odds and analytics.constructor (via joint-distribution convolution).
Tyre health — EKF
An extended Kalman filter tracks each driver's lap-delta series to estimate tireHealth (0–1), degradation rate (s/lap), and stint lap — robust to safety-car and traffic noise that fools a naive regression.
Expected championship points — ECP/ECPA
Season-level points expectation (and an adjusted variant) projecting championship outcomes from the current standings and per-race position distributions. Surfaces on analytics.championship-probability.
2026 regulation adjustments
The 2026 F1 regulations — tripled ERS power, turbo lag, active aero replacing DRS — required every model to be re-assessed. What changed: