Deep Learning · Price Forecasting

How neural networks read price

Price forecasting with deep learning isn't a single model — it's a decision about architecture, data pipeline, and what "error" actually costs you.

These visuals map out the key ideas: from raw time-series to trained recurrent networks, from feature engineering to evaluation metrics that matter in practice.

Deep learning pipeline diagram illustrating neural network layers processing financial time-series data

Architecture choices and what they cost

LSTM and GRU networks handle sequential dependencies well, but they're slow to train on long windows. Transformer-based models like Temporal Fusion Transformer handle mixed-frequency inputs more flexibly.

Choosing between them isn't about which is "better" — it's about your sequence length, available compute, and whether interpretability matters in your use case.

"The architecture should fit the data, not the trend."

60+ hours of video walkthroughs, code labs and model comparisons

Enough depth to build and critique a forecasting pipeline from scratch.

8 distinct model families covered across the curriculum

ARIMA baseline through attention-based architectures — with honest comparisons.

Data pipeline

Normalization, windowing, and train-test split without leakage

Feature engineering

Lag features, rolling statistics, and external regressors

Model evaluation

MAE, RMSE, directional accuracy and their trade-offs

Deployment

Serving forecasts with FastAPI, monitoring drift in production

What makes this hard

Price series are non-stationary — the statistical properties shift over time
Signal-to-noise ratio is low, especially at daily and hourly granularity
Overfitting is easy to hide if you don't walk-forward validate properly
Most benchmark datasets have already been overfit by published papers
Production models degrade — monitoring matters as much as training

Learning path through the curriculum

The sequence below reflects how we structured the material at Konpaterud — starting from where most learners actually get stuck, not from first principles.

Time-series fundamentals

Autocorrelation, seasonality decomposition, stationarity tests — the baseline every model should beat.

Recurrent architectures

LSTM and GRU from scratch in PyTorch, including gradient flow visualizations and gating mechanics.

Attention and transformers

Self-attention applied to sequences, positional encoding, and when transformers outperform RNNs.

Uncertainty quantification

Probabilistic forecasting, conformal prediction intervals, and calibration — because point estimates alone mislead.

"One student, Petra Valkonen, described the walk-forward validation lab as the first time a backtest result felt honest. That's the kind of shift we're aiming for."