Predicting Economic Recessions

An interactive report on using leading indicators and network analysis to forecast economic downturns.

A Project by Gurpreet Singh

View Source Code on GitHub

Project Overview

This project aims to predict U.S. economic recessions by leveraging a combination of traditional macroeconomic indicators and a novel network analysis approach. The core idea is to model the economy as a complex network where indicators are nodes and their correlations are edges. By analyzing the structural properties of this network over time, we can extract new features that may capture systemic risk and financial contagion.

The analysis compares a baseline model, trained only on standard time-series features, with an enhanced model augmented by these network centrality metrics. This interactive report walks through the methodology, results, and key findings of the project.

The Building Blocks: Data & Features

Core Economic Indicators

The model is built upon 12 leading economic indicators sourced from FRED and Yahoo Finance, chosen for their relevance in capturing different facets of the economy.

Yield Curve (T10Y3MM)
Unemployment (UNRATE)
S&P 500 (SP500)
Inflation (CPIAUCSL)
Consumer Sentiment
Building Permits (PERMIT)
Leading Indicator (OECD)
Volatility Index (VIX)
Initial Claims (ICSA)
Personal Consumption
Industrial Production
Core CPI

Two Flavors of Features

From the raw data, two distinct sets of features were engineered to feed the models.

1. Traditional Time-Series Features

Standard metrics that capture momentum and trends, such as Year-over-Year % Change and 12-Month Rolling Averages.

2. Novel Network Features

Metrics derived from the correlation network, such as Degree, Betweenness, and Eigenvector Centrality, to quantify an indicator's systemic importance.

Performance Showdown

Comparing the Baseline Logistic Regression model against the Enhanced XGBoost model (with network features).

Validation Set Performance

Test Set Performance

Model Predictions vs. Actual Recessions

Why Did It Work? A Look Inside the Model

Feature Importance: Full Model

A SHAP analysis on the best model reveals which features have the biggest impact. High feature values pushing the prediction higher are shown in red, and those pushing it lower are in blue.

SHAP Summary Plot for the full model

Finding: The model's predictions are dominated by traditional indicators like the OECD Leading Indicator, Yield Curve, and Inflation. Network features are not among the top predictors.

The Experiment: Removing Top Features

What happens if we remove the most powerful traditional features and retrain the model? Will the network features become more important?

SHAP Summary Plot for the reduced model

Finding: Even when deprived of its strongest predictors, the model still relies more on other traditional metrics (like S&P 500 volatility) than the network features.

Visualizing the Economic Network

The structure of the economy's correlation network changes over time. Below are snapshots from different critical periods. Green edges represent positive correlations, and red edges represent negative correlations.

Economic Network Graph for Stable Period 2015

Evolution of Network Centrality

These plots show how the systemic importance of key indicators (their centrality) changes over time, with actual recession periods shaded in gray for context.

Time series of Betweenness Centrality for the Yield Curve

Yield Curve Betweenness

Time series of Degree Centrality for S&P 500 Volatility

S&P 500 Volatility Degree

Time series of Eigenvector Centrality for Unemployment Rate

Unemployment Eigenvector

Final Conclusion

This project successfully demonstrated that augmenting a dataset with network centrality features can improve the performance of a non-linear model like XGBoost for recession prediction, increasing the validation ROC AUC to 0.95.

However, a deep interpretation using SHAP reveals a more nuanced story. The model's predictive power is still dominated by traditional macroeconomic indicators. The network features, while contributing to a marginal performance lift, appear to capture information that is largely redundant or less powerful than that contained in standard time-series transformations of the base indicators.

The final conclusion is that while network analysis provides a measurable improvement, the primary drivers for this specific recession prediction task remain the powerful, well-established macroeconomic variables.