Two Markets, One Platform

Who Buys Synthetic Macro Data?

The synthetic data market is projected to reach $2.7B by 2030 (39% CAGR). WorldSim serves two distinct segments.

SEGMENT 1: FINANCIAL STRESS TESTING

Banks, Insurers, and Regulators (EU-focused)

Banks under CCAR, Basel, and EBA frameworks need macroeconomic stress scenarios to test portfolio resilience. Traditional approaches use fixed scenario sets from regulators or generate synthetic data with GANs/VAEs that lack structural coherence.

Structurally coherent scenarios where GDP, unemployment, inflation, and debt interact realistically

Full audit trail per scenario (run group ID, seed, rules fired) for regulatory compliance

Parameter sweeps: 1,000 scenario configs x 5,000 paths = 5 million trajectories on demand

SEGMENT 2: AI MODEL TRAINING & ROBUSTNESS

ML Teams and AI Developers (Global)

ML teams building credit scoring, demand forecasting, insurance pricing, or economic prediction models need diverse macro environments for training and robustness testing. Current synthetic data generators (GANs, VAEs) produce statistically plausible data but without causal structure.

195 countries globally, not just EU: train on the full spectrum of macro conditions

Raw simulation paths (CSV/Parquet) with 26 KPIs per country per year per path

Test model robustness: does your model degrade under recession, inflation spike, or demographic shift?

The Data Product

What WorldSim Delivers to AI Teams

Not just numbers. Structurally coherent, causally consistent, reproducible synthetic macro environments.

Raw Monte Carlo Paths

Up to 10,000 individual simulation paths per country per scenario, each containing 26 KPI values per year from 2025 to 2050. Delivered as CSV or Parquet. Each path is a complete, internally consistent synthetic country trajectory.

Quantile Distributions

P10/P50/P90 quantile outputs per KPI per year, plus full histograms at any target year. Use the distributions directly for model calibration, or sample from the raw paths for training data diversity.

Structural Coherence (Not Just Noise)

This is what separates WorldSim from GAN-generated synthetic data. Every path respects 100+ structural coupling rules: when GDP falls, unemployment rises, migration outflows increase, fiscal revenue drops, and debt accumulates. The causal structure is preserved across every trajectory.

API Access (Coming Soon)

Programmatic access to submit scenario configurations, retrieve raw paths, and run batch sweeps. Submit 1,000 scenario configs, each generating 5,000+ paths across any set of countries. Enterprise-grade throughput for production ML pipelines.

Data Specification

Countries 195

KPIs per country 26

Simulation paths per run Up to 10,000

Years per path 2025 to 2050

Scenario paths 3 (Better / Average / Shock)

Coupling rules 100+ structural rules

Formats CSV, Parquet, API

Reproducibility Deterministic per seed

Example: Maximum Data Volume

A single Enterprise batch run can generate:

195 countries x 3 paths x 10,000 sims x 26 KPIs x 25 years

= 3.8 billion data points

All structurally coherent, causally consistent, and fully reproducible.

Structural vs Statistical

Why Structural Synthetic Data Beats GANs

GANs and VAEs generate statistically plausible data. WorldSim generates causally coherent data. The difference matters.

GAN/VAE Synthetic Data

Variables may correlate but don't respect causal structure

Can generate GDP falling while unemployment falls simultaneously

No audit trail: can't explain why a scenario was generated

Not deterministically reproducible (stochastic generators)

WorldSim Synthetic Data

100+ coupling rules enforce causal macro relationships

GDP falls, unemployment rises, migration responds, debt accumulates

Full audit trail: run group ID, seed, rules fired, per path

Deterministic: same config always produces identical trajectories

Use Cases for AI Teams

What AI Teams Build With WorldSim Data

CREDIT SCORING

Test credit models under diverse macro conditions across 195 countries

Does your credit scoring model degrade when unemployment doubles? When inflation hits 8%? Generate thousands of macro environments and test model robustness across the full spectrum of plausible economic conditions.

DEMAND FORECASTING

Train demand models on structurally diverse economic environments

Consumer demand depends on GDP, inflation, employment, and housing costs. WorldSim generates macro environments where these variables interact realistically, providing better training data than historical time series from a single country.

INSURANCE PRICING

Model claim frequencies under different economic stress scenarios

Insurance claims correlate with macro conditions: recessions increase defaults, energy crises affect health costs, crime rates respond to unemployment. WorldSim provides the structurally coherent macro scenarios that actuarial models need.

PORTFOLIO STRESS TESTING

Generate macro scenarios for multi-asset portfolio risk models

Move beyond historical VaR. Generate thousands of forward-looking macro scenarios with structural coupling, then map them to asset returns. The distributional output gives you the tail risk that historical data doesn't cover.

BIAS DETECTION

Test AI fairness across different economic environments

Does your model perform differently for users in high-inflation vs low-inflation countries? Generate controlled macro environments where only specific variables change, isolating the effect on model predictions for fairness testing.

ECONOMIC PREDICTION

Train macro prediction models on structurally coherent synthetic data

Historical macro data is limited: ~25 years of clean data for most countries, with only 2-3 recession episodes. WorldSim generates thousands of recession, recovery, and crisis scenarios that respect structural coupling, massively expanding your training set.

Data Preview

What the Data Looks Like

WorldSim covers 195 countries globally. Every simulation produces structured, multi-layered output that AI teams can consume as training data, test environments, or contextual inputs for LLM-based systems.

1

Global Coverage: 195 Countries, Same Depth

WorldSim isn't EU-only. Here's a US baseline simulation: TI 0.43, with 26 KPIs across all 9 structural domains. Every country in the database produces the same schema and depth of output, from the US to Bangladesh to Nigeria. AI teams training global models get globally diverse macro environments with consistent structure.

worldsimlab.com/explore, United States, As Planned, 5,000 simulations

WorldSim USA Overview, 26 KPIs across 9 domains

2

Raw Monte Carlo Paths: The Training Data

This histogram shows the distribution of US GDP per capita outcomes at 2050 across 500 stored sample paths. Each path is a complete 25-year trajectory with 26 KPIs. For Enterprise buyers, all paths (up to 10,000) are available as raw CSV/Parquet downloads. This is the underlying data that powers every chart, quantile, and regime classification. Your ML model trains on the paths; the distributions are the validation target.

Monte Carlo Distribution, Year 2050, 500 stored sample paths

WorldSim US GDP Monte Carlo Distribution 2050, 500 sample paths

3

Coupling Rules: Structured Context for LLMs

Every simulation path comes with a full log of which structural rules fired, when, and why. For LLM and descriptive AI systems, this is high-value contextual metadata: "GDP fell because Energy Vulnerability triggered in 2025, which cascaded to Fuel Pressure in 2027, which triggered Monetary Tightening in 2029." Your model doesn't just learn the numbers; it learns the causal narrative. Rules can serve as structured inputs, outputs, or training labels depending on your architecture.

Coupling Rules Triggered: year-by-year causal audit trail per simulation path

WorldSim Coupling Rules Timeline, structured metadata for AI training

4

Time Series Format: 25-Year Trajectories for Sequential Models

Each simulation path is a complete 25-year time series (2025-2050) with annual values for all 26 KPIs. The fan chart shows P10/P50/P90 quantile envelopes, but the underlying raw paths are individual sequences. For LSTM, Transformer, and other sequential ML architectures, this is native training format: thousands of structurally coherent multivariate time series per country, each following different but causally consistent trajectories.

Inflation Rate fan chart: P10/P50/P90 envelopes from thousands of individual paths

WorldSim Inflation Fan Chart, 25-year time series for sequential ML models

5

Cross-Country Diversity: Train on the Full Spectrum

The same 26 KPIs produce fundamentally different structural profiles across countries. Romania (TI 0.53) and Sweden (TI 0.45) show opposite patterns: Romania leads on Income, Housing, and Demographics; Sweden leads on Fiscal and Energy. Training on a single country's data produces models that overfit to one structural pattern. WorldSim gives you 195 structurally distinct environments with the same schema, maximising training diversity without sacrificing consistency.

Romania vs Sweden: same schema, fundamentally different structural profiles

WorldSim Romania vs Sweden comparison, cross-country diversity for ML training

6

Parameterised Scenarios: Controllable Data Generation

Every dataset is controlled by a scenario configuration: country, path (Better/Average/Shock), KPI tilts (sigma shifts with persistence and decay), and simulation count. Via API, your system can programmatically generate specific macro environments: "Give me 5,000 paths for Germany where inflation is +3σ and unemployment is +2σ for 5 years." This enables iterative learning: your model can request increasingly extreme scenarios, test its own boundaries, and even integrate WorldSim's engine as a structured environment for reinforcement learning or agent-based exploration.

Scenario output: parameterised by country, tilts, persistence, and simulation count

WorldSim scenario output, parameterised data generation for AI pipelines

Key Takeaway for AI & Data Teams

Historical macro data is scarce: ~25 years of clean data per country, with 2-3 recession episodes. You can't train a robust model on 3 recessions. WorldSim generates thousands of structurally coherent recession, recovery, crisis, and boom scenarios across 195 countries, each respecting the causal macro relationships that your model needs to learn. The coupling rules provide structured causal narratives that LLMs can consume as context. The raw paths provide multivariate time series that sequential models can train on. The parameterised API enables your system to generate its own training environments on demand. This is the synthetic macro data infrastructure that the $2.7B synthetic data market has been missing.

Get Structurally Coherent Synthetic Macro Data

Raw Monte Carlo paths, quantile distributions, and full audit trails across 195 countries. Available as CSV/Parquet downloads or via API.

Explore the Platform → Request Data Access