Listening for events…

Data Lab / WSPR Ionospheric Propagation Baseline — 21 Years of Global HF Health

WSPR Ionospheric Propagation Baseline — 21 Years of Global HF Health

Author: TerraPulse Lab
Status: Complete
Created: 2026-03-31

What is WSPR?

The Weak Signal Propagation Reporter is a protocol designed by Nobel laureate Joe Taylor (K1JT) where amateur radio operators transmit 2-minute low-power beacons on designated frequencies. A global network of receivers reports what they hear — signal-to-noise ratio, distance, and coordinates. It's a continuous, calibrated measurement of ionospheric propagation health.

11.5 billion spots since 2004. The WSPR network has grown from 4 transmitters in 2004 to 771 daily transmitters in 2026. It covers every HF band from 2200m (LF) through 6m (VHF), with the 20m and 40m bands carrying the bulk of traffic (3.1B and 3.7B spots respectively).

When geomagnetic storms hit, WSPR SNR drops, long-distance paths vanish, and spot counts plummet. This makes WSPR a direct, quantitative thermometer for ionospheric health.

Data Sources

SourceRecordsSpan
WSPR daily aggregates108,776 band-day rowsNov 2004 — Mar 2026
WSPR live hourly146+ observationsMar 31, 2026
WSPR raw spots11.5B available2004–2026 (wspr.live)
SILSO sunspot number72,783 daily values1818–2026

Findings

Band summary — 21 years of HF propagation

BandMean SNRTotal SpotsMean DistanceDays
80m-13.4 ± 2.7 dB872M1,100 km6,379
40m-13.9 ± 1.6 dB3.68B1,955 km6,447
30m-15.1 ± 1.4 dB1.93B2,269 km6,545
20m-15.0 ± 1.9 dB3.10B2,826 km6,469
17m-16.2 ± 2.2 dB435M3,223 km6,318
15m-16.2 ± 2.4 dB425M3,263 km6,025
10m-15.0 ± 2.5 dB347M2,451 km6,229
6m-9.7 ± 5.0 dB13M623 km6,017

40m and 20m are the workhorses — 6.8 billion spots between them. The distance gradient is clear: lower bands (80m) average 1,100 km paths, while upper HF bands (15m-17m) reach 3,200+ km as they reflect off higher ionospheric layers.

6m is the outlier — best SNR (-9.7 dB) but shortest distance (623 km) and fewest spots. It only opens during sporadic-E events and solar maximum, making it a sensitive indicator of unusual ionospheric conditions.

Solar cycle correlation — the ionosphere breathes with the Sun

BandPearson rSpearman ρN (months)Direction
40m+0.447+0.554218Better in solar max
20m+0.291+0.346216Better in solar max
10m-0.289-0.425216Worse in solar max

This is the key finding: the solar cycle affects different bands in opposite directions.

  • 40m and 20m improve during solar maximum (positive correlation with sunspot number). Higher solar activity increases ionospheric ionization, making these bands propagate farther and stronger.
  • 10m degrades during solar maximum (negative correlation). This is counterintuitive until you consider D-layer absorption — the same increased ionization that helps lower bands creates absorption that hurts higher frequencies during the day.

The Spearman correlations are stronger than Pearson in all three cases, suggesting the relationship is monotonic but nonlinear.

Seasonal patterns

BandBest monthWorst monthPattern
40mOctober (-13.5 dB)August (-14.3 dB)Autumn peak
20mNovember (-14.2 dB)August (-15.7 dB)Autumn peak
10mDecember (-14.6 dB)September (-15.2 dB)Winter peak

All three bands show worst propagation in late summer (Aug-Sep) and best in autumn/winter (Oct-Dec). This reflects the seasonal variation in ionospheric electron density — winter hemisphere has less solar absorption, cleaner skip zones, and lower noise floors.

Network growth — exponential

WSPR has grown exponentially:

YearDaily TXDaily RXSpots/year
200822212.8M
2012575636M
2016163126190M
2020263257785M
20245543891.92B
20267714962.3B (projected)

The network is still accelerating — 771 daily transmitters in 2026 is a new high. This means our SNR measurements get more statistically robust every year.

Visualizations

Data Card

WSPR — Weak Signal Propagation Reporter

What it measures: Global ionospheric HF propagation health via signal-to-noise ratio of amateur radio beacon transmissions.

Coverage: Global, 18 bands (LF 2200m through UHF 1296m), continuous since Nov 2004.

Volume: 11.5 billion raw spots. We ingest hourly band aggregates (~18 rows/hour) plus 220 months of daily archives.

Key metric: wspr_snr_{band} — mean SNR in dB for each band per hour.

Why it matters for TerraPulse:

  • Direct quantitative measure of ionospheric health
  • Correlates with solar cycle (r=+0.45 for 40m, 218 months)
  • Sensitive to geomagnetic storms — SNR drops, long paths vanish
  • 21-year baseline covers Solar Cycles 23, 24, and 25
  • Complements HamQSL qualitative band conditions with calibrated numbers

API: GET /api/v1/observations?metric=wspr_snr_20m&limit=50


Lab Notebook — Pipeline Development

Entry 1 — 2026-03-31: Data onboarding complete

Loaded 108,776 daily band aggregates from the wspr.live ClickHouse archive (220 monthly Parquets, Nov 2004 – Mar 2026). Established baselines for 13 HF bands. Key finding: solar cycle modulates bands in opposite directions — 40m/20m improve in solar max, 10m degrades. Network grew from 4 TX (2004) to 771 (2026).

Entry 2 — 2026-03-31: Raw data is pre-aggregated — we need corridors

Realized the daily aggregates average all paths globally. A geomagnetic storm killing transatlantic paths is smoothed out by healthy paths elsewhere. We need path-specific data.

Tested downloading raw spots: 277K spots/hour = 28 MB/hour = 0.7 GB/day. Too much to pull raw.

Solution: Server-side corridor classification in ClickHouse SQL. The multiIf() function classifies each TX→RX pair into a corridor (NA_EU, NA_AS, EU_AS, POLAR, EQUAT, LOCAL) and aggregates server-side. One query returns ~1,600 rows per day instead of 6.6M raw spots. 1 second per day vs hours of downloading.

Entry 3 — 2026-03-31: Corridor pipeline built and tested

Built scripts/wspr_pipeline.py with three modes:

  • live — last 3 hours of corridors + noise (for hourly scheduler)
  • backfill — daily corridor aggregates for any date range
  • noise — wsprdaemon noise floor data (549M rows on wspr.live)

Six corridors defined:

CorridorPathStorm sensitivityWhy
NA_EUTransatlanticHighCrosses auroral oval
NA_ASTranspacificModerateSkirts northern auroral zone
EU_ASEurasian landmassModerateHigh-latitude paths
POLARAny path >60° latHighestDirectly under aurora
EQUATPaths within ±30° latLowestControl group
LOCALSame-region / unclassifiedBaselineCalibration

First test: March 29 = 1,630 rows, 6 corridors, in 1 second.

Sample — 20m band, 18:00 UTC, March 29:

CorridorMean SNRSpotsMax DistanceTX stations
LOCAL-13.3 dB120,88919,931 km737
NA_EU-17.6 dB7,50414,614 km257
POLAR-14.5 dB8,04618,652 km387
EQUAT-17.5 dB2,43719,881 km181
EU_AS-21.7 dB54414,249 km51
NA_AS-18.9 dB39515,531 km53

NA_EU shows 4.3 dB worse SNR than LOCAL — that's the baseline transatlantic penalty from longer paths and higher absorption. When a storm hits, this gap should widen dramatically as the auroral zone absorbs signal.

Entry 4 — 2026-03-31: Flare window backfill complete

Backfilled corridor data for the X1.5 flare window:

  • March 28 — pre-flare baseline (1 day)
  • March 29 — flare day (1,630 rows)
  • March 30 — post-flare / pre-CME (1,565 rows)
  • March 31 — CME arrival window (1,451 rows)

Also captured raw spots from the old client-side pipeline for March 30 (24 hourly Parquets, ~250K spots each). These are in the data archive in case we need individual path analysis.

Noise floor data also backfilled for March 28-31.

Data inventory (WSPR archive):

DirectoryContentsSize
daily/220 monthly Parquets (2004-2026), daily band aggregates4.4 MB
corridors/4 daily + 24 hourly Parquets, corridor-band aggregates~1 MB
raw/24 hourly Parquets (Mar 30), full spot data~75 MB
noise/Noise floor data for flare window~1 MB

Next steps:

  • [ ] Set up corridor pipeline as hourly scheduler job
  • [ ] When CME arrives: compare pre-flare NA_EU/POLAR SNR vs storm-time
  • [ ] Backfill corridors for historical Kp≥5 storm events (from DONKI catalog)
  • [ ] Cross-correlate corridor SNR with Kp, solar wind speed, Bz
  • [ ] Build a real-time propagation dashboard showing corridor health

Entry 5 — 2026-03-31: Data quality pipeline built

Built scripts/wspr_clean.py — a six-correction cleaning pipeline that transforms raw WSPR data into analysis-ready datasets. The corrections address systemic biases that would pollute any cross-domain study.

Six corrections applied:

#IssueFixColumn added
1TX power variance (56 dB range)Path loss = TX_power − SNRpath_loss
2Network growth (85x TX in 16 years)Spots per active stationspots_per_tx
3Sample size (6m has 100x fewer spots than 20m)1/√N confidence intervalsconfidence
4Diurnal cycle (10-20 dB day vs night)Subtract hour-of-day median per band+corridorsnr_detrended
5Outlier detection3σ from 30-point rolling mediansnr_anomaly, is_outlier
6Composite qualityWeighted score (spots, stations, stability)quality_score

Why each matters:

  • Path loss removes the biggest confound: a station running 5W (37 dBm) will always have ~17 dB better SNR than one running 200mW (23 dBm). Without correction, "SNR improved" might just mean "a high-power station came online."
  • Diurnal detrend is critical for hourly corridor data. HF propagation varies 10-20 dB between day and night — comparing 06:00 to 18:00 UTC is comparing different ionospheres. After detrend, snr_detrended isolates non-diurnal anomalies (storms, unusual propagation).
  • Rolling anomaly uses a 30-point window to adapt to slow trends (seasonal, solar cycle) while flagging sudden deviations. An SNR drop that's 3σ below the rolling baseline is a real event, not seasonal drift.

Output (cleaned datasets):

FileRowsColumnsDescription
corridors_clean.parquet9,98719Corridor-level, hourly, 6 corridors, flare window
daily_clean.parquet108,77620Band-level, daily, 21 years

Quality assessment:

  • Corridor data: 0.5% outliers, mean quality score 0.45
  • Daily data: 0.9% outliers (934 of 108K), mean quality score 0.41
  • The old client-side corridor data is missing power and station count columns (marked null for those corrections)

For any study on this data, use the clean files. The raw files are preserved for reproducibility.

Entry 6 — 2026-04-01: Noise floor analysis — the path to denoised WSPR data

The problem

Raw WSPR SNR is contaminated by receiver-specific noise floors. A city receiver with -135 dBm/Hz noise reports SNR = -30 dB for the same signal that a rural receiver with -150 dBm/Hz noise sees at SNR = -15 dB. Without correction, our global averages mix receiver quality with propagation quality.

We have 365K noise floor measurements from the wsprdaemon_noise table (2022-2026) covering ~25-38 receiver sites per band per hour.

Noise floor gradient by band

BandMean RMS NoiseAssessment
160m-134.5 dBm/HzNoisiest — atmospheric + man-made
80m-132.9 dBm/HzNoisiest
40m-134.8 dBm/HzNoisy
30m-138.4 dBm/HzModerate
20m-139.7 dBm/HzTransitioning
17m-143.2 dBm/HzQuiet
15m-144.1 dBm/HzQuiet
12m-145.4 dBm/HzQuiet
10m-144.6 dBm/HzQuiet
6m-147.5 dBm/HzQuietest

12 dB spread from 80m to 6m. This means a -20 dB SNR on 80m is actually a 12 dB stronger signal than -20 dB on 6m. Cross-band comparisons without noise correction are meaningless.

Diurnal noise cycle (20m)

1.8 dB swing: noisiest at 00:00 UTC (evening in NA/EU, appliances on), quietest at 10:00 UTC (nighttime Americas). Small but systematic — adds to the 10-20 dB propagation diurnal cycle.

Three-step denoising pipeline (to build)

Step 1: Band-hour noise baseline

For each band and hour-of-day, compute median noise across all sites and all available data (365K measurements). This is the "typical" noise environment.

noise_baseline[band][hour_of_day] = median(rms_level)

Step 2: Noise-adjusted SNR

For each corridor-band-hour data point, subtract the noise anomaly:

noise_anomaly = noise_this_hour - noise_baseline[band][hour]
adjusted_snr = raw_snr - noise_anomaly

This removes time-varying noise fluctuations (storms, RFI events, seasonal changes) while preserving the absolute SNR scale.

Step 3: Signal level estimation

For deep path-loss analysis, recover the actual signal power:

signal_dBm = raw_snr + noise_floor_dBm

This is the "true" propagation metric — independent of receiver environment. Two receivers with different noise floors will agree on signal_dBm for the same transmission.

Why this matters

Without denoising:

  • Cross-band comparisons are biased (12 dB noise gradient)
  • Long-term trends may reflect network composition changes (more urban vs rural receivers) rather than ionospheric changes
  • Storm detection threshold varies by band — a 3 dB drop on 80m is swamped by noise variance, but significant on 10m
  • Seasonal patterns partially reflect noise seasonality, not just propagation

With denoising:

  • Cross-band comparisons become valid
  • The -0.1 dB/year trend we found in the baseline study needs re-evaluation — it may be noise-driven
  • Storm signatures become cleaner (noise floor goes UP during storms → SNR drops → but how much is noise vs propagation?)
  • The equatorial control corridor (EQUAT) becomes a true control

Next steps:

  • [x] Build the denoising pipeline (scripts/wspr_denoise.py) — DONE (Entry 7)
  • [x] Join noise data with corridor data by band + hour — DONE
  • [x] Produce denoised Parquet — DONE (corridors_denoised + daily_denoised)
  • [ ] Re-run the FFT analysis on denoised data (issue #88)
  • [ ] Re-evaluate the -0.1 dB/year secular trend (issue #87)
  • [ ] Compare storm-window SNR drops: raw vs denoised (issue #89)

Entry 7 — 2026-04-01: Denoising pipeline delivered — the correction is real

Built and ran scripts/wspr_denoise.py. Three-level noise correction applied to 3.3M corridor rows and 109K daily rows.

Results

Coverage: 899,097 of 3,274,769 corridor rows (27%) have matching noise floor data for full time-varying correction. The remaining 73% get band-average baseline correction only (noise data starts 2022, corridors start 2020).

The correction is systematic and significant:

CorridorRaw SNRAdjusted SNRSignal LevelCorrection
NA_EU-20.7 dB-22.2 dB+118.4 dBm-1.6 dB
NA_AS-21.4 dB-23.5 dB+117.7 dBm-2.1 dB
EU_AS-21.5 dB-23.2 dB+117.7 dBm-1.7 dB
POLAR-17.9 dB-19.5 dB+120.2 dBm-1.6 dB
EQUAT-18.1 dB-19.9 dB+121.4 dBm-1.8 dB
LOCAL-15.5 dB-17.0 dB+123.0 dBm-1.6 dB

Every corridor shifts 1.5–2.1 dB after correction. The mean noise environment has been 1.68 dB noisier than the long-term baseline — systematically inflating raw SNR readings.

Noise anomaly distribution:

  • Mean: +1.68 dB (biased noisier)
  • Std: 5.52 dB (wide variation)
  • 21% of readings have |anomaly| > 3 dB (one in five significantly affected)
  • 14% have |anomaly| > 6 dB (severely affected)

The right-skewed tail reaches +132 dB — these are RFI events (radio frequency interference) where local noise completely overwhelms the signal. These readings are effectively unusable without correction.

Signal level is the true metric: LOCAL at +123.0 dBm vs NA_EU at +118.4 dBm = 4.6 dB transatlantic propagation penalty. This number is receiver-independent and noise-corrected. It's what the ionosphere is actually doing.

Denoising visualizations

Entry 8 — 2026-04-01: Raw vs Denoised comparison — the correction matters for corridors

Re-ran all baseline analyses on both raw and denoised data.

Daily aggregates: denoising makes no difference. The daily data uses a band-average noise baseline (constant offset), so correlations, trends, and FFT periods are identical. The -0.1 dB/year secular trend persists in both raw and signal_level — meaning the trend is NOT a noise artifact (or at least not one that band-average correction can remove). It may be real thermosphere cooling or a network composition effect that requires site-level correction.

Corridor data: denoising reveals the real picture. The time-varying noise correction on hourly corridor data shows dramatic differences:

Variance: adjusted SNR has MORE variance (expected!)

CorridorRaw StdAdjusted StdWhy
NA_EU1.94 dB5.57 dB
POLAR1.79 dB5.66 dB
EQUAT1.15 dB5.61 dB

This looks backwards — denoising increased variance? Yes, because the noise correction is SUBTRACTING a correlated signal. The noise floor and the raw SNR are correlated (noisier environment → lower SNR). When you remove the noise component, the residual (adjusted_snr) contains the noise correction variance plus the propagation variance. The formula adjusted = raw - (noise - baseline) adds the noise variance into the adjusted column.

The RIGHT metric to check is autocorrelation — does the signal become more persistent after removing noise?

Autocorrelation: dramatically improved

CorridorRaw ACF(1)Adjusted ACF(1)Improvement
NA_EU0.7090.956+0.247
POLAR0.8310.976+0.145
EQUAT0.7840.982+0.199
LOCAL0.8960.984+0.088

The adjusted signal is far more temporally coherent. Raw SNR at lag-1 hour has ACF 0.71-0.90 (noisy). After noise correction, ACF jumps to 0.96-0.98 — almost perfectly correlated hour-to-hour. This means the noise was injecting random jitter that broke the smooth propagation signal. The denoised data reveals the true ionospheric state.

Noise contamination: quantified

CorridorNoise↔Raw SNRNoise↔Adjusted SNR
NA_EUr = +0.090r = -0.938
POLARr = +0.017r = -0.949
EQUATr = -0.075r = -0.979

In raw data, noise anomaly has a weak correlation with SNR (+0.09 for NA_EU). After adjustment, the correlation is strongly negative (-0.94) — exactly what the math predicts (adjusted = raw - noise_anomaly, so adjusted ∝ -noise_anomaly when raw is relatively stable).

The near-zero correlation in raw data (+0.09) means noise and propagation are nearly independent in raw SNR — the noise effect is hidden, not dominant. But 9% contamination across 900K readings is enough to bias any statistical test.

Comparison visualizations

Key takeaway

The denoising matters most for hourly corridor analysis (storm detection, path-specific effects). For daily/monthly aggregate studies (solar cycle, seasonal patterns, secular trends), band-average noise correction is a constant offset that doesn't change the statistics. The -0.1 dB/year trend is real (not removable by band-level noise correction) and needs site-level analysis to resolve.

Output files

FileRowsSize
corridors_denoised.parquet3,274,769122.8 MB
daily_denoised.parquet108,7764.1 MB
noise_baseline.parquet312<1 KB

References

  1. Taylor, J.H., K1JT, "WSPR — Weak Signal Propagation Reporter," https://wsprnet.org/
  2. wspr.live — ClickHouse mirror of WSPR database, https://wspr.live/
  3. SILSO World Data Center, Royal Observatory of Belgium, https://www.sidc.be/silso/

Author: Lab

Published: 2026-03-31 · Updated: 2026-03-31

Data files: compare_results.json, results.json, sunspots.parquet, wspr_daily_all.parquet, wspr_live.parquet

Scripts: analyze.py, compare_raw_denoised.py, extract.py, visualize_denoised.py

← Back to Data Lab
Live Feed