Back to BondTerminal.com
Validation appendix
BondTerminal research project

BT Sovereign Signal

An open methodology for turning public sovereign fundamentals into a score that can be inspected, challenged, and improved.

Paper
Public methodology
Date
2026-06-29
Score
BT-SFS v0.4
Status
Draft

BT Sovereign Signal is an open research project for comparing sovereign fundamentals across countries. Its core output is the BT Sovereign Fundamentals Score, or BT-SFS, built from public macroeconomic, fiscal, external, institutional, and default-history data.

The score does not use bond spreads, CDS, or agency ratings as inputs. That separation lets anyone compare fundamentals with market pricing later, without building market prices into the score itself.

The Question

Sovereign investors usually look at countries through market data, credit views, and country data. Those lenses overlap, but they are not the same.

A country can trade at a high spread because fundamentals are weak. It can also trade wide because of liquidity, investor positioning, recent restructuring history, election risk, or a global selloff.

How strong are this country's measured sovereign fundamentals compared with the other countries in the same frozen release universe?

That is all BT-SFS tries to answer. Market comparison is a later use case, not part of score construction.

Design Principles

These rules keep BT-SFS open, auditable, and separate from the market prices it is later compared against.

RuleMeaning
Open enough to auditInputs, source years, weights, transforms, and penalties should be visible in each release.
ParsimoniousUse a small set of stable indicators rather than a large opaque model.
Non-circularDo not use market spreads, CDS, or agency ratings as score inputs.
Directionally intuitiveEvery metric should have a clear better direction before scoring.
Release-relativeScores rank countries against the same frozen peer set.
Honest about gapsMissing or weak data should be labeled, not silently neutral-filled.

Source Data

BT-SFS uses public-source data where possible and a published default/restructuring event registry where public datasets lag recent events.

Source familyMain useTypical cadence
IMF DataMapper / WEO / Fiscal MonitorDebt, fiscal balances, revenue, current account, inflation, growth, GDP/capitaIMF release cycle; current-year values may be estimates or projections
World Bank World Development IndicatorsReserves months, external-debt diagnostics, fallbacks, country crosswalksAnnual or rolling
World Bank Worldwide Governance IndicatorsRule of Law, Government Effectiveness, Control of CorruptionAnnual
Bank of Canada / Bank of England default databaseLong-run default-history supportAnnual
BT default/restructuring event registryRecent source-linked default and restructuring overlayManual, release-gated

The score blends source dates. Current IMF fields can be estimates or projections, while WDI and WGI observations often lag. Public releases should show the source year for every metric.

Metrics And Weights

The headline score uses nine required metrics plus a separate default/restructuring adjustment.

MetricPillarWeightBetter direction
General government debt/GDPFiscal10.50%Lower
Interest expense/revenueFiscal10.50%Lower
Primary balance/GDPFiscal9.00%Higher
International reserves, months of importsExternal14.25%Higher
Current account/GDPExternal10.75%Higher
Inflation stressMacro8.00%Lower stress
Real GDP growthMacro6.00%Higher
GDP/capita, current USDMacro6.00%Higher
Institutional qualityInstitutions25.00%Higher

External debt remains diagnostic for now. It is important, but current open-source coverage is not clean enough across the whole tradable country set to make it a required input.

Institutional quality is a single composite: the simple average of the World Bank WGI Rule of Law, Government Effectiveness, and Control of Corruption estimates, scored as one metric. WGI is perception-based and typically lags by about two years, so the institutions pillar is the slowest-moving quarter of the score and can carry a residue of past market sentiment.

Transforming Raw Data

Every metric is converted into an oriented score where 100 is better and 0 is worse. Ordinary metrics use release-relative percentile ranks.

percentile = 100 * (count_less + (count_equal - 1) / 2) / (n - 1) score = percentile for higher-is-better metrics score = 100 - percentile for lower-is-better metrics

Inflation is handled differently. Very low and stable inflation is good, but deflation and very high inflation are both credit-relevant stresses.

Inflation rangeScore treatment
0% to 5%Full score of 100
5% to 10%Linear decline from 100 to 75
10% to 25%Linear decline from 75 to 25
Above 25%25 minus 1 point per point of inflation above 25%, floor at 0
Below 0%100 minus 10 points per point of deflation, floor at 0
Inflation stress transform A score line stays high from zero to five percent inflation, then declines as inflation rises and also penalizes deflation. -5% 5% 10% 25% high 100 0
The chart is a visual guide; the exact v0.4 parameters are the band formulas in the table, locked in the release specification file.

Aggregation And Default History

The default/restructuring adjustment is separate because recent default history is not just another macro variable. It captures willingness to pay, market-access track record, and unresolved restructuring complexity.

The adjustment should be source-linked, time-decayed, capped, visible to users, and included in the public release package. The v0.4 parameters are:

penalty = base × recency multiplier + active-status penalty multiple events add together, floored at -18 base: -14 high-severity event · -8 other events recency: ×1.00 within 2y · ×0.85 within 5y · ×0.55 within 10y · ×0.25 after 10y active: -2 while a standstill or restructuring remains unresolved

The overlay interacts with the metrics by design: a restructuring can improve measured debt ratios while the penalty offsets it, and the offset fades on the published decay schedule rather than on new behavior.

Missing Data

StatusMeaning
PublishableAll required headline metrics are present and reviewed for the release.
ProvisionalA required metric is missing, stale, or under review. The country can be shown, but comparisons need caution.
Distressed extensionThe country has BT USD sovereign bonds, but the representative instruments are defaulted or non-performing. Fundamentals can be shown; ordinary spread comparisons should be excluded.
ExcludedThe country lacks enough source coverage to support a meaningful score.

Required metrics should not be neutral-filled in the public score. A neutral fill can be useful internally for diagnostics, but it makes public comparisons look more precise than they are.

Why Ratings And Spreads Stay Out

Agency ratings and market spreads are useful. They are also exactly the kinds of things we may want to compare against the score.

LayerRole
Fundamentals scoreOpen public-data model.
Agency ratingsReference fields only, not score inputs.
Market spreads/yieldsBondTerminal market layer used for comparison, not score construction.

One practical use case is a chart comparing BT-SFS with selected USD sovereign bond spreads. That chart is a product and research surface. It is not the methodology itself.

Research Context

BT-SFS reflects several inputs at once: academic work on ratings, spreads, and debt distress; public rating-agency methodology; public scorecard examples; open-data constraints; and our own product requirement that fundamentals stay separate from market pricing. No single external model is the template.

Source familyWhat it contributesBT-SFS decision
Academic ratings literatureShows that compact macro, income, inflation, external, and default-history variables explain much of sovereign rating variation.Use a parsimonious fundamentals vector, but do not try to reproduce agency ratings.
Sovereign-spread literatureShows that country fundamentals matter, but global risk and market conditions also matter.Keep spreads outside the score and use them later as a comparison layer.
Debt-distress and debt-sustainability literatureHighlights debt burden, liquidity, policy quality, institutions, and crisis-warning variables.Treat the score as an explainable scorecard, not a default-probability model.
Public rating-agency methodologiesProvide a common vocabulary around institutions, economy, external position, fiscal strength, monetary flexibility, and event risk.Use similar broad categories with published weights and source rules.
Public investor frameworks, including VanEck's emerging-market debt materialsShow how radar profiles, standardized scores, and fundamentals-versus-market charts can communicate sovereign risk.Borrow the communication idea, not the proprietary model or internal weights.
BondTerminal reasoning and data constraintsRequire open sources, reproducible updates, and a clean separation between fundamentals and our market layer.Prefer fewer auditable metrics over a larger opaque or hard-to-redistribute model.

Academic work on sovereign ratings and spreads supports a compact fundamentals vector. Cantor and Packer show that macro, income, inflation, external, and default-history variables explain much of sovereign rating variation. Eichengreen and Mody, Hilscher and Nosbusch, Longstaff et al., and others show that spreads reflect both country fundamentals and global risk conditions. Mauro, Sussman, and Yafeh add a longer historical comparison across the 1870-1913 and modern globalization periods, which is useful because fiscal variables, political events, country-specific developments, and global co-movement can all matter.

Public rating-agency methodologies from S&P, Fitch, and Moody's use broad categories such as institutions, economy, external position, fiscal strength, monetary flexibility, and event risk. BT-SFS uses similar categories, but with published weights and source rules.

TraditionWhat it contributesWhat BT-SFS does differently
Rating-replication literatureShows that compact macro and default-history variables explain much of sovereign ratings.Does not try to reproduce agency ratings or issue a rating.
Debt-distress prediction literatureHighlights solvency, liquidity, debt burden, policy quality, and crisis-warning variables.Does not estimate default probability.
Sovereign-spread literatureShows that fundamentals matter, but global risk and market conditions also matter.Keeps market spreads outside the score to avoid circularity.
Transparent scorecard methodologyShows how public score frameworks can communicate country risk drivers.Publishes weights, source vintages, and missing-data rules for debate.

That boundary is important. BT-SFS is an inspectable fundamentals score that can later be compared with ratings, spreads, and default outcomes. It is not a rating model, a default model, or a pricing model.

Limitations

LimitationWhy it matters
Relative rankingA country can improve but still fall if peers improve more.
Mixed source yearsForecast IMF data and observed WDI/WGI data can live in the same release.
Judgmental weightsThe v0.x weights are defensible assumptions, not statistically proven truths.
Indicator overlapFiscal, external, macro, and institutional variables are not fully independent.
Data coverageSome important metrics are hard to source consistently.
Default penalty judgmentEvent classification and decay rules require transparent but imperfect judgment.
Market comparison limitsAny spread chart depends on bond selection, liquidity, duration, and curve assumptions.
Distressed-end curvatureA single log-linear spread fit understates curvature for post-restructuring credits, whose spreads price recovery mechanics — see the tier-exclusion diagnostic in the validation appendix.
Within-tier discriminationInside one rating tier (single-B in the current cross-section) the score explains little of the remaining spread variation; BT-SFS ranks across the quality spectrum, not within tiers.

BT-SFS measures observable macro, fiscal, external, institutional, and default-history fundamentals. It does not fully capture policy credibility, fiscal accounting quality, FX-regime fragility, currency mismatch, political transition risk, or market technicals. The external pillar also measures flow adequacy: reserves in months of imports covers trade needs, not rollover risk against short-term external debt, which stays diagnostic until open-source coverage improves. Those forces may appear later as residuals between the fundamentals score and market spreads.

BT-SFS is not a sovereign rating, default-probability model, recovery model, investment recommendation, or fair-value model for spreads.

Historical backcast policy

Historical series should be treated as current-vintage backcasts: they apply the current methodology to historical source series. They do not show what the model would have known in real time using only data available in each year.

History rangePublic treatmentCoverage finding
2007-2026Default public rangeStable enough for normal display; recurring gaps are concentrated in reserves months for Benin, Cote d'Ivoire, Senegal, and UAE through 2022.
2002-2006Optional extended range with visible warningUsable, but reserves gaps add Serbia and Suriname in early years.
1997-2001Internal/research onlyDebt/GDP, interest/revenue, primary balance, and reserves coverage begin to break for too many countries.
1992-1996Do not publish under the current methodologyWGI institutions are missing before 1996 and fiscal data are patchy.

The public chart should show what exists without inventing missing values: complete years can be drawn as solid lines, provisional years as faint or dotted lines, and missing years as gaps. Tooltips should show source coverage and the specific missing metric.

First Validation Results

BT-SFS has a separate validation appendix because the methodology and the evidence should stay distinguishable. The paper explains how the score is built. The validation appendix asks whether the frozen score behaves sensibly against observable outcomes.

The first validation run is encouraging, but limited. It supports BT-SFS as a transparent fundamentals score with construct and concurrent validity. It does not prove forecast skill, and it does not justify statistically re-estimating the weights yet.

CheckWhat we foundHow to read it
Event-risk panel333 country-year rows; 21 positive rows across 8 event countries; AUC 0.898.Lower scores line up with later listed stress events, but the event count is small.
Weakest-score quartile18 / 83 event rows, versus 3 / 250 outside that quartile; 18.1x lift.This is the headline event-risk statistic, but rows are not independent defaults.
Temporal splitA 2016-2020 cutoff caught all 6 positive rows in the 2021-2022 test window, while flagging 25 / 90 test rows.Useful screening-burden evidence, not a production classifier.
Spread cross-sectionPublishable sample Spearman score/spread correlation -0.768; log-spread R² 0.576.Stronger scores generally trade tighter, but this is one market-date cross-section.
Robustness previewLeave-one-event-country-out AUC range 0.888-0.916; the composite beats every single pillar; country-level permutation p = 0.00005.Rules out single-country dependence and chance under clustering, not vintage leakage.

A tier-exclusion cut of the spread fit (validation appendix) shows where that concurrent validity lives: most of the explanatory power is between rating tiers; the distressed CCC cohort sits off the single global curve — excluding it lifts the log-fit R² from 0.58 to 0.69 — and the strong-fundamentals tail anchors the slope, an argument for adding high-grade anchor countries in a future release.

The full appendix is published alongside this paper: BT-SFS validation appendix v0.1.

Publication Standard

A public release should include the short paper, the technical appendix, score specification lock, country universe, source-traceable inputs, metric scores, pillar scores, base score, default adjustment, final score, source years, fallback flags, event registry, known source failures, provisional countries, and changelog.

Before launch, the public product also needs a rights and framing review for agency-rating references, EMB-holdings-derived universe construction, and language that could be mistaken for investment advice.

How To Improve It

This should be an open project, not a closed claim of authority. Good feedback should be concrete enough to test.

ContributionExample
Better sourceA reliable public source for gross financing needs or short-term external debt.
Better transformA clearer inflation-stress curve or reserve-adequacy rule.
Better weightA proposed weight set plus evidence that rankings are more stable or intuitive.
Better default treatmentA cleaner event taxonomy, penalty size, or decay schedule.
Better coverage ruleA way to classify provisional countries more consistently.
Data correctionA wrong country mapping, stale source value, or questionable event flag.
Literature suggestionA paper or methodology that should change the score design.

The standard for change should be simple: does the proposal make the model more transparent, more reproducible, more stable, or more useful without pretending to be more precise than the data allow?

Current Release Exhibits

The exhibits below show how the current v0.4 dry run looks when the methodology is applied. They are deliberately placed after the methodology: the table, spread view, and history view are evidence and reader aids, not inputs into the score.

Snapshot source: BT-SFS v0.4 internal dry run, market date 2026-06-25. Spreads are selected near-5-duration USD sovereign bond observations. History is a current-vintage backcast, not a real-time historical model.

CountryStatusBT-SFSRankSpreadSelected bondFlag
United Arab EmiratesPublishable86.92129 bpsUAE 2 31none
OmanPublishable80.02279 bpsOMAN 7.375 32none
Saudi ArabiaPublishable77.53375 bpsKSA 2.75 32none
UruguayPublishable72.41480 bpsURUGUA 5.75 34weak 5D proxy
ChilePublishable68.89565 bpsCHILE 2.55 32none
BrazilPublishable48.3729157 bpsBRAZIL 5.5 33default-history review, no penalty
TurkiyePublishable46.9430258 bpsTURKEY 9.375 33Ynone
GhanaPublishable45.0032303 bpsGHANA 5 35default penalty -13.9
ArgentinaPublishable42.5533468 bpsARGENT 4.125 35default penalty -11.2
EcuadorPublishable38.5638441 bpsECUA 6.9 35default penalty -7.7
SenegalProvisional41.36n/a1,605 bpsSENEGL 6.25 33one missing metric
BoliviaPublishable20.8845497 bpsBOLIVI 9.45 31weak 5D proxy
UkrainePublishable11.5046807 bpsUKRAINE 4.5 34default penalty -16.0

Release exhibit 1 - Selected score versus spread

BT-SFS versus selected 5D USD spreadsScatter plot comparing selected BT-SFS scores with near five duration USD sovereign spreads. Higher score is stronger and higher spread is wider.0204060800200400600800BT-SFS score, higher is strongerSelected 5D spread, bpsUnited Arab Emirates: 86.92 score, 29 bpsOman: 80.02 score, 79 bpsSaudi Arabia: 77.53 score, 75 bpsUruguay: 72.41 score, 80 bpsChile: 68.89 score, 65 bpsBrazil: 48.37 score, 157 bpsTurkiye: 46.94 score, 258 bpsGhana: 45 score, 303 bpsArgentina: 42.55 score, 468 bpsEcuador: 38.56 score, 441 bpsSenegal: 41.36 score, 1,605 bps clippedBolivia: 20.88 score, 497 bpsUkraine: 11.5 score, 807 bpsAEOMSAUYCLBRTRGHARECSNBOUASenegal clipped above 850 bps

The line is a simple guide through the selected examples, not a fair-value model. Spreads remain a market comparison layer and do not enter BT-SFS.

Release exhibit 2 - Current-vintage history examples

Current-vintage BT-SFS backcast examplesLine chart showing annual current-vintage BT-SFS scores for selected countries from 2016 to 2026. Higher score is stronger.161820222426020406080Annual score, current-vintage backcastBT-SFS scoreUAE 2026: 86.92AEArgentina 2026: 42.55ARBrazil 2026: 48.37BRGhana 2026: 45GHTurkiye 2026: 46.94TR

This chart applies the current v0.4 methodology to annual historical source series. It is useful for context, but it is not a point-in-time backtest.

Metric Profile Matrix

The profile is table-first rather than radar-first. Each cell shows the oriented metric score, then the raw value and source year. Higher score is better for every metric after transformation.

MetricArgentinaBrazilGhanaUAETurkiye
Debt/GDP33.3raw 70.4 · IMF 202612.5raw 96.5 · IMF 202666.7raw 53.0 · IMF 202687.5raw 31.4 · IMF 202695.8raw 25.5 · IMF 2026
Interest/revenue81.3raw 4.5 · IMF 202635.4raw 18.1 · IMF 202625.0raw 22.2 · IMF 202689.6raw 2.7 · IMF 202658.3raw 10.2 · IMF 2026
Primary balance/GDP85.4raw 1.9 · IMF 202654.2raw -0.5 · IMF 202681.3raw 1.5 · IMF 2026100raw 5.6 · IMF 202658.3raw -0.3 · IMF 2026
Reserves months26.7raw 3.6 · WB 202493.3raw 8.0 · WB 20242.2raw 1.6 · WB 202451.1raw 5.2 · WB 202444.4raw 4.7 · WB 2024
Current account/GDP60.4raw -0.8 · IMF 202635.4raw -2.7 · IMF 202697.9raw 10.1 · IMF 2026100raw 11.4 · IMF 202633.3raw -2.8 · IMF 2026
Inflation stress19.6raw 30.4 · IMF 2026100raw 4.0 · IMF 202696.0raw 5.8 · IMF 2026100raw 2.5 · IMF 202621.4raw 28.6 · IMF 2026
GDP growth57.3raw 3.5 · IMF 202618.8raw 1.9 · IMF 202689.6raw 4.8 · IMF 202646.9raw 3.1 · IMF 202654.2raw 3.4 · IMF 2026
GDP/capita60.4raw $14,357 · IMF 202656.3raw $12,313 · IMF 202614.6raw $3,314 · IMF 2026100raw $54,214 · IMF 202672.9raw $19,018 · IMF 2026
Institutional quality60.4raw -0.13 · WGI 202435.4raw -0.36 · WGI 202468.8raw -0.03 · WGI 202497.9raw 0.98 · WGI 202425.0raw -0.49 · WGI 2024

References

  1. IMF DataMapper API help
  2. IMF DataMapper general government debt/GDP example
  3. World Bank API developer documentation
  4. World Bank WDI reserves months example
  5. World Bank Worldwide Governance Indicators
  6. World Bank WGI Rule of Law API example
  7. Bank of Canada sovereign default database update
  8. Bank of Canada sovereign default database methodology report
  9. VanEck Investment Case for Emerging Markets Debt
  10. Cantor and Packer, sovereign credit ratings
  11. Eichengreen and Mody, emerging-market spreads
  12. Mauro, Sussman, and Yafeh, sovereign bond spreads history
  13. Afonso, Gomes, and Rother, sovereign rating determinants
  14. Hilscher and Nosbusch, determinants of sovereign risk
  15. Longstaff et al., sovereign credit risk
  16. Manasse, Roubini, and Schimmelpfennig, sovereign debt crises
  17. Kraay and Nehru, external debt sustainability
  18. Hellwig, machine-learning fiscal-crisis prediction
  19. IMF Vulnerability Exercise approach using machine learning
  20. Reinhart, Rogoff, and Savastano, debt intolerance
  21. FTSE Russell / LSEG Sovereign Risk Monitor methodology
  22. S&P sovereign rating methodology
  23. Fitch sovereign rating criteria
  24. Moody's sovereign methodology overview