The Question
Sovereign investors usually look at countries through market data, credit views, and country data. Those lenses overlap, but they are not the same.
A country can trade at a high spread because fundamentals are weak. It can also trade wide because of liquidity, investor positioning, recent restructuring history, election risk, or a global selloff.
That is all BT-SFS tries to answer. Market comparison is a later use case, not part of score construction.
Design Principles
These rules keep BT-SFS open, auditable, and separate from the market prices it is later compared against.
| Rule | Meaning |
|---|---|
| Open enough to audit | Inputs, source years, weights, transforms, and penalties should be visible in each release. |
| Parsimonious | Use a small set of stable indicators rather than a large opaque model. |
| Non-circular | Do not use market spreads, CDS, or agency ratings as score inputs. |
| Directionally intuitive | Every metric should have a clear better direction before scoring. |
| Release-relative | Scores rank countries against the same frozen peer set. |
| Honest about gaps | Missing or weak data should be labeled, not silently neutral-filled. |
Source Data
BT-SFS uses public-source data where possible and a published default/restructuring event registry where public datasets lag recent events.
| Source family | Main use | Typical cadence |
|---|---|---|
| IMF DataMapper / WEO / Fiscal Monitor | Debt, fiscal balances, revenue, current account, inflation, growth, GDP/capita | IMF release cycle; current-year values may be estimates or projections |
| World Bank World Development Indicators | Reserves months, external-debt diagnostics, fallbacks, country crosswalks | Annual or rolling |
| World Bank Worldwide Governance Indicators | Rule of Law, Government Effectiveness, Control of Corruption | Annual |
| Bank of Canada / Bank of England default database | Long-run default-history support | Annual |
| BT default/restructuring event registry | Recent source-linked default and restructuring overlay | Manual, release-gated |
The score blends source dates. Current IMF fields can be estimates or projections, while WDI and WGI observations often lag. Public releases should show the source year for every metric.
Metrics And Weights
The headline score uses nine required metrics plus a separate default/restructuring adjustment.
Exhibit A - Pillar weights
| Metric | Pillar | Weight | Better direction |
|---|---|---|---|
| General government debt/GDP | Fiscal | 10.50% | Lower |
| Interest expense/revenue | Fiscal | 10.50% | Lower |
| Primary balance/GDP | Fiscal | 9.00% | Higher |
| International reserves, months of imports | External | 14.25% | Higher |
| Current account/GDP | External | 10.75% | Higher |
| Inflation stress | Macro | 8.00% | Lower stress |
| Real GDP growth | Macro | 6.00% | Higher |
| GDP/capita, current USD | Macro | 6.00% | Higher |
| Institutional quality | Institutions | 25.00% | Higher |
External debt remains diagnostic for now. It is important, but current open-source coverage is not clean enough across the whole tradable country set to make it a required input.
Institutional quality is a single composite: the simple average of the World Bank WGI Rule of Law, Government Effectiveness, and Control of Corruption estimates, scored as one metric. WGI is perception-based and typically lags by about two years, so the institutions pillar is the slowest-moving quarter of the score and can carry a residue of past market sentiment.
Transforming Raw Data
Every metric is converted into an oriented score where 100 is better and 0 is worse. Ordinary metrics use release-relative percentile ranks.
Inflation is handled differently. Very low and stable inflation is good, but deflation and very high inflation are both credit-relevant stresses.
| Inflation range | Score treatment |
|---|---|
| 0% to 5% | Full score of 100 |
| 5% to 10% | Linear decline from 100 to 75 |
| 10% to 25% | Linear decline from 75 to 25 |
| Above 25% | 25 minus 1 point per point of inflation above 25%, floor at 0 |
| Below 0% | 100 minus 10 points per point of deflation, floor at 0 |
Aggregation And Default History
Exhibit B - Score construction
The default/restructuring adjustment is separate because recent default history is not just another macro variable. It captures willingness to pay, market-access track record, and unresolved restructuring complexity.
The adjustment should be source-linked, time-decayed, capped, visible to users, and included in the public release package. The v0.4 parameters are:
The overlay interacts with the metrics by design: a restructuring can improve measured debt ratios while the penalty offsets it, and the offset fades on the published decay schedule rather than on new behavior.
Missing Data
| Status | Meaning |
|---|---|
| Publishable | All required headline metrics are present and reviewed for the release. |
| Provisional | A required metric is missing, stale, or under review. The country can be shown, but comparisons need caution. |
| Distressed extension | The country has BT USD sovereign bonds, but the representative instruments are defaulted or non-performing. Fundamentals can be shown; ordinary spread comparisons should be excluded. |
| Excluded | The country lacks enough source coverage to support a meaningful score. |
Required metrics should not be neutral-filled in the public score. A neutral fill can be useful internally for diagnostics, but it makes public comparisons look more precise than they are.
Why Ratings And Spreads Stay Out
Agency ratings and market spreads are useful. They are also exactly the kinds of things we may want to compare against the score.
| Layer | Role |
|---|---|
| Fundamentals score | Open public-data model. |
| Agency ratings | Reference fields only, not score inputs. |
| Market spreads/yields | BondTerminal market layer used for comparison, not score construction. |
One practical use case is a chart comparing BT-SFS with selected USD sovereign bond spreads. That chart is a product and research surface. It is not the methodology itself.
Research Context
BT-SFS reflects several inputs at once: academic work on ratings, spreads, and debt distress; public rating-agency methodology; public scorecard examples; open-data constraints; and our own product requirement that fundamentals stay separate from market pricing. No single external model is the template.
| Source family | What it contributes | BT-SFS decision |
|---|---|---|
| Academic ratings literature | Shows that compact macro, income, inflation, external, and default-history variables explain much of sovereign rating variation. | Use a parsimonious fundamentals vector, but do not try to reproduce agency ratings. |
| Sovereign-spread literature | Shows that country fundamentals matter, but global risk and market conditions also matter. | Keep spreads outside the score and use them later as a comparison layer. |
| Debt-distress and debt-sustainability literature | Highlights debt burden, liquidity, policy quality, institutions, and crisis-warning variables. | Treat the score as an explainable scorecard, not a default-probability model. |
| Public rating-agency methodologies | Provide a common vocabulary around institutions, economy, external position, fiscal strength, monetary flexibility, and event risk. | Use similar broad categories with published weights and source rules. |
| Public investor frameworks, including VanEck's emerging-market debt materials | Show how radar profiles, standardized scores, and fundamentals-versus-market charts can communicate sovereign risk. | Borrow the communication idea, not the proprietary model or internal weights. |
| BondTerminal reasoning and data constraints | Require open sources, reproducible updates, and a clean separation between fundamentals and our market layer. | Prefer fewer auditable metrics over a larger opaque or hard-to-redistribute model. |
Academic work on sovereign ratings and spreads supports a compact fundamentals vector. Cantor and Packer show that macro, income, inflation, external, and default-history variables explain much of sovereign rating variation. Eichengreen and Mody, Hilscher and Nosbusch, Longstaff et al., and others show that spreads reflect both country fundamentals and global risk conditions. Mauro, Sussman, and Yafeh add a longer historical comparison across the 1870-1913 and modern globalization periods, which is useful because fiscal variables, political events, country-specific developments, and global co-movement can all matter.
Public rating-agency methodologies from S&P, Fitch, and Moody's use broad categories such as institutions, economy, external position, fiscal strength, monetary flexibility, and event risk. BT-SFS uses similar categories, but with published weights and source rules.
| Tradition | What it contributes | What BT-SFS does differently |
|---|---|---|
| Rating-replication literature | Shows that compact macro and default-history variables explain much of sovereign ratings. | Does not try to reproduce agency ratings or issue a rating. |
| Debt-distress prediction literature | Highlights solvency, liquidity, debt burden, policy quality, and crisis-warning variables. | Does not estimate default probability. |
| Sovereign-spread literature | Shows that fundamentals matter, but global risk and market conditions also matter. | Keeps market spreads outside the score to avoid circularity. |
| Transparent scorecard methodology | Shows how public score frameworks can communicate country risk drivers. | Publishes weights, source vintages, and missing-data rules for debate. |
That boundary is important. BT-SFS is an inspectable fundamentals score that can later be compared with ratings, spreads, and default outcomes. It is not a rating model, a default model, or a pricing model.
Limitations
| Limitation | Why it matters |
|---|---|
| Relative ranking | A country can improve but still fall if peers improve more. |
| Mixed source years | Forecast IMF data and observed WDI/WGI data can live in the same release. |
| Judgmental weights | The v0.x weights are defensible assumptions, not statistically proven truths. |
| Indicator overlap | Fiscal, external, macro, and institutional variables are not fully independent. |
| Data coverage | Some important metrics are hard to source consistently. |
| Default penalty judgment | Event classification and decay rules require transparent but imperfect judgment. |
| Market comparison limits | Any spread chart depends on bond selection, liquidity, duration, and curve assumptions. |
| Distressed-end curvature | A single log-linear spread fit understates curvature for post-restructuring credits, whose spreads price recovery mechanics — see the tier-exclusion diagnostic in the validation appendix. |
| Within-tier discrimination | Inside one rating tier (single-B in the current cross-section) the score explains little of the remaining spread variation; BT-SFS ranks across the quality spectrum, not within tiers. |
BT-SFS measures observable macro, fiscal, external, institutional, and default-history fundamentals. It does not fully capture policy credibility, fiscal accounting quality, FX-regime fragility, currency mismatch, political transition risk, or market technicals. The external pillar also measures flow adequacy: reserves in months of imports covers trade needs, not rollover risk against short-term external debt, which stays diagnostic until open-source coverage improves. Those forces may appear later as residuals between the fundamentals score and market spreads.
BT-SFS is not a sovereign rating, default-probability model, recovery model, investment recommendation, or fair-value model for spreads.
Historical backcast policy
Historical series should be treated as current-vintage backcasts: they apply the current methodology to historical source series. They do not show what the model would have known in real time using only data available in each year.
| History range | Public treatment | Coverage finding |
|---|---|---|
| 2007-2026 | Default public range | Stable enough for normal display; recurring gaps are concentrated in reserves months for Benin, Cote d'Ivoire, Senegal, and UAE through 2022. |
| 2002-2006 | Optional extended range with visible warning | Usable, but reserves gaps add Serbia and Suriname in early years. |
| 1997-2001 | Internal/research only | Debt/GDP, interest/revenue, primary balance, and reserves coverage begin to break for too many countries. |
| 1992-1996 | Do not publish under the current methodology | WGI institutions are missing before 1996 and fiscal data are patchy. |
The public chart should show what exists without inventing missing values: complete years can be drawn as solid lines, provisional years as faint or dotted lines, and missing years as gaps. Tooltips should show source coverage and the specific missing metric.
First Validation Results
BT-SFS has a separate validation appendix because the methodology and the evidence should stay distinguishable. The paper explains how the score is built. The validation appendix asks whether the frozen score behaves sensibly against observable outcomes.
The first validation run is encouraging, but limited. It supports BT-SFS as a transparent fundamentals score with construct and concurrent validity. It does not prove forecast skill, and it does not justify statistically re-estimating the weights yet.
| Check | What we found | How to read it |
|---|---|---|
| Event-risk panel | 333 country-year rows; 21 positive rows across 8 event countries; AUC 0.898. | Lower scores line up with later listed stress events, but the event count is small. |
| Weakest-score quartile | 18 / 83 event rows, versus 3 / 250 outside that quartile; 18.1x lift. | This is the headline event-risk statistic, but rows are not independent defaults. |
| Temporal split | A 2016-2020 cutoff caught all 6 positive rows in the 2021-2022 test window, while flagging 25 / 90 test rows. | Useful screening-burden evidence, not a production classifier. |
| Spread cross-section | Publishable sample Spearman score/spread correlation -0.768; log-spread R² 0.576. | Stronger scores generally trade tighter, but this is one market-date cross-section. |
| Robustness preview | Leave-one-event-country-out AUC range 0.888-0.916; the composite beats every single pillar; country-level permutation p = 0.00005. | Rules out single-country dependence and chance under clustering, not vintage leakage. |
A tier-exclusion cut of the spread fit (validation appendix) shows where that concurrent validity lives: most of the explanatory power is between rating tiers; the distressed CCC cohort sits off the single global curve — excluding it lifts the log-fit R² from 0.58 to 0.69 — and the strong-fundamentals tail anchors the slope, an argument for adding high-grade anchor countries in a future release.
The full appendix is published alongside this paper: BT-SFS validation appendix v0.1.
Publication Standard
A public release should include the short paper, the technical appendix, score specification lock, country universe, source-traceable inputs, metric scores, pillar scores, base score, default adjustment, final score, source years, fallback flags, event registry, known source failures, provisional countries, and changelog.
Before launch, the public product also needs a rights and framing review for agency-rating references, EMB-holdings-derived universe construction, and language that could be mistaken for investment advice.
How To Improve It
This should be an open project, not a closed claim of authority. Good feedback should be concrete enough to test.
| Contribution | Example |
|---|---|
| Better source | A reliable public source for gross financing needs or short-term external debt. |
| Better transform | A clearer inflation-stress curve or reserve-adequacy rule. |
| Better weight | A proposed weight set plus evidence that rankings are more stable or intuitive. |
| Better default treatment | A cleaner event taxonomy, penalty size, or decay schedule. |
| Better coverage rule | A way to classify provisional countries more consistently. |
| Data correction | A wrong country mapping, stale source value, or questionable event flag. |
| Literature suggestion | A paper or methodology that should change the score design. |
The standard for change should be simple: does the proposal make the model more transparent, more reproducible, more stable, or more useful without pretending to be more precise than the data allow?
Current Release Exhibits
The exhibits below show how the current v0.4 dry run looks when the methodology is applied. They are deliberately placed after the methodology: the table, spread view, and history view are evidence and reader aids, not inputs into the score.
Snapshot source: BT-SFS v0.4 internal dry run, market date 2026-06-25. Spreads are selected near-5-duration USD sovereign bond observations. History is a current-vintage backcast, not a real-time historical model.
| Country | Status | BT-SFS | Rank | Spread | Selected bond | Flag |
|---|---|---|---|---|---|---|
| United Arab Emirates | Publishable | 86.92 | 1 | 29 bps | UAE 2 31 | none |
| Oman | Publishable | 80.02 | 2 | 79 bps | OMAN 7.375 32 | none |
| Saudi Arabia | Publishable | 77.53 | 3 | 75 bps | KSA 2.75 32 | none |
| Uruguay | Publishable | 72.41 | 4 | 80 bps | URUGUA 5.75 34 | weak 5D proxy |
| Chile | Publishable | 68.89 | 5 | 65 bps | CHILE 2.55 32 | none |
| Brazil | Publishable | 48.37 | 29 | 157 bps | BRAZIL 5.5 33 | default-history review, no penalty |
| Turkiye | Publishable | 46.94 | 30 | 258 bps | TURKEY 9.375 33Y | none |
| Ghana | Publishable | 45.00 | 32 | 303 bps | GHANA 5 35 | default penalty -13.9 |
| Argentina | Publishable | 42.55 | 33 | 468 bps | ARGENT 4.125 35 | default penalty -11.2 |
| Ecuador | Publishable | 38.56 | 38 | 441 bps | ECUA 6.9 35 | default penalty -7.7 |
| Senegal | Provisional | 41.36 | n/a | 1,605 bps | SENEGL 6.25 33 | one missing metric |
| Bolivia | Publishable | 20.88 | 45 | 497 bps | BOLIVI 9.45 31 | weak 5D proxy |
| Ukraine | Publishable | 11.50 | 46 | 807 bps | UKRAINE 4.5 34 | default penalty -16.0 |
Release exhibit 1 - Selected score versus spread
The line is a simple guide through the selected examples, not a fair-value model. Spreads remain a market comparison layer and do not enter BT-SFS.
Release exhibit 2 - Current-vintage history examples
This chart applies the current v0.4 methodology to annual historical source series. It is useful for context, but it is not a point-in-time backtest.
Metric Profile Matrix
The profile is table-first rather than radar-first. Each cell shows the oriented metric score, then the raw value and source year. Higher score is better for every metric after transformation.
| Metric | Argentina | Brazil | Ghana | UAE | Turkiye |
|---|---|---|---|---|---|
| Debt/GDP | 33.3raw 70.4 · IMF 2026 | 12.5raw 96.5 · IMF 2026 | 66.7raw 53.0 · IMF 2026 | 87.5raw 31.4 · IMF 2026 | 95.8raw 25.5 · IMF 2026 |
| Interest/revenue | 81.3raw 4.5 · IMF 2026 | 35.4raw 18.1 · IMF 2026 | 25.0raw 22.2 · IMF 2026 | 89.6raw 2.7 · IMF 2026 | 58.3raw 10.2 · IMF 2026 |
| Primary balance/GDP | 85.4raw 1.9 · IMF 2026 | 54.2raw -0.5 · IMF 2026 | 81.3raw 1.5 · IMF 2026 | 100raw 5.6 · IMF 2026 | 58.3raw -0.3 · IMF 2026 |
| Reserves months | 26.7raw 3.6 · WB 2024 | 93.3raw 8.0 · WB 2024 | 2.2raw 1.6 · WB 2024 | 51.1raw 5.2 · WB 2024 | 44.4raw 4.7 · WB 2024 |
| Current account/GDP | 60.4raw -0.8 · IMF 2026 | 35.4raw -2.7 · IMF 2026 | 97.9raw 10.1 · IMF 2026 | 100raw 11.4 · IMF 2026 | 33.3raw -2.8 · IMF 2026 |
| Inflation stress | 19.6raw 30.4 · IMF 2026 | 100raw 4.0 · IMF 2026 | 96.0raw 5.8 · IMF 2026 | 100raw 2.5 · IMF 2026 | 21.4raw 28.6 · IMF 2026 |
| GDP growth | 57.3raw 3.5 · IMF 2026 | 18.8raw 1.9 · IMF 2026 | 89.6raw 4.8 · IMF 2026 | 46.9raw 3.1 · IMF 2026 | 54.2raw 3.4 · IMF 2026 |
| GDP/capita | 60.4raw $14,357 · IMF 2026 | 56.3raw $12,313 · IMF 2026 | 14.6raw $3,314 · IMF 2026 | 100raw $54,214 · IMF 2026 | 72.9raw $19,018 · IMF 2026 |
| Institutional quality | 60.4raw -0.13 · WGI 2024 | 35.4raw -0.36 · WGI 2024 | 68.8raw -0.03 · WGI 2024 | 97.9raw 0.98 · WGI 2024 | 25.0raw -0.49 · WGI 2024 |
References
- IMF DataMapper API help
- IMF DataMapper general government debt/GDP example
- World Bank API developer documentation
- World Bank WDI reserves months example
- World Bank Worldwide Governance Indicators
- World Bank WGI Rule of Law API example
- Bank of Canada sovereign default database update
- Bank of Canada sovereign default database methodology report
- VanEck Investment Case for Emerging Markets Debt
- Cantor and Packer, sovereign credit ratings
- Eichengreen and Mody, emerging-market spreads
- Mauro, Sussman, and Yafeh, sovereign bond spreads history
- Afonso, Gomes, and Rother, sovereign rating determinants
- Hilscher and Nosbusch, determinants of sovereign risk
- Longstaff et al., sovereign credit risk
- Manasse, Roubini, and Schimmelpfennig, sovereign debt crises
- Kraay and Nehru, external debt sustainability
- Hellwig, machine-learning fiscal-crisis prediction
- IMF Vulnerability Exercise approach using machine learning
- Reinhart, Rogoff, and Savastano, debt intolerance
- FTSE Russell / LSEG Sovereign Risk Monitor methodology
- S&P sovereign rating methodology
- Fitch sovereign rating criteria
- Moody's sovereign methodology overview