VT and Chill is Costing You Money – The Statistical Evidence

Since 1970, over a 56 year span, by simply slicing global equities into regional funds and using either a buy-and-hold, annual rebalancing or 5/25 threshold rebalancing, investors historically boosted annualized returns by up to 1.1% while actually reducing volatility. In case you are wondering: over a 30-year career, that additional 1% lift can boost your final retirement nest egg by 35%.

Total market funds like VT are actively and continuously neutral – they are heavily driven by float-adjusted capitalization. This means VT continuously funnels money into expanding equity supply and corporate share issuance. This creates a structural drag. By contrast, a simple buy-and-hold regional slice portfolio updates its weights purely through investment returns, acting as a passive momentum engine that lets winners run. When you add a disciplined rebalancing schedule, you overlay a contrarian, value-investing mechanic – systematically buying low and selling high. Ultimately, passive portfolio construction isn’t just about what assets you own; it’s about choosing which structural engine dictates how your wealth compounds.


The story so far looks good but: are the observed performance differences real, or are they artifacts of the data?

Data Calibration Test

Before comparing portfolio outcomes, the first step we did was to verify that the data accurately reproduces a global market-cap weighted portfolio.

To test this, we compared:

  • P1: VT-style continuous market-weight portfolio
  • With the annual market-weight rebalanced VTSAX + VTIAX portfolio

From 1970 through 2025:

PortfolioCAGR
VT (P1)9.3604%
Annual Market Weight Rebalance (VTSAX + VTIAX)9.4842%

The difference: +0.1238% per year. Correlation: 99.7562%

The correlation is effectively perfect. The small return difference is expected because annual rebalancing only approximates the continuous market-weight adjustments that occur naturally inside VT.

This confirms that the historical data used in by our portfolio construction methodology is consistent. The performance differences observed later cannot be dismissed as data artifacts.

We then evaluated every available 20-year rolling period between 1970 and 2026 (end of April). A total of 38 independent rolling windows were examined.

The results are striking.

  • 0.74% to 0.97% annually for buy-and-hold
  • 0.88% to 1.39% annually for yearly rebalancing (R1)
  • 1.01% to 1.40% annually for threshold rebalancing (R5/25)

Most importantly, the median result is remarkably close to the average result, indicating that the outperformance is not driven by a handful of lucky periods.

Even the worst observations were generally small while the best observations were substantial.

count = 38P2(B&H)-P1P2(R1)-P1P2(R5/25)-P1
min-0.1432%-0.1857%-0.1035%
max1.3806%1.7723%1.9085%
average0.7402%0.8823%1.0197%
stdev0.4612%0.3854%0.3759%
median0.7982%0.8982%1.0519%
Newey-West Adjusted t-statistic (lag 19)3.27856.14437.8319
p-value (1T Asymptotic Z-test)0.0522%00
p-value (1T Finite-sample t-test, df=37)0.002300
count = 38P3(B&H)-P1P3(R1)-P1P3(R5/25)-P1
min-0.1259%-0.0697%0.0112%
max1.7488%2.3619%2.3685%
average0.9741%1.2160%1.2708%
stdev0.4642%0.4771%0.4632%
median1.0931%1.1845%1.2391%
Newey-West Adjusted t-statistic (lag 19)5.16667.68828.5140
p-value (1T Asymptotic Z-test)0.0000%00
p-value (1T Finite-sample t-test, df=37)0.0004%00
count = 38P4(B&H)-P1P4(R1)-P1P4(R5/25)-P1
min0.2096%0.0102%-0.0286%
max1.4420%2.8767%2.8844%
average0.9126%1.3931%1.3979%
stdev0.3528%0.5809%0.5846%
median0.9829%1.3338%1.3664%
Newey-West Adjusted t-statistic (lag 19)6.64267.20497.2024
p-value (1T Asymptotic Z-test)000
p-value (1T Finite-sample t-test, df=37)000
  • Average annual advantage ranged from 0.75% to 1.45%
  • Median results closely matched averages
  • Minimum observations remained positive in all cases

The persistence of the advantage over three decades is difficult to dismiss as random variation. A strategy that consistently outperforms over 20-year periods could be experiencing luck.

count = 28P2(B&H)-P1P2(R1)-P1P2(R5/25)-P1
min0.2151%0.5279%0.6636%
max1.2602%1.5797%1.7068%
average0.7459%0.9101%1.0427%
stdev0.3686%0.2880%0.2888%
median0.6937%0.8332%0.9578%
Newey-West Adjusted t-statistic (lag 19)4.972010.847913.0451
p-value (1T Asymptotic Z-test)0.0000%00
p-value (1T Finite-sample t-test, df=37)0.0016%00
count = 28P3(B&H)-P1P3(R1)-P1P3(R5/25)-P1
min0.3181%0.7768%0.8371%
max1.5318%2.0204%2.0332%
average0.9546%1.2378%1.2890%
stdev0.3625%0.3537%0.3458%
median0.9580%1.1395%1.1999%
Newey-West Adjusted t-statistic (lag 19)7.621413.908415.1483
p-value (1T Asymptotic Z-test)000
p-value (1T Finite-sample t-test, df=37)000
count = 28P4(B&H)-P1P4(R1)-P1P4(R5/25)-P1
min0.4438%0.9219%0.9601%
max1.3093%2.3794%2.3881%
average0.8757%1.4420%1.4484%
stdev0.2720%0.3883%0.3889%
median0.8543%1.3010%1.3101%
Newey-West Adjusted t-statistic (lag 19)9.958317.094916.4076
p-value (1T Asymptotic Z-test)000
p-value (1T Finite-sample t-test, df=37)000

A common criticism of rolling-period analysis is that the observations are highly overlapping. For example, the periods 1970-1989 and 1971-1990 share 19 of their 20 years of returns. This overlap creates serial correlation and can artificially inflate conventional t-statistics.

To address this issue, all significance tests were computed using Newey-West adjusted standard errors with a lag length of 19 and 29 years respectively. This adjustment is specifically designed to account for the autocorrelation introduced by overlapping rolling windows.

The results remain remarkably strong.

Across all 20-year rolling return tests, Newey-West adjusted t-statistics ranged from 3.28 to 8.51. Across all 30-year rolling return tests, they ranged from 4.97 to 17.09.

Of the 18 rolling-return tests reported above (9 twenty-year tests and 9 thirty-year tests), every single Newey-West adjusted t-statistic exceeded 3, and most exceeded 5.

The corresponding p-values were extraordinarily small. In most cases the probability that the observed outperformance arose from random chance was effectively zero.

A Possible Risk-Based Explanation

A natural question is: if regional-slice portfolios historically outperformed VT, why should that premium exist at all?

One possible answer is that investors in VT are purchasing a form of insurance.

A market-cap weighted portfolio continuously updates its holdings to reflect the market’s collective judgment about the future. As regions grow in economic importance and market capitalization, VT automatically increases exposure. As regions shrink, VT reduces exposure. In effect, investors are outsourcing the problem of forecasting the future to the market itself.

Regional-slice portfolios take a different approach. By maintaining fixed regional allocations or rebalancing back to predetermined targets, investors are implicitly making an active bet that today’s market-cap weights may not perfectly predict tomorrow’s winners. They are willing to tolerate larger deviations from the global market portfolio in exchange for the possibility of capturing a rebalancing premium, a mean-reversion effect, or a momentum effect.

Historically, that willingness was rewarded. But the reward was not free.

The risk is that future economic leadership may undergo a permanent structural shift that market-cap weights correctly anticipate and regional-slice portfolios fail to recognize. In that scenario, VT’s continuous adaptation would be an advantage rather than a drag.

Viewed through this lens, the historical premium documented in this series may not represent a free lunch. Instead, it may be compensation for accepting the risk that the future will look very different from the past.

An Alternative Explanation: The Equity Issuance Effect

Another possible explanation involves the way market-cap weighted portfolios respond to new equity issuance.

When companies issue new shares, the supply of equity increases. Because market-cap weighted funds allocate capital based on market capitalization, they automatically direct additional capital toward regions and companies that have expanded their equity base.

This mechanism helps channel capital to growing firms and economies, which is one of the strengths of market-cap weighting. However, it also means that VT investors systematically absorb a large share of new equity issuance.

A regional-slice portfolio is less sensitive to these flows. Its allocation is determined primarily by regional targets rather than by changes in the global supply of publicly traded equity.

As a result, regional-slice investors may avoid some of the dilution associated with persistent net share issuance while still benefiting from the economic growth generated by the capital that newly issued equity helps finance.

Viewed from this perspective, part of the historical return gap may represent compensation earned by investors who were less willing to continuously fund the expansion of the global equity supply.

Whether this effect is large enough to explain a meaningful portion of the observed premium remains an open question, but it provides another potential mechanism linking portfolio construction rules to long-term returns.

First, the calibration test showed that the underlying data faithfully reproduces a global market-cap weighted portfolio. A VT-style portfolio and an annually market-weight rebalanced VTSAX/VTIAX portfolio exhibited a 99.76% correlation, indicating that the historical dataset and portfolio construction methodology are internally consistent.

Second, the rolling-return analysis demonstrated that the advantage of regional-slice portfolios is remarkably persistent. Across 38 twenty-year rolling periods and 28 thirty-year rolling periods, regional-slice portfolios outperformed VT in the overwhelming majority of cases. Average annual excess returns ranged from approximately 0.7% to 1.4%, depending on the portfolio construction and rebalancing method.

Third, and most importantly, the results remain statistically significant even after accounting for the overlapping nature of rolling windows using Newey-West adjusted standard errors.

Across all 18 tests presented in this article:

  • Every Newey-West adjusted t-statistic exceeded 3.
  • Most exceeded 5.
  • Several exceeded 10.
  • The strongest results exceeded 17.

To put these numbers into perspective, a t-statistic above 2 is typically considered statistically significant in academic finance. Many widely discussed market anomalies were originally published with t-statistics in the 2 to 4 range. By comparison, the results observed here are substantially stronger.

This does not prove that regional-slice portfolios will outperform VT over the next decade, nor does it guarantee that the historical relationship will persist indefinitely.

The evidence is consistent with a structural explanation.

A continuously market-cap weighted portfolio such as VT systematically increases exposure to regions that have become larger and decreases exposure to regions that have become smaller. A regional-slice portfolio breaks that link. Depending on whether the investor chooses buy-and-hold or disciplined rebalancing, the portfolio gains exposure to either a momentum effect, a contrarian rebalancing effect, or a combination of both.

For decades, investors have debated active versus passive investing, factor investing versus indexing, and domestic versus international diversification.

The results in this series point toward a different question:

Should investors pay more attention not only to what they own, but also to the weight-update engine that determines how their portfolio evolves through time?


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *