A Statistically Well-Behaved Transformation of P/E for Growth–Value Inference
Why inference on P/E fails for a classical testing method — and what changes when we test valuation on a stable scale or use a non-parametric test.
Abstract: This note examines whether large cap growth and value portfolios differ in valuation when measured using the conventional P/E ratio versus a modified statistic that is always defined and statistically well-behaved. The two portfolios are constructed from the top 25 constituents of VOOG and VOOV, respectively. The analysis presented here involves the statistical significance of differences in valuations of the two portfolios and whether the results of statistical tests depend on idiosyncrasies of the valuation variable or the choice of statistical test.
INTRODUCTION
A previous post demonstrated that the price–earnings ratio, the most familiar valuation statistic used in industry practice, is substantially affected by cases where earnings are negative or small. The PE ratio is undefined for negative earnings and an outlier, subject to large changes in response to a small change in earnings when earnings are small.
The previous analysis also demonstrated that a transformation of the traditional PE ratio was defined for all values of earnings and did not become an outlier or unstable at small levels of earnings.
The follow-up analysis presented here considers the consequences of conducting formal statistical tests on differences in the valuation of two portfolios when using the traditional PE ratio and the transformed or modified PE ratio.
METHODS
The alternative valuation statistic considered in this paper is defined as the difference between price per share and earnings per share divided by price per share.
It can be written as S=(P-E)/P or S=1-(E/P) where P is price per share and E is earnings per share.
The transformation does not change the underlying economic information; it regularizes the statistic so that it is defined for all firms, including those with zero or negative earnings, and is not explosive when earnings are near zero.
The potential advantages of using the modified valuation measure S instead of the traditional PE ratio when conducting formal statistical test on the difference in the valuation of two portfolios is demonstrated by a test evaluating the statistical significance of valuation for two portfolios – one primarily growth stocks the other primarily value stocks.
The exact portfolios considered here are the top 25 holdings of the growth ETF VOOG, and the top 25 holdings of the value portfolio VOOV.
The following table reports the distributional diagnostics that explain why inference on raw P/E behaves poorly and why the transformed statistic S yields clear results. The P/E distributions for both VOOG and VOOV exhibit extreme skewness and heavy tails, and normality is decisively rejected. In contrast, the distribution of S is compact, nearly symmetric, and does not reject normality in the value sample — making standard inferential methods valid on that scale.
Notes.
Skewness measures asymmetry (0 = symmetric). Large positive skew in P/E reflects tail explosions near zero earnings.
Excess kurtosis measures tail weight relative to normal (0 = normal-like tails). P/E displays heavy outliers; SS does not.
Shapiro p-value tests normality; small p rejects. P/E rejects decisively; SS does not reject for VOOV and is borderline for VOOG, making parametric inference meaningful.
RESULTS AND INTERPRETATION
When inference is performed directly on P/E using classical parametric tests that assume approximate normality, what fails is not the economic hypothesis but the suitability of the statistic itself. A stable alternative statistic or a non-parametric testing framework is required to recover a clearer conclusion.
• Classical parametric tests (normality-based):
P/E shows only a marginal difference between VOOG and VOOV (Welch p≈0.089),
whereas the transformed statistic SS shows a decisive difference (Welch p≈0.00020).
→ Classical inference depends on the statistic: P/E fails, S succeeds at a 0.05 level of significance.
• Non-parametric tests (distribution-free):
Mann–Whitney yields the same significance level for both P/E and S (p≈0.00034p≈0.00034).
→ When inference does not assume normality, the underlying valuation gap is shown to be significant for both valuation measures. The non-parametric test gives the same p-value for both valuation measures.
When inference is performed directly on P/E, what fails is not necessarily the economic hypothesis but the suitability of the statistic or the suitability of the testing method.
The central tendency of the two portfolios differs markedly on the conventional P/E scale: the median P/E in the VOOG basket is approximately 38, while the median in the VOOV basket is approximately 19. The median stock level the growth-tilted portfolio trades at roughly twice the valuation multiple of the value-tilted portfolio.
The upper part of the distribution shows a similar spread. The 75th percentile P/E in the VOOG basket is approximately 52, compared with approximately 29 in the VOOV basket.
These differences in medians and upper-quartile values make clear that the economic gap is not subtle — growth names are valued substantially higher than value names even before any formal testing is applied.
What classical parametric testing on raw P/E fails to recover consistently is not this underlying pricing gap but the ability to make a reliable statistical statement about it. Once the statistic is placed on a stable scale or a non-parametric test is used, the statistical conclusion aligns with the economic magnitude already visible in the distribution itself.
DISCUSSION AND CONCLUSION
Two conclusions follow. First, the valuation gap between the VOOG-tilted and VOOV-tilted portfolios is real and statistically demonstrable once the statistic is placed on a stable scale. The “marginal” conclusion under P/E reflects the instability of the ratio, not the absence of a pricing difference. Second, the transformation to SS does not attempt to replace the interpretive content of P/E; it makes the same information compatible with inferential methods that assume continuity, finiteness, and absence of tail explosions.
Non-parametric methods provide one workaround by avoiding distributional assumptions altogether. The transformation to S provides another by making parametric inference valid without excluding loss-makers or applying ad-hoc trimming rules. In short, P/E remains useful for interpretation, but not for inference with classical tests.
This post is entirely free to all, but some material on other topics is behind a paywall.
Your modest paid subscription supports this work.
One month free:
https://bernsteinbook1958.substack.com/fb965b7d
50 percent off annual membership ($30 total.)
https://bernsteinbook1958.substack.com/4d9daaf9


