The current Flap over the Public Debt and Growth paper originally produced by Reinhart, Carmen M, Vincent R Reinhart, and Kenneth S Rogoff. 2012. Public Debt Overhangs: Advanced-Economy Episodes since 1800. Journal of Economic Perspectives 26, no. 3: 69-86, brings out some core issues we face in the management and assessment of complex projects using performance data.
In the original paper - which has now been updated, there are several key words that trigger red flags in any credible statistical analysis of dynamic systems like the economy or complex programs. There is an assessment of the original paper. Let's start with the response to the original paper from the original authors addressed to the author of the critique.
This brings us to the core conceptual issue, which Herndon, Ash and Pollin argue greatly biases our results. They argue that we (they) use an “unconventional weighting of summary statistics.” In particular, for each bucket, we (they) take average growth rates for each country and then take an average of the result.
- Averages without variances are not very useful. We all know the story of measuring the most likely temperature in two locations Cody Wyoming and Trinidad-Tobago and the misrepresentation of Averages.
- Since economies and projects are driven by stochastic processes, we must model their data as random.
- Everyone claiming to forecast the future or make correlations between random processes must have on their shelf and have demonstrated to have read How to Lie with Statistics, Darrell Huff.
So ignoring for the moment the very naive statement by two Harvard professors of averaging the averages in the absence of the variances, and then making projections of the correlations, there are other problems.
- Using Excel for modeling can certainly be done, but there are much better ways to model statistical and probabilistic processes - R for one. MathLab is another.
- Excel also suffers from the in auditability problem. If you build a spreadsheet and send it to me to make management decisions, I have to look at every cell to determine the right formula, right range for that formula, and right data is in the right rows and columns for that formula to work. A pain in the ass for simple spreadsheets. A nightmare for larger ones. This is simply bad modeling. Especially since R is free and an academic version of MathLab or Mathematica is under $300. Come on Harvard professors, earn your reputation.
So What Does This Mean for Project Performance Analysis?
There are pictures in the links above to show that projects are statistical processes. So here's the bottom line:
- All numbers in projects and econometric models are random numbers.
- Correlations between the source of these numbers may also be random or at least changing over time - nonstationary stochastic processes.
- Models that don't consider this information are probably not that useful and may actually - and I'll be crass here - BOGUS.
- If we are ever to produce credible models of how projects work - cost, schedule, risk, and technical performance models - they must be credible statistical models. That means averages and averages of averages are simply not allowed without the variances and even higher order moments and the correlations between the generating functions of these random variables.
So however, this turns out with the Bad Math people at Harvard, we've got to do better.
- The current Earned Value Management processes don't consider the statistical nature of the performance indices. This is bad. Here's a simple and very understandable paper "Performing Statistical Analysis on Earned Value Data."
- As well, the calculations of future performance use cumulative numbers which wipe out the varainces, the current period which is a point sample to compute the Estimate At Completion. This is not only bad it is naive math.
- And then we are surprised when things don't turn out as expected.
Some final thoughts
- Correlation is not causation.
- All numbers are random numbers drawn from a known or possibly unknown population. If you don't know the population statistics your assessment of the numbers is different than if you do.
- All random numbers need variance, standard deviation, and higher order moments to be considered credible as sources of any analysis.
- Drawing a picture of the average of anything is very sporty without the proper statistical analysis.
- Read Huff, reread it, read it all the time. Also go buy a good statistics and analysis book. Start with Advanced Statistics Demystified.
- If there is an academic paper that is being used for public policy has not been peer reviewed, and then throw it away. The Reinhard and Rogoff paper is the basis of Paul Ryan's economic plan. He may have really good points, but it built on sand.
- Same advice from self-proclaimed experts in anything, especially project management. If there is a PhD thesis that has ZERO references in CITESEER, ignore it.
- Read more in the linked references below to see how bad statistics can go even more bad.
- Download R, download the free books, learn how to think and act with credible statistics. Learn how many sample you need, learn how to assess the needed population statistics confidence levels, learn how to make decisions based on confidence levels not absolute numbers. We have a 70% confidence of completing on or before a date, or we have a 70% confidence or completing at or below of specific cost must be the answers when management asks about cost and schedule performance.