When there are charts showing an Ideal line or a chart of samples of past performance - say software delivered - in the absence of a baseline for what the performance of the work effort or duration should have been, was planned to be, or even better could have, this is called Open Loop control.
The issue of forecasting the Should, Will, Must cost problem has been around for a long time. This work continues in DOD, NASA, Heavy Construction, BioPharma, and other high risk, software intensive domains.
When we see graphs where the baseline to which the delays or cost overages are compared and those baselines are labeled Ideal, (like the chart below), it's a prime example of How to LIe With Statistics, Darrell Huff, 1954. This can be over looked in an un-refereed opinion paper in a IEEE magazine, or a self-published presentation, but a bit of homework will reveal that charts like the one below are simply bad statistics.
This chart is now being used as the basis of several #NoEstimates presentations, which further propagates the misunderstandings of how to do statistics properly.
Todd does have other papers that are useful Context Adaptive Agility is one example from his site. But this often used and misused chart is not an example of how to properly identify problems with estimates,
Here's some core issues:
- If we want to determine something about a statistical process, we of course need to collect data about that process. This data is empirical - much misused term itself - to show what happened over time. A time series of samples.
- To computer a trend, we can of course draw a line through population of data, like above.
- Then we can compare this data with some reference data to determine the variances between the reference data and the data under measurement.
Here's where the process goes in the ditch - literally.
- The reference data has no basis of reference. It's just labeled ideal. Meaning a number that was established with no basis of estimate. Just this is what was estimated, now let's compare actuals to it and if actuals matched the estimate' let's call it ideal.
- Was that ideal credible? Was it properly constructed? What's the confidence level of that estimate? What's the allowable variance of that estimate that can still be considered OK (within the upper and lower limites of OK)? Questions and their answers are there. It's just a line.
We can use the ne plus ultra put-down of theoretical physicist Wolfgang Pauli's "This isn't right. It's not even wrong." As well the projects were self-selected, and like the Standish Report, self-selected statistics can be found in the How to Lie book
It's time to look at these sort of conjectures in the proper light. They are Bad Statisics, and we can't draw any conclusion from any of the data, since the baseline to which the sampled values are compared Aren't right. They're not even wrong." We have no way of knowing why the sampled data has a variance from the ideal the bogus ideal
- Was the original estimate simple naïve?
- Was the project poorly managed?
- Did the project change direction and the ideal estimate never updated?
- Were the requirements, productivity, risks, funding stability, and all the other project variables held constant, while assessing the completion date? if not the fundamental principles of experiment desgin was violated. These principles are taught in every design of experiments class in every university on the planet. Statistics for Experimenters is still on my shelf. George Box as one of the authors, whose often misused and hugely misunderstood statement all models are wrong, some are useful.
So time to stop using these charts and start looking for the Root Causes for the estimating problem.
- No reference classes
- No past performance
- No parametric models
- No skills or experience constructing credible estimates
- No experience with estimating tools, processes, databases (and there are many for both commerical and government software intensive programs).
- Political pressure to come up with the right number
- Misunderstanding of the purpose of estimating - provide information to make decisions.
A colleague (former NASA cost director) has three reasons for cost, schedule, and technical shortfalls
- They didn't know
- They couldn't know
- They didn't want to know
Only the 2nd is a credible reason for project shortfalls in performance.
Without a credible, calibrated, statistically sound baseline, the measurements and the decisions based on those measurements are Open Loop.
You're driving your car with no feedback other than knowing you ran off the road after you ran off the road, or you arrived at your destination after you arrived at your destination.
Just like this post Control Systems - Their Misuse and Abuse