Came across this puzzling tweet today
... real empirical data & using probability to forecast are worlds apart. I don't buy "estimation uses data" argument.
This is always reminds me of Wolfgang Pauli's remark to a colleague who showed him a paper from an author who wanted Pauli to comment on...
Das ist nicht nur nicht richtig, es ist nicht einmal falsch!
It is not only not right, it is not even wrong
So let's Look At a Simple Forecasting Process
First forecasting is about the future. Estimates are about the past, present, and future. So estimates of future cost, schedule, and technical performance can be called forecasts.
A project's past performance data is a time series. Gather from things that happened in the past, based on intervals of times. These intervals should be evening spaced. They don't have to be, but that makes the analysis more complex. The example below is done with R is a statistical programming language found here www.r-project.org. R is used in a wide variety of domains. I was introduced to R through our son's work in cellular biology when he pointed out I'd get a D in his BioStats class he taught. Stop making linear projects, unadjusted for the variances of the past, and most of all unadjusted for variances created from uncertainty in the future. Come on Dad get with the program of making risk informed decisions. Here's a good reference on how to do this at a much broader scale Risk Informed Decision Making Handbook.
Below is a R plot from historical data of a project cost parameter, forecasting the possible values of this parameter to the future, using ARIMA (Autoregressive Integrated Moving Average). ARIMA is built into R - which can be downloaded for free. R and its statistical analysis capabilities are used in our domain to develop estimates. Using past performance - in the example below of cost index - we can forecast the eange and confidence on the bounds of that range - for cost index values.
The chart above is from the paper below.
- Do you have any experience forecasting future outcomes from past performance that is mathematically credible?
- Did you adjust your forecast for past variances?
- Did you adjust your forecast for future uncertainty?
No? Then it's unlikely your number will have any chance of being correct.
We Have No Empirical Data, Now What?
Here's a continuation of the Tweet stream
Have you ever been asked to estimate something and haven't got any empirical data. This happens all the time. New teams are put together in new domains and asked to estimate, which really means commit. I don't see too many managers gathering real data about their projects and using them to forecast lead times.
Here's the way to solve this non-problem, problem.
- Price Systems
- A Comparison of Parametric Software Estimation Models Using Real Data
- Parametric Estimating in the Knowledge Age (Young Kwak, was the editor of a book I contributed to)
- Software Cost Estimation: Parametric Models (DAU is the source of most things progam management in the DOD. Course resource lists book and tools).
- Parametric Estimating Handbook (ICEAA is a professional cost estimation organization)
- QSM
- SEER
- Center for Systems and Software (This is alma mater, MS Systems Management)
This is a short list, just from my office book shelf. The office library has dozens of other books and the files have many dozens of recent papers on estimating software in the absence of empirical data. Google will find you 100's more.
So here's the final outcome. Whenever we hear about the reason we can't estimate, it's simply not true, never was true, never will be true.