In the #NoEstimates conversation, the term empirical data is used as a substitute for Not Estimating. This notion of No Estimates - that is making decisions (about the future) with No Estimates, is oxymoronic since gathering data and making decisions about the future from empirical data is actually estimating.
But that aside for the moment, the examples in the No Estimates community of empirical data are woefully inadequate for any credible decision making. Using 22 or so data samples with a ±30 variance to forecast future outcomes when spending other peoples money doesn't pass the smell test where I work.
Here's some sources of actual data for IT projects that can be used to build Reference Class that have better statistics.
- Nederlandse Software Metrieken Association
- International Software Benchmarking Standards Group
- Common Software Measurement International Consortium
The current issue of ORMS Today has resources as well ORMS can be obtained for free. There are several professional societies that provide guidance for estimating
Are two I participate in.
As well I have a colleague, Mario Vanhoucke, who speaks at our Earned Value Management conferences, whose graduate studies do research on project performance management. A recent paper, "Construction and Evaluation of Frameworks for Real Life Project Database," is a good source of how to apply empirical data to making estimates of outcomes in the future. Mario teaches Economics and Business Administration, at Ghent University and is a founder of OR-AS.
All of this is to say, using empirical data is necessary but not sufficient. Especially when the data being used if too small a sample size, statistically unstable, or at a minimum statistical broad variances. To be sufficient, we need a few more things:
- The correlations between the data samples as the evolve in time. This is Time Series Analysis.
- sample sizes sufficient to draw variances assessment of the future outcomes.
- A broader Reference Class basis, than just the small number of samples in the current work stream. These small samples can be useful IF the future work represents the same class of work. This would imply the project itself is straightforward, has little emergent risk (reducible or irreducible), and we're confident not much is going to change. Without those assumption the statistics from those 20 or so samples should not be used.
What's Next?
Starting with empirical samples to make estimates of future outcomes is call Estimating. Labeling it as No Estimates seems a bit odd at best.
With the basic understanding the empirical data is needed for any credible estimating process, look further into the principles and practices of probabilistic estimating for project work.
This, hopefully, will result in an understanding of sample size calculations to determine the confidence in the forecast as a start.