Bill Duncan posted a piece on PM Student titled Estimating Effort. It's a multiple part series that addresses some of the issues around estimating cost and schedule in a project context. Multipart pieces are ineffective in the medium of a Blog. But Project@Work did this all the time, so we'll have to wait for the next installation to discover the real meat. In the mean time, I've skipped to the end in the best Atkinson tradition.
In the first installment Bill states:
This is the case in most project management domains, especially IT. An estimate is defined as
The example of "guessing" the gender of a 6'6" person and a 5'2" standing outside your door is slightly off, since the variability involved in estimating are not in place in this example. The variability between two people standing outside the door in terms of the gender is 0. The persons are either male or female.
This introduces the misunderstanding common in all naive estimating processes of confusing statistics with probability. Several posts in the past are background
Understanding the differences between probability and statistics is critical to understanding how to make credible estimates of cost, schedule, technical performance, and finally project risk from these variables. What the project stakeholders want to know (or should want to know) is: What's is the confidence we'll come in on-time, on-budget, and the project will produce working results (be they products or services).
The primary difficulty comes with the confusion between statistics and probability. The questions that need answering in project work are:
- What is the underlying statistical behaviors in the estimate of cost, schedule, or technical performance?
- What is the confidence level of the estimate?
One question is statistical, one is probabilistic. Let's look at an example.
The real questions we need answered are:
- What is the probability that the cost of the project or any portion will be some value or less?
- What is the probability that the project will complete on or before some date?
The underlying variability of the schedule comes from the variability of the work efforts that make up the schedule. Here's a simple picture.
There are lots of subtle issues here about independence, interdependence, and other correlations behaviors in networks of activities who's behaviors are random variables. This by the way is why PERT estimates are consistently biased by as much as 27%.
So What's the Next Step in Building Credible Estimates?
- Understand the underlying statistical distributions of the random variables that make up cost and schedule. This means discovering there distribution function. Symmetric functions are NEVER the case in project work. The distribution function ALWAYS has a tail to the right. If you don't know the underlying statistics, the best guess of a value can be no more than a 50/50 guess.
- Build a model that can answer the question what is the probability that the cost will be X or less? or What is the probability that the project will complete on or before Y date?