There are two general types of schedules in the planning domain.
- Deterministic schedules - are networks of tasks connected to each other with dependencies that describe the work to be performed, that work's duration and the planned completion of the project.
- Each task has a planned duration
- Each task has a predecessor and a successor. The only task that should not have a predecessor is the start of the project and the only task without a successor is the completion of the project. This forms a closed network with no widows or orphans
- The longest path through the network is the critical path
- The total duration of the project is a fixed value - it is deterministic
- Probabilistic schedules - are networks with all the elements of a deterministic plan, but the durations of the tasks are random variables
- The duration are not random, but they are random variables drawn from a probability distribution.
- Three point estimates "can" be used to describe these random variables. It is not necessary to use the three point estimate approach, but it is a easy starting point
- The total duration of the project is a random number
With that simple definition, comes several problems:
- How to capture the three point estimates?
- What probability density function should be used to model the values in between the three point estimates?
- What values do the three point estimates actually represent?
What Values do the Three Point Estimates Represent
This is a critical question. The first naive approach is to say they are the "best case" and "worst case" values. This means they are the 0% and 100% end points of a probability distribution. This forces the probability distribution to be from the family of finite distributions. This is not really the problem. The problem is how these estimates are gathered.
If you ask some "tell the worst case estimate" they will bias this estimate by a personal risk avoidance factor. Field research has shown this is the case. A better approach works like this.
- Tell me the Most Likely duration of this task. This means that if I perform this same work in the same project 10 times, this is the duration I will see "most often." It is the Mode of all the possible durations for the task. Notice this is NOT the Mean duration. That is the average duration. It is the "most likely" duration.
- Tell me the duration of the task that will observed 1 time our of 10 if you performed this work 10 times. This is the 10% estimate. This says for 10 executions of this task, 1 out of those 10 times I'll see this task complete in this duration or less.
- Tell me duration of the of the task that will be observed 9 times out of 10 if you perform this work 10 times. This is the 90% estimate. This says for 10 execution of this task, 9 out of 10 times the task duration will be this value or less.
The picture below describes these cases for a Triangle distribution. The really nice thing about Triangle distributions is the Mean and the Mode are the same. This is NOT the case for other distributions, like Beta and BetaPERT.
With this information in hand (the three point estimates) the first inclination is to run the PERT tool in your favorite project management application. DO NOT do this. PERT has several significant problems. The first is it assumes the random variables are distributed under the Beta distribution. This may or may not be the case. Second is that PERT assumes a Standard Deviation has a single formula along with the Mean having a standard formula. The Mean is A+4B+C. This is almost NEVER the case. It gets worse from there.
One very wicked problem is called merge bias. This is an unfavorable bias introduced into the duration calculation when parallel tasks are "joined" (usually at a milestone). A Google search for "PERT Merge Bias" will find all the information you can probably stomach for some time.
Use Monte Carlo
Once you have the three point estimates use a Monte Carlo Simulation (MCS) tool on the network. There are many of these Risk+, @Risk for Project, Crystal Ball are ones I've used. There are others but make sure you know how to "trust" them by running a null network - Lo/Most Likely/Hi all set to the same value for 10,000 iterations and see if there is any difference from the statistical network and the deterministic network.
Once you get some experience with MCS you can start assessing the schedule in terms of probabilistic confidence. You're now doing probabilistic scheduling.
BTW probabilistic network have little use for Critical Paths. Since the duration of a task is a random variable following a known probability distribution the critical path changes depending on the sampled value of this random variable. The MCS tools have ways of showing sensitivity, critically, and cruciality of the various paths through the network. But a single Critical Path is not there.
This is about as detailed as it should get in a Blog level presentation. There is loads of detail that needs to be understood before probabilistic scheduling is used as a decision making tool. But this is the way large aerospace and construction jobs are planned. This is the way modern projects are planned. There are pit falls in this method, but the benefits far out way the little diversions along the way.