Had a conversation today with a colleague around probabilistic modeling of cost, schedule, and the impact on risk management. Lots of topics touched, but one was the concept of Merge Bias.
Merge Bias is the impact of parallel tasks on the probability of completing on or before a need date. I've talked in the past about Schedule Margin using Monte Carlo. The notion of a Deterministic versus Probabilistic Schedule is the starting point for Merge Bias.
In the management of projects, the important points along the path to completion are usually where several parallel paths come together. At these merge points, the paths must all be completed before a milestone can be achieved. An inspection is done, sub-assemblies integrated for testing, initial flight testing complete - activities like these.
The Fallacy of the Critical Path and PERT Methods
There are several critical factors in the use of standard scheduling techniques.
- The project duration calculated by the critical path method is accurate only if everything goes according to plan. This is rare in real projects.
- In many cases the completion dates the critical path method produces are unrealistically optimistic and highly likely to be overrun, even if the schedule logic and duration estimates are accurately implemented.
- The critical path method completion date is not even the most likely project completion date, in almost all cases.
- The path identified as the critical path using traditional critical path method techniques may not be the one that will be most likely to delay the project and which may need management attention. This mear critical path is the "killer" path not shown in static assessment methods.
Extreme Parallel and Serial Task Example
Here's an example where all the work is performed in parallel and all the work performed in series. This extreme example is used to show that when we encounter project schedules where merge points are used, there is a hidden effect - merge bias that will result in unpleasant outcomes if we don't understand and deal with the probabilistic aspects of this type of structure.
In the first figure all the work is in series. We plan to finish on 1/25/2013. The work is a series of 10 day activties that have a total of 100 days work.
When we run the Monte Carlo Simulation (MCS) using Risk+ and look at the final outcome, Series Tasks Complete, we get this curve.
This graph tells us there is a 55% probability of completing on or before our target date of 1/25/13. This is a cumulative probability curve for all the 1,000 samples taken in the model. That's OK, since there is a 90% probability of completing on or before 1/30/13, 5 calendar days after our target date. The schedule has no slack, so this is another topic about having margin in all schedules.
The extreme Parallel example looks like this. Here there are two sets of parallel activities, whose total durations to the same 100 days. How they do this is not an issue here, maybe less resources, maybe just different work periods, but the parallel aspects are the point of the model.
The MCS for this parallel case, for the same Period of Performance looks like this. The target date of 1/25/13 is not even in the realm of possibilities. 1/31/13 of only 5% probability. The 80% probability date - the common confidence level of a planned date is 2/13/13, 3 weeks late. This lateness is from the parallel aspects of the network of activities.
It doesn't mean don't have parallel work. It means that when there is parallel work, the standard PERT style estimating is no good. It means:
- The network should not be developed at too high a level of detail. This creates or may even mandate parallel flows.
- The schedule should show clearly the parallel paths that could cause the project to be late if not coordinated.
- The network should not be developed in such great detail that it requires too much information and create more opportunities for parallel flows.
Here's the killer concept that is little understood
When we think of statistical processes, say an estimate of cost, the variability of these numbers will tend to the Mean. That is, when there is an estimated cost with a higher than planned cost, that will be (or could be) off set by an estimate with a lower than planned cost. This is the notion of the Central Limit Theorem. Sets of random numbers tend to the mean when the sample size grows large.
In a schedule, when the work is in parallel, when the duration of a work activity along a path experiences a higher than planned result, and another work activity experience a lower than planned result, it is the higher than planned duration activities that impacts the schedule. If both are higher, than the highest one is the result. If both are lower, than the highest of the lower one is the result.
Without a Monte Carlo Simulation of the network any credible confidence in completing on or before at date with a budget at or below that planned value is not there.