Estimation is part of project management.
The most important estimates for the project manager are related to time and cost.
And remember you're High School Economics class - The Value of something cannot be determined alone, you need to know the Cost to acquire that Value and the Time that Value will be available for its use.
Since showing up late and over budget diminishes the value produced from the project.
Since it is easier to estimate small tasks, these estimates are often calculated and performed as point estimates, for example, a task will take 3 days. Or perhaps as an estimate with two-point ranges. A task will take between 2 and 5 days.
When there are a number of tasks, each with possible dependencies on other tasks there is a problem. You simply can't add up the duration and the ranges of those durations. Since each task may or may not be dependent on other tasks, and when one task is being modeled on the low end of its range, another task may be being modeled on its upper range of possible values.
When we hear the term Monte Carlo we think of the gambling center in the country of Monaco. The scientific study of probability concerns itself with the occurrence of random events and the characterization of those random happenings. Gambling casinos rely on probability to ensure, over the long run, that they are profitable.
For this to happen, the odds or chance of the casino winning has to be in its favor. This is where probability comes into play because the theory of probability provides a mathematical way to set the rules for each one of its games to make sure the odds are in its favor. As a simulation technique, Monte Carlo simulation relies on probability. [10] [18]
Monte Carlo simulation, also known as the Monte Carlo method, originated in the 1940s at Los Alamos National Laboratory. Physicists Stanislaw
Ulman, Enrico Fermi, John von Neumann, and Nicholas Metropolis had to perform repeated simulations of their atomic physics models to understand
how these models would behave given a large number of uncertain input variable values. As random samples of the input variables were chosen for each simulation run, a statistical description of the model output emerged that provided evidence as to how the real-world system would behave. That real-world system was the first atomic bomb. [13]
It is the repeated random sampling model of the input variables over many simulation runs that defines Monte Carlo simulation. The result is an artificial world (model) that is meant to closely resemble the real world in all relevant aspects. [8]
So before we proceed, let's look at a definition of Monter Carlo Simulation in the project domain, so we don't have to decipher someone else's definition designed to obscure the actual definition.
A Monte Carlo Simulation is “a problem-solving technique used to approximate the probability of certain outcomes by running multiple trial runs, called simulations, using random variables.” ... It then simulates the completion of remaining work and produces a histogram showing the distribution of possible delivery dates.
The Monte Carlo Method
Monte Carlo Simulation has four steps, no matter the domain or the problem:
- Define the distribution of possible inputs for each input random variable.
- Generate the inputs randomly for those distributions.
- Perform the deterministic computation using that set of inputs.
- Aggregate the results in the individual computation into a final result.
This is the simple but powerful process of Monte Carlo Simulation that is universally applicable from development projects to nuclear physics (my original domain) to molecular plant biological processes (our son's domain), to financial planning (our broker's domain), to the modeling of human behaviors of special needs clients (our daughter's domain).
This approach allows you to model systems in the future using past data OR using models of what the future might be if the system hasn't been done before.
The notion that Monte Carlo Simulation cannot be applied to a single project is simply wrong.
Let's look further.
What Is Monte Carlo Simulation of Projects
Monte Carlo Simulation started with Buffon's Needle Problem which says...
Let a needle of length L be thrown at random onto a horizontal plane ruled with parallel straight lines spaced by a distance d from each other, with d > L. What is the probability p that the needle will intersect one of those lines?
Monte Carlo simulation on projects examines all paths through the network of activities or all possible states of the project for the duration, cost, and risk that create impacts on duration and cost.
It provides an accurate estimate (within the confidence intervals) for the overall duration of the project schedule for that work and the impact of risk on that cost and schedule.
As well it provides a sensitivity analysis for all the interacting tasks
Let's look at a notional project, where the tasks are interconnected and dependent - predecessor and successor relationships with each other like the project below.
Each work activity in a discrete model will have an estimated duration - a scalar number, usually measured in days. Since all projects operate in the presence of uncertainty, this deterministic duration is not likely to have much credibility in actual use. For traditional projects, a Monte Carlo Simulation creates a list of durations from the Probability Distribution of a specific duration for a specific task.
This probability distribution can be built from past data for similar work, like the PDFs shown above, which have different shapes depending on the type of work. Or it can be pre-defined for a shape and an upper and lower range for that shape. In the simplest approach when we know little about the past performance for the work a Triangle distribution provides a confidence that isn't too optimistic on the lower bounds and too pessimistic on the upper bounds. Using this past data is called Reference Class Forecasting [14].
When there is no past data available, a second approach can be used. This involves ranking to ranges of the most likely value for the variable. Here's an example for a spacecraft with ranking ranges around the most likely using a Triangle distribution.
In this case, the Business as Usually ranges is -5% for low and +10% for a high around the most likely value for the duration, cost, or some technical performance parameter. The business as usual, but with some technical processes get a bit wider. Flight software is always an issue where we work, so those ranges are wider even more. Putting all the components together into a working system is fraught with uncertainties, so a wider range is used. Getting the software qualified is about the same variability as getting it certified, so the same range is used. But the big problem comes with the spacecraft goes to the Thermal Vacuum chamber. That is modeled as -5% to +175%.
These assignments are made initially during the proposal, then updated monthly for the reality of the project's performance. The proposals I work usually require an 80% confidence of an on or before for schedule and an at or below for cost. Monte Carlo Simulation tools are the heart of this work
This is a Closed Loop Control Systems for managing the performance of a Software Intensive System of Systems, all developed using Scrum.
The triangular distribution can be used for when we have no idea what the distribution is but we have some idea what the minimum value is for the variable, the maximum value for the variable and what you think the most likely value is. The Triangle distribution is a good place to start since it models the log-normal distribution which is found for many naturally occurring processes. The uncertainties for the work effort on projects is a naturally occurring process derived from the Aleatory uncertainty processes of humans working on technical processes.
When you run the Monte Carlo Simulation tool (Risky Project in this case), you get a chart that looks like this. This chart is the result of the MCS for a complete project, that is for the end of the project. A similar chart can be produced for any specific task in the project with the same results.
The chart shows the probability distribution of the completion dates for the project, when the durations for all the work on the project are defined as most likely value, the upper and lower limits of those durations, and the PDF for the curve that the Monte Carlo tool is going to produced samples from - this is usually a Triangle curve for me, since it gives us credible values with the least amount of work.
Let's Look At Some Myths of Monte Carlo Simulation
Here's are some common myths, misunderstandings, and willful ignorance of the use of probability and statistics, which is the basis of Monet Carlo Simulation, when making informed decisions of how to manage a software project
Statistics don't apply to single events. Stats don't make sense in a single event
In the presence of uncertainty, a single event in the future - like the delivery date or the cost to deliver that outcome, Monte Carlo Simulation is THE tool to be used to develop a confidence and accuracy model of that future event. All that is needed is to know...
- The Most Likely value that event could take on. If you have No idea what that most likely value might be, there are several ways to come up with that answer, starting with wideband Delphi. Here an example of How To Estimate Almost Any Software Deliverable in 90 Seconds.
- Then use a simple Monte Carlo tool that you can find on the web for Excel. RiskAmp is one I like.
- Then you'll be able to show the probability of occurrence for the variables range of possible values and be able to debunk that statement since Statistics DOES apply to a single event when you can answer the question
What's the probability of completing this work on or before the due date, given I know something about what the work entails, what the dependencies of the work are, and what are the ranges of the random variables that drives the work.
If you don't know the answers to these broad statistical questions, your project is doomed before you start, assuming you actually have a due date, a not to exceed budget, and a minimum set of capabilities for that time or budget. If you don't have those, we'd call that a de minimis projects - meaning no one cares when you show up, how much it costs, or what you'll deliver. Nice work if you can get it.
Here's an example from an actual project - the Wright Brothers Army contract for the Wright Flyer
Here's there schedule (this is the MSFT Project version, but we had access to the Smithsonian archives and recreated this schedule from their notebooks).
From the notebooks, they made estimates for the reducible and irreducible uncertainties to be assured they could meet the contractual dates
Using an estimating technique of their own, we've recreated a Monte Carlo Simulation of the cost and schedule targets in the contract. Here's the cost model, given the schedule, and the cost loaded activities, with a single upper/lower range of -10%/+20% (we were lazy). With that you get a confidence (an 80% probability referred to a P80) that the work will come in under $20,000 with some assumptions should in the picture below
Orville and Wilbur needed to show up on time as well as on budget. They had to deliver the Flyer on or before September 30, 1908. So they needed a schedule with enough margin to protect that date. The Monte Carlo Simulation of the MSFT Project schedule taken from the work in the notebooks showing they understood the notion of schedule margin. They also understood the principles of Systems Engineering. Here's a paper (you need an INCOSE membership) on "The Concepts of Systems Engineering as Practiced by the Wright Brothers."
So when you hear statistics can't be applied to a single project it is simply NO True.
Here's some more background on Wright Brothers Overview of the Wright Brothers Innovation Process that debunks the myth that only in the modern world are their innovators.
Monte Carlo and Agile Development
There are a number of Monte Carlo Simulation tools for agile software development when you don't have an Integrated Master Schedule, with planned durations, and ranges of values
- Start with Troy's book below and download the Excel simulator [12]. There is a download section at the website
- Jira has a number of plugins for Monte Carlo Simulation. My favorite is Agile Monte Carlo. But there are other in the Jira Marketplace. If you set up Jira properly, and capture the estimating data, from Product Backlog, Release Plans, Tee Shirt Sizes to Hours planned for developed and Hours actually performed for development this tool will provide you an Estimate to Complete using Monte Carlo. Now if you don't use Jira properly - well then you get what you deserve.
- VersonOne has a portfolio item Monte Carlo Simulation dashboard. Again you've got to use the tool properly to get any value out of this dashboard.
There are several issues with apply Monte Carlo Simulation to Agile projects.
- First is there is no schedule in the sense of a Gantt chart, with tasks arranged in the sequence needed. The is work, contained in Sprints, but those are not the same
- So what is it that the Monte Carlo Simulator is simulating
Here's a classic estimating fallacies
My proposal is don't estimate. Stipulate. If I say that Feature A will be available in 1 month. I'll make sure that I have a working version of it in 1 week. So that 1-month from now, I'll have something at least. Then slice appropriately.
How does this person know they can get the work done in one week, one month, 6 months? Any uncertainties involved? With no model of the work, the productivity needed to produce the outcomes, no model of the uncertainties and the impact of those uncertainties, that statement is simply nonsense
”80% confidence of on-or-before” is a meaningless term for a single project. It means “If we carry out this exact project a statistically significant number of times (>20 say) then 80% of those will be within this date.” But we will carry it out exactly once. Ever. Statistics matters.
... any Monte Carlo simulation has parameters that are guesses, with probability distributions that are more guesses.
Yes, statistics matters. Monte Carlo Simulation can be applied to work that has not yet been performed if you have some sense of the most likely effort and the range of possible efforts. If you don't have either, why are you spending your customer's money, unless it is to generate those values?
If you're guessing, I'd suggest that those paying you hired to the wrong person to spend their money. There are numerous databases with reference class data for most of the software problems on the planet. Now you may have to pay to get access, but don't guess learn how to make informed decisions with good estimating processes
Monte Carlo Simulation is a good approach to estimating in the presence of uncertainty.
From the same author, here's another
Monte Carlo predicts a probability distribution for a number of future trials. We are using it to estimate the result of a single trial.
That is not how Monte Carlo Simulation is used. MCS provides a probability distribution of the occurrence of all the possible outcomes from the model. In project work, this model is a network of activities, with durations and upper and lower limits on those durations. Then MCS can tell you what outcomes have what probability of occurrence.
Like the chart above produce by Risky Project, there is a 52% chance the cost will be less than 390.15 thousand dollars or there is a 51% change to duration will be less than 5 days. The Risky Project tool predicts these values from a number of samples of the work that drives that outcome. The notion of a single trial is NOT how Monte Carlo Simulation works, now could it work that way, nor does it work that way.
Another example of misunderstanding how the tool works, either because of lack of knowledge and experience or willful ignorance
Bibliography
"Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it. When we enquire into any subject, the first thing we have to do is to know what books have treated of it. This leads us to look at catalogues, and at the backs of books in libraries."
— Samuel Johnson (Boswell's Life of Johnson)
[1] "Examining the Value of Monte Carlo Simulation for Project Time Management," Goran Avlijas. Management: Journal of Sustainable Business and Management Solutions in Emerging Economies
[2] "Introduction To Monte Carlo Simulation," Robert L. Harrison, AIP Conference Proceedings, January 5, 2010, 1204, pp. 17-21.
[3] "Adding Probability to Your 'Swiss Army Knife'", John Goodpasture, Proceedings of the 3oth Annual Project Management Institute, 1999 Seminars & Symposiums, October 1999.
[4] "Monte Carlo for Newbies," Simon Leger, QuantLabs
[5] "Monte Carlo Methods for Absolute Beginners," Christophe Andrieu, Advanced Lectures on Machine Learning, 2003
[6] "The Monte Carlo Method," Nicholas Metropolis and Stan Ulam, Journal of the American Statistical Association, Vol. 44, No. 247, September 1949, pp. 335 - 341
[7] "Fuzzy Monte Carlo Simulation and Risk Assessment in Construction," N. Sadeghi, A. R. Fayek, and W. Pedrycz, Computer-Aided Civil and Infrastructure Engineering, 25 (2010) 238–252
[8] "The Principles of Monte Carlo Simulation," University of Alberta
- Lecture One: Overview
- Lecture Two: Probability Distributions
- Lecture Three: Statistical Models and Stationarity
- Lecture Four: Monte Carlo Simulation
- Lecture Five: Dependence and Multivariable Distributions
- Lecture Six: Problem Formulations, Implementation Details, and Validation
- Lecture Seven: Transfer of Uncertainty
- Lecture Eight: Decision Making
[9] Essentials of Monte Carlo Simulation: Statistical Methods for Building Simulation Models 2013th Edition, Nick T. Thomopoulos, Springer, 2013
[10] Modeling and Simulation Fundamentals Theoretical Underpinnings and Practical Domains, edited by John A. Sokolowski and Catherine M. Banks, John Wiley & Sons, 2010.
[11] Monte Carlo Methods, Second, Revised and Enlarged Edition, Malvin H. Kalos and Paula A. Whitlock, Wiley-Black Well, 2008
[12] Forecasting and Simulating Software Development Projects: Effective MOdeling of Kanban & Scrum Projects using Monte-carlo Simulation, Troy Magennis, www.focusedobjectives.com
[13] "Stan Ulam, John Von Neumann, and the Monte Carlo Method," Roger Eckhardt, Los Alamos Science, Special Issue, 1987, pp. 131-143.
[14] "Reference Class Forecasting: Resolving Its Challange to Statistical Modeling," Robert F. Bordley, American Statistical Association November 2014, Vol. 68, No. 4.
[15] "Introduction to Monte Carlo, Astro 542," Princeton University, Shirley Ho.
[16] "Monte Carol Methods," Dirk P. Kroese, The University of Queensland and Reuven Y. Rubinstein, Technion, Israel Institute of Technology
[17] Evidence-Based Software Engineering: Based in the Publically Available Data, Derek M. Jones
[18] "Calibration and Validation of the SAGE Software Cost/Schedule Estimating Systems to United States Aur Force Databases," David B. <arzo, Captin, USAF, AFIT/GCA/LAS/97S-6
[19] "Estimating Total Program Cost of a Long-Term, High-Technology, High-Risk Project with Task Duration and Cost That May Increase Over Time," Maj Roger T. Grose and Dr. Robert A. Koyak, Military Operations Reseach, V11, N4, 2006.
[20] "Parametric Quality Metrics for Evolutionary Software Development Models," Walker Royce, TRW Space and Defense Sector, Redondo Beach California.
[21] "Empirical Cost Estimating Tool," Dr. Johnathan Mun and Dr. Thomas Housel, Naval Postgraduate School, 17 October 2016.
[22] "Agile Estimation with Monte Carlo Simulation," Juanjuan Zang,