This is a quote from Twitter today that says
Various projective practices upon trending have been used to forecast progress ... These have proven useful. However, these do not replace the importance of empiricism. In complex environments, what will happen is unknown. Only when what has already happened may be used for forward-looking decision-making.
This, of course, would mean that climate models, compressible fluid flows, stock markets, automatic landing systems, fracture mechanics, any closed loop process with stationary or non-stationary stochastic behaviors would NOT be possible. Here are two domains I'm intimately familiar with from past lives.
- Multidimensional integrations of statistical mechanics process in physics.
- Simulation of stochastic phenomena - adaptive flight controls with 6 degree of freedom (6DoF) models in the presence of turbulent environments.
The original author of that opening quote may not be familiar with either these or the general principles of modeling stochastic processes. One of the current non-stationary stochastic processes I work with is Software Intensive System of Systems (SIS0S).
Here are the elements of this approach that are applicable to all stochastic domains, including SISoS projects.
- Random variables - these come from the underlying processes of writing software for money. These variables are in three major classes.
- Cost
- Schedule
- Technical Performance
- Our observation of these random variables are impacted by uncertainties. and uncertainties come in three forms:
- Aleatory Uncertainty (alea is a single die in Greek) - the tossing of dice is an aleatory process. The duration of software development work is an aleatory uncertainty process. The only protection from the impacts of Aleatory uncertainty is with margin.
- Epistemic Uncertainty (επιστημονικές) - is lack of knowledge about some variable under observation. This uncertainty can be addressed by gathering more knowledge. Build a prototype, run a test, buy two in case one breaks, sample the produced work on fine-grained boundaries to see if we're making progress to plan - Scrum.
- Ontological Uncertainty - this is uncertainty about things we can know nothing about. The unknowable.
- Probability density function - define the probability, p(x), when a variable is sampled, that the value x will be the result. For a large number of samples, p(x) = (Number of samples with result x) / (Total number of samples).
- The probability that the reading of a single die is a 1, p(1), is 1/6.
- The Expectation (Mean) of a random variable is
There is more about these random variables in any good probability and statistics book. Which any competent developer wishing to learn can find at the bookstore.
We can simulate a simple random process, like a drunkard walking down the street with a simple piece of code and the plot of its output
This simple and simple-minded example of modeling the behavior in the future from a model of the possible behaviors. No past empirical data is needed in this example.
There are other examples where empirical data is very useful.
If we have a model we'd like to use to explore what could be the outcomes of the model, having data from the past is useful. Climate and Weather models as a good example. I live in the town where those models are created. NOAA and NCAR are climate and weather centers and the models run on a supercomputer in Wyoming at the Computational and Information Systems Laboratory. It's a very cool place, literally, since the cooling is provided by the ambient outside temperature of Cheyenne Wyoming. Several of my neighbors work there and our evenings at the club sometimes revolve around comparing modeling techniques between climate and complex projects with emerging requirements and semi-chaotic process.
Those models use past performance of the weather and climate to validate the model in its work of forecasting, predicting, estimating future performance of the climate and weather. By the way, anyone claiming you can't predict the weather more than a few days ahead needs to do their homework on the NCAR and NOAA sites to learn that is simply a naive understanding of how weather and climate are forecast. This is a common excuse used by anti-estimating advocates to not learn how to estimate.
So Now Modeling Projects
If we want to model a project to assess the probability of completing on or before an needed date, at or below a needed cost, and assess the probability of showing up with the needed features, we need a model. That model can most certainly be informed by past performance.
This is called Reference Class forecasting
One of the originators of Reference Class Forecasting has a nice paper From Nobel Prize to Project Management. May I strongly suggest that anyone conjecturing you can't estimate future outcomes of projects read that paper as well as many of the other papers and books found here for Estimating Software Intensive System of Systems. And if you come across someone, like the original poster at the beginning of this blog who can't quote at 6 to 10 of those papers, you'll then known that person is uninformed (possibly willfully) about how estimating is used to manage software projects in the presence of uncertainty.
For traditional projects, estimating future cost, schedule, and technical performance is standard practice. Many times a mandated standard practice in some of the domains I work. Doing the same for Agile projects is straightforward as well. Many of the papers in the references just above speak directly to how to do that.
Simply Google monte carlo simulation of agile software projects and you'll be on your way to understanding how this is done, and also on your way to learning that those claiming it can't be done have not done their assigned homework and their advice can't be considered credible without having a basis of understanding beyond personal anecdotes showing they could not make it work.