In a recent presentation, Tim speaks further about managing in the presence of uncertainty and the application of agile in software development. Plans NEVER go right, planning in presence of uncertainty, requires - DEMANDS actually - estimating risk, uncertainties, unknowns - on the project. When we hear about making decisions in the absence of estimating the probability of the drivers, impacts, and outcomes. As he says This is a crock just as the making decisions in the absence of estimating is a crock. Ignoring the probabilistic behaviour of impacts from the future - microeconomics - is as Tim suggests childish behaviour.

Managing in the Presence of Uncertainty

Here's an extract from a much larger briefing on managing complex, software intensive systems, in our domain - Enterprise IT. The critical issue here is uncertainty is always present. Failure to recognize this, failure to deal with it, failure to make decisions based on the underlying statistical and probabilistic aspects of this uncertainty is as Tim suggest childish.

If we're looking where we need estimates, look here potential - potential being something that might occur in the future. Potential cost, potential schedule, potential event.

Effective risk management - and therefore effective project management and effective value delivery - requires navigating through this causal chain, assessing the current potential for loss, and implementing strategies for minimizing the potential for loss. The next section builds on the concepts in this section by examining two fundamental approaches for analyzing risk.

There remains serious misunderstandings of how, why, when, and for what purpose estimates of cost, schedule and delivered capabilities are made in the development of software systems using other peoples money.

There are three distinct approaches to the problem:

Quantifying IT Forecast Quality, J. L. Eveleens and C, Verhoef, Science of Computer Programming, Volume 74, Issues 11–12, November 2009, Pages 934–988.

The first paper shows the self-selected projects and how they have completed - for the most part - lonfer the ideal initial estimates. These estimates are not calibrated, meaning they are not assessed for credibility, error bands, or confidence. The paper mentions the the solid line, initial versus actual is the ideal line where actuals meet estimated value. In any stochastic estimating process, it will be unlikely ant estimate will result in a match with the actual for the very simple reason that the work processes are random and when the estimates don't contain the probabilistic confidence intervals, the actual MUST be different that the estimate.

As well, no root cause for the unfavorable performance of the actuals compared to the initial estimates is provided. This is a core failure to understand the process of estimating, rot cause analysis, and the discovery of the corrective actions needed to improve both the estimating processes as well as project performance management.

This fundamental failure is not limited to the self-selected set of projects in the paper. This failure mode can be found a wide variety of project domains in and out of the software business.

The second paper speaks the the major flaws Standish Report - meaningless figures. self-selected samples, perverted accuracy, unrealistic rates and misleading definitions. The paper states the root causes and suggested corrective actions.

The third paper shows how to quantify IT forecasts (estimates of future outcomes) in a mathematically sound manner.

Software Cost Estimation and Sizing Methods is a more in depth report on the issues, root causes, and corrective actions is a good starting point for further understanding. There are numerous other reports, guides, assessment, and corrective actions.

All five papers are useful in the right context. Little re-introduces Boehm's cone of uncertainty, assessment of Standish shows the traps that can be easily fallen into when good statistical practices are not followed, the third provides the mathematical foundation for restoring those sound practices, and the RAND report shows the mechanics of the corrective actions to restore credibility in software estimating.

A risk based view of the estimating problem developed for the recent successful launch and recovery of Orion, then called Crew Exploration Vehicle.

On a twitter discussions and email exchanges there is a notion of populist books versus technical books used to address issues and problems encountered in our project management domains. My recent book Performance-Based Project Management® is a populist book. There are principles, practices, and processes in the book that can be put to use on real projects, but very few equations and numbers. It's mostly narrative about increasing the probability of project success. But the to calculate that probability based on other numbers, processes, and systems is not there. That's the realm of Technical books and journal papers.

The content of the book was developed with the help of editors at American Management Association, the publisher. The Acquisition Editor contacted me about writing a book for the customers of AMA. He explained up front AMA is in the money making business of selling books. And that although I may have many good ideas, even ideas that people might want to read about, it's an AMA book and I'll be getting lots of help developing those ideas into a book that will make money for AMA.

The distinction between a populist book and a technical book are the differences between a book that addresses a broad audience with a general approach to the topic and a deep dive book focused on a narrow audience.

But one other disticntion is for most of the technical approaches, some form of calculation takes place to support the materials found in the populist material. One simple example is estimating. There are estimating articles and some books that lay out the principles of estimates. We have those in our domain in the form of guidelines and a few texts. But to calculate the Estimate To Complete in a statistically sound manner, technical knowledge and the underlying mathematics of non-linear, non-stationary, stochastic processes (Monte Carlo Simulation of the projects work structure) is needed.

Two examples of populist versus technical

Two from my past two from my current work.

These two books are about the same topic. General relativity and its description of the shape of our universe. One is a best selling popularization of the topic, found in many home libraries of those interested in this fascinating topic. The one on the left is on my shelf from a graduate school course on General Relativity along with Misner, Thorne, and Wheeler's Gravity.

Dense is an understatement for the math and the results of the book on the left. So if you want to calculate something about a rapidly spinning Black Hole, you're going to need that book. The book on the right will talk about those Black Holes in non-mathematical terms, but no numbers come out from that description.

The book on the left is about probabilistic processes in everyday life that we misunderstand or are biased to misunderstand. The many cognitive biases we use to convince ourselves we are making the right decisions on projects are illustrated through nice charts and graphs.

We use the book on the left in our work with non-stationary stochastic process of complex project cost and schedule modeling. Making these decisions is critical to quantifying how technical and economics risk may affect a system's cost. This book is a treatment of how probability methods are applied to model, measure, and manage risk, schedule, and cost engineering for advanced systems. Garvey's shows how to construct models, do the calculations, and make decisions with these calculations.

Here's The Point - Finally

If you come across a suggestion that decisions can be made in the absence of knowing anything about the future numbers or about actually doing the math, put that suggestion in the class of populist descriptions of a complex topic.

If you can't calculate something, then you can't make a decision based on the evidence represented by numbers. If you can't decide based on the math, then the only way left is to decide on intuition, hunchs, opinion, or some other seriously flawed non-analytical basis.

Just a reminder from Mr. Deming stated in yesterday's post

If it's not your money, there's likley an expectation that those providing the money are intestered in the calculations needed to make those decisions.

In the mathematics of physics, there are two essential types of values in all calculations - Scalars and Vectors.

Scalars are isolated values, with no outside context. Indeed They remain the same regardless of any context. A common example would be mass. An object has a mass of 1 Kilogram no matter where it is, or how much physical space it occupies. The context of the object cannot change the scalar value of its mass.

The number of stories produced in the last iteration.

The number of database rows selected on average for a transaction.

The number of defects found in thise release to production.

Vectors are contexted values, and can change depending on that context. An object has weight, dependent on both the mass value and gravity context of the object. An object with high mass, may still have no weight in the corresponding context of gravity.

The change in the productivity of value as a function of time and cost as the project moves through its maturation process.

The estimated productivity needed to complete the project on or before a need date, at or below the planned budget to achieve the needed Return on Investment on the need date, so the break even date can be announced to those paying for our work.

But most specifically, vector values allow the calculations of change over time. Numbers (scalars) without context (vectors) are not metrics.

It is meaningless to say that cost to operate the IT Service Desk has doubled within the last ten years, without also showing how the number of employees has tripled in the same time.

It is meaningless to say a self-selected 120 projects exceeded their estimated cost and duration without an assessment of the credibility of that original estimate and the determination of the Root Cause of that overage.

It is meaningless to say the end date and cost can be forecast with saying something about the underlying uncertainties in effort size, risk, inter-dependencies, changing requirements, defect rates, labor absorption rates, integration issues, performance issues, and complexities of emerging behaviours once the system starts to come together and applied to fulfill the needed capabilities.

When we hear about small sample sized forecasts of same sized work activities, or selecting the next priority item to work on without considering the inter-dependencies of past work items or future work items - we're speaking about Scalar numbers, not Vector numbers.

Vectors state magnitude and direction. Open Loop control only states magnitude. Use it at your own risk.

Thanks to Mr. Honner, a mathematics teacher in at Brooklyn Technical High School. If you like mathematics and appreciate the contribution a good teacher can make to mathematically understanding which is woefully lacking in our project management domain, sign up to get his blog posts.

The term statistical significance is critical to most every discussion about spending other peoples money with some new and innovative process.

When we here I know a CEO that uses my approach, we need to ask several critical questions both getting too excited about this idea that is being suggested. Especially is this new idea violates some core business processes, like Microeconomics, let alone FASB 86, GAAP cost and revenue recognition.

Is that CEO the CEO of a publically traded firm, subject to governance processes? If so, some outside the developer communbity gets to say if the new and innovative idea hasn't violated the business rules.

Does that CEO's company live a domain where what they do is like what other people do? You know a Reference Class that can be used outside the ancedote.

Does that CEO have to report cash obliogations to his line or credit or banker for some planning horizon in the future? Those pesky bankers do like know the cash call demands from your firm for that LOC.

The notion of an anecdote is always interesting in conversation I knew a guy once who .... But can we make policy decisions based on anecdotes? Hopefully not.

We can make policy decisions based on statistically sound observation - 8 out 10 dentist recommend Pepsident Toothpaste was a popular advertisement in the 1970's.

Let me ask all the people I ride with in our cycling club what they think of the local brewery where we leave from on Tuesday evenings, what they think of the Nitro Milk Stout that is served for free. We like it.

Without a statistical sound sample space, a statistically sounds sampling processes, any conclusion are just ancedote. This is the core issue with things like the Standish report and other surveys suggesting the sky is falling on IT projects.

The same goes for thosie suggesting their favorite apporoach to spending other peoples can be done in the absence of knowing hwo much money, when that money will produce value, and what kinds of value will be produced.

Ask for data. No data, then as they say "Data Talks, BS walks"

† The carton above is from Hugh MacLoed, gapingvoid art, 1521 Alton Road, Suite #518, Miami Beach, FL 33139. I've been following him since day one. You shold do the same and buy his book

"All the mathematical sciences are founded on relations between physical laws and laws of numbers, so that the aim of exact science is to reduce the problems of nature to the determination of quantities by operations with numbers." — James Clerk Maxwell, On Faraday's Lines of Force, 1856

There is a suggestion that only the final target of a project's performance is needed to steer toward success. This target can be budget, a finish date, the number of stories or story points in an agile software project. With the target and the measure of performance to date, collected from the measures at each sample point, there is still a missing piece needed to guide the project.

With the target and the samples, no error signal is available to make intermediate corrections to arrive on target. With the target alone, any variances in cost, schedule, or techncial performance can only be discovered when the project arrives at the end. With the target alone, this is an Open Loop control system.

Pages 27 and 28 show the difference between Open Loop control and Closed Loop control of a notional software development project using stories as the unit of measure.

In the figure below (page 27), the cummulative performance of stories is collected from the individual performance of stories over the projects duration. The target stories - or budget, or some other measure - is the final target. But along the way, there is no measure of are we going to make it at this rate?

An Open Loop Control System

Is a non-feedback system, where the output – the desired state – has no influence or effect on the control action of the input signal — the measures of performance are just measures. They are not compared to what the performance should be at that point in time.

The output – the desired state– is neither measured nor “fed back” for comparison with the input — there is not an intermediate target goal to measure the actual performance against. Over budget, late, missing capabilities are only discovered at the end.

Is expected to faithfully follow its input command or set point regardless of the final result — the planned work can be sliced into small chunks of equal size - this is a huge assumptions by the way - but the execution of this work must also faithfully follow the planned productivity. (See assumptions below).

Has no knowledge of the output condition – the difference between desired state and actual state – so cannot self-correct any errors it could make when the preset value drifts, even if this results in large deviations from the preset value — the final target is present but the compliance with that target along the way is missing, since there is no intermediate target to steer toward for each period of assessment - only the final.

There are two very simplifying assumptions made in the slicing approach sugegsted to solve the control of projects:

The needed performance in terms of stories or any other measure of performance are linear and of the same size - this requires decompsong the planned work for each period to nearly identical sizes, work efforts, and outcomes.

The productivity of the work performed is also linear and unvarying - this require zero defects, zero variance in the work effort, and sustained productivity at the desired performance level.

Fulfilling these assumptions before the project starts requires effort and the assumptions about the homogeneity of the planned production, the homogeneity of the work effort, and the homogeneity of any defects, rework, or changes in plan would require near Perfect planning and management of the project.

Instead, the reality of all project work is the planned effort, duration, outcomes, dependencies, and cost are random variables. This is the nature of the non-stationary stochastic processes that drive project work. Nothing will turn out as planned during to uncertainty. There are two types of uncertainty found in project work:

Irreducible Uncertainty - this is the noise of the project. Random fluctuations on productivity, technical performance, efficiency, effectiveness, risk. These cannot be reduced. They are Aleatory Uncertainties.

Reducible Uncertainty - these are event based uncertainties that have a probability of occurring, have a probability of the consequences, and have a residual probability that when fixed will come back back again.

Irreducible Uncertainty can only be handled with Margin. Cost margin, schedule margin, technical margin. This is the type of margin you use when you drive to work. The GPS Navigation system says it 23 ninutes to the office. It's NEVER 23 minutes to the office. Something always interferes with our progress.

Reducible Uncertainty is handled in two way. Spending money to buydown the risk that results from this uncertainty. Management Reserve (budget reserve and schedule contingency) to be used whenm soemthnig goes wrong to pay for the fix when the uncertainty turns into reality.

The next figure (page 28) shows how to manage in the presence of these uncertainties, by measuring actual performance against the desired performance at each step along the way.

In this figure, we measure at each assessment point the progress of the project against the desired progress - the planned progress, the needed progress. This planned, desired, or needed progress is developed by looking at the future effort, duration, risk, uncertainty - the stochastic processes that drive the project - and determining what should be the progress at this point in time to reach our target on or before the need date, at or below the needed cost, and with the needed confidence that the technical capabilities can be delivered along the way? This is closed loop control.

The planned performance, the needed performance, the desired performance is developed early in the project. Maybe on day one, more likely after actual performance has been assessed to calibrate future performance. This is called Reference Class Forecasting. With this information estimates of the needed performance can then be used to establish steering targets along the way to completing the project. These intermediate references - or steering - points provide feedback along the way toward the goal. They provide the error signal needed to keep the project on track. They are the basis of Closed Loop control.

In the US, many highways have rumble strips cut into the asphalt to signal that you are nearing the edge of the road on the right. They make a loud noise that tells you - hey get back in the lane, otherwise you're going to end up in the ditch.

This is the purpose of the intermediate steering targets for the project. When the variance between planned and actual exceeds a defined threshold, this says hey, you're not going to make it to the end on time, on budget, or with your needed capabilities if you keep going like this.

Kent Beck's quote is...

Optimism is the disease of software development. Feedback is the cure.

This feedback must have a reference to compare against if it is to be of any value in steering the project to a successful completion. Knowing it's going to be late, over budget, and doesn't work when we arrive at late, over budget, and not working is of little help to the passengers of the project.

Little's Law and the Central Limit Theorem are used many times in agile software discussions when speaking about flow processes based on Deming's principles.

These discussions usually start by quoting something from a summary of Little's Law or the Central Limit Theorem.

A critical element of both Little's Law and the CLT is the notion of Identical Independently Distributed (IID) random variables. These variables are the arrival rate to a service - stories selected from the backlog for development, or someone arriving in line at the bank to make a deposit of a check.

Let's start with some math. In probability theory, the central limit theorem (CLT) says,

Given certain conditions (we'll define these next), the arithmetic mean (the average) of a sufficiently large number (this needs to be defined) of samples of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally (Gaussian) distributed.

There are some important ideas here, Independentrandom variables are the first important idea. That is the variables are not related to each other.

As well these random variables must beIdentically distributed. That is, they are all drawn from the same underlying probability distribution, and this assumes a large - usually infinite - population of random variables.

Using a grocery store cehck-out line or bank teller window example, Little’s law gives the relation between the mean number of customers in the system, E(L), and the mean transit time through the system, E(S), and the average number of customers entering the system per unit time, λ, as E(L) = λE(S).

An Actual Example In Preparation for Developing Software From The Story Queue

Let's pretend we're in line at the grocery store. We'll call the check-out line the resource and the people lining up at the check-out line the customers. If the clerk manning the check-out station is busy checking out customers, a queue will form in the line waiting to check out.

The population of customers that can use the store is usually finite, but this makes the problem harder, so let's assume for the moment the population of customers is infinite. The number of check-out lines can be one or many, but we'll want to assume they are identical in their services for the moment as well. Let's define the capacity of the store as the number of people that can wait in line, plus the person being served by the clerk. In most stores there will be a finite number of people in the queue at check-out, but again if this number is infinite to makes the analysis easier.

We need another simplifying assumption. The distribution of the amount of time each customer stays at check-out (once they arrive) is Independent and Identically Distributed (IID). As well the probability that a customer will arrive at check-out is also an IID variable. This distribution is usually taken to be exponential. This means the longer you wait, the more likely it is someone will show up at the check-out stand ready to check out.

These are critical assumptions for what follows about Little's Law. If the above conditions are not met, Little's Law is not applicable to the problem being described.

So let's have a quick summary of Little's Law:

Mean number of people in line at check-out = Arrival Rate of customers × Check-Out Time

This law can be applied if some other conditions exist:

The mean number of people arriving at the check-out line equals the mean number of people being checked out,.

This means - over a long period of time - the mean number of people arriving in line approximately equals the mean number of people leaving the store after check-out.

In math terms:

The number of arrivals N over a long period of time T defines the arrival rate.

Mean Arrival Rate = Total Arrivals / Total Time = N/T.

Now for Software Development

Instead of assuming if Little's Law can be applied to software development, let's first ask are the conditions right to apply the law:

Is the arrival rate at the server an Identical Independently Distributed random variable?

Is the service time for each piece of work an Identical Independently Distributed random variable?

This means, do the jobs - stories - arriving at the service - development (or some other process) - behave like IID variables. That is they have no knowledge of each other and are indistinguishable from each other and when serviced, they can not be distinguished from the other work serviced(meaning developed, tested, installed, etc.)

Let's look at an actual project, a simple one. We want to fly to the moon for the first time, land, and come home.

To do this we need to do some work, in a specific order, with dependencies between this work in order to produce some outcomes that enable us to fly to the moon and back - and live to tell about it.

Doing work on a development project is not the same as work arriving in a queue of a service, it is a network of dependencies, with interconnections, and most importantly - most critical actually - the duration of the work, the time spent in the service - are not independent, identically distributed random variables. A network of work looks like this (notionally).

So does Little's Law apply? Nope!

The work arriving at the service is not an Independent Identically Distributed random variable.

The service time is not an Independent Identically Distributed random variable.

The dependencies between work may itself by a random variable of unknown distribution, behaviing dynamically depending on the prior work, the follow-on work and other external conditions.

These networks of work are called Stochastic Networks and are not subject to Little's Law. The proicess in the Little's Law condition can be stochastic, but there has to be independence between the work elements and they have to be identically distirbuted probability distributions.

Production queues of parts going down an assembly line are. Cards being pulled from the the tray in a Kanban furniture manufacturing system are. The notional Kanban system in agile development are - but If and Only IF (IFF) the work pulled from the wall is independent from all other work, and the probability distribution of the durtaion of that work independently distributed from the actual work as well.

If you can find a project where all the features are independent from each other, there work efforts are identical, independently distributed random variables, and you'll be able to apply Little's Law.

The End

Little's Law applies to software development work that looks like productioni flow - like the assembly line at Toyota, or the office furniture production line we designed at a factory in Idaho.

But those types of software projects must be intentionally designed to have no dependencies between the work performed, have the duration of the work in the service cycle (development) have no dependency on the prior work or the work.

This is the condition of Identical Independent Distribution (IID) needed for Little's Law as well as the Central Limit Theorem. So befoe anyone says Little's Law applies to software development, they need to show these conditions exist.

One Final Observation

The slicing proposed by some in the agile community might create the conditions for Little's Law to work. But the effort to slice the stories into equal sized - or at least Independent Identically Distributed work sizes for the entire project duration seems like a lot of work. Especially when there are much easier ways to estimating the total work, total duration, and total cost.

But since this slicing paradigm appears to be anecdotal and untested across a wide variety of projects, domains, and sizes, the population sample size condition is unlikely to be met as well.

More research, based on actual analysis needs to be done, then that research reviewed and tested, before the notion of mathematically slicing has much use outside of anecdotal examples.

Obvious not every decision we make is based on mathematics, but when we're spending money, especially other people's money, we'd better have so good reason to do so. Some reason other than gut feel for any sigifican value at risk. This is the principle of Microeconomics.

All Things Considered is running a series on how people interprete probability. From capturing a terrortist to the probability it will rain at your house today. The world lives on probabilitic outcomes. These probabilities are driven by underlying statistical process. These statistical processes create uncertainties in our decision making processes.

Both Aleatory and Epistemic uncertainty exist on projects. These two uncertainties create risk. This risk impacts how we make decisions. Minimizing risk, while maximizing reward is a project management process, as well as a microeconomics process. By applying statistical process control we can engage project participants in the decision making process. Making decision in the presence of uncertainty is sporty business and many example of poor forecasts abound. The flaws of statistical thinking are well documented.

When we encounter to notion that decisions can be made in the absence of statistical thinking, there are some questions that need to be answered. Here's one set of questions and answers from the point of view of the mathematics of decision making using probability and statistics.

The book opens with a simple example.

Here's a question. We're designing airplanes - during WWII - in ways that will prevent them getting shot down by enemy fighters, so we provide them with armour. But armor makes them heavier. Heavier planes are less maneuverable and use more fuel. Armoring planes too much is a proplem. Too little is a problem. Somewhere in between is optimum.

When the planes came back from a mission, the number of bullet holes was recorded. The damage was not uniformly distributed, but followed this pattern

Engine - 1.11 bullet holes per square foot (BH/SF)

Fueselage - 1.73 BH/SF

Fuel System - 1.55 BH/SF

Rest of plane - 1.8 BH/SF

The first thought was to provide armour where the need was the highest. But after some thought, the right answer was to provide amour where the bullet holes aren't - on the engines.

"where are the missing bullet holes?" The answer was onb the missing planes. The total number of planed leaving minus those returning were the number of planes that were hit in a location that caused them not to return - the engines.

The mathematics here is simple. Start with setting a variable to Zero. This variables is the probability that a plane that takes a hit in the enginer manages to staty in the air and return to base. The result of this analysis (pp. 5-7 of the book) can be applied to our project work.

This is an example of the thought processes needed for project management and the decision making processes needed for spending other peoples money. The mathematician approach is to ask what assumptions are we making? Are they justified? The first assumption - the errenous assumption - was tyhat the planes returning represented were a random sample of all the planes. If so, the conclusions could be drawn.

In The End

Show me the numbers. Numbers talk BS walks is the crude phrase, but true. When we hear some conjecture about the latest fad think about the numbers. But before that read Beyond the Hype: Rediscovedring the Essence of Management, Robert Eccles and Nitin Nohria. This is an important book that lays out the processes for sorting out the hype - and untested and liley untestable conjectures - from the testable processes.

The core issue starts with first chart. It shows the actual completion of a self-selected set of projects versus the ideal estimate. This chart is now in use for the #NoEstimates paradigm as to why estimating is flawed and should be eliminated. How to eliminate estimates while making decisions about spending other peoples money is not actually clear. You'll have to pay €1,300 to find out.

But let's look at this first chart. It shows the self-selected projects, the vast majority completed above the initial estimate. What is this initial estimate? In the original paper, the initial estimate appears to be the estimate made by someone for how long the project would take. No sure how that estimate was arrived at - the basis of estimate - or how was the estimate was derived. We all know that subject matter expertise is the least desired and past performance, calibrated for all the variables is the best.

So Here in Lies the Rub - to Misquote from Shakespeare'sHamlet

The ideal line is not calibrated. There is no assessment if the orginal estimate was credible or bogus. If it was credible, what was the confidence of that credibility and what was the error band on that confidence.

This is a serious - some might say egregious - error in statistical analysis. We're comparing actuals to a baseline that is not calibrated. This means the initial estimate is meaningless in the analysis of the variances without an assessment of it accuracy and precision. To then construct a probability distribution chart is nice, but measured against what - against bogus data.

This is harsh, but the paper and the presentation provide no description of the credibility of the initial estimates. Without that, any statistical analysis is meaningless. Let's move to another example in the second chart.

The second chart - below - is from a calibrated baseline. The calibration comes from a parametric model, where the parameters of the initial estimate are derived from prior projects - the reference class forecasting paradigm. The tool used here is COCOMO. There are other tools based on COCOMO and Larry Putman's and other methods that can be used for similar calibration of the initial estimates. A few we use are QSM, SEER, Price.

The issue of software management, estimates of software cost, time, and performance abound. We hear about it every day. Our firm works on programs that have gone Over Target Baseline. So we walk the walk every day.

But when there is bad statistics used to sell solutions to complex problems, that's when it becomes a larger problem. To solve this nearly intractable problem of project cost and schedule over run, we need to look to the root cause. Let's start with a book Facts and Fallacies of Estimating Software Cost and Schedule. From there let's look to some more root causes of software project problems. Why Projects Fail is a good place to move to, with their 101 common casues. Like the RAND and IDA Root Cause Analysis reports many are symptoms, rather than root causes, but good infomation all the same.

So in the end when it is suggested that the woo's of project success can be addressed by applying

Decision making frameworks for projects that do not require estimates.

Investment models for software projects that do not require estimates.

Project management (risk management, scope management, progress reporting, etc.) approaches that do not require estimates.

Ask a simple question - is there any tangible, verifiable, externally reviewed evidence for this. Or is this just another self-selected, self-reviewed, self-promoting idea that violates the principles of microeconomics as it is applied to software development, where:

Economics is the study of how people make decisions in resource-limited situations. This definition of economics fits the major branches of classical economics very well.

Macroeconomics is the study of how people make decisions in resource-limited situations on a national or global scale. It deals with the effects of decisions that national leaders make on such issues as tax rates, interest rates, and foreign and trade policy, in the presence of uncertainty

Microeconomics is the study of how people make decisions in resource—limited situations on a personal scale. It deals with the decisions that individuals and organizations make on such issues as how much insurance to buy, which word processor to buy, what features to develop in what order, whether to make or buy a capability, or what prices to charge for their products or services, in the presence of uncertainty. Real Options is part of this decision making process as well.

Economic principles underlie the structure of the software development life cycle, and its primary refinements of prototyping, itertaive and incremental development, and emerging requirements.

If we look at writing software for money, it falls into the microeconomics realm. We have limited resources, limited time, and we need to make decisions in the presence of uncertainty.

In order to decide about the future impact of any one decision - making a choice - we need to know something about the furture which is itself uncertain. The tool to makes these decisions about the future in the presence of uncertainty is call estimating. Lot's of ways to estimate. Lots of tools to help us. Lots of guidance - books, papers, classrooms, advisers.

But asserting we can in fact make decisions about the future in the presence of uncertainty without estimating is mathematically and practically nonsense.

So now is the time to learn how to estimate, using your favorite method, because to decide in the absence of knowing the impact of that decision is counter to the stewardship of our customers money. And if we want to keep writing software for money we need to be good stewards first.

When there are charts showing an Ideal line or a chart of samples of past performance - say software delivered - in the absence of a baseline for what the performance of the work effort or duration should have been, was planned to be, or even better could have, this is called Open Loop control.

The issue of forecasting the Should, Will, Must cost problem has been around for a long time. This work continues in DOD, NASA, Heavy Construction, BioPharma, and other high risk, software intensive domains.

When we see graphs where the baseline to which the delays or cost overages are compared and those baselines are labeled Ideal, (like the chart below), it's a prime example of How to LIe With Statistics, Darrell Huff, 1954. This can be over looked in an un-refereed opinion paper in a IEEE magazine, or a self-published presentation, but a bit of homework will reveal that charts like the one below are simply bad statistics.

This chart is now being used as the basis of several #NoEstimates presentations, which further propagates the misunderstandings of how to do statistics properly.

Todd does have other papers that are useful Context Adaptive Agility is one example from his site. But this often used and misused chart is not an example of how to properly identify problems with estimates,

Here's some core issues:

If we want to determine something about a statistical process, we of course need to collect data about that process. This data is empirical - much misused term itself - to show what happened over time. A time series of samples.

To computer a trend, we can of course draw a line through population of data, like above.

Then we can compare this data with some reference data to determine the variances between the reference data and the data under measurement.

Here's where the process goes in the ditch - literally.

The reference data has no basis of reference. It's just labeled ideal. Meaning a number that was established with no basis of estimate. Just this is what was estimated, now let's compare actuals to it and if actuals matched the estimate' let's call it ideal.

Was that ideal credible? Was it properly constructed? What's the confidence level of that estimate? What's the allowable variance of that estimate that can still be considered OK (within the upper and lower limites of OK)? Questions and their answers are there. It's just a line.

We can use the ne plus ultra put-down of theoretical physicist Wolfgang Pauli's "This isn't right. It's not even wrong." As well the projects were self-selected, and like the Standish Report, self-selected statistics can be found in the How to Lie book

It's time to look at these sort of conjectures in the proper light. They are Bad Statisics, and we can't draw any conclusion from any of the data, since the baseline to which the sampled values are compared Aren't right. They're not even wrong." We have no way of knowing why the sampled data has a variance from the ideal the bogus ideal

Was the original estimate simple naïve?

Was the project poorly managed?

Did the project change direction and the ideal estimate never updated?

Were the requirements, productivity, risks, funding stability, and all the other project variables held constant, while assessing the completion date? if not the fundamental principles of experiment desgin was violated. These principles are taught in every design of experiments class in every university on the planet. Statistics for Experimenters is still on my shelf. George Box as one of the authors, whose often misused and hugely misunderstood statement all models are wrong, some are useful.

So time to stop using these charts and start looking for the Root Causes for the estimating problem.

No reference classes

No past performance

No parametric models

No skills or experience constructing credible estimates

No experience with estimating tools, processes, databases (and there are many for both commerical and government software intensive programs).

Political pressure to come up with the right number

Misunderstanding of the purpose of estimating - provide information to make decisions.

A colleague (former NASA cost director) has three reasons for cost, schedule, and technical shortfalls

They didn't know

They couldn't know

They didn't want to know

Only the 2nd is a credible reason for project shortfalls in performance.

Without a credible, calibrated, statistically sound baseline, the measurements and the decisions based on those measurements are Open Loop.

You're driving your car with no feedback other than knowing you ran off the road after you ran off the road, or you arrived at your destination after you arrived at your destination.

I'll admit up front I'm hugely biased toward statistical thinking. As one trained in physics and the mathematics that goes with physics, and Systems Engineering and the math that goes with that thinking about statistics is what we do in our firm. We work programs with cost and schedule development, do triage on programs for cost and schedule, guide the development of technology solutions using probabilistic models, assess risk to cost, schedule, and technical performance using probability and statistics, and build business cases, performance models, Estimates To Complete, Estimates At Completion, the probability of program success, the probability of a proposal win, and the probability that the Go-Live date will occur on or before the need date, and be at or below the planned cost, and the probability the mandatory needed capabilities will be there as well.

We use probability and statistics not because we want to, but because we have to. Many intelligent, trained, and educated people in our domain - software intensive systems and the management of projects - find themselves frozen in fear when confronted by any mathematical problem beyond the level of basic arithmetic - especially in the software development domain. The algorithm writers on flight control systems we work with are not actually software developers in the common sense, but are control system engineers who implement their algorithm in Handel-C - so they don't count - at leats not in this sense.

We have to deal with probability and statistics for a simple reason - ever variable on a project is a random variable. Only accountants deal with Point Numbers. The balance in your checking account is not subject to a statistical estimate. The price of General Electric stock in your 401(K) is a random variable. All the world is a non-stationary, stochastic process, and many times a non-linear, non-stationary, stochastic process.

Stochastic processes are everywhere. They are time series subject to random fluctuations. Your heart beat, the stock market, the productivity of your software team, the stability of technical requirements, the performance of the database server, the number of defects in the code you write.

In our software development domain there is an overwhelming need to predict the future. Not for the reason you may think. Because at the same time there is a movement underway to Not Estimate the future. But it turns out this is a need not necessarily a desire. The need to predict - to have some sense of what is going to happen - is based on a few very simple principles of microeconomics

It's not our money. If it were we could do with it as we please.

Those providing the money have a finite amount of money. They alos have a finite amount of time in which to exchange that money for value produced by us, the development team.

If there were a non-finite amount of money and time, we won't have to talk about things like estimates of when we'll be done, or how much it will cost, or the probability that the produced outcomes will meet the needs of the users.

Our natural tendencies are to focus on observation - empirical data - rather than the statistical data that drives the probabilistic aspects of our work.

This approach - the statistical processes and probabilistic outcomes requires we know something about our underlying processes. Our capacity for work, the generated defect rate, the defect fix rate. Without that knowledge the probabilistic answers aren't forth coming and if th ey are forced out in the Dilbert Style management, they'll be bogus at best, and down right lies at worst.

Let's stop here for some critically important points:

If we don't have some sense of the underlying processes driving our project, we're in much bigger trouble than we think - we don't know what done looks like in units of measure meaningful to the decision makers

When will we be done? Approximately? -I don't know - then we're late before we start.

How much will this cost? Approximately? - I don't know - we're over budget before we start.

What's the probability we'll be able to deliver all the needed features - minimal features or mandatory features, for a cost and schedule goal? I don't know - this project is going to be a Death March project before we run out of time and money.

If we don't know our capacity for work, which should be developed from empirical data - we can't make duration and cost estimates.

Knowing this once the project is going if fine. But it's likely too late to make business decisions needed to start the project.

The very naive assumption that all the work can be broken down into same sized chunks has not broad evidence, and is likely to be highly domain dependent.

So let's look at the core problem of estimating

As humans we are poor at estimating. Fine, does then mean we should not estimate. Hardly. We need to become aware of our built in problem and deal with them. The need for estimating in business is not going away, it is at the heart of business itself and core to all decision making.

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Which of these is more probable?

1. Linda is a bank clerk. 2. Linda is a bank clerk and an active member of a feminist movement.

In Kahneman's studies, 90% of all respondents selected the latter option. Why? Because the description of Linda and the word feminism intuitively raise an idea of compatibility. Being outspoken and an activist are not usually associated with the job of a bank clerk. Quickly thinking, the bank clerk is a slightly more probable option if she is at least a feminist. However, the second option is horribly wrong because the probability is lower when there are more variables. Statistical probability is always lower when the target group grows – there are more bank clerks who are not active feminists.

So Now What, We've Confirmed We're Bad At Making Estimates

How do we make it better? First, come to realize the good - meaning credible - estimates are part of good business. Knowing the cost of the value delivered is at the core of all business success. Second, is look for the root causes of poor estimating outcomes. These comes in many sizes. But the Dilbert excuse is not an excuse. It's a carton of bad management. So let's drop that right away. If you work for a Dilbert boss or have a Dilbert boss for a customer, not much good estimating is going to do for you. So let's dispense with the charde of trying too.

So what are some root causes of poor estimates:

Poor understanding of what done looks like - what capabilities do we need and when do we need them. Without this understanding building a list of requirements has no home and the project becomes an endless series of emerging work efforts in an attempt to discover the needed capabilities.

The numbers that appear in projects — cost, schedule, performance — are all random variables drawn from an underlying statistical process. This process is officially called a non-stationarystochastic process. It has several important behaviours that create problems for those trying to make decisions in the absence of understanding how these processes work in practice.

The first issue is that all point estimates for projects are wrong, in the absence of a confidence interval and an error band on that confidence.

How long will this project take is a common question asked by those paying for the project. The technically correct answer is there is an 80% confidence of completing on or before some date, with a 10% error on that confidence. This is a cumulative probability number collecting all the possible completion dates and describing the cumulative probability - the 80% - of an on or before, since the project can complete before that final probabilistic date as well.

Same conversation for cost. The cost of the project will be at or below "some amount" with a 80% confidence.

The performance of products or services are the third random variables. By technical performance it means anything and everything that is not cost or schedule. This is the wrapper term for the old concept of scope. But in modern terms there are two general purpose categories of Performance with one set of parameters.

Measures of Effectiveness - are the operational measures of success that are closely related to the achievements of the mission or operational objectives evaluated in the operational environment, under a specific set of conditions. The Measures of Effectiveness:

Are stated in units meaningful to the buyer,

Focus on capabilities independent of any technical implementation,

Are connected to the mission success.

Measures of Performance - are the measures that characterize physical or functional attributes relating to the system operation, measured or estimated under specific conditions. The Measures of Performance are:

Attributes that assure the system has the capability and capacity to perform,

Assessment of the system to assure it meets design requirements to satisfy the MoE.

Key Performance Parameters - represent the capabilities and characteristics so significant that failure to meet them can be cause for reevaluation, reassessing, or termination of the program. Key Performance Parameters:

Have a threshold or objective value,

Characterize the major drivers of performance,

Are considered Critical to Customer (CTC).

These measures are all random numbers with confidence intervals and error bands.

So What's The Point?

When we hear you can't forecast the future, that's not true. The person saying that didn't pay attention in the High School statistics class. You can forecast the future. You can make estimates of anything. The answers you get may not be useful, but it's an estimate all the same. If it is unclear on how to do this, here's a reading assignment for the books we use nearly every month to make our estimates at completion and estimates to complete for software intensive project, starting with the simplist:

How to Think About Statistics, 5th Edition - a survey of how good statistical thinking is needed to make decisions. And it is decision making we're after.

How to Measure Anything, Douglas Hubbard - shows how to do just that, with a bit more mathematics than the first 3 books.

Introduction to Stochastic Models, 2nd Edition - starts down the path of real mathematics used to model cost, schedule, and performance of complex systems. And Dr. Goodman's Statitics page

While on the topic of books, here are some books that should be on your shelf that put those probability and statistics to work.

Facts and Fallacies of Software Engineering, Robert Glass - speaks to the common fallacies in software development. The most common is we can't possibly estimate when we'll be done or how much it will cost. Read the book and start calling BS on anyone using that excuse to not do their homework. And a nice update by Jack Atwood, founder of Stack Exchange.

Estimating Software-Intensive Systems, Richard Stutzle - this is the book that started the revolution of statistical modeling of software projects. When you hear oh this is so olde school, that person didn't take the HS Stats class either.

Software Engineering Economics, Barry Boehm - is how to pull all this together. And when you hear this concept is olde school, you'll know better as well.

There are several tools that make use these principles and practices:

When the Dilbert boss comes around, you'll have to tools to have a credible discussion about the Estimate to Complete number he's looking for is bogus. He may not listen or even understand, but you will.

And that's a start in fixing the dysfunction of bad estimating when writing software for money. Start with the person who can actually make a change - You

The book How To Lie With Statistics, Darrell Huff, 1954 should be on the bookshelf of everyone who spends other peoples money for a very simple reason.

Everything on every project is part of an underlying ststistical process. Those expecting that any number associated with any project in any domain to be a single point estimate will be sorely disappointed to find out that is not the case after reading the book.

As well, those expecting to make decisions about how to spend other peoples money will be disappointed to know that statistical information is needed to determine the impact of the decision is influenced by the cost of the decision and the cost of the value obtained by the decision, the impact on the schedule of the work needed to produce the value from that decision, and even the statistical outcomes of the benefits produced by making that decision.

One prime example of How To Lie (although unlikley not a Lie, but just poor application of statistical processes) is Todd Little's "Schedule Estimation and Uncertainty Surrounding the Cone of Uncertainty." In this paper the following figure is illustrative of the How to Lie paradigm.

This figure shows 106 sampled projects, their actual completion and their ideal completion. First let's start with another example of Bad Statistics - the Standish Report - often referenced when trying to sell the idea that software projects are always in trouble. Here's a summary of posts about the Standish Report, which speaks to a few Lies in the How to Lie paradigm.

The samples are self-selected, so we don't get to see the correlation between the sampled projects and the larger population of projects at the firms.

Those returning the survey for Standish stating they had problems and those not having problems can't be compared to those not returning the survey. And can't be compared to the larger population of IT projects that was not sampled.

This is a Huff example - limit the sample space to those examples that support you hypothesis.

The credibility of the original estimate is not stated or even mentioned

Another good Huff example - no way to test what the root cause of the trouble was, so no way to tell the statistical inference of the suggested solution to the possible corrected outcome.

The Root Cause of the over budget, over schedule, and less the promised delivery of features is not investigated, nor any corrective actions suggested, other than hire Standish.

Maybe the developers at these firsm are not very good at their job, and can't stay on cost and schedule.

Maybe the sampled projects were much harder than first estimated, and the initial estimate was not updated - a new estimate to complete - when this was discovered.

Maybe management forced the estimate onto the development team, so the project was doomed from day one.

Maybe those making the estimate had no estimating process, skills, or experience in the domain they were asked to estimate for.

Maybe a few dozen other Root Causes were in place to create the Standish charts, but these were not seperated from the statistical samples to seek the underlying data.

So let's look at Mr. Little's chart

There is likely good data at his firm, Landmark Graphics, for assessing the root cause of the projects finishing above the line in the chart. But the core issue is the line is not calibrated. It represents the ideal data. That is using the orginal estimate, what did the project do? as stated on page 49 of the paper.

For the Landmark data, the x-axis shows the initial estimate of project duration, and the y-axis shows the actual duration that the projects required.

There is no assessment of the credibility of the initial estimate for the project. This initial estimate might accurately represent the projected time and cost, with a confidence interval. Or this initial estimate could be completely bogus, a guess, made up by uninformed estimators, or worse yet, a estimate that was cooked in all the ways possible from bad management to bad math.

So if our baseline to make comparisons from is bogus from the start, it's going to be hard to draw any conclusion from the actual data on the projects. Both initial estimates and actual measurements must be statistically sound if any credible decisions can be made about the Root Cause of the overage and any possible Corrective Actions that can be taken to prevent these unfavorable outcomes.

This is classic How To Lie - let me present a bogus scale or baseline, then show you some data that supports my conjecture that something is wrong.

In the case of the #NoEstimates approach, that conjecture starts with the Twitter clip below, which can be interpreted as we can make decisions without having to estimate the independent and dependent variables that go into that decision.

So if, estimates are the smell of dysfunction, as the popular statement goes, what is the dysfunction? Let me count the ways:

The estimates in many software development domains are bogus to start. That'll cause management to be unhappy with the results and lower the trust in those making the estimates. Which in turn creates a distrust between those providing the moeny and those spending the money - a dysfunction

The management in these domains doesn't understand the underlying statistical nature of software development and have an unfounded desire to have facts about the cost, duration, and probability of delivering the proper outcomes in the absence of the statistical processes driving those processes. That'll cause the project to be in trouble from day one.

The insistence that estimating is somehow the source of these dysfunctions, and the corrective action is to Not Estimate, is a false trade off - in the same way as the Standish Report saying "look at all these bad IT projects, hire us to help you fix them." This will cause the project to fail on day one again, since those paying for the project have little or no understanding of what they are going to get in the end for an estimated cost if there is one.

So next time you hear estimates are the smell of dysfunction, or we can make decisions without estimating:

Ask if there is evidence of the root cause of the problem?

Ask to read - in simple bullet point examples - some of these alternatives - so you can test them in your domain.

Ask in what domain would not estimating be applicable? There are likley some. I know of some. Let's hear some others.

Ask to show how Not Estimating is the corrective action of the dysfunction?

When it is said that we can't forecast or estimate, it brings a smile. Since in fact forecasting and estimating is done all the time. Not always correctly, and not always properly used once the estimate is made, but done all the same, every day in some domains, every week and every month in the domains I work.

In our domains the Estimate At Complete is submitted to the customer every month. And the Estimate At Completion quarterly on most projects we work. These are software intensive projects and some time software only projects. All innovative development, sometimes never been done before, sometimes inventing new physics.

Some of these estimates are very formal, using tools, reference class forecasting, Autoregressive Integrated Moving Average (ARIMA) projections of risk adjusted past performance and compliance with System Engineering Measures of Effectiveness (MOE) and Performance (MOP), traceable to Technical Performance Measures (TPM) and Key Performance Parameters (KPP). Some are simple linear projects of what it will cost give a few parameters - the is it bigger than a bread box type estimates. Here's how to estimate any software deliverable in an informal way.

Success is a function of persistence and doggedness and the willingness to work hard for twenty-minutes to make sense of something that most people would give on after thirty seconds - Malcolm Gladwell, Outliers: The Story of Success.

That chapter and others speak to making estimates about the things we want to measure. Along with Monte Carlo Simulation - another powerful estimating tool we use on our programs. The process entering our domain (space and defense) is Bayesian estimates - adding to what we all ready know.

The instinctive Bayesian approach is very simple

Start with a calibrated estimate

Gather additional information

Update the calibrated estimate subjectively, without doing additional calculations

So if we hear, we can't forecast the future, estimates are a waste, we can't know anything about the future until it arrives — stop, think about all the estimating and forecasting activities you interact with every day, from the weather, to the stock market, to your drive to work, to the estimated cost of the repainting of your house, or the estimated cost of a kitchen remodel.

Anything can be estimated or forecast. All that has to happen is the desire to learn how. Since the purpose of estimates is to improve the probability of success for the project, the estimates start by providing information to those paying for the project. This is a immutable principle of business

Value is exchanged for the cost of that value. We can't know the value of something until we know it's cost. From the kitchen cabinets, the the garden upgrade, to the software for Medicaid enrollment. It's this simple

Probability theory is nothing but common sense reduced to calculation — Pierre-Simon Laplace 1749-1827)

So when you hear we can't forecast the future, or estimates are evil, or we can't know what we need to do until we start doing, focus on the last part of the quote — if you don't apply probability theory and its partner statistics, they are correct and missing that basic common sense. If we apply basic statistically thinking to project management issues, we can calculate the probability of anything. The resulting probability may not have sufficient confidence levels - but we can calculate non the less.

When you hear we can make decisions without estimating the cost, schedule, or capability impacts of those decisions, consider Laplace and the nonsense of that notion. And a recent example of how to do the math for forecast the future behaviour of a project in a specific domain Earned Value Meets Big Dataand the annotated briefing of the same paper.

It is remarkable that a science which began with the consideration of games of chance should become the most important object of human knowledge — Pierre Simom Laplace, 1812.

The notion that all project variables are Random Variables is not well understood in many instances, especially in the agile community and those suggesting that estimating cost, schedule, and performance are not needed to make business decisions.

In some agile paradigms, fixed duration sprints mortgage the future by pushing unfinished or un-started features to future sprints resulting in a Minimal Viable Features outcome rather than the Needed Capabilities for the business case or mission success.

While possibly useful in some domains, many domains assume the minimum features are the same as the required features. Without all the required features, the system is non-functional. Management Reserve, schedule margin, and cost margin are needed to proetct those required features, their cost and their schedule from the random behaviours shown below.

When developing products or services using other peoples money, it is incumbent on us to have some understanding of how these random variables behave on their own and interact with each other. This knowledge provides the basis for making decisions about how that money will result in value to those providing the money. In the absence of that knowledge, those providing the money have no way of knowing when the project will complete, how much it will cost when it is complete, and what the probability is of the capabilities produced by the project to meet the needed business, technical, or mission goals. And most importantly how to make decisions based on those behaviours and interactions. They are deciding without the needed information of the consequences of their decisions.

I'm speaking at the ICEAA conference here in Denver (not on travel for once) on forecasting the Estimate At Completion (EAC) for large complex programs using Box Jenkins algorith. And the Cure for Unanticipated Growth in EAC.The notion of making estimates of cost, schedule, and performance of something goes back to the beginning of all projects. From the Egyptians to modern times. Projects that bend metal, projects that develop new life forms, projects that write software in exchange for money.

Each and every project on the planet today has three variables. These variables ae not independent. They are coupled in some way. Usually this coupling is non-linear, non-stationary (this means they are evolving) and many times unknown. These variables are cost, schedule, and delivered capabilities.

If these variable on the project are are UNKNOWABLE (Black Swans), your project is in the ditch before it starts. So let's skip that excuse for not estimating the three variables of the project. Another excuse we can dispose of is our clients don't know what they want. If this is the case, someone has to pay to find out what DONE looks like in units of measure meaningful to the decision makers. Don't have this information in some form, any form, that can be used to further the conversation? Your project is a Death March project on day one.

Modeling Random Variables on Projects

If we are managing a project, or a participant on a project, we should know something about the work we have been asked to do, the outcomes of that work, the relationships between the elements of the work - the hull size of a ship and the cost of the propulsion system. Or, the size of the hardware needed to handle the number of users on the systems.

But the first and most important thing to know is all variables on projects are random variables. If you hear we can't estimate because we can't know exactly what the cost, schedule, or capabilities will be, that person may be unaware of the underlying statistical processes of all projects. Projects are not accounting. They are decision making processes. Decision making in the presence of uncertainty. Anyone seeking certainty - a manager, a customer, a provider - is going to be serious disapponted when they discover all the project variables are random variables, with a Mode (Most Likley), a Standard Deviation, and higher order Moments, describing the shape of the Probability Distribution Function.

So here's a simple and straightforward approach to modeling our project in Excel. Let's start with some urban myths about estimating anything:

Estimating is not guessing - guessing is not only bad math, it's bad management, and it's bad development. You wouldn't guess at how to size the table space on disk for your Oracle instance? No you wouldn't. You'd get to explain to your customer why you ran out of table space and the app crashed - ONCE. Then you'd have to explain to your customer, why they should keep you as a supplier or an employee.

There is nothing new under the sun. We're not inventing new physics. I've worked in that job early in my career and we did invent devices and their software that required new physics - surface acoustic wave digital filters. Somewhere, someone has done what you have been asked to do. If you can't find them, you can build a parametric model of something that looks like you've been asked to do. This is the is it bigger than a bread box game. It's also the 20 questions game. Here's an example of How To Estimate Any Software Deliverable.

Our Customers Don't What TheyWant - if this is the case you're in a Death Marchproject. Buy Yourdon's book read it, Stop Doing Stupid Things on Purpose. If you've been tasked to be the steward of your customers money, time to behave like that. Inform them of the consequences of spending money without knowing what done looks like. You may be able to find what DONE looks like, but it's going to cost money. And spending that money on the project may be a true waste, since the customer will be spending money exploring while you are writing code for things that may never be needed to deliver those pesky Capabilities that produce the needed business value.

We don't know how to estimate -That's acceptable. Many don't. It's a common situation. But saying we can't estimate, or estimating is a waste, or we can't forecast the future ignores a very large body of literature, books, courses, and tools that can be the foundation of learning how to estimate, that you can find here.

In The End, It Is This Simple

If you're building products or providing services using someone elses money, you are professionally obligated to provide some sort of understanding of the cost, scehdule, and probability of showing up with the needed capabilities. Since each of those is interconnected in some non-linear, non-stationary manner, with unknown, but knowable correlations between the lowest level work elements, assume a fixed budget and a set of past performance from measurements of work will provide a credible forecast of future performance is naive at best.

As a Final Aside

Most the conversation arounbd agile software development assumes a list of work in the backlog, but ignores completly the correlations between these work elements in the future. This is the myth that the cost of change in flat in the agile world. This violates system logical, This idea of flat cost is true IF AND ONLY IF - and this is a BIG IF - that those produced components have not inter-dependencies between the developed components. Once there are inter-dependencies, change in one drives change in others, in unknown and possibly unknowable ways in the absence of a detailed architectural document. Design Structure Matrix applied to the architecture question is one way to assess the cost of any change in any system.

This branch of mathematics [probability] is the only one, I believe, in which good writers get results entirely erroneous - Charles Sanders Pierce

When it is said - you can't estimate the future - or we don't know total cost, think of Mr. Pierce. All things project management are probabilistic drive by the underlying statistical processes of irreducible and reducible uncertainty. Rarely, if ever, are these uncertainties Unknowable, in the mathematical sense.

## Recent Comments