When discussing about “examining of historical data” as a method of building project estimates, we can usually hear something like “OK, might be helpful, but not complex enough, not convincing enough”. And that is exactly the point.
Typically, examination of historical data is referenced as one of available methods to be used. Indeed, but it shall never be used alone, never as a sole base for estimate. Statement in the beginning of this article is completely correct, saying that to build estimate only on historical data is not convincing enough. Because each project is different, circumstances and project context are inevitably different as well. With little piece of humor, The Great Gatsby finally acknowledged also, that “we can’t repeat the past” in recent movie.
On the other hand, building estimate without examining of historical data is not convincing AT ALL. Could we even imagine ignoring past experiences and events once forecasting the future? Each human being is ultimately using and benefiting from his/her individual, or some collective experience. Even unconsciously, unwittingly. That is the way we grow and getting more mature and reliable. And this is exactly what we want to achieve also with our estimates.
From our perspective, examination of historical data shall be obligatory step within process of building estimates. It creates a baseline, a starting point for any method used later to precise and finalize the estimate. If nothing else, it can help setting priorities and weights for different factors considerable in other estimation methods.
And with 99.99% probability we can find some historical data, which could be applicable for our estimates. Have you ever experienced a project, which was such unique, that nothing from past was applicable?
- how long did this type of activity take in similar projects before?
- what was the difference between initial estimate and final duration? (what was the planning mistake?)
- what was the reason for difference? could it be applicable also for us?
- was there some link of achieved performance with external factors, which could repeat in our current project?
- does duration vary with different people involved? (considering different levels of efficiency by different actors)
etc.
Now the question is, how to examine historical data rather consciously and in systematic way. We see 3 major success factors to be applied.
What You Will Learn
Know, what are you missing
By definition, historical data are more-less infinite. And with exponential growth of dynamics and experience of current world it will be even worse (phenomenon of Big Data). So what to look for? How to find relevant data? To stay pragmatic and reasonable, you need to nail the problem down to some manageable extent.
The answer comes from the fact that we work with estimates. We do not know exact figures (otherwise we would have been using it), because we have some factor of uncertainty, which we are trying to “guess”, to estimate. In such situation, we normally also understand WHY we have that uncertainty, so we know what is missing, what is variable in our equation. And that is the element, to be examined in history.
For example, today I can’t be sure about duration of SW module development, because I have no specification ready yet. But I know at least expected materiality of development (number of interfaces, or functional points), so I can search in history for similar implementations and see typical durations of similar tasks. I can see if different vendors implemented that with different durations. And I can see what was the difference between original estimates and final duration (if not typical PI factor), so I can anticipate the same bias in our project (or understand the reasons and mitigate them… but this goes more with Risk Management than Estimations).
Be prepared
The most difficult, or let’s say the most energy consuming task when examining historical data is not the process of examining itself. It is rather the process of gathering and more importantly interpreting the data. In point 1) above we were discussing “how to find relevant data”, in this point we add “and how to reach them reasonably fast”.
Even if we imagine, that you gather everything relevant into some XLS file (your personal knowledge database), after years of experience your file will be heavy and to digest something relevant from this data will be practically impossible. The gold is not (only) in gathering data. The gold is in anticipating future questions and uses of data, and doing interpretations at time of data recording.
Honestly, almost nobody is doing this properly. But modern PMOs understand benefits of this activity and create reasonable pressure to Project Managers to act in this direction:
- exhaustive and structured project evaluations (lessons learned) – not formalistic, but really with focus on practica use. This is practically the interpretation of project data.
- templating and archiving of project data – same structure allows easier search and data collection and processing (e.g. statistics about task durations etc)
- as a new phenomenon, DataMining and BigData tools, able to digest interesting facts in heavy PMO databases of past and current projects
If you are not supported by such modern PMO (or if you are individual PM), try at minimum to:
- maintain your own knowledge base, with well thought and designed structure
- keep the same structure (characteristics, naming conventions etc) of project data, like WBS tasks etc.
- try to keep it in form, which will allow future automated processing
- play with it, try to find some common characteristics, common behaviors. People are sometimes surprised, how big value could be found in simple data exploration
The rest is statistics
One would probably say, that the main content of this article would be listing of various methods to be used to examine historical data. In fact, this is probably the simplest part of exercise. Or if not the simplest, then the least complicated or sophisticated part. Once you have your data collected and you understand them (you have them interpreted), you have variety of statistical methods to be applied. If you are not skilled enough here, at this point it shall be quite easy to involve some expert and ask him to help with understanding of data and potentially help with prognosis. Because most of inputs for his work are ready.
Not to forget, some historical data could be found also on your current project. If you are not at its complete start, you might have collected already some “historical” data there. Even the history does not go far to past. For example, Earned Value Analysis (EVA) method is practically helping you to improve your estimates of final project landing based on project performance achieved so far. Different perspective, but the same topic, isn’t it?