Using story Points for estimation – the big view

13 May, 2016 (13:56) | Uncategorized | By: seth

Most agile software teams choose to define User Stories as the increment of value they deliver, with a given release or product comprising multiple of these stories. To figure out what they can do and when, teams need to estimate these stories. Estimating, using time units such as days, person-days, or ideal-days is a big mistake. You should instead use unit-less story points. Here is why.


Unit-less story points are better than estimating in time units

Relative estimation is easier than absolute estimation

If I were to ask you how tall is the Empire State Building in NYC, could you tell me in feet or meters? By asking for a measurement with an actual length unit, I am asking for an absolute estimate. (similarly asking for time in days is also an absolute estimate) But what if I showed you the diagram below and asked you how tall it is relative to the ‘Great Pyramid”? You could answer that it is about three times as tall as the pyramid. Similarly you can say the Eiffel Tower is about twice as tall as the Great Pyramid. These are relative estimations, and we tend to be a lot better at making these than we are at absolute estimations.

The way this works with stories is you pick several small-ish stories you have already completed, and you assign them one or two story points, where the two-point stories are about twice as big as the one-point ones. Now you have a baseline with which to measure stories you have not yet started. Going-forward, pick a new story to be estimated and if it is twice as big as a one-pointer, assign it two story points. If it is twice as big as a two-pointer, assign it four story points. Keep in mind for every team, story size will mean different things depending on what baseline they chose.


Time depends on who is doing the work

 Another reason not to use time base estimation is that a veteran developer who has been on the team for years will likely be able to complete a story in a fraction of the time it would take the new intern. So calling a story a 3 day story is meaningless — is it 3 days for the pro or 3 days for the intern? Instead we should measure story size using story points.


How it works in real life

What do we mean by “size”?

We are measuring story “size” but what does that mean? It is not time, otherwise we would just be referring to time by another name. Size is a measure of scope and complexity, and therefore ultimately effort. Yes this will correlate to time, but as we previously discussed time will vary depending on who is doing the work. With size we are attempting to capture a measure independent of who does the work.


But we need to ultimately know time

We do ultimately need to know what will be ready for release and when it will be ready. Our stakeholders are often keenly interested in release and availability dates. This information can be calculated from your story points. Just look at what your team has historically delivered — add up the story points for all stories completed by the team and calculate how many points the team delivers over time. This is called velocity (or delivery rate) and tells us what the team is capable of doing. We can then look at all the stories that remain, and add up the points for all those stories. Using the velocity we can then calculate how much time it will take to complete all those stories. In other words:  

Time to completion = Remaining points / Velocity (in points per time)

Of course this is the simplified version. For more details, see this.


Story points can be fun

Many teams choose to make up fun units rather than use generic story points. One team I know uses “aspirins”, since the bigger a story is, the bigger headache it can be. Other teams use “jelly beans” or “gummy bears”.


Some best practices

Estimate as a group

Group exercises like Planning Poker enable you to engage the whole team, bringing the power of collaboration to your estimates. When doing group estimation be sure to use methods where everyone estimates simultaneously. Otherwise the first person to speak will create an anchor bias in subsequent answers from other folks in the group.


Avoid false precision by using an increasing sequence –

When assigning story points you should use a sequence with increasing gaps between consecutive numbers. Many teams use powers of two (1, 2, 4, 8, 16…) or a Fibonacci sequence (1, 2, 3, 5, 8, 13, 21…). The reason for this is that a small story is easier to understand and leaves less room for estimation error. As a story gets bigger your estimate will be less precise. Therefore deciding between 15, 16, or 17 story points is meaningless – in this case you would assign 16 and move on

Ease of estimation is one reason you should favor smaller stories, and when a story estimates as big, you should try to break it into smaller stories. Another reason to favor small stories is that you will finish them quicker. You will be able to show value to your stakeholders more often and get feedback to keep your project on track.


Getting started

The above information is all available elsewhere on the internet, but I could not find a single citation that discusses all the points above together. Now with all this info in one place, you can now get some ideas on how to proceed with your team. As you do, you will need more details and for those you can search and dive-deeper on each issue I mention above.


Thank you to Ed Tellman and Alex Zotos who helped me with this blog.


Comment from Alex
Time May 16, 2016 at 4:21 pm

One thing I have seen teams doing “wrong” is that they don’t round up properly, after averaging their estimates.

For example, it the average is 10.4 they will assign 11 and not 13 (if they use Fibonacci). However the round up should always be one of your system’s sequence number.

Comment from PD
Time June 12, 2016 at 11:30 am

It started with defined Function points at one time, which followed by time based estimation and now into story points and some even stop at only doing T-shirt size costing. It’s definitely getting better, but my humble believe is that it is not solving the problem which the business and the engineering teams wanted to solve.

The problem is always about different type of personalities in the team. This has roots in Agile and therefore I’ll use this word. A healthy Agile team (or swimlane team or 2 pizza team) consists of Generalized Specialists and to be honest no matter how much you tame, the initial 3-4 Sprint estimations will be wrong as the team is still learning what they are capable to do and at what speed. Starting from 5th Sprint onwards they get a hang of what it even means when they want to put 2 points on a story. But guess what their definition of 2 will be different than other team’s definition of 2. So in general, the business and the stakeholders are never getting the estimates in their language. This even causes friction and frustrations between interacting teams as they don’t know when and how the dependency provider team will be ready. Oh yeah, and add to that the team churn. Folks joining and leaving just adds to the chaos and believe me this churn happens at a much higher frequency than we think.

My problem with these methodologies is that they are just so generalized that they leave to the teams to Go figure the other complexities. While being guiding (as against prescriptive) is suppose to be helpful, this has only caused more and more pains and the engineering teams are spending too much of a time on these things instead of the real work.

I am a strong supporter of Agile, but after going through so many teams and such effects that I still find love in those good old days of function points or time based estimates. They cannot be perfect but they are much closer than any other methods.

Comment from Ralph Case
Time June 13, 2016 at 6:40 am

The Great Pyramid probably took about 20 years to complete; The Eiffel Tower, just over two years; and the Empire State Building just 410 days. The make-up of the team and the tools you use are significant. If the team and tools are changing, it can be hard to apply previous velocity to the current project. If the team and tools aren’t changing, why not?

Comment from seth
Time June 13, 2016 at 10:33 am

Ralph, yes, a challenge to establishing consistent velocity is consistency of the team.

And (Even though you already know this), just to clarify for other readers, the image was intended to convey the ease of relative estimation using *height* not effort in this case.

Write a comment