Sixth habit: done, done, done, done, done.
Posted in Seven habits, scrum on January 25th, 2010 by Dion Nicolaas – Be the first to commentThis is part of Seven habits of highly effective scrum-teams, a book in seven parts about scrum for teams.
Sixth habit: done, done, done, done, done.
The Team moved all stories (except the last one) to ‘Done’. Quite a successful sprint that was! But in the next sprint, things didn’t go so well. First of all, the new software still had to be integrated into the main product. Then they had to create the release packages to release the new software to production. They suddenly realized they hadn’t planned for that.
So they adapted the plan, did the work, and went on with the new sprint. But then the bug reports started flowing in..
Definition of ‘done’
It sounds odd to talk a lot about what ‘done’ really means: after all, when it’s done, it’s done, isn’t it? But actually it is well worth it to spend some time on the team’s ‘definition of done’.
Is your software ready when it passes all tests? Does that include integration tests, if it is part of a larger system? Was it tested by actual users? And does it really work in real life, on end-users’ machines?
In a talk to Google[1], Jeff Sutherland, co-inventor of scrum, explains the evolution of the definition of done at PatientKeeper. First a feature was considered done when it was unit tested, but the stakeholders were not very happy with that. Then they considered a feature ‘done done’ when it was system tested as well. They moved to ‘done done done’ when it was acceptance tested by the users. Then they considered it ‘done done done done’ when it was taken in production by at least four end users. But in the end they settled for ‘done done done done done’, if it was released to production for all the users they have. In Jeff’s words: ‘Let’s see if the phone rings in the next hour. That’s our demo. And if the phone didn’t ring, it was a great demo.’
PatientKeeper invested a lot of time in advancing their definition of done as far towards the end user as possible. That’s the way to exchange as much unplanned work for planned work as possible.
Testing
If it is not practical, or even not possible to release to production at the end of your sprint, your definition of done should include as much testing as possible. Just unit testing is the bare minimum; system testing or ‘end-to-end’ testing is a lot better. If at all possible, acceptance testing should be part of the sprint as well.
If you don’t do all levels of testing, the stories that were considered done will come back into a next sprint. When they come back then depends entirely on people outside your team, and the urgency of it as well. But in the mean time, the team moved on and might be working on something different.
Just like your definition of done, your test strategy might need a lot of work. Just make sure you pay it the attention it deserves.
Test driven development
Test driven development means you write automated tests first, then implement the code to make the tests pass. Ideally, when the whole system is ready, you have automated tests for every bit and piece of the whole system.
Apart from the fact that test driven development shapes your development in a good way, it will also make sure you have automated (regression) tests for your system. This makes it very helpful in ensuring proper testing is part of your definition of done.
If you find bugs later, you first add tests to reproduce those bugs, and only then fix them. Then you run all the tests again. This will make sure the bug is really fixed, will not return later, and you didn’t break anything else in the process.
Test driven development is a possible test strategy that may fit in well with yours. There is a lot of information about it on-line.
Dealing with bugs
Even if ‘done’ means ‘released to production’ in your team, you still can’t be 100% sure a finished story won’t come back to you. However well tested, bugs might be discovered after the release. And bugs in production software usually have a very high priority.
Incoming bugs almost always disrupt your sprint. There is little you can do about it. The best thing you can do with it is handle it as all other unplanned work: make a sticky for it, put it on the board, give it the priority that is necessary (’should we drop everything of our hands right now or can I finish lunch first?’) and let the team pick it up.
The worst thing you can do with it is trying to ignore it. It will pile up and then it will kill your sprint completely.
Planned versus unplanned
All those unplanned items look really ugly on your scrum board. And they are not really prioritized, or sized, or handled top to bottom, as other items are. Why are unplanned items handled so sloppily in a neat and organized system like scrum?
First of all: unplanned items are, well, unplanned. You didn’t plan for them. The fact that you even have a place for them on the board is already pretty organized. You have to make sure though that they deserve the status of ‘unplanned’: they should have high enough priority to disrupt your sprint, and they should be small enough not to kill it completely. If the priority is not really high, put it off till next sprint. If it is so large that you cannot expect to finish it within weeks, but it has to be done now, kill the sprint and do a new planning session.
So what is a ‘good’ unplanned item is a small task with a priority high enough to disrupt your sprint. Do it now. Then it’s gone. The unplanned items area in effect is above all other stories on the scrum board. If you handle unplanned items really fast, they have the least impact on your sprint.
It’ll show in your velocity
Of course, handling unplanned items will take time: if you do a lot of unplanned work, you will do less planned work. In other words: the team’s velocity will go down.
Hopefully, the amount of unplanned work per sprint is more or less stable. It grows a bit after a big release and shrinks a little in the holiday season, but that is all difficult to predict (it’s unplanned after all.) Small variations don’t really matter: on average, your velocity will remain more or less stable.
If you have lots of unplanned work in one sprint and very little in another, that is a reason for concern. Where does it come from, and why does it come in bursts? This is a good topic for the retrospective.
If you always have a lot of unplanned items, the variance between sprints will be large as well, meaning your velocity will jump up and down. That is not a good thing, because the predictability of the team will suffer. That is a good reason to keep the number of unplanned items as low as possible.
If your amount of unplanned work is small enough, it will have little impact on your sprint. The length of your sprint will provide enough buffer space to compensate for some unplanned work.
To summarize: plan everything you can, but don’t bother about the things you cannot plan. They will be accounted for in your velocity.
More tools is better
Apart from bugs, another source of unplanned work is lack of automation. If you have manual tests, manual setup work, manual deployments, or manual database maintenance, the team will always be necessary to do these, and it will require a lot of time. Automated tools take even more time to create, but that will pay off later. Tools will make tedious work easier, makes it easier to repeat those jobs, and ideally will allow the team to let someone else do the work.
Writing tools is work that can just be planned in: create a story for it and do it. This is a form of ’sharpening the saw’: don’t get caught in boring and time consuming work, if investing a little bit more can get rid of it. Planning the creation of a tool is easy and predictable. Manual work is much harder to plan, as it can come back frequently.
‘Fire fighters’
If the amount of operational work is high, either because of buggy software, or because your tools aren’t mature enough, it may be better to split of part of your team into a new team, the ‘fire fighters’. While they take care of all operational work, the rest of the team can work in peace. Their main focus should of course be to fix bugs and build tools; if they do that right, the ‘fire squad’ can be merged back into the team later.
Component teams get lots of unplanned work
Some teams don’t work independently, but work on a component of a larger system. Even though they can do unit testing, they probably can’t do system testing, because for that, the other teams, responsible for the other components, need to do their work as well, and then the integration work needs to be done. Only then the system testing, acceptance testing and release can be done.
This is a difficult situation. It is almost guaranteed that bugs will be found, bits will be missing and help will be needed. But the team can never properly plan that, as the timing depends on other teams. So a lot of unplanned work will be generated.
Furthermore, this structure will lead to a lot of managerial overhead. For each new feature that is planned, the product owner of the team needs to deal with the product owners of other teams to synchronize the schedules. Instead of people being on the critical path, you now have complete teams on the critical path, for every single new feature.
Feature teams
There is a way to prevent unplanned work and have a much more advanced definition of done: by forming feature teams. A feature team is responsible for a new feature, but will change every component that needs to be changed to implement that feature.
It is easy to see that feature teams will not create unnecessary critical paths: the team itself will do everything necessary to implement the new feature, and will not have to wait for other teams. It can also have a very advanced definition of done, because the team can consider the feature done when it is released to production: after all, they implement it all the way.
It does require something from the teams, though: they will need to learn something from each system; and they will need to work together with all other feature teams that might work on the same components. It will probably take some time before the teams work smoothly together.
This will have the added benefit of a lot of knowledge sharing, though: by the time you have real feature teams, they are more or less interchangeable. This will reduce the managerial overhead to a minimum.
If you have component teams, the change to feature teams is a large reorganization and can be difficult to realize. A possible way to form your initial feature teams is to take one member from each component team and put it in each team. That way each feature team will have a component expert, which means that each team as a whole has all the knowledge necessary to implement the feature.
The end user’s done
Why is it so important to pay so much attention to the definition of done? Because your end users have only one definition of done: It Just Works. Perfectly. Anything short of that will make the end user unhappy, so they will come back and haunt you. (Or even worse: they will leave you for a competitor.)
The further ‘down the chain’ you shift your definition of done, the more work will be in your sprint. And more work in your sprint means less work after it. Work that comes in after your sprint is usually unplanned, and at least is not planned very far ahead. That way your release planning is not very accurate. If all the work is done in the sprint, you can have a release planning that goes several sprints into the future, contains really all the work that has to be done.
This will not reduce the amount of work: that will remain the same. But a good definition of done will ensure an accurate estimate of that work well ahead of time, and that is what planning is all about. In the end, all software becomes ‘done done done done done’. The question is how far ahead you could see it coming.
[1] Scrum Tuning: Lessons learned from Scrum implementation at Google
Next chapter: Seventh habit: power to the team