How banning Scrum increased my team’s velocity and happiness

10/08/2022 — Software Development, Lessons Learned — 11 min read

Ok, the title is a little bit click-baity on this blog as it’s actually to do with using scrumban instead of outright banning scrum but the sentiment is something I’m sure most people working on a software delivery team will find alluring.

In my day job I work as a cell lead working on an project that has evolved through multiple stages:

We started by taking on an existing codebase that had been running in production for 12 years with the view of keeping the live site working while refactoring the system
Then when the 2020 pandemic hit we found ourselves frantically expanding the client’s offering to additional markets in order to reduce some of the uncertainty that the situation created
Finally after the pandemic panic, we were asked to continue the delivery of an existing project to deliver a new site on a newer technology stack that was shelved due to the pandemic but that the team hadn’t been part of (and in-fact I found myself leading a new team)

Preparing to scrum

During the time that the project we were picking up was shelved, the architects had been re-evaluating the technologies they wanted to use and moved from a series of REST based microservices to a GraphQL solution that used a GraphQL gateway to route operations to separate GraphQL services.

This lead to the team being given an architectural plan that was half as-is and to-be and a requirement to make use of the existing work that was done and this expectation was fed into the initial estimation sessions.

We were given a couple of weeks to pull together a rough outline of how we’d tackle the different problems but those problems were shaped by the architects, who having worked one the project previously, had made assumptions of how easy it was to adapt the existing work to meet the new architecture.

After we pulled together a rough action plan we then gave t-shirt sized estimates to the different epics that made up our backlog which then got turned into boxes in the Gaant chart by the Delivery Managers who converted that into a number of two-week sprints.

The first sprint contained a lot of spikes to investigate what we were actually up against during which we identified how much work was actually required to refactor the existing code to meet the new architecture — it was a lot and we ended up rewriting it all.

At the same time the spikes were unearthing how misaligned the estimates had been and we were then also trying to refine and estimate a sprints worth of work so we would have some work to do next sprint.

It’s hard to accurately plan for a sprint if your stories lack detail

With the initial spikes revealing how little we knew about the work there’s an assumption that it would have been a good time to take corrective action in order to de-risk the rest of the delivery, but this would undermine the established estimates so we looked to fix on the fly.

As we were fixing on the fly this meant we had to try and add as much information that we could to tickets that were being prepped for upcoming sprints while also trying to meet the sprint commitment.

This took the form of me dedicating about 50% of my time on story refinement (in order to give the team time to do their work) alongside a Business Analyst who did the bulk of the story writing and I don’t think we had a backlog of more than a sprints worth of stories until the eighth sprint.

There was often a lot of context switching as the delivery management side of the team wanted to de-risk parts of the delivery by running things in parallel, but this normally meant that stories didn’t tie into the bigger picture of the features being delivered properly because we didn’t have the time to think about that.

The level of detail on the stories being written was bare bones, just containing basic bullet points of what needed to be done, what was out of scope for the ticket and some hastily pulled together development notes.

Because of the context switching and the bad state of the stories the estimation sessions were often drawn out where the team scrambled to make sense of what was being asked of them and essentially worked to build up the detail in the ticket. These sessions were never shorter than two hours.

After getting a sprint pulled together, estimated and committed to, we would also then encounter issues in development as the team started to see how there were complexities that were hidden a layer or two under the surface.

All these issues led to a brutal loop of, at best, treading water from a velocity stand point and any external factors had the ability to sink the team’s throughput.

Team morale also suffered due to the low velocity and an ever increasing backlog of bugs and technical debt as we tried to trim the fat from the deliverable in order to meet the two week sprint cut-off.

It’s hard to deliver a backlog of stories if the list of stories is continually growing

While on paper the delivery looked slightly on-track in reality there was an ever increasing backlog of technical debt that wasn’t charted and was hard to get prioritised because the focus of the delivery was the features, even if it took longer to build them because of the technical issues.

One of the main initiatives was to build an end to end skeleton of the system to see how everything hung together which from a risk management point of view was great as we could see where potential integration issues between systems would happen but from a backlog management perspective this was a nightmare.

Having stories across the entire user journey which raised small bits of technical debt as they were worked on made it a lot harder to focus on planning on how to pay off that technical debt because by the time you’d refined the work item another ticket had been done that duplicated the issue elsewhere.

Working around the known technical debt also proved to be a pain for the team as we would often need to plan a sprint in such a way that stories were delivered in a slightly disjointed manner so someone could pay off the technical debt, but not block the others before their story was completed, and because the technical debt was often spread across multiple parts of the user journey this increased the context switching the team did within a story.

Eventually we did get to a point where once the bigger technical debt was paid off we could minimise the context switching, but this required a lot of effort in the refinement of the stories and I had to put a lot of effort into making it easier to trace technical debt across the code base.

It’s hard to adhere to scrum when you’re constantly fighting scrum

Ultimately the lack of time to prepare for a sprint meant we had bad sprints and scrum isn’t very forgiving when you fall at the first hurdle as it assumes the sprint 0 activities will ensure you hit the ground running and from there you’ll only go faster.

If I was to personify our sprints it would be of someone finishing the 100m hurdles but with half the hurdles wrapped round their foot. From memory I think we had one sprint where we completed 100% of the committed to stories and that was sprint 1, which was the sprint where we did all the investigations.

Every sprint would see us fail to close all of the stories we committed to, sometimes due to external influences such as the continuous delivery pipelines we were reliant on to merge our feature branches into the main branch failing, but more often than not because the work was underestimated and too complex.

We had programme goals we needed to hit and my team was just one of many working to a two week sprint cadence so there was no means to reconfigure our use of scrum. We just had to accept being an ‘at risk’ team that was actually delivering the value the programme wanted, but just didn’t do so in a way that matched the reporting framework.

It’s hard on a team when scrum isn’t tailored to the team’s situation

Having your team being a programme delivery risk isn’t a fun position to be in and it’ll take its toll on the team as no-one likes to be on the ‘losing side’, especially when the reason you’re losing isn’t directly attributable to your performance.

There were a number of initiatives within the programme to try and correct the course, with a bit of a ‘reset’ period happening where we tried to evaluate the approach we were taking, but the issue with this activity was that we still took the output of that period and tried to look at is as a series of two week sprints.

When framing the problem in sprints there’s two variables that come into play; the number of sprints to complete the work and the amount of throughput the team has.

In our situation the number of sprints were already defined as we had a set timescale to complete the work which meant we then ended up having to estimate how many more team members would be needed to ensure the delivery would happen on time.

The problem with adding more people to a team where the work lacks detail and is complex is that while it increases the ability to pair on a problem (or run things in parallel) it doesn’t solve the underlying issues of the work itself and leads to more work for those trying to prepare a sprint as there’s more work to refine and get estimated.

Reflecting on the way we’re working and scrum’s shortcomings

A couple of months after that reset period we found ourselves with a change in delivery manager and this gave us an opportunity to try and sell a change into the way we worked to them.

We reflected on the issues we were having with the existing scrum-based approach:

We had a number of longer lasting tickets that were being dragged over which meant that our sprint report looked like we only ever completed half the work, even if those tickets only needed a couple more days
Context switching between epics was making it harder to refine and estimate stories, plan a sprint and to build up knowledge of the codebase within the team as constantly switching areas of focus meant no one had an in-depth knowledge of a particular area
My role in the team should have been more hands-on with the code but I was spending half my time working on story analysis and the BA was spending 80% of their time on just preparing a sprint instead of looking into future work

We proposed some changes based on this reflection:

We’d look to have a means to have these long lasting tickets take as long as they need without that negatively impacting the reporting of the sprint by balancing the longer stories with smaller stories being completed during the sprint and using scrumban to mitigate the commitment part of scrum
We’d look to create streams of work so that a small group within the team could focus on a particular area for a longer period of time and build up the in-depth knowledge across that group (giving us some redundancy if people are away)
Adopting scrumban and using a streaming approach would allow us to remove the need to prepare a sprint and instead rely on ‘kick-off’ meetings when a new ticket was started which meant we could spend more time refining the backlog and provide a just-in-time approach to estimation and commitment

(Kan)banning scrum

Firstly I’ll explain what I mean by scrumban as we’re likely not following the textbook definition of approach because I haven’t actually read it but I’m familiar enough with kanban and scrum to say we’re using the following practices from both approaches:

We’re using the sprint cadance from scrum as we have reporting requirements based on the stories completed in a sprint. This means we’re effectively reporting on the cycle time
We’re using the show and tell ceremony from scrum as it’s important that we show the value we’re delivering each sprint
We’re using the retrospective ceremony from scrum as it’s important that we get feedback from the team on how they feel about the process
We’re using the sprint planning ceremony from scrum but instead of this being an up-front commitment of work we’re focusing mostly on setting the goal for the upcoming sprint so the team are aware of what we want to achieve
We are not using the estimation aspect of scrum, instead we do this just as the team are picking up the issues to work on them
We’re using the work in process limit from kanban to focus the team on closing off work that has already been started before pulling new work into the sprint

The scrumban approach itself isn’t the key to the success we’ve seen within the team but it’s an enabler for the streaming approach we’re using which has made the real impact.

In software development it’s common for developers and testers to pair program on a problem. This means that two or more people are working together to solve an issue, talking through the approach and evaluating technical decisions together. We’ve expanded the pair programming approach a little and have a small team of developers pair on delivering an epic.

We found with pairing that after the complex parts of the problem were solved the developers wanted to break away onto their own issues instead of having multiple people watch one person write code so we decided to embrace this with the pairing at an epic level.

It allows the developers to work together to chip away at the initial ticket(s) in the epic that have the most complexity and once they’ve got a shared understanding of problems and patterns to implement it that they can then work on their own tickets within the epic. This allows for a certain level of parallelisation and they can always pair program again if more complexity is unearthed later on.

The streaming approach does have one side-effect however. With teams focusing on a particular area this does means that depending on how complex that area is they may take longer than a sprint to complete the initial ticket.

The scrumban approach helps mitigate the reporting risks of these longer lived tickets as we commit to deliver a set number of points per sprint instead of specific stories so we can rely on completing smaller stories and the completion of longer stories that were started earlier to meet our sprint commitment.

This works well as we’ve naturally hit a good cadance of stories where the streams have aligned themselves in such a way that we’re often closing off the last, easy to complete stories of one stream as we start a new stream and have to spend a longer time on those stories.

How scrumban has impacted the team

Adopting scrumban has had a massively positive impact on the team in the following ways:

Developer happiness — Being able to focus on an area of the code without the need to context switch has really helped the team understand the problem we’re trying to solve and owning the technical solution to that problem makes developers feel more in control of their work
Tester happiness — The kick-off meetings we have as a ticket is brought into a sprint that makes it easier to prepare a test strategy for each ticket and discuss that with the developers as we’re no longer trying to cram as many tickets as we can into a one hour estimation session
BA happiness — Not having to spend a large proportion of the current sprint preparing for the next has increased the team’s BA’s happiness and similar to the developers they feel more in control when they don’t need to context switch
Cell lead happiness — I’m happier as I no longer need to spend a large proportion of my time refining stories and I can instead focus on other issues within the team. I’m also able to get closer to the code now and have been able to deliver stories myself
Delivery manager happiness — We’ve met and exceeded every sprint commitment since we introduced scrumban and our show and tells now look a lot more impressive. We’re slowly getting back into the boundaries set out in the original estimate which also makes them happy
Product owner happiness — With the extra time we can spend on tickets at the start of a new stream the product owner can afford to challenge the work more instead of having to prioritise deadlines over building a quality product

There has been no negative impact on the team from what I’ve seen.

Summary

If you have to use scrum because of reporting needs but find that like my team the work you’re working on is too complex because the requirements aren’t well defined, the work itself is hard to breakdown into ‘sprint friendly’ chunks or things just take longer than expected, then scrumban can help you find better working practices while maintaining that sprint ‘contract’.

The approach that’s worked well for my team has been to adopt a work-streaming approach were a small group within the team work through an epic from the backlog in a pair programming like fashion as this allows for that subset of the team to build up a shared understanding of the epic and it’s implementation and reduces context switching, which was killing our productivity previously.