Coach Rants: On Refactoring

Nathan Nicholson
7 min readApr 20, 2021

Was recently on a call with a team when the question came up, “Often times we get pushback from stakeholders when we say we need to refactor. How can we convince them that it needs to happen?”

I actually hear this pretty frequently. When stakeholders (or even development teams) don’t understand the hidden cost of a messy codebase, they are just making the best decision they know how. It’s the business’ job to make sure they are fulfilling their mission for as little cost as they can. As technologists that enable the business, it’s our job to make sure they have the best information and tools to facilitate that mission. So let’s bring the cost into the light so that we are better prepared to fulfill our purpose.

Messy desk covered in office supplies.
Photo by Robert Bye on Unsplash

Hidden Cost of Messy Code

In order to examine the hidden cost of messy code, we must first ask ourselves “Why does the code exist?”. Bear with me here.

Code exists for two reasons:

  1. To fulfill a business capability.
  2. To communicate how that capability was implemented.

The first is obvious to all. You need code for webapps, services, mobile apps, etc. that run the business. That’s why developers are engaged for features, and defects, and incidents, oh my!

The second is a less recognized, more nuanced, and equally critical point. When we write code, we are also communicating with our future contributors about the intent of the codebase and give them the structure to implement further changes. If the intent is unclear or the structure makes it difficult, future changes are going to be more expensive to make and more prone to error.

The actual expense comes in three primary forms:

  1. Time spent reading and understanding the code.
  2. Time spent implementing new functionality.
  3. Defects and time spent debugging them.

Reading and Understanding

As anyone that has read through messy code can tell you, it’s not a singularly pleasant experience. Mixed domain language, long functions, gigantic and/or entangled classes, and strange code structures make understanding how it works difficult and mentally exhausting.

Woman stressed out looking at her computer.
Photo by JESHOOTS.COM on Unsplash

Yes, as you spend time with a codebase, you become more familiar with it and can navigate it easier. However, you are always going to be reading it to understand the intricacies of this or that in order to implement functionality. It’s inevitable that you’ll be looking for that one piece of code that needs to change, but it’s buried.

It’s like trying to find a specific wrench socket in your garage. With your workbench in a mess you end up searching through drawers, boxes, and bags because “I know I had it right here! This is such a mess!”. As opposed to going over to the drawer with the wrench sockets and grabbing the one you need. Potentially hours versus seconds.

All of that time spent reading and understanding code is time that the company is paying expensive engineers to do. If you could reduce the time to understand from days to 15 minutes, wouldn’t you? I have seen that drastic of change, and it makes a world of difference in the enjoyment of the team and the satisfaction of stakeholders.

Implementing New Functionality

This is the obvious one, right? Developers write code. However, what’s not always obvious to people that don’t ship software is the profound difference between adding new functionality to well-factored and messy codebases. Well-factored code where everything has a purpose, place, and clear intent is code that’s easier to modify, and test.

As difficult as it is to understand a messy codebase, it is even harder to modify one. They are are harder to test, and have fewer tests in them as a result. This makes them much more difficult to modify with a high degree of confidence and causes more defects to be produced during development. Those defects impact customers and means rework instead of actually improving or enhancing the codebase. Everything is connected to and impacted by the state of the code.

Warehouse with spotless floors.
Photo by Ruchindra Gunasekara on Unsplash

In a well run warehouse, the employees have a near religious zeal for clean floors. This is because of the fact that dirty floors are expensive and dangerous. Forklifts require more maintenance and cause more accidents when the floor isn’t clean. People could get hurt or die because of a stray piece of pallet on the ground. That’s why you won’t see an employee walk by a piece of trash. They will pick it up and throw it into trash receptacles that are scattered all over the building for their own safety and that of their colleagues. You will not find anyone related with the warehouse telling employees to stop picking up trash and move freight. It’d be irresponsible and dangerous.

Code generally doesn’t have quite the same bodily ramifications, but its cleanliness undoubtedly has similar impact with respect to delivery. Dirty code is slow and dangerous (defects, bugs, untestable, etc.), clean code is fast and safe.

Defects and Debugging

This is really just another way of saying reading and writing code, but this time a customer is being impacted. All of the above still applies.

  • If it’s hard to read the code, it will be more difficult to find root cause(s).
  • If it’s hard to modify the code, it will more difficult to fix the issue.
  • If it’s hard to test the code, it will be more difficult to simulate the issue.
Large cockroach on a branch.
Photo by Robert Thiemann on Unsplash

You get the picture. For every one of the scenarios above, there is cost with respect to engineering time, opportunity cost due to customer impact, and a potential monetary cost due to damage of brand/trust.

This is not the situation that anyone wants to be in. Nobody goes to work thinking “I wonder how I can negatively impact the customer today?”. We all want to do a good job of providing the tools and capabilities our business needs to delight customers and acquire currency.

Beware! Pitfalls Ahead!

We’ve talked about some of the reasons to keep a clean code base and how refactoring is a critical tool for that function. Now here’s a curveball. Refactoring isn’t always the answer, and is occasionally a bad idea. Here’s some pitfalls to beware as you are out there cleaning your source code for the greater good.

Over-cleaning the Code Base

I know I was just talking about how important it is to keep the code base clean, but there is such a thing as too clean for a particular context. Warehouses and operating rooms should both be clean, but they have different standards for what clean means in their context.

Spotlessly clean operating theater.
Photo by Arseny Togulev on Unsplash

If some code is stable or unlikely to change much, limit the zeal to refactor it to perfection. It’s not worth it. Only apply engineering rigor to problems that require it. Good enough is fine everywhere else. The skill is in determining where to/not apply rigor for maximum value delivery.

Large Scale Refactor

Your tech lead determines that a large scale refactor of the system is in order and it will take 3–6 months in order to complete. Your customers are distraught, escalation emails are flying, and the team is giddy because the code has been degrading at an accelerated rate and they’ll finally have a chance to “do it right”.

The problem is that often the same team that developed the system that needs to be refactored is the one doing the refactoring. The same skills, habits, tendencies, biases, etc. that led to the system needing a major refactor are all still present and so the end result will be likely need a major refactor.

Big broken rock on a river bank.
Photo by theverticalstory on Unsplash

It is almost always better to let refactoring happen emergently as part of changes to the system in order to add functionality. Continuously refactoring based on the types of changes the system needs will create a system that’s more likely to be resistant to these types of big bang refactoring efforts.

Hardening Sprints

A smaller-scale version of the Large Scale Refactor, a hardening sprint is a smell that we are not baking quality into our development process. If we schedule time to go back and add tests, clean code, etc. we are putting time and cognitive distance between the developers of now and the developers that actually wrote the code.

It is better to bake quality into the software as we write it so that we don’t have to remember all of the things that we were thinking over a month ago when we wrote it initially. Or worse, what someone else was thinking over a month ago. We won’t remember, and it is more likely than other priorities will sneak in and push out any efforts to clean the codebase.

Summary

Rant over. That’s an overview of the costs of messy code and pitfalls to avoid in the search of keeping it clean.

  • It’s important as developers to be able to communicate why cleaning the codebase is a critical and necessary function.
  • Messy code bases are slow, expensive, and dangerous.
  • Teams of highly trained engineers are expensive. Optimize for their/your time.
  • Cleaning the codebase generally has slim/no direct value to customers. Bake refactoring into your value-adding changes to make the cost easier to bear for feature-happy end-users.
  • Any change to the codebase carries cost and risk. Make changes in order to maximize value throughput to the customer while minimizing risk.
  • Refactoring has some pitfalls to be wary of, but keeping them small, continuous, and relevant reduces the pitfalls to pinholes.

References

--

--

Nathan Nicholson

Senior Release Engineer and Value Stream Architect at Netlify