23

I've found that many codebases in large companies degrade over time, only adding new features and never improving the design of the codebase. Gradually, these codebases degrade to what is considered "that legacy system no one wants to work on".

However, I do know that one can refactor a code base: "improving design without changing its behavior” (Michael C. Feathers, 2005, p. 5)

My question is: what conditions are sufficient for successful refactoring? What differentiates a codebase that can be refactored from one that cannot be feasibly refactored?

11
  • 4
    Legacy code is code without tests. If you have plenty of code coverage, you are quite safe in refactoring as you see fit without breaking anything by following the test results. If you don't have test you first need to write them, and writing tests is boring and hard especially for a code base not written with testing it in mind
    – Bakuriu
    Commented Feb 4 at 11:23
  • 3
    Its not always wise to refactor, all refactoring introduces risk sometimes with only a moderate often unused rewards. You have to be extremely strategic with what you choose or not choose to refactor. There's no one size fits all rule. Usually I create a greenspace within a nasty app to work from. Commented Feb 4 at 17:34
  • 12
    "Legacy code is code without tests" - Hardly... I know plenty of non-legacy code that isn't well tested and plenty of legacy code that's covered in tests Commented Feb 5 at 13:54
  • 1
    @ScottishTapWater I do agree with you on the technicality. But, I have seen Bakuriu's definition in the past, actually in the same book as I refenced in my question: "working effectively with legacy code". But MY interpretation of that statement is to not take it literally, but rather as: "code without tests, should be treated as legacy code" Because with no tests you are less sure of its behaviour and making changes becomes a challenge.
    – Divan
    Commented Feb 5 at 14:50
  • 1
    @DirkBoer I think you could assume that when someone says "tests" they mean "useful tests". It's fair that some tests are not useful, but of course those aren't what is important. It's the tests that are useful that matter. (And some languages have mutation testing that can tell you if your tests are useful, at least in catching errors) Commented Feb 5 at 22:34

7 Answers 7

48

Refactoring takes time. If not done well, it creates bugs that you need to fix. So it costs. You refactor when the benefits outweigh the cost of refactoring.

There are good times for refactoring: one time is immediately after the code is written. At that time:

  • you know the code
  • you know exactly what it is supposed to do
  • you know how to improve it
  • there hasn't been enough time to remove bugs due to maturing of the software, so bugs added during refactoring are not a big deal

Another time is when you need to change the code's behaviour and the code is in such bad shape that making changes is hard and risky, and so it is more effective to first refactor and put the code into good shape and then make the behaviour changes.

NOT a good time is when you look at the code and don't like it, and that's your only reason to change the code.

(And please look at the other answers describing how to refactor successfully. Important are: having tests that find when you introduce bugs, and finding available tools to help you.)

12
  • 20
    On the other hand, if look at the code and don't understand it, then spend hours/days working through it to understand it... then it may be a good time to refactor it for clarity so the next developer to come along doesn't lose as much time as you did. Commented Feb 4 at 14:57
  • 1
    Part of the question is: Why are you trying to understand the code? Probably because you plan to change the code's behavior, you are trying to test the code, or you are trying to document the code. If you are just trying to understand it for some other reason then no, don't refactor. This answer covers the case where the code is so bad that making changes is hard. But what if you want to better document/test the code (but not yet change behavior) and that is prohibitively difficult? Is that cause for refactoring?
    – Jagerber48
    Commented Feb 5 at 6:37
  • 2
    Actually reading some other answers and reflecting on the importance of testing for refactoring, the real kicker question is: Say you have a hard-to-understand but important legacy codebase with no or limited testing. Should you refactor in this case? refactoring is risky because testing is poor, but writing tests is challenging because the code is hard to understand. Should you go through and write tests and refactor as you go? Or try to go through, understand the code, write tests, and then refactor only if you want to change behavior?
    – Jagerber48
    Commented Feb 5 at 6:54
  • 4
    @Jagerber48: You should definitely always write your tests first, then refactor. Not only when the code is difficult to understand, but also when you think that you understand it ;-) When you have hard-to-understand code it's a good strategy to get test data from a production system. That gives you inputs and expected outputs that you can use for some black box tests. Then refactor the hell out of the code, re-run the tests and be surprised by the results. It's incredible how much you can learn this way!
    – Treb
    Commented Feb 5 at 7:11
  • 4
    "NOT a good time is when you look at the code and don’t like it" I reckon that depends on the current level of technical debt and the scope/blast radius of what you're looking to change. In a relatively clean-practice codebase and for a small-scope change, "drive-by fixing" is a very strong approach to keeping technical debt at bay. However, this does require self-discipline as to picking the right battles and limiting the size of your battles. It's not foolproof by any means, but it is in my opinion one of the best ways of continually suppressing the growth of any technical debt.
    – Flater
    Commented Feb 5 at 22:16
26

To enable a codebase to be refactored, the fundamental necessity is test coverage. Ideally fast, automated test coverage. You want to be able to assert, with high confidence after every refactoring step, that you have not introduced a behavioral change. You get that confidence and make those assertions with test coverage. When you're refactoring, you're changing the internal structure, so your unit tests will also need to be refactored. However, your higher-level tests should remain stable. If you don't have robust testing or if your code isn't easily testable, characterization tests are a good way to get some test coverage to enable refactoring.

But, in my experience, most of the barriers to refactoring aren't technical. Being allowed and encouraged to spend time to improving the readability, maintainability, and testability of the codebase isn't often seen as a good investment of time. Instead, there's an ongoing drive for delivering new features for customers. However, investing in refactoring - especially if the codebase is high in technical debt - can make it faster and safer to deliver these new features and reduce the likelihood of defects in those new features.

25
  • 1
    The necessity is to be able to reliably & controllably achieve equivalence of behavior. This requires some combination of careful analysis and testing. Automated testing is good, but robust analysis can (if needed) shoulder a fair amount of weight. I generally prefer equivalence as a first step, but divergence can be allowable where it is due to bugs being fixed. All divergence however must be rigorously accounted for.
    – Thomas W
    Commented Feb 4 at 22:55
  • 2
    I generally don't trust human analysis on the plain fact that managers can pressure them to deliver results that they want. You aren't going to pressure automated tests to give you a pass unless you screw around with it, but that's not going to fool anyone.
    – Nelson
    Commented Feb 5 at 1:46
  • 1
    @ThomasOwens, I find that is highly contingent on what facilities exist for automatic testing. I mostly write in SQL these days, and it can be practically impossible to programmatically test the workings of database engines, especially for defects which occur only under certain circumstances of concurrent load. Manual analysis is often all you have.
    – Steve
    Commented Feb 5 at 11:14
  • 1
    I strongly agree with this, with the understanding that you don't just mean unit tests. If that's correct, I think it would be good to clarify that in the answer. Unit tests are helpful for refactoring inside a very narrow scope only.
    – JimmyJames
    Commented Feb 5 at 16:04
  • 3
    I'm blanking on a good wording, but it might come to me when I stop thinking actively about it. I think some people may see 'automated' and 'test coverage' (especially) and incorrectly assume you mean 'unit tests', if that helps.
    – JimmyJames
    Commented Feb 5 at 17:04
6

A significant factor is tooling used by the team and the proficiency with which developers use the tools. The concept of refactoring was originally developed in the Smalltalk context, and there were a number of tools developed to support this activity:

  • a Refactoring browser (part of the Smalltalk IDE) which automates certain low-level refactoring jobs, such as moving methods and instance variables up and down the class hierarchy,
  • a rewrite tool which performs search and replace operations on syntactic entities, enabling more complex refactorings,
  • and last but not least a unit test framework (I'm not sure whether this was developed in conjunction with the other tools) as unit testing is an important factor in ensuring that refactoring did not damage the semantics of the code.

Modern IDEs often have similar tools, but of course these are only applicable to supported languages. I'm not sure whether there are proper tools to refactor COBOL, FORTRAN or PL/I code, so the use of older languages may be a limiting factor for the feasibility of refactoring.

3
  • 1
    This was actually the first thing which came into my mind when reading the question. Can't believe the question got formerly 3 answers, but all of them overlooked this, which is IMHO one of the most important constraints.
    – Doc Brown
    Commented Feb 3 at 22:27
  • 1
    ... and there is more to say about tooling. An important "tool" is the language itself - for example, typesafety helps a lot. I once had to deal with a code base written with some old 4GL tool, where the whole code base was stored in some binary BLOB. This made it almost impossible to "diff" any changes in version control, which makes refactoring inherently harder.
    – Doc Brown
    Commented Feb 3 at 22:36
  • 2
    ... our solution was finally not to refactor - we replaced the 4GL code by a number of C# modules, it was one of the rare occasions where I favored a rewrite over refactoring.
    – Doc Brown
    Commented Feb 3 at 23:02
4

I've found that many codebases in large companies degrade over time

No, I find what degrades is (a) the existence of staff who understand the code, and (b) the correspondence between the understanding which staff have of the situation, and the concepts encoded in the structure and naming scheme of the source code.

What differentiates a codebase that can be refactored, from one that cannot be feasibly refactored?

A codebase that can be refactored is one where staff are retained who are still fully in control of the code, in that they continue to possess a strong understanding of the code, the business purposes for which it was written, a strong understanding of exactly how the existing code lacks quality, the implications of proposed changes and refactorings, and the resources to execute such changes.

A codebase that cannot be refactored may lack any of these ingredients.

In the same way that a lot of rocket engineering from the 1960s is now lost to our understanding, often codebases reach a point where so much understanding has been lost that it makes more sense to start again on producing source code, than to try and adapt what already exists.

Many codebases are prematurely damaged by two things.

One is relentless development and accumulation of complexity in the source code to fulfil too many minor or auxiliary functions, to the extent that the development team cannot remain in control, and cannot continue over time to adapt the source code to serve its major and fundamental functions. This leads either to increasingly slowed development times approaching sclerosis, which stop the software being re-adapted as quickly as would be desirable, or it leads to changes being hacked in which then permanently destroy correctness or which grossly exacerbate the complexity of further change.

The other is the rapid loss of development staff responsible for the creation of source code and who have the most complete understanding of it. Second (or further) rounds of staff are introduced to work on the codebase who have had little or no contact with or supervision from the original creators, and this leads to rapid collapse both of the capability of the staff and of the coherence of the concepts embedded into the source code.

Both cancers can be related, in that excessive complexity makes it difficult to recruit further staff and reproduce the necessary understanding for them to work effectively on the source code, and failure to retain staff who are properly in control often leads to proliferative complexity in the source code which nobody properly understands (not even the creator).

It's why "pressure from above" is often so destructive for software development, because it often leads either to turnover as developers reject unreasonable additional stress, or it leads to modifications of source code that may meet short-term objectives, but which undermines the ability of the staff to maintain a controlling understanding of the source code, and therefore rapidly erodes the ability of the staff to adapt the source code further.

26
  • 2
    I do not agree with this answer as written. I think it merits more explanation as to how code does not degrade over time. To use a simple, but clear example, if I wrote something for flash 20 years ago, that code is clearly less efficient relative to modern techniques today. Same with things like Python. If I wrote something in Python 2, Python 3 offers better readability, more intuitive libraries, and better speeds for the same code I write a long time ago in Python 2. Moreover, project goals often change over time rendering decisions that originally mad sense, less effective realatively. Commented Feb 4 at 17:39
  • 3
    @GrantCurell, I fail to see how the comparative lower efficiency of older technologies, amounts to a "degradation" of the code itself. The gist of my answer anyway is not about how tools and technologies change over time, but about how, much sooner than that, the organisation loses the facility of having staff who actually comprehend what the code means and who retain enough understanding of how the codebase corresponds to the business operation that they can continue to re-adapt the code to suit that operation. It's this comprehension and understanding that tends to degrade, not the code.
    – Steve
    Commented Feb 4 at 18:05
  • 2
    Supporting @GrantCurell's point; the quote "a lot of rocket engineering from the 1960s is now lost to our understanding" just goes to show how horribly unfounded this answer is. My best guess here is that this answer is built from a limited experience of how things worked at one or a handful of companies and has then wrongly been extrapolated to a blanket philosophy.
    – Flater
    Commented Feb 4 at 21:50
  • 1
    @Flater, I'm not quite sure what you disagree with about the rocket engineering analogy? It's widely accepted that many old designs of rocket technology are now completely defunct, because the manufacturing knowledge and craft techniques on which those designs relied has since been lost. Nowadays, rather than trying to adapt old designs (without understanding the constraints anymore), or rather than trying to recreate those craft understandings (when no knowledge remains), it makes more sense to simply begin the design process itself from scratch, and solve the problems in a modern context.
    – Steve
    Commented Feb 5 at 10:13
  • 1
    @Flater, actually I was specifically thinking of the welding techniques on the F1 rocket engines. Nothing to do with lost data, but the loss of human understanding. Modern designs probably are better, the point is that all the effort that went into the old designs is now worthless, and the new designs have to be designed again from first principles (and those principles re-discovered). This is often what happens with rewrites of source code - developers have to start again in order to regain the understanding that the creators of the legacy code gained in the process of designing the original.
    – Steve
    Commented Feb 6 at 7:39
4

There's this fact/urban legend that if you put a frog in hot water, it will jump out. But if you put a frog in cold water, it will stay in. If you then heat up the water, the frog does not notice and eventually gets boiled to death. This is most likely not true, but it serves as a fantastic analogy here.

Technical debt often behaves like this. If you introduce a developer to a codebase that's already steeped in technical debt, they will balk and either not want to take on the project or start it from the ground up. But any project, even ones that start cleanly, over time accrue technical debt, often stemming from small decisions that initially don't appear to have long term ramifications. This may even be statistically correct, but over time at least some of these decision have a bigger impact than expected and they become the precipitation points on which technical debt issues start to form.

My question is: what conditions are sufficient for successful refactoring?

To continue the analogy, you're asking how the frog can make sure it doesn't get boiled to death once it's in the (now cool) pot.

First of all, it requires us to define what we consider to be too hot, and that's already a significant can of worms. So let's sidestep that discussion and conclude that we have some kind of metric in place, whether it's a specific temperature, a quick rise in temperature, ...

Next up, what good does that metric do if we don't give the frog the right tool to track it (thermometer, timer, ...). This means that you need to monitor the situation, such as:

  • The velocity of the team over time. As the technical debt increases, you'll see that the devs need more and more time to deliver the same amount of features.
  • Static analysis of the codebase, which can indicate a growing list of concerns/warnings.
  • Listening to your developers. Often, technical debt is observed and communicated well before actual action is taken.

Okay, now we have a frog who knows the pot is heating up. But did we make sure that the frog jumped in a pot that it's actually capable of getting out of again?

The equivalent part to that analogy is having reliable (up to date) documentation of the business requirements and a testing suite (which ideally should be the code-equivalent of the requirements documentation). This enables you to refactor any part of your codebase while maintaining confidence that the refactoring isn't changing the actual behavior.


Really, all I've done here in this answer is list the main ingredients to create a dev team that does not let technical debt pile up:

  • Agree on the definition of technical debt and good/bad practice
  • Observe the current state of the technical debt and good/bad practice
  • Prepare to have to fix the current state of the technical debt and good/bad practice

I can't spell the whole process out, because this is way too contextual for every company and situation. If there were a one-size-fits-all solution to this, it would already be a widely distributed guideline. As it currently stands, technical debt requires a balancing of business and technical needs and requires cooperation with the non-technical side of the company who are usually in control of the budgets and requirements, and that's not something you can prescribe in a guideline.

3

It really depends on where you are coming from. There is a good argument for "never refactor" if you just look at it on a time cost vs new feature benefit point of view.

The common arguments in favour of refactoring tend to be more abstract

  1. Maintainability. If the code is constantly worked on and "up to date" then its going to quicker to add features/fix bugs

  2. Skills up to date. If you keep refactoring to the latest framework, programming language etc then your devs are always working with cutting edge tech, are easier to hire

-1

My honest answer is that mostly what prevents refactoring is time, i.e. resource availability. Especially in small companies there simply aren't enough bodies to refactor old stuff, what matters is developing new stuff.

2
  • 10
    That's the usual excuse, and it is always wrong, since it suffers from the misconception that refactoring is mostly an activity which runs separately from feature development. Refactoring is something I do whenever I want to make a change to a code base, and the change is hard, because the code is not in good shape. Or, whenever I made a few changes to a method or class, which made it a little bit messier, and I don't want to leave this mess behind. Often, I switch between these activities multiple times a day, it is inherent part of any effective development work.
    – Doc Brown
    Commented Feb 3 at 22:47
  • 10
    ... but when you think of refactoring as an activity which takes two weeks for the team of unproductivity, don't be astonished when your manager isn't excited about it. Sure, sometimes you need such kind of refacorings - but 95% of all refactoring activity should not be subject to any individual time planning apart from other development activities.
    – Doc Brown
    Commented Feb 3 at 22:49

Not the answer you're looking for? Browse other questions tagged or ask your own question.