32

Currently, I'm reading "Code Complete" by Steve McConnell, 24.3 chapter "Specific Refactorings". I've become kind of confused with 4th paragraph, that looks like:

Move an expression inline Replace an intermediate variable that was assigned the result of an expression with the expression itself.

Does that inline expression decrease readability of code compared to a well-named (even if it intermediate) variable? When will it be recommended to use inline expressions and when intermediate variables?

14
  • 22
    This is a valid question. Some practices suggested in older books may be outdated; when it’s counterintuitive, it’s therefore wise to question them :-)
    – Christophe
    Commented Feb 22, 2022 at 11:33
  • 14
    The problem with such rules stated abstractly, is that there are equally cases where moving an inline expression to an intermediate variable will be clearer.
    – Steve
    Commented Feb 22, 2022 at 12:53
  • 50
    Notice that what you have in that book is a catalogue of refactorings - these aren't recommendations, it's simply a list of things that you can do to shuffle code around, often as an intermediate step before you get to some goal (a way to "make the change easy, then make the easy change", to paraphrase Kent Beck). Yes, there's a refactoring named "Move an expression inline [...]" listed in there, but there's also the exact opposite: "Introduce an intermediate variable Assign an expression to an intermediate variable whose name summarizes the purpose of the expression." Commented Feb 22, 2022 at 19:00
  • 4
    @FilipMilovanović So if you try to apply all the rules, you end up in an infinite loop :) Sounds like a good XKCD.
    – Barmar
    Commented Feb 23, 2022 at 15:01
  • 3
    @Barmar: sure if you treat them as rules, but my point was precisely that these aren't rules, it's just a list. E.g., if you want to extract a piece of code into a separate function, you might want to first refactor an inline expression into a variable, move that variable in front of the code you want to extract, then use your refactoring tool to extract the function. The tool can then figure out that you've referenced a variable external to the selection, and create a parameter for it. Once that code is replaced with a function call, you, might decide to get rid of the interm. variable. 1/3 Commented Feb 24, 2022 at 11:16

6 Answers 6

61

Personally I prefer having temporary variables with explicit names (but don't abuse them either).

For me:

void foo()
{
    int const number_of_elements = (function_1() + function_2()) / function_3(); // Or more complex expression

    write(number_of_elements);
}

is clearer than:

void foo()
{
    write((function_1() + function_2()) / function_3());
}

But if you use temporary variables for something like:

void foo()
{
    int const number_of_elements = get_number_of_elements();

    write(number_of_elements);
}

This is less readable than:

void foo()
{
    write(get_number_of_elements());
}

I am not the author so I can't know what he was thinking when writing this line but it is a possibility.

The book being quite old (18 years for the second edition) there might also be historical reasons for that...

13
  • 34
    Nowadays, every decent compiler optimizes unneeded local variables away. This was not necessarily the case when the book was written (for quite a while people even added register to local variables to hint the compiler about optimizations)
    – Christophe
    Commented Feb 22, 2022 at 11:29
  • 25
    @ f222 & @Christophe - the book simply lists various refactorings that one can do on the path towards some final state of the code, they are not intended as recommendations (in fact, a couple of lines later), it lists the exact opposite refactoring ("introduce local/intermediate variable"). Commented Feb 22, 2022 at 19:08
  • 3
    @FilipMilovanović indeed a couple of paragraphs below he also proposes "Introduce an intermediate variable. Assign an expression to an intermediate variable whose name summarizes the purpose of the expression.” which leaves it to the write to chose what’s lost suitable
    – Christophe
    Commented Feb 22, 2022 at 19:19
  • 5
    @Christophe Even 18 years ago, the register keyword was ignored by all major C/C++ compilers because they did a better job than a human. You'd have to go back to the 80's to find to a time where it was meaningful. Commented Feb 23, 2022 at 3:16
  • 7
    With function names so poor, no wonder the intermediate variable version is clearer. Commented Feb 23, 2022 at 13:35
26

I think there are two key factors here:

  • How complex is the expression?
  • How meaningful is the intermediate variable?

Take an example where we have three elements to get a final price:

total = price + tax - discount;

An intermediate variable used only once is probably a bad idea:

taxed_price = price + tax;
total = taxed_price - discount;

But if the "taxed_price" variable is actually used repeatedly in the function, then naming and reusing it might make sense.

Meanwhile, if the individual elements are all themselves expressions, the full expression becomes unwieldy inline:

total = product.get_price() + calculate_tax(jurisdiction, product) - session.get_active_discount();

In this case, there are few enough operators that we can put a line break next to each one, which helps, but it's still hard to see at a glance what the overall calculation is doing

total = product.get_price() 
    + calculate_tax(jurisdiction, product)
    - session.get_active_discount();

In this case, introducing single-use variables makes it easier to read:

price = product.get_price();
tax = calculate_tax(jurisdiction, product);
discount = session.get_active_discount();
total = price + tax - discount;

But the variables are chosen to be meaningful, not just arbitrary break points - this doesn't really help, if taxed_price is only used once:

taxed_price = product.get_price() - calculate_tax(jurisdiction, product);
discount = session.get_active_discount();
total = taxed_price - discount;

And this is almost certainly nonsensical (it's hard to even name the variable, which is always a bad smell):

price = product.get_price();
tax_minus_discount = calculate_tax(jurisdiction, product) - session.get_active_discount();
total = price + tax_minus_discount;
2
  • 2
    Your "unwieldy line" example becomes perfectly ok if you just break it out into three lines, such as total = product.get_price() and + calculate_tax(jurisdiction, product) and - session.get_active_discount();. No need to introduce any variables for that. Commented Feb 23, 2022 at 14:50
  • 3
    @leftaroundabout I've added an example with line breaks, and some commentary on why I think the intermediate variables are still an improvement. The more complex the sub-expressions, the more obvious that would be.
    – IMSoP
    Commented Feb 23, 2022 at 15:40
16

“Refactoring” means changing what the code looks like, without changing what it does. A book about refactoring will tell you about changes you can make and how to make them without affecting what your code does.

You will often find a description how to change A to B, followed by a description how to change B to A - and your job is to find out which is actually an improvement. Sometimes A is better, sometimes B. The book tells you how to make the change, not what’s better.

7

Local immutable variables (const in C++, final inJava, etc.) can add useful context-dependent information for the reader of the code and are useful when debugging interactively (stepping statement-by-statement). They are a trivial for the optimizer to get rid of, so thus can generally be considered a useful “zero cost abstraction”.

Otherwise, when the context is clear and debugger ergonomics is not of a concern, then they are just a syntactic noise (especially if syntactically mutable), and more of a disturbance.

So overall — a matter of taste, I’d say.

13
  • 5
    "They are [...] trivial for the optimizer to get rid of" Beware semantic differences. In C++ a temporary with a nontrivial destructor is destroyed at the end of the statement, whereas a local variable is destructed at the end of its scope. I have fixed bugs caused by someone callously pulling out temporaries into local variables before.
    – TLW
    Commented Feb 23, 2022 at 1:40
  • 1
    C++ compilers don't really need const to optimize. But it's quite useful for humans reading the code.
    – MSalters
    Commented Feb 23, 2022 at 8:53
  • 3
    @MSalters - there are a fair number of cases where const locals affect potential optimizations. For instance if you pass a const pointer or reference to said local to a function it might be mutated under you unless the local itself is const. (C++'s "const is actually an attribute of the storage not the pointer/reference" comes into play here.)
    – TLW
    Commented Feb 24, 2022 at 0:45
  • 1
    ("But I'd never write that code", you say. You might not, but C++ relies heavily on inlined and optimized-out code. Some random static inline function somewhere in a header, when inlined a couple of levels, might very well have that sort of code.)
    – TLW
    Commented Feb 24, 2022 at 0:53
  • 1
    @ojs - not in an invalid state, rather in a valid, but an unspecified state (17.6.5.15).
    – bobah
    Commented Feb 24, 2022 at 7:04
5

No language was specified, so I'll give a few examples.

  • In languages with const correctness (e.g., C or C++) I make more intermediate variables
    • Always make intermediate variables final or const using any/all tools you have
    • in C++, use references (const Type&) if appropriate (be careful with ownership)
    • in C, use pointers (const Type* const ) if appropriate (maybe don't make something a pointer that wasn't already one)
  • Languages without that, like C# or Python, I think there's diminishing returns to having a lot of intermediate variables

Consider creating an inline variable if:

You can add context to the return value of a generic function

  • What image are we reading and why is it important?
// C#
byte[] trainingImage = ReadImage("car1.png");
byte[] validationImage = ReadImage("car2.png");

You want to cache the result of a slow or unreliable operation

  • If you really only need to query something once, do it only once.
  • Don't sprinkle extra slow or unreliable operations throughout your code, then you have to spread that dependency and its error handling throughout your code too.
  • Also remember that remote resources could "change" their answer the next time you contact them, which could lead to unexpected results.
// C#
Response telescopeResponse = RequestHubbleTelescopeData();
// do something with response

If it helps you provide better error messages

Inline expressions can make you tempted to lump all of them into the same recovery strategy even when that might not be a good fit.

Instead of one line that opens a file and parses the data,

  • Open the file in one of the lines, and be prepared to handle errors/exceptions about file I/O, file not found, etc.
  • In a separate line, process the data, and be prepared to handle exceptions about data in the wrong format, missing information, etc.

To document the code (better than comments)

From my answer to Windows API dialogs without using resource files:

Here hopefully it's clear why I am adding 1 to the length.

// C
int textLength_WithNUL = GetWindowTextLength(txtEditHandle) + 1;

From my answer to Get rid of the white space around matlab figure's pdf output

Here I unpack width/height from OldAxisPosition, so hopefully people don't need to look up what each index is.

% matlab

%% Get old axis position and inset offsets 
%Note that the axes and title are contained in the inset
OldAxisPosition = get(AxisHandle,   'Position');
OldAxisInset    = get(AxisHandle,   'TightInset');

OldAxisWidth    = OldAxisPosition(3);
OldAxisHeight   = OldAxisPosition(4);

OldAxisInsetLeft    = OldAxisInset(1);
OldAxisInsetBottom  = OldAxisInset(2);
OldAxisInsetRight   = OldAxisInset(3);
OldAxisInsetTop     = OldAxisInset(4);

To point out intermediate variables that somebody might want to monitor for debugging

I may want to know where the player was prior to the collision, so I can see whether SimulateCollision is working properly.

// C#
Point3D oldPosition = player.Position;
Point3D newPosition = SimulateCollision(player, otherObject);
player.Position = newPosition;

Consider not creating an inline variable if:

  • There isn't a descriptive name for the variable, or the name is very long.
    • Long names can be more annoying in languages without good linting / static type systems.
  • The name of the variable is more or less just repeating the function name and doesn't provide any other value-add
// C#
string productPrice = GetProductPrice();
  • The class property/field/etc. that you're reading from is already contextually appropriate and clear, and the property isn't slow or unreliable
// C#
string customerAddress = customer.Address;
  • You are doing math but the intermediate calculation doesn't make any sense on its own

This is a bit of a judgment call because sometimes I do this when dividing.

// C++
const float squareRootOfXPlus5 = sqrt(x + 5.0);
  • Making an intermediate variable involves an extremely expensive copy operation and there's no practical way to avoid it

  • In duck typed languages (e.g., Python), it may not be worth storing the value of a dict instead of just accessing it by key

# python
customer_name = customer['name']
5
  • 4
    Definitely "To point out intermediate variables that somebody might want to monitor for debugging" The main time I will change something like this in someone else's code is if I need to split out the return values so I can see which one has a problem.
    – Dragonel
    Commented Feb 23, 2022 at 16:39
  • @Dragonel Depends on your language. In Java, we can easily evaluate expressions in the debugger while paused, so no need to prepare temporary variables ahead of time (and then inevitably realize you don't have quite the right one you need and have to go change it, etc.). This has even improved over the past few years with better support for debugging fluent interfaces. Commented Feb 24, 2022 at 21:12
  • @user3067860 That doesn't solve the problem, at all. If you see in your variable that it has the wrong value, but cannot inspect the intermediate values in the debugger, then what? Re-type parts of the expression until you find the sub-expression where a value si wrong? thats cumbersome. Its much esier to read the code and debug it when there are already meaningful intermediate values present as variables.
    – Polygnome
    Commented Feb 25, 2022 at 10:02
  • @Polygnome also, in some cases, even if you did re-type parts of the expression, anything that had side effects would happen over again (not saying that's necessarily a sign of good API design, but I've seen it).
    – jrh
    Commented Feb 25, 2022 at 12:37
  • @Polygnome Select, right click, "evaluate expression". Commented Feb 25, 2022 at 18:12
1
Customer customer = Customer.builder()
  .name("John Doe")
  .age(15)
  .address(Address.builder()
    .number(15)
    .street("Maple street")
    .build()
  )
  .build();

vs.

Address address = Address.builder()
  .number(15)
  .street("Maple street")
  .build();

Customer customer = Customer.builder()
  .name("John Doe")
  .age(15)
  .address(address)
  .build();

Imagine going overboard with the first case (deep builders) - I've seen that, what a mess.

Just treat each case separately, don't inline everything because the book said you can.

Not the answer you're looking for? Browse other questions tagged or ask your own question.