How do I make my code more readable?

Question

I've often come across the same kind of question:

- How do I make my code more readable?

- How can I improve my code?

And usually the answers are similar - regardless of language or problem. That's why I thought I'd make a community wiki self answering post, where we can collect that knowledge and give a short overview for more resources.

So, if you're about to ask a similar question, please read some of the answers to this post, as they will probably include information relevant to you.

Relevant: Tools for format and error checking in your programming language, Performance checking tools, Frequently posted documentation - a new zombie weapon?, Create a reference for newbie common mistakes and 'Canonical' questions to help address common issues. — Mast, Commented Jun 15, 2022 at 15:43

2 revs, 2 users 94% · Accepted Answer · 2022-06-15 15:35:18Z

Common problems

The most common problems I see are inappropriate naming and long functions and insufficient encapsulation.

I'll go over these one by one. Note that most of my opinions on this matter come from Clean Code by Uncle Bob: Book Link | YouTube Lecture Link (which I can only recommend)

Inappropriate naming

I like the philosophy that code should be self documenting. Appropriate naming helps a lot with that, and inappropriate names can confuse the reader - which may be you in 2 weeks. We've all been there.

So how do I name appropriately?

I usually always write out abbreviations unless they are well known in my context. Sorry C devs.
I never name my variables tmp if they live longer than two lines of code.
It doesn't matter which naming convention you use: camelCase, PascalCase, lowercase, lowercase_separated_by_underscores, ... or whatever, as long as you do it consistently. And your team agrees on it, if you're working in a team.

And probably most importantly:

I name my variables after their purpose.
Variables are objects, their names are nouns.
I name my functions after their intent, not their implementation details.
Functions do stuff, their names are verbs.

A small exception to the noun / verb thing are booleans: I usually write them as is_condition, e.g. user_is_logged_in. I try to avoid negative terms in these conditions e.g. user_is_not_logged_in or user_is_logged_out - although user_is_logged_out sometimes is justifiable.

Long functions

Often, when I see code to be reviewed, it has at least one long function. Half the time, this long functions consists of blocks of code with a comment above it, that describes what the block does. Usually, that block can be extracted into its own function. If you use Jetbrains products like IntelliJ, the IDE can do most of the work for you: mark the block to be extracted -> right click -> Refactor -> Extract. If not, refactoring.guru/extract-method has a neat example and explanation.

If you don't have these conveniently placed comments, you'll have to do decide which blocks can be extracted.

I like to use these general rules of thumb:

Is the method longer than 5 lines? -> Extract some of it.
Does the method have more than one level of indention? E.g. nested ifs or loops? -> Extract until it only has one level indention.

Why would I do that? Now I have to look at more than one place for what my code does! I skip around in files and lines all the time now, debugging became a nightmare!

Well first of, good that you're using a debugger. If not, learn how to.

Second, debugging becomes easier for the same reasons that your code is now easier to read and understand (if you've named appropriately): If you only need to read the name of a function to understand what it does, you can skip reading the implementation, because you know what it does. You only need to read it if you also need to know how it does that. In the same way, parts of your code became now skippable during debugging, where you don't need to confirm every step. I'd argue debugging became easier.

Granted, this approach usually works best when paired with a lot of tests - which come free when doing TDD - or just being very confident in your code. I'd argue the tests are usually better, but for smaller projects - which I know will stay small - confidence is enough for me.

For more info, refactoring.guru/smells/long-method is a good resource.

Insufficient encapsulation

This usually comes back to the question Is my method / class doing one thing only? I really like Sandy Metz' squint test when I first look at code, because it tells me multiple things:

nested indentation (see the previous paragraph)
colors in the code

Now why would the colors in the code matter? Assuming you use an IDE which colors variables, members, functions and constants differently.

It matters because the more colorful your code is, the more you mix responsibilities and therefore purposes of your code. The best example for this are magic numbers or strings: If your code is doing high level calculations and also checking for a specific string in the same line, you've mixed responsibilities.

Another pattern I regularly see is mixing IO with logic: A master class that takes input, calculates some logic and outputs some intermediary results. The final result usually is printed out somewhere separately - which is good. However, the master class is clearly doing more than one thing. So how do we make our classes and functions do one thing only?

Architecture patterns exist for this very reason - to separate concerns, or responsibilities. Well known examples include Model-View-Controller (MVC) and similar patterns. I like MVVM.

The gist of all these patterns is to separate logic from IO (and persistence if needed). They do this by creating a layer for the logic and a layer for the IO and some intermediary layer that connects the other two layers. It's important to know that the logic layer should know nothing of the IO layer. The logic doesn't need to know how its data gets viewed, be it via command line or JPanel. The IO layer doesn't need to know the data types the logic uses to calculate, it needs to know the numbers and strings it should display. These are sometimes conveniently grouped into structures that the logic uses, and I think it's okay if the IO also uses these structures to display matching data.

However, all interaction between these layers should go through the intermediate layer, which provides intent functions for the IO and updates the IO if changes in the model occurs.

E.g.:

IO is a GUI and there's a button to draw a random card to your hand.
The intermediate layer provides a function drawToHand(), which is called by the GUI when the button is pressed.
drawToHand() connects to the model and updates it (via another function call, this time in the model. The intermediate layer does not do application logic, it only transforms data and notifies the other layers.).
when the model update is complete, the intermediate layer informs or updates the IO layer, which then displays the new data.

G. Sliepen · Accepted Answer · 2022-06-28 18:56:09Z

Write idiomatic code

Try to write code that follows the same practices as the majority of the other code written for your programming language, including its standard library. This includes both naming and code style. Doing so helps others feel more familiar.

When creating custom functions or classes that do something similar to what is in the standard library of your language, try to make them work exactly the same way. This allows your code to be used as drop-in replacements, and also follows the principle of least surprise.

Proper code density

As mentioned by lukstru, avoid overly long functions. But apart from that, don't be afraid to add empty lines to your functions and class definitions to clearly separate various sections.

Also try to avoid overly long lines, as this makes the code harder to read in most editors, as you either have to start scrolling horizontally, or lines are wrapped in ugly ways. Some projects mandate a strict line length limit. I would not go that far, but you can often reduce the length of a line by splitting long expressions into multiple expressions, for example by introducing temporary variables for partial results. Sometimes you have names of types, variables or functions that are very long, and this in turn causes very long lines. If your language supports it, consider creating short aliases for them to make the code easier to read.

Prefer generic code

Projects often begin small, doing only one thing, like writing a string to a database using a function write_string_to_database(). But projects grow, and maybe you want to write numbers or other things as well. It is easy to create more variations of the first function to handle all these types, but this results in a lot of code that has to be maintained. In such a case, if your language supports it, find more generic ways to structure your code. Perhaps a write_to_database() that can take strings, numbers and other things as parameters.

Stack Exchange Network

How do I make my code more readable?