13

Here's a programming/language problem I'd like to hear your thoughts on.

We have developed conventions that most programmers (should) follow that aren't a part of the languages syntax but serve to make code more readable. These are of course always a matter of debate but there's at least some core concepts that most programmers find agreeable. Naming your variables appropriately, naming in general, making your lines not outrageously long, avoiding long functions, encapsulations, those things.

However, there's a problem that I have yet to find anyone commenting on and that just might be the biggest one of the bunch. It's the problem of arguments being anonymous when you call a function.

Functions stem from mathematics where f(x) has a clear meaning because a function has a much more rigorous definition that it usually does in programming. Pure functions in mathematics can do a lot less than they can in programming and they are a much more elegant tool, they usually only take one argument (which is usually a number) and they always return one value (also usually a number). If a function takes multiple arguments, they are almost always just extra dimensions of the function's domain. In other words, one argument isn't more important than the others. They are explicitly ordered, sure, but other than that, they have no semantic ordering.

In programming however, we have more freedom defining functions, and in this case I'd argue it isn't a good thing. A common situation, you have a function defined like this

func DrawRectangleClipped (rectToDraw, fillColor, clippingRect) {}

Looking at the definition, if the function is written correctly, its perfectly clear what's what. When calling the function, you might even have some intellisense/code completion magic going on in your IDE/editor that will tell you what the next argument should be. But wait. If I need that when I'm actually writing the call, isn't there something we're missing here? The person reading the code doesn't have the benefit of an IDE and unless they jump to the definition, they have no idea which of the two rectangles passed as arguments is used for what.

The problem goes even further than that. If our arguments come from some local variable, there might be situations where we don't even know what the second argument is since we only see the variable name. Take for example this line of code

DrawRectangleClipped(deserializedArray[0], deserializedArray[1], deserializedArray[2])

This is alleviated to various extents in different languages but even in strictly typed languages and even if you name your variables sensibly, you don't even mention the type the variable is when you're passing it to the function.

As it usually is with programming, there are a lot of potential solutions to this problem. Many are already implemented in popular languages. Named parameters in C# for example. However, all that I know have significant drawbacks. Naming every parameter on every function call can't possibly lead to readable code. It almost feels like maybe we're outgrowing possibilities that plain text programming gives us. We've moved from JUST text in almost every area, yet we still code the same. More information is needed to be displayed in the code? Add more text. Anyways, this is getting a bit tangential so I'll stop here.

One reply I got to the second code snippet is that you would probably first unpack the array to some named variables and then use those but the variable's name can mean many things and the way it's called doesn't necessarily tell you the way it's supposed to be interpreted in the context of the called function. In the local scope, you might have two rectangles named leftRectangle and rightRectangle because that's what they semantically represent, but it doesn't need to extend to what they represent when given to a function.

In fact, if your variables are named in the context of the called function than you're introducing less information than you potentially could with that function call and on some level if does lead to code worse code. If you have a procedure that results in a rectangle you store in rectForClipping and then another procedure that provides rectForDrawing, then the actual call to DrawRectangleClipped is just ceremony. A line that means nothing new and is there just so the computer knows what exactly you want even though you've explained it already with your naming. This isn't a good thing.

I'd really love to hear fresh perspectives on this. I'm sure I'm not the first one to consider this a problem, so how is it solved?

16
  • 2
    I'm confused about what the exact problem is... there seem to be several ideas here, not sure which one is your main point. Commented May 21, 2014 at 16:25
  • 1
    The documentation for the function should tell you what the arguments do. You might object that someone reading the code may not have the documentation, but then don't really know what the code does and whatever meaning they extract from reading it is an educated guess. In any context where the reader needs to know that the code is correct, he's going to need the documentation.
    – Doval
    Commented May 21, 2014 at 16:40
  • 3
    @Darwin In functional programming all functions still only have 1 argument. If you need to pass "multiple arguments", the parameter is usually a tuple (if you want them to be ordered) or a record (if you don't want them to be). Additionally, it's trivial to form specialized versions of functions at any time, so you can reduce the number of arguments needed. Since pretty much every functional language provides syntax for tuples and records, bundling up values is painless, and you get composition for free (you can chain functions that return tuples with those that take tuples.)
    – Doval
    Commented May 21, 2014 at 17:10
  • 1
    @Bergi People tend to generalize much more in pure FP so I think the functions themselves are usually smaller and more numerous. I could be way off though. I don't have much experience working on real projects with Haskell and the gang.
    – Darwin
    Commented May 21, 2014 at 22:04
  • 4
    I think the answer is "Don't name your variables 'deserializedArray'" ? Commented May 21, 2014 at 22:51

6 Answers 6

10

I agree that the way functions are often used can be a confusing part of writing code, and especially reading code.

The answer to this problem partly depends on the language. As you mentioned, C# has named parameters. Objective-C's solution to this problem involves more descriptive method names. For example, stringByReplacingOccurrencesOfString:withString: is a method with clear parameters.

In Groovy, some functions take maps, allowing for a syntax like the following:

restClient.post(path: 'path/to/somewhere',
            body: requestBody,
            requestContentType: 'application/json')

In general, you can solve this issue by limiting the number of parameters you pass to a function. I think 2-3 is a good limit. If it appears that a function needs more parameters, it causes me to re-think the design. But, this can be harder to answer generally. Sometimes you are trying to do too much in a function. Sometimes it makes sense to consider a class for storing your parameters. Also, in practice, I often find that functions which take large numbers of parameters normally have many of them as optional.

Even in a language like Objective-C it makes sense to limit the number of parameters. One reason is that many parameters are optional. For an example, see rangeOfString: and its variations in NSString.

A pattern I often use in Java is to use a fluent-style class as a parameter. For example:

something.draw(new Box().withHeight(5).withWidth(20))

This uses a class as a parameter, and with a fluent-style class, makes for easily readable code.

The above Java snippet also helps where the ordering of parameters may not be so obvious. We normally assume with coordinates that X comes before Y. And I normally see height before width as a convention, but that is still not very clear (something.draw(5, 20)).

I've also seen some functions like drawWithHeightAndWidth(5, 20) but even these can't take too many parameters, or you'd start to lose readability.

2
  • 2
    The order, if you continue on the Java example, can indeed be very tricky. For instance compare the following constructors from awt: Dimension(int width, int height) and GridLayout(int rows, int cols) (the number of rows is the height, meaning GridLayout has the height first and Dimension the width). Commented May 22, 2014 at 13:19
  • 1
    Such inconsistencies have also been very criticized with PHP (eev.ee/blog/2012/04/09/php-a-fractal-of-bad-design), for instance: array_filter($input, $callback) versus array_map($callback, $input), strpos($haystack, $needle) versus array_search($needle, $haystack) Commented May 22, 2014 at 13:20
12

Mostly it's solved by good naming of functions, parameters, and arguments. You already explored that and found it had deficiencies, however. Most of those deficiencies are mitigated by keeping functions small, with a small number of parameters, both in the calling context and the called context. Your particular example is problematic because the function you are calling is trying to do several things at once: specify a base rectangle, specify a clipping region, draw it, and fill it with a specific color.

This is kind of like trying to write a sentence using only the adjectives. Put more verbs (function calls) in there, create a subject (object) for your sentence, and it's easier to read:

rect.clip(clipRect).fill(color)

Even if clipRect and color have terrible names (and they shouldn't), you can still discern their types from the context.

Your deserialized example is problematic because the calling context is trying to do too much at once: deserializing and drawing something. You need to assign names that make sense and clearly separate the two responsibilities. At a minimum, something like this:

(rect, clipRect, color) = deserializeClippedRect()
rect.clip(clipRect).fill(color)

A lot of readability problems are caused by trying to be too concise, skipping intermediate stages that humans require to discern context and semantics.

2
  • 1
    I like the idea of stringing multiple function calls to clarify meaning, but isn't that just dancing around the issue? It's basically "I want to write a sentence, but the language I'm suing won't let me so I can only use the closest equivalent"
    – Darwin
    Commented May 21, 2014 at 16:59
  • @Darwin IMHO it's not like this could be improved by making the programming language more natural-language-like. Natural languages are very ambiguous and we can only understand them in context and actually can never be sure. Stringing function calls is way better as every term (ideally) has documentation and available sources and we have parentheses and dots making the structure clear.
    – maaartinus
    Commented Jan 23, 2020 at 23:42
3

In practice, it's solved by better design. It is exceptionally uncommon for well-written functions to take more than 2 inputs, and when it does occur, it's uncommon for those many inputs to not be able to be aggregated into some cohesive bundle. This makes it pretty easy to break up functions or aggregate parameters so you're not making a function do too much. One it has two inputs, it becomes easy to name and much clearer about which input is which.

My toy language had the concept of phrases to deal with this, and other more natural language focused programming languages have had other approaches to deal with it, but they all tend to have other downsides. Plus, even phrases are little more than a nice syntax around making functions have better names. It's always going to be hard to make a good function name when it takes a bunch of inputs.

2
  • Phrases really seem like a step forward. I know some languages have similar capabilities but it's FAR from widespread. Not to mention, with all the macro hate coming from C(++) purists that never used macros done right, we might never have features like these in popular languages.
    – Darwin
    Commented May 21, 2014 at 16:45
  • Welcome to the general topic of domain-specific languages, something I really wish more people would understand the advantage of... (+1)
    – Izkata
    Commented May 21, 2014 at 18:36
2

In Javascript (or ECMAScript), for example, many programmers grew accustomed to

passing parameters as a set of named object properties in a single anonymous object.

And as a programming practice it got from programmers to their libraries and from there to other programmers who grew to like it and use it and write some more libraries etc.

Example

Instead of calling

function drawRectangleClipped (rectToDraw, fillColor, clippingRect)

like this:

drawRectangleClipped(deserializedArray[0], deserializedArray[1], deserializedArray[2])

, which is a valid and correct style, you call the

function drawRectangleClipped (params)

like this:

drawRectangleClipped({
    rectToDraw: deserializedArray[0], 
    fillColor: deserializedArray[1], 
    clippingRect: deserializedArray[2]
})

, which is valid and correct and nice with regard to your question.

Off course, there have to be suitable conditions for this - in Javascript this is much more viable than in, say, C. In javascript, this even gave birth to now widely used structural notation that grew popular as a lighter counterpart to XML. It's called JSON (you may have already heard about it).

2
  • I don't know enough about that language to verify the syntax but, I overall like this post. Seems pretty elegant. +1
    – IT Alex
    Commented Jan 22, 2020 at 20:20
  • Quite often, this gets combined with normal arguments, i.e., there's 1-3 arguments followed by params (often containing optional arguments and often itself optional) like e.g., this function. This makes functions with many arguments pretty easy to grasp (in my example there are 2 mandatory and 6 options arguments).
    – maaartinus
    Commented Jan 23, 2020 at 23:54
0

You should use objective-C then, here is a function definition:

- (id)performSelector:(SEL)aSelector withObject:(id)anObject withObject:(id)anotherObject

And here it is used:

[someObject performSelector:someSelector withObject:someObject2 withObject:someObject3];

I think ruby has similar constructs and you can simulate them in other languages with key-value-lists.

For complex functions in Java I like to define dummy variables in the functions wording. For your left-right-example:

Rectangle referenceRectangle = leftRectangle;
Rectangle targetRectangle = rightRectangle;
doSomeWeirdStuffWithRectangles(referenceRectangle, targetRectangle);

Looks like more coding, but you can for example use leftRectangle and then refactor the code later with "Extract local variable" if you think it will not be understandable to a future maintainer of the code, which might or might not be you.

1
  • About that Java example, I wrote in the question why I think it's not a good solution. What do you think about that?
    – Darwin
    Commented May 22, 2014 at 6:28
0

My approach is to create temporary local variables - but not just call them LeftRectange and RightRectangle. Rather, I use somewhat longer names to convey more meaning. I often try to differentiate the names as much as possible, e.g. not call both of them something_rectangle, if their role is not very symmetric.

Example (C++):

auto& connector_source = deserializedArray[0]; 
auto& connector_target = deserializedArray[1]; 
auto& bounding_box = deserializedArray[2]; 
DoWeirdThing(connector_source, connector_target, bounding_box)

and I might even write a one-liner wrapper function or template:

template <typename T1, typename T2, typename T3>
draw_bounded_connector(
    T1& connector_source, T2& connector_target,const T3& bounding_box) 
{
    DoWeirdThing(connector_source, connector_target, bounding_box)
}

(ignore the ampersands if you don't know C++).

If the function does several weird things with no good description - then it probably needs to be refactored!

Not the answer you're looking for? Browse other questions tagged or ask your own question.