25

Say that I have two functions that are essentially identical, where one validates its arguments while the other doesn't. The rationale: sometimes you want to be safe and sometimes you want to go fast.

What's your preferred naming convention to distinguish the two? For example:

list_err_t list_push_a(list_t *ref, list_t *item) {
    if ((ref != NULL) && (item != NULL)) {
        list_push_b(ref, item);
        return LIST_ERR_NONE;
    } else {
        return LIST_ERR_BAD_ARG;
    }
}

void list_push_b(list_t *ref, list_t *item) {
    item->next = ref->next;
    ref->next = item;
}

What would you name list_push_a and list_push_b? I think list_push_safe and list_push_fast is a bit wordy -- one of them should just be list_push. (And note that I'm not asking about CamelCase vs snake_case etc...)

addenda...

There have been some great answers already. I should have mentioned up front that the programming environment in question is low-level embedded devices, where speed is important and resources are scant. For example, raising exceptions is not an option...

11
  • 2
    list_push_if_valid and list_push
    – Fabio
    Commented Oct 25, 2020 at 18:12
  • 13
    As an example of this, the standard library in Rust consistently uses the _unchecked suffix. These functions are usually also marked unsafe, but that's a rather Rust-specific feature. Some examples are slice::get_unchecked (not checking bounds on array access), f64::to_int_unchecked (not checking for NaN and Infinity) and str::from_utf8_unchecked (quite self-explanatory, converts a byte array to a string without checking whether it is valid UTF-8 or not)
    – xenia
    Commented Oct 25, 2020 at 23:28
  • 2
    Most of what I#d write in an answer has already been said, but I would like to leave the principle of least astonishment, also known as law of least surprises here. Most devs would probably assume functions are safe, unless explicitly stated otherwise.
    – Polygnome
    Commented Oct 26, 2020 at 1:26
  • 3
    @BernhardBarker: If speed is of paramount concern, you wouldn't want to chew up your performance checking a parameter. Commented Oct 26, 2020 at 4:15
  • 1
    That said, for your type of case, unless the function is not defined in the header (as static/inline), we tend to just use asserts to validate such preconditions. We don't turn cases like a null being passed to a function that requires non-null arguments to be passed into a runtime error/exception. We assume that's a programmer error as a violation of a precondition. Of course, if it's part of an API and the function definition isn't available in the header, then we do the checking for unsafe versions for lack of a better alternative.
    – user377672
    Commented Oct 26, 2020 at 10:01

9 Answers 9

50

Your typical caller will expect functions to be safe, i.e. to inform about failure in an orderly fashion instead of crashing or giving funny results. You chose to use the traditional C style of error return values for that purpose, and that's probably okay in your situation (generally, I'd prefer exceptions, but I don't know if that's usable for you).

I'd go for calling the safe version list_push(), and the "fast" version list_push_unsafe() or list_push_no_checks(), as the most important fact about the "fast" one isn't its speed, but its unsafe behaviour. And of course clearly describe the difference in the function documentation.

I'd explicitly recommend against using the wording "fast" for the second one, as that doesn't convey a hint about the additional risk. Then the typical caller will see that there are two implementations, a slow one and a fast one, and will of course choose the "better" one: the fast one.

6
  • 1
    Maybe I should just tell it like it is with list_push_dangerous() ;). Thanks! Commented Oct 25, 2020 at 14:21
  • 6
    Another one which also conveys why you might only bother in tight situation, is list_push_raw(). Commented Oct 25, 2020 at 22:31
  • 19
    unsafe is IMO best word choice. It's often used to describe unchecked operations, e.g. Rust unsafe, Java's sun.misc.Unsafe, "memory safety", etc. Commented Oct 26, 2020 at 4:42
  • 1
    @fearless_fool There are real-world libraries out there that go with the dangerous route. I know this is not embedded code but React.js calls its function to insert HTML in the document without checks dangerouslySetInnerHtml()
    – slebetman
    Commented Oct 26, 2020 at 8:04
  • 1
    In C (as in the posted code) the typical caller should reasonably expect all functions to be unsafe though. Especially since the given implementation doesn't really make the safe option safe. Nonzero pointers can point anywhere.
    – moooeeeep
    Commented Oct 26, 2020 at 11:09
17

I would prefer to know when to use one vs the other.

Oh sure you could name one list_push_fast() and the other list_push_safe() but as your newest on-boarded employee I've no idea when to use one vs the other.

First, understand that removing two null checks is an exceedingly trivial optimization. Speed is a poor justification for separating validation. A better justification is that validation has already been performed elsewhere.

So if, as my new boss, you insisted on keeping both I'd name them list_push_checked() and list_push_unchecked(). If we name one simply list_push() my vote is to make that the checked one.

2
  • Does checked mean that the function will check its input, or that it assumes that the input has been checked? Commented Nov 18, 2020 at 0:08
  • @JacobRaihle it means that the function has a checkered past. Commented Nov 18, 2020 at 0:11
4

In a scenario like this, there are not just two, but at least three sensible ways the function could be designed:

  1. If pointer is null, function's behavior is specified as doing nothing.

  2. If pointer is null, function's behavior is specified as trapping.

  3. If pointer is null, function's behavior is undefined.

In some similar scenarios, it may make sense to have semantics that would be specified as either trapping if observable nonsensical behavior would be otherwise unavoidable, but allowing nonsensical behavior to be avoided without a trap. For example, a function which is supposed to compute x*y/z with integer values might guarantee that it will either yield an arithmetically-correct value or trap, but not specify which option it will choose if x*y would overflow. Under such a specification, a function could optimize the case where y==z and z!=0 so that it simply returns x without having to determine whether x*y would overflow.

A variant that can't guarantee anything about corner-cases behavior would be "unsafe" compared with one that could offer some behavioral guarantees, but whether a function would be "safe" in a particular use case would depend upon the caller's requirements. Consider, for example, a malloc() style function. Depending upon circumstances, a function which is specified to always either yield a valid pointer or trap may be "safer" than one which would return null in case of allocation failure, or the fact that the function might trap could mean that it's "unsafe" and the version which would return null would be safer, despite the fact that the caller would need to validate the result.

1
  • A very good point that "safety" and convenience both can be context-dependent. And look, there are widely-used alternatives to malloc() which abort if they fail to succeed. Commented Oct 26, 2020 at 17:44
3

I'm assuming that you have a more complex usecase and you're not trying to optimize the code actually posted (which is 2 cmp statements and if that needs to be optimized out you're likely to find more speed elsewhere). I would explicitly call one version unsafe or unchecked as other suggested.

I would also suggest putting a lint rule into your build. That lint rule should consider any use of the fast version an error. You can then use a lint disable comment on lines where it's actually needed. This will ensure that users don't blindly use the unsafe version and put a little thought into it.

1
  • The lint rule is a good idea.
    – user949300
    Commented Oct 26, 2020 at 16:03
2

Try to make the function fast and safe.

C++ references

In C++ you can use references when the object must be present (there is no null references).

void list_push(list_t &ref, list_t &item) {
    // ref and item can't be null, no need to check
    item.next = ref.next;
    ref.next = &item;
}

int main() {
    list_t list, item;
    // list and item is guaranteed to be there, null checks are a wastefull (but
    // probably optimized away).
    list_push(list, item);

    list_t *pitem = try_get_item();
    // try_get_item() can fail, so null check is needed before dereferencing.
    // Note that the check is only done in the caller, where it can be handled properly.
    if (pitem) {
        list_push(list, *pitem);
    }
}

Use the type system

For example, use not_null from CppCoreGuidelines:

// The function interface now says that it do not want nulls
void list_push(not_null<list_t *> ref, not_null<list_t *> item) {
    // ref and item are asserted to not be null in the not_null constructor.
    // The assert can often be optimized away.
    item->next = ref->next;
    ref->next = item;
}

Use asserts

Available in most languages. Will often be optimized away. Can often be removed completely in release builds after testing.

void list_push(list_t *ref, list_t *item) {
    assert(ref && item);
    item->next = ref->next;
    ref->next = item;
}
1
  • 1
    My environment mandates pure C (not C++), so references and strict typing aren't available. But ASSERTs are an excellent choice for the reasons you give. Commented Nov 18, 2020 at 18:26
1

In C, in this particular case, you don't.

Pushing a null node onto a null list is a bug. No matter what you do, there will be a bug. Returning an error code doesn't make it not a bug. Returning an error code means the program keeps running, but now there is an item missing from a list. Which item? Which list? I don't know that. Only the caller knows that. It could be the list of nuclear weapons to disarm, and it's a bad idea to just skip one.

Does every "safe" function caller check the error code? If so, is that easier than checking that the parameters aren't NULL? If not, then why not check that the parameters aren't NULL, and then you can have one function that is either fast or safe depending on whether you check for NULL first?

If you want to guarantee fail-fast behaviour, you can also assert inside the function that the parameters are not NULL.

Besides, why NULL specifically? What if someone passes (list_t*)14? Or a dangling pointer?

0

There are two possibilities: One is that speed for this code is of no importance. Have you measured it? If not, no importance. Use the safe code. Two is that speed is important, in that case take the safe function and declare it as "inline" or "static". Obviously not calling the "fast" function but incorporating it. And you would check your compiler documentation if there is a way to indicate which result is expected, for example if you expect that in 99% of all cases the two pointers are not NULL.

Another possibility is that calling the function with a NULL pointer is a severe bug. In that case the caller is unlikely to have code checking the return value, and you may be better off using assert or similar to check that the pointers are not NULL. And if you find they are, fix the caller.

On the other hand, you might not even need an assert because your program will crash / fall into the debugger if one of the pointers is NULL.

0

If possible you can separate the actual activity and the checks into two different components. One wraps around the other and external components that want to be always safe use the safe wrapper while components that are so deep in the system all parameters are already validated at that point just go with the inner plain class that holds the unprotected implementation.

Obviously this does not work for all cases, but it might be worthwhile to consider.

0

In this case the "safe" option is still unsafe when you pass in pointers to invalid memory. The only way to have it safe is to call the function correctly. This is true for both options. Especially for C or C++ code, I'd rather have the unchecked option as the default than to have it do some checks that may or may not make it safer by a small amount and claim a false impression of fail safe behavior.

If your use case for the "safe" option is really some function returns an object or null, that I want to push to the list directly if available, then I'd suggest to add the checked variant as a special case. Name them after the use case: list_push_if_not_null().

Note that the return value in case nothing is added isn't actually an error here. It's the expected and successful behavior in case null is given. Consider to omit that return value. If you want to use that info for control flow external to that function, you might want to perform that check external to the function too.

Checking if the first parameter is null doesn't seem to make sense to me here. This is not Java. In C a non-zero pointer might point anywhere. You should rather add a test suite to the client code to make sure you call the function with an actual object of the required type. As a library author you might want to add documentation and example code though.

Also consider to add a different type for list items. For your own sanity. And note that names ending with _t are reserved by the POSIX standard. So if this is your actual code, you might want to rename your types.

Not the answer you're looking for? Browse other questions tagged or ask your own question.