Benefits of pure function

Question

Today i was reading about pure function, got confused with its use:

A function is said to be pure if it returns same set of values for same set of inputs and does not have any observable side effects.

e.g. strlen() is a pure function while rand() is an impure one.

__attribute__ ((pure)) int fun(int i)
{
    return i*i;
}

int main()
{
    int i=10;
    printf("%d",fun(i));//outputs 100
    return 0;
}

http://ideone.com/33XJU

The above program behaves in the same way as in the absence of pure declaration.

What are the benefits of declaring a function as pure[if there is no change in output]?

I don't think this definition of purity is correct - printf, for example, would qualify (calling it twice with the same arguments yields the same return value), but it is not pure. — tdammers, Commented Jun 22, 2012 at 11:51
@tdammers: Indeed, it lacks the ...and no side-effects... part. — Frerich Raabe, Commented Jun 22, 2012 at 12:15
@Ben: where does the entropy come from? We're dealing with (theoretically) deterministic machines here, the only way of getting true entropy into them is from external sources, which means side effects. Of course, we could allow programming languages to define nondeterministic functions, pretending the technical side effects aren't there and the functions really are nondeterministic; but if we do that, most of the practical benefits of tracking purity are lost. — tdammers, Commented Jun 22, 2012 at 13:50
tdammers is correct - the definition of pure given above is incorrect. Pure means that the output depends only on the inputs to the function; also, there must be no observable side-effects. "Same output for same input" is a very inaccurate summary of those requirements. en.wikipedia.org/wiki/Pure_function — Dancrumb, Commented Jun 22, 2012 at 14:04

Philip Kendall · Accepted Answer · 2012-06-22 09:48:38Z

146

pure lets the compiler know that it can make certain optimisations about the function: imagine a bit of code like

for (int i = 0; i < 1000; i++)
{
    printf("%d", fun(10));
}

With a pure function, the compiler can know that it needs to evaluate fun(10) once and once only, rather than 1000 times. For a complex function, that's a big win.

answered Jun 22, 2012 at 9:48

Philip Kendall

4,3041 gold badge24 silver badges43 bronze badges

Ie, you can safely use memoization
– Joel Coehoorn
Commented Jun 22, 2012 at 17:48
@mob What do you mean? Why not?
– Konrad Rudolph
Commented Jun 22, 2012 at 18:17
15

Because you can modify the string (sequence of chars starting from some address) without modifying the input (the pointer to the address where the string starts), i.e., you can't memoize it. It would only be a pure function in a language with immutable strings (Java, say).
– mob
Commented Jun 22, 2012 at 19:35
5

@KonradRudolph: Imagine a length 1000 string. Call strlen on it. Then again. Same thing yes? Now modify the second character to be \0. Does strlen still return 1000 now? The starting address is the same (== input is same) but the function now returns a different value.
– Mike Bailey
Commented Jun 23, 2012 at 16:10
5

@mob That’s a good objection, obviously you are right. I was misled by the fact that even books claim that strlen (in GCC / glibc) is in fact pure. But a look at the glibc implementation showed this to be wrong.
– Konrad Rudolph
Commented Jun 23, 2012 at 16:44

Add a comment |

Community · Accepted Answer · 2017-05-23 12:32:04Z

34

When you say a function is 'pure' you are guaranteeing that it has no externally visible side-effects (and as a comment says, if you lie, bad things can happen). Knowing that a function is 'pure' has benefits for the compiler, which can use this knowledge to do certain optimizations.

Here is what the GCC documentation says about the pure attribute:

pure

Many functions have no effects except the return value and their return value depends only on the parameters and/or global variables. Such a function can be subject to common subexpression elimination and loop optimization just as an arithmetic operator would be. These functions should be declared with the attribute pure. For example,
          int square (int) __attribute__ ((pure));

Philip's answer already shows how knowing a function is 'pure' can help with loop optimizations.

Here is one for common sub-expression elimination (given foo is pure):

a = foo (99) * x + y;
b = foo (99) * x + z;

Can become:

_tmp = foo (99) * x;
a = _tmp + y;
b = _tmp + z;

edited May 23, 2017 at 12:32

CommunityBot

11 silver badge

answered Jun 22, 2012 at 9:49

ArjunShankar

23.5k6 gold badges62 silver badges85 bronze badges

3

I am not sure if any do this, but pure functions also allow the compiler to reorder when the function gets called, should it deem reordering would be beneficial. When there is the possibility for side effects, the compiler needs to be more conservative.
– mpdonadio
Commented Jun 22, 2012 at 17:12
@MPD - Yes, that sounds reasonable. And since a call instruction is a bottleneck for superscalar CPUs, some help from the compiler can help.
– ArjunShankar
Commented Jun 22, 2012 at 17:50
I vaguely recall using a DSP compiler a number of years ago that would use this technique to get return values sooner/later. This allowed it to minimize pipeline stalls.
– mpdonadio
Commented Jun 22, 2012 at 18:53
1

Could "foo(99)" be precalculated since 99 is a const and foo will always return the same result? Maybe in some sort of two-stage compile?
– markwatson
Commented Jun 25, 2012 at 20:41
1

@markwatson - I am not sure. There can be cases when it is simply not possible. e.g. if foo is part of another compilation unit (another C file), or in a pre-compiled library. In both cases, the compiler wont know what foo does, and cannot pre-calculate.
– ArjunShankar
Commented Jun 25, 2012 at 23:08

Add a comment |

Frerich Raabe · Accepted Answer · 2012-06-22 09:56:18Z

29

In addition to possible run-time benefits, a pure function is much easier to reason about when reading code. Furthermore, it's much easier to test a pure function since you know that the return value only depends on the values of the parameters.

answered Jun 22, 2012 at 9:56

Frerich Raabe

93k19 gold badges119 silver badges212 bronze badges

3

+1, your point about testing is an interesting one. No set-up and tear-down required.
– ArjunShankar
Commented Jun 22, 2012 at 9:59

Add a comment |

Giorgio · Accepted Answer · 2012-06-22 15:08:37Z

15

A non-pure function

int foo(int x, int y) // possible side-effects

is like an extension of a pure function

int bar(int x, int y) // guaranteed no side-effects

in which you have, besides the explicit function arguments x, y, the rest of the universe (or anything your computer can communicate with) as an implicit potential input. Likewise, besides the explicit integer return value, anything your computer can write to is implicitly part of the return value.

It should be clear why it is much easier to reason about a pure function than a non-pure one.

answered Jun 22, 2012 at 15:08

Giorgio

5,1636 gold badges44 silver badges71 bronze badges

1

+1: Using the universe as a potential input is a very nice way of explaining the difference between pure and not-pure.
– ArjunShankar
Commented Jun 22, 2012 at 16:52
indeed, this is the idea behind monads.
– Kristopher Micinski
Commented Oct 7, 2012 at 19:26

Add a comment |

Robert Mason · Accepted Answer · 2012-06-22 15:40:24Z

Just as an add-on, I would like to mention that C++11 codifies things somewhat using the constexpr keyword. Example:

#include <iostream>
#include <cstring>

constexpr unsigned static_strlen(const char * str, unsigned offset = 0) {
        return (*str == '\0') ? offset : static_strlen(str + 1, offset + 1);
}

constexpr const char * str = "asdfjkl;";

constexpr unsigned len = static_strlen(str); //MUST be evaluated at compile time
//so, for example, this: int arr[len]; is legal, as len is a constant.

int main() {
    std::cout << len << std::endl << std::strlen(str) << std::endl;
    return 0;
}

The restrictions on the usage of constexpr make it so that the function is provably pure. This way, the compiler can more aggressively optimize (just make sure you use tail recursion, please!) and evaluate the function at compile time instead of run time.

So, to answer your question, is that if you're using C++ (I know you said C, but they are related), writing a pure function in the correct style allows the compiler to do all sorts of cool things with the function :-)

Uri Goren · Accepted Answer · 2015-05-13 18:53:32Z

In general, Pure functions has 3 advantages over impure functions that the compiler can take advantage of:

Caching

Lets say that you have pure function f that is being called 100000 times, since it is deterministic and depends only on its parameters, the compiler can calculate its value once and use it when necessary

Parallelism

Pure functions don't read or write to any shared memory, and therefore can run in separate threads without any unexpected consequence

Passing By Reference

A function f(struct t) gets its argument t by value, and on the other hand, the compiler can pass t by reference to f if it is declared as pure while guaranteeing that the value of t will not change and have performance gains

In addition to the compile time considerations, pure functions can be tested fairly easy: just call them.

No need to construct objects or mock connections to DBs / file system.

Collectives™ on Stack Overflow

Benefits of pure function

6 Answers 6

Caching

Parallelism

Passing By Reference

Not the answer you're looking for? Browse other questions tagged
c
pure-virtual
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Caching

Parallelism

Passing By Reference

Not the answer you're looking for? Browse other questions tagged cpure-virtual or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
c
pure-virtual
or ask your own question.