Why does GCC call libc's sqrt() without using its result?

Question

Using GCC 6.3, the following C++ code:

#include <cmath>
#include <iostream>

void norm(double r, double i)
{
    double n = std::sqrt(r * r + i * i);
    std::cout << "norm = " << n;
}

generates the following x86-64 assembly:

norm(double, double):
        mulsd   %xmm1, %xmm1
        subq    $24, %rsp
        mulsd   %xmm0, %xmm0
        addsd   %xmm1, %xmm0
        pxor    %xmm1, %xmm1
        ucomisd %xmm0, %xmm1
        sqrtsd  %xmm0, %xmm2
        movsd   %xmm2, 8(%rsp)
        jbe     .L2
        call    sqrt
.L2:
        movl    std::cout, %edi
        movl    $7, %edx
        movl    $.LC1, %esi
        call    std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
        movsd   8(%rsp), %xmm0
        movl    std::cout, %edi
        addq    $24, %rsp
        jmp     std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)

For the call to std::sqrt, GCC first does it using sqrtsd and saves the result on to the stack. If it overflows, it then calls the libc sqrt function. But it never saves the xmm0 after that and before the second call to operator<<, it restores the value from the stack (because xmm0 was lost with the first call to operator<<).

With a simpler std::cout << n;, it's even more obvious:

subq    $24, %rsp
movsd   %xmm1, 8(%rsp)
call    sqrt
movsd   8(%rsp), %xmm1
movl    std::cout, %edi
addq    $24, %rsp
movapd  %xmm1, %xmm0
jmp     std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)

Why is GCC not using the xmm0 value computed by libc sqrt?

This is actually a really cool trick they implemented, we finally get the performance of single assembly instructions for calculating transcendental functions in the common case without having to use -fno-math-errno and similar. — Matteo Italia, Commented Apr 9, 2017 at 13:13

Peter Mortensen · Accepted Answer · 2017-04-09 07:01:07Z

77

It doesn't need to call sqrt to compute the result; it's already been calculated by the SQRTSD instruction. It calls sqrt to generate the required behaviour according to the standard when a negative number is passed to sqrt (for example, set errno and/or raise a floating-point exception). The PXOR, UCOMISD, and JBE instructions test whether the argument is less than 0 and skip the call to sqrt if this isn't true.

edited Apr 9, 2017 at 7:01

Peter Mortensen

31.3k22 gold badges109 silver badges132 bronze badges

answered Apr 9, 2017 at 5:07

Ross Ridge

39.2k7 gold badges87 silver badges117 bronze badges

12

@Benoît Like I said, it doesn't need the result of the sqrt. It's not calling sqrt to obtain the result. It's calling sqrt purely for its side effects, its error handling when the argument to sqrt is less than 0.
– Ross Ridge
Commented Apr 9, 2017 at 5:27
2

Which side effects ? The only I would think of would be setting errno.
– Benoît
Commented Apr 9, 2017 at 5:30
13

@Benoît Isn't that enough? In C++11 it can also (or instead) generate an FE_INVALID floating-point exception. The compiler is simply leaving it up to the library implementation to handle this case.
– Ross Ridge
Commented Apr 9, 2017 at 5:35
6

@TheTechel: In the cases where the argument >= 0, std::sqrt won't be called because it gets jumped over and the sqrtsd assembly instruction does the work. In the other cases (argument < 0), the sqrtsd instruction will be jumped over and std::sqrt will be called (with might or might not try to calculate the root of a negative number). So there will be at most one calculation.
– hoffmale
Commented Apr 9, 2017 at 9:37
2

@hoffmale Eh? Where do you see in the question that the sqrtsd instructions gets jumped over? It looks to me like that one executes unconditionally, it's only the call to sqrt that gets skipped. Which is still enough when optimising for the common case.
– user743382
Commented Apr 9, 2017 at 10:44

| Show 4 more comments

Collectives™ on Stack Overflow

Why does GCC call libc's sqrt() without using its result?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
c++
gcc
assembly
x86-64
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged c++gccassemblyx86-64 or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
c++
gcc
assembly
x86-64
or ask your own question.