Investigating the floating-point implementation of the $\operatorname{pow}(x,b)=x^b$ with $x,b\in\Bbb R$ in some library implementations, I found that some pow
implementations are inexact.
The implementation is basically as $$ \operatorname{pow}(x,b)= \exp(b\ln x) \tag 1 $$ Now let's assume $x>0$, and that the relative error inflicted by $\exp$ and $\ln$ is the same $\Delta$ with $|\Delta| \ll 1$
Then the result of the computation is:
$$\begin{align}
(1+\Delta_\text{pow}) \operatorname{pow}(x,b) &= (1+\Delta)\exp (b(1+\Delta)\ln x) \\
&=(1+\Delta)\exp(b\ln x)\underbrace{\exp(\Delta\cdot b\ln x)}_{\textstyle\approx 1+\Delta\cdot b\ln x} \\
&\approx (1+\Delta)\operatorname{pow}(x,b)(1+\Delta\cdot b\ln x) \\
&\approx (1+\Delta + \Delta\cdot b\ln x)\operatorname{pow}(x,b) \tag 2
\end{align}$$
This means that the relative error of pow
is not $\Delta$ but
$$
\Delta_\text{pow} = \Delta(1+|b\ln x|) \tag 3
$$
So it is clear why the implementation according to $(1)$ is getting inexact, but I don't see how pow
could be implemented with a smaller relative error. (The targeted precision is relative because it is in the context of floating-point imlpementation).
I am not asking for a specific implementation of course, but for a different representation of pow
that has a better error behaviour. In the above implementation, the relative error of pow
is unbounded even when the relative errors of exp
and ln
are bounded and well-behaved.
For example, in IEEE double the relative error of exp
and ln
is in the order of magnitude of
$$
\Delta \approx 10^{-16}
$$
so that the $e^t\approx 1+t$ approximation works well for a wide range of $t=\Delta\cdot b\ln x$, e.g. when $|b\ln x| = 1000 \approx 2^{10}$ then the loss of precision of the pow
implementation is 10 bits.
exp
andln
), $\Delta$ is in the order of $10^{-16}$. So that for most practical purposes, it's still the case that $|\Delta b\ln x| \ll 1$. For example with $|b\ln x| = 1000$, the implementation ofpow
loses 10 LSBs in relative precision. $\endgroup$exp
,log
andmul
with higher precision. Moreover, there is no upper bound for $|b\ln x|$, though it would be some partial remedy. I just thought I had overlooked something obvious. $\endgroup$exp
andlog
in that implementation are not as precise as could, but it's hard to do it within the same codesize / speed constraints. $\endgroup$