[PATCH 1/2] micro-optimisation: avoid division in main strtod() loop
Dirk Hohndel
dirk at hohndel.org
Sun Feb 16 12:01:22 UTC 2014
I think I'll have to pull the full email into the commit message...
/D
---
From my phone
Linus Torvalds <torvalds at linux-foundation.org> wrote:
>
>From: Linus Torvalds <torvalds at linux-foundation.org>
>Date: Mon, 10 Feb 2014 13:41:00 -0800
>Subject: [PATCH 1/2] micro-optimisation: avoid division in main strtod() loop
>
>Division is expensive, so replace it with multiplication instead. But
>don't multiply by 0.1 (inexact in floating point), multiply by 10 and
>then do one division at the end.
>
>Make sure the final division is at the very end, so that the result
>isn't immediately used. That allow the division to overlap with the
>function return overhead, hiding it further.
>
>Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
>---
>
>This is silly, but while thinking about different file formats and doing
>profiling of loading big files, it turned out that "strtod_flags()"
>actually showed up in profiles. Not very high, but at more than 1%.
>
>This makes the common case (no exponent) use only addition and
>multiplication until the very end, and makes the division be the very last
>thing it does, which minimizes the data dependencies on the division.
>
>For my stupid test-case, it cut the cost of strtod_flags() in half
>according to the profile. The half a percent speedup on loading time isn't
>really noticeable or even measurable outside of profiling startup costs,
>but rather than carry this along in my tree or just throw it away, I'm
>sending it out to see if anybody cares.
>
>Note that we could avoid the final division by instead multiplying
>"decimal" with 0.1 rather than multiplying by 10 (and switching the sign
>test over), but that's a fundamentally inexact operation in binary floatig
>point, so doing the "multiply by tens for decimals" ends up keeping
>everything exact as long as possible.
>
>For our use, we probably really don't care, but whatever. End result: this
>should not only speed things up immeasurably, it *might* also make things
>more precise at a level that we really don't care about :^p
>
>I'm really selling this piece of crap, aren't I?
>
> strtod.c | 15 ++++++---------
> 1 file changed, 6 insertions(+), 9 deletions(-)
>
>diff --git a/strtod.c b/strtod.c
>index fe7319887546..457cbcd9a384 100644
>--- a/strtod.c
>+++ b/strtod.c
>@@ -64,12 +64,9 @@ double strtod_flags(const char *str, const char **ptr, unsigned int flags)
> }
> if (c >= '0' && c <= '9') {
> numbers++;
>- if (dot) {
>- decimal /= 10;
>- val += (c - '0') * decimal;
>- } else {
>- val = (val * 10) + (c - '0');
>- }
>+ val = (val * 10) + (c - '0');
>+ if (dot)
>+ decimal *= 10;
> continue;
> }
> if (c != 'e' && c != 'E')
>@@ -111,9 +108,9 @@ double strtod_flags(const char *str, const char **ptr, unsigned int flags)
>
> while (exponent-- > 0) {
> if (esign)
>- val /= 10;
>+ decimal *= 10;
> else
>- val *= 10;
>+ decimal /= 10;
> }
> }
>
>@@ -122,7 +119,7 @@ done:
> goto no_conversion;
> if (ptr)
> *ptr = p-1;
>- return sign ? -val : val;
>+ return (sign ? -val : val) / decimal;
>
> no_conversion:
> if (ptr)
>--
>1.9.0.rc3
>
More information about the subsurface
mailing list