[PATCH 1/2] micro-optimisation: avoid division in main strtod() loop

Dirk Hohndel dirk at hohndel.org
Sun Feb 16 12:01:22 UTC 2014


I think I'll have to pull the full email into the commit message...

/D
---
From my phone

Linus Torvalds <torvalds at linux-foundation.org> wrote:

>
>From: Linus Torvalds <torvalds at linux-foundation.org>
>Date: Mon, 10 Feb 2014 13:41:00 -0800
>Subject: [PATCH 1/2] micro-optimisation: avoid division in main strtod() loop
>
>Division is expensive, so replace it with multiplication instead.  But
>don't multiply by 0.1 (inexact in floating point), multiply by 10 and
>then do one division at the end.
>
>Make sure the final division is at the very end, so that the result
>isn't immediately used.  That allow the division to overlap with the
>function return overhead, hiding it further.
>
>Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
>---
>
>This is silly, but while thinking about different file formats and doing 
>profiling of loading big files, it turned out that "strtod_flags()" 
>actually showed up in profiles. Not very high, but at more than 1%.
>
>This makes the common case (no exponent) use only addition and 
>multiplication until the very end, and makes the division be the very last 
>thing it does, which minimizes the data dependencies on the division.
>
>For my stupid test-case, it cut the cost of strtod_flags() in half 
>according to the profile. The half a percent speedup on loading time isn't 
>really noticeable or even measurable outside of profiling startup costs, 
>but rather than carry this along in my tree or just throw it away, I'm 
>sending it out to see if anybody cares.
>
>Note that we could avoid the final division by instead multiplying 
>"decimal" with 0.1 rather than multiplying by 10 (and switching the sign 
>test over), but that's a fundamentally inexact operation in binary floatig 
>point, so doing the "multiply by tens for decimals" ends up keeping 
>everything exact as long as possible.
>
>For our use, we probably really don't care, but whatever. End result: this 
>should not only speed things up immeasurably, it *might* also make things 
>more precise at a level that we really don't care about :^p
>
>I'm really selling this piece of crap, aren't I?
>
> strtod.c | 15 ++++++---------
> 1 file changed, 6 insertions(+), 9 deletions(-)
>
>diff --git a/strtod.c b/strtod.c
>index fe7319887546..457cbcd9a384 100644
>--- a/strtod.c
>+++ b/strtod.c
>@@ -64,12 +64,9 @@ double strtod_flags(const char *str, const char **ptr, unsigned int flags)
> 		}
> 		if (c >= '0' && c <= '9') {
> 			numbers++;
>-			if (dot) {
>-				decimal /= 10;
>-				val += (c - '0') * decimal;
>-			} else {
>-				val = (val * 10) + (c - '0');
>-			}
>+			val = (val * 10) + (c - '0');
>+			if (dot)
>+				decimal *= 10;
> 			continue;
> 		}
> 		if (c != 'e' && c != 'E')
>@@ -111,9 +108,9 @@ double strtod_flags(const char *str, const char **ptr, unsigned int flags)
> 
> 		while (exponent-- > 0) {
> 			if (esign)
>-				val /= 10;
>+				decimal *= 10;
> 			else
>-				val *= 10;
>+				decimal /= 10;
> 		}
> 	}
> 
>@@ -122,7 +119,7 @@ done:
> 		goto no_conversion;
> 	if (ptr)
> 		*ptr = p-1;
>-	return sign ? -val : val;
>+	return (sign ? -val : val) / decimal;
> 
> no_conversion:
> 	if (ptr)
>-- 
>1.9.0.rc3
>


More information about the subsurface mailing list