[RFC PATCH] gas model: use virial cubic polynomial form

Dirk Hohndel dirk at hohndel.org
Sun Mar 13 08:35:45 PDT 2016


Here's another one with no commit message and no SOB :-/

(cleaning up my inbox, now that Subsurface-mobile has been released)

/D

On Fri, Mar 04, 2016 at 11:56:31AM -0800, Linus Torvalds wrote:
> On Fri, Mar 4, 2016 at 1:04 AM, Robert Helling <helling at atdotde.de> wrote:
> >
> > But if you really wanted to tweet things, there are two things that could be
> > factored out: Don’t divide o2 and he by 1000.0, rather do that only once in
> > the end.
> 
> Yes. And make it a "multiply by 0.001" instead - that not only avoids
> the divide, it makes it possible for the compiler to do more madd
> instructions (not that we'll see those on x86 any time soon due to
> compatibility issues, even though they are starting to exist)
> 
> > And instead of doing the weighted average of the Z factors, you
> > could equivalently do the weighted average of the coefficients of the
> > polynomial. That you would need to do only once per gas and save those
> > values as we compute Z for many pressures for the same gas.
> 
> The testing overhead for "is it the same gas" would actually overwhelm
> the calculations. Floating point isn't really *that* expensive in the
> end.
> 
> Now, the whole thing is actually amenable to be written in vector
> form, but that's just too much.
> 
> Anyway, having done a release build (in order to get reasonable
> optimizations), the code generated is actually quite good.. It ends up
> being just 52 instructions, of which 41 are floating point
> instructions (multiply, add, convert from integer, load fp constant).
> The remaining 11 instructions are just the trivial "load o2 and
> helium, check for air, subtract 1000-o2-he".
> 
> It would be even better if we had multiply-add instructions, but it's
> really not bad at all.
> 
> Even vectorizing it might actually just make things worse: turning on
> the vector units can be much more expensive than the couple of
> instructions we'd win.
> 
> I'm attaching the trivial patch to get to that state here, but I'm not
> even sure it's worth it: with optimizations, gcc actually turns our
> *current* code into fine code.
> 
> Yeah, without this patch there's a couple of "divsd" instructions, but
> they don't dominate, so it's actually likely that they schedule fine.
> The latency of a divsd is only 3-4 times the latency of a multiply on
> modern cores - we're talking fewer cycles than a cache miss.
> 
>                 Linus

>  subsurface-core/gas-model.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/subsurface-core/gas-model.c b/subsurface-core/gas-model.c
> index 81765e003002..4612d99c86d8 100644
> --- a/subsurface-core/gas-model.c
> +++ b/subsurface-core/gas-model.c
> @@ -40,22 +40,25 @@ double gas_compressibility_factor(struct gasmix *gas, double bar)
>  		-0.00000004077670019935,
>  		+0.00000000000077707035
>  	};
> -	double o2, he;
> +	int o2, he;
>  	double x1, x2, x3;
>  	double Z;
>  
> -	o2 = get_o2(gas) / 1000.0;
> -	he = get_he(gas) / 1000.0;
> +	o2 = get_o2(gas);
> +	he = get_he(gas);
>  
>  	x1 = bar; x2 = x1*x1; x3 = x2*x1;
>  
>  	Z = virial_m1(o2_coefficients, x1, x2, x3) * o2 +
>  	    virial_m1(he_coefficients, x1, x2, x3) * he +
> -	    virial_m1(n2_coefficients, x1, x2, x3) * (1.0 - o2 - he);
> +	    virial_m1(n2_coefficients, x1, x2, x3) * (1000 - o2 - he);
>  
>  	/*
>  	 * We add the 1.0 at the very end - the linear mixing of the
>  	 * three 1.0 terms is still 1.0 regardless of the gas mix.
> +	 *
> +	 * The "*0.001" is because we multiplied in the gas fractions
> +	 * as whole permille.
>  	 */
> -	return Z + 1.0;
> +	return Z * 0.001 + 1.0;
>  }



More information about the subsurface mailing list