[RFC PATCH] gas model: use virial cubic polynomial form
Dirk Hohndel
dirk at hohndel.org
Sun Mar 13 08:35:45 PDT 2016
Here's another one with no commit message and no SOB :-/
(cleaning up my inbox, now that Subsurface-mobile has been released)
/D
On Fri, Mar 04, 2016 at 11:56:31AM -0800, Linus Torvalds wrote:
> On Fri, Mar 4, 2016 at 1:04 AM, Robert Helling <helling at atdotde.de> wrote:
> >
> > But if you really wanted to tweet things, there are two things that could be
> > factored out: Don’t divide o2 and he by 1000.0, rather do that only once in
> > the end.
>
> Yes. And make it a "multiply by 0.001" instead - that not only avoids
> the divide, it makes it possible for the compiler to do more madd
> instructions (not that we'll see those on x86 any time soon due to
> compatibility issues, even though they are starting to exist)
>
> > And instead of doing the weighted average of the Z factors, you
> > could equivalently do the weighted average of the coefficients of the
> > polynomial. That you would need to do only once per gas and save those
> > values as we compute Z for many pressures for the same gas.
>
> The testing overhead for "is it the same gas" would actually overwhelm
> the calculations. Floating point isn't really *that* expensive in the
> end.
>
> Now, the whole thing is actually amenable to be written in vector
> form, but that's just too much.
>
> Anyway, having done a release build (in order to get reasonable
> optimizations), the code generated is actually quite good.. It ends up
> being just 52 instructions, of which 41 are floating point
> instructions (multiply, add, convert from integer, load fp constant).
> The remaining 11 instructions are just the trivial "load o2 and
> helium, check for air, subtract 1000-o2-he".
>
> It would be even better if we had multiply-add instructions, but it's
> really not bad at all.
>
> Even vectorizing it might actually just make things worse: turning on
> the vector units can be much more expensive than the couple of
> instructions we'd win.
>
> I'm attaching the trivial patch to get to that state here, but I'm not
> even sure it's worth it: with optimizations, gcc actually turns our
> *current* code into fine code.
>
> Yeah, without this patch there's a couple of "divsd" instructions, but
> they don't dominate, so it's actually likely that they schedule fine.
> The latency of a divsd is only 3-4 times the latency of a multiply on
> modern cores - we're talking fewer cycles than a cache miss.
>
> Linus
> subsurface-core/gas-model.c | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/subsurface-core/gas-model.c b/subsurface-core/gas-model.c
> index 81765e003002..4612d99c86d8 100644
> --- a/subsurface-core/gas-model.c
> +++ b/subsurface-core/gas-model.c
> @@ -40,22 +40,25 @@ double gas_compressibility_factor(struct gasmix *gas, double bar)
> -0.00000004077670019935,
> +0.00000000000077707035
> };
> - double o2, he;
> + int o2, he;
> double x1, x2, x3;
> double Z;
>
> - o2 = get_o2(gas) / 1000.0;
> - he = get_he(gas) / 1000.0;
> + o2 = get_o2(gas);
> + he = get_he(gas);
>
> x1 = bar; x2 = x1*x1; x3 = x2*x1;
>
> Z = virial_m1(o2_coefficients, x1, x2, x3) * o2 +
> virial_m1(he_coefficients, x1, x2, x3) * he +
> - virial_m1(n2_coefficients, x1, x2, x3) * (1.0 - o2 - he);
> + virial_m1(n2_coefficients, x1, x2, x3) * (1000 - o2 - he);
>
> /*
> * We add the 1.0 at the very end - the linear mixing of the
> * three 1.0 terms is still 1.0 regardless of the gas mix.
> + *
> + * The "*0.001" is because we multiplied in the gas fractions
> + * as whole permille.
> */
> - return Z + 1.0;
> + return Z * 0.001 + 1.0;
> }
More information about the subsurface
mailing list