Warnings (was: Subsurface libdivecomputer branch merged with upstream..)
bstoeger at mail.tuwien.ac.at
Sun Dec 17 04:38:32 PST 2017
On Sonntag, 17. Dezember 2017 10:03:22 CET Robert Helling wrote:
> > On 16. Dec 2017, at 20:29, Lubomir I. Ivanov <neolit123 at gmail.com> wrote:
> > so yes, "float" is like a non-default sized type e.g. "short", yet
> > one's variable might fit in a "short" and it's perfectly fine to use
> > the non-default type.
> > while nowadays we don't have much of a size restrain using the smaller
> > type, it could be a good indication that we don't actually need the
> > whole 64bit precision for this variable.
> > the same way one could use a "char" vs an "int".
> > on the other hand, if this causes more trouble than benefits, it's
> > probably best to just convert all the calculations to be with
> > "doubles", have consistency and silence the -Wall warnings.
> from my very limited understanding: RAM is indeed big enough, so there is no
> size constraint coming from there. But what is actually finite is CPU cache
> size. So when the doubling in memory footprint makes the difference for
> some calculation to fit in the cache vs the CPU has to fetch the data from
> RAM several times the performance will hugely differ. Of course I have no
> idea at all if this is any concern of us (not having done any significant
> performance tuning ever). Plus there are these vector instructions where
> the CPU can do a number of floating point instructions on different data in
> parallel: There the number of things done at once is twice for float than
> for double (since twice as many floats fit in the CPU). Again no idea at
> all if the compiler generates this parallelism for us at any critical
This is of course all correct, but none of the proposed changes are anywhere
close to in a tight loop. In some cases it's only a local variable which is
not even spilled to RAM.
I tried to look at the assembler output and the first thing I noted is that we
don't compile with -O2? Is that on purpose?
Anyway, with -O3, the first case in plannernotes.h produces better code without
casting to float. You could of course get the same result by replacing 1000.0
with 1000.0f, but this seems rather obscure.
I removed two cases concerning binary representation from the patch and made a
PR. Perhaps we get Linus to have a look at it.
More information about the subsurface