Warnings (was: Subsurface libdivecomputer branch merged with upstream..)

Sun Dec 17 05:51:02 PST 2017

On 17 December 2017 at 15:45, Lubomir I. Ivanov <neolit123 at gmail.com> wrote:
> On 17 December 2017 at 14:38, Berthold Stoeger
> <bstoeger at mail.tuwien.ac.at> wrote:
>> Hi Robert,
>>
>> On Sonntag, 17. Dezember 2017 10:03:22 CET Robert Helling wrote:
>>> Hi,
>>>
>>> > On 16. Dec 2017, at 20:29, Lubomir I. Ivanov <neolit123 at gmail.com> wrote:
>>> >
>>> > so yes, "float" is like a non-default sized type e.g. "short", yet
>>> > one's variable might fit in a "short" and it's perfectly fine to use
>>> > the non-default type.
>>> > while nowadays we don't have much of a size restrain using the smaller
>>> > type, it could be a good indication that we don't actually need the
>>> > whole 64bit precision for this variable.
>>> > the same way one could use a "char" vs an "int".
>>> >
>>> > on the other hand, if this causes more trouble than benefits, it's
>>> > probably best to just convert all the calculations to be with
>>> > "doubles", have consistency and silence the -Wall warnings.
>>>
>>> from my very limited understanding: RAM is indeed big enough, so there is no
>>> size constraint coming from there. But what is actually finite is CPU cache
>>> size. So when the doubling in memory footprint makes the difference for
>>> some calculation to fit in the cache vs the CPU has to fetch the data from
>>> RAM several times the performance will hugely differ. Of course I have no
>>> idea at all if this is any concern of us (not having done any significant
>>> performance tuning ever). Plus there are these vector instructions where
>>> the CPU can do a number of floating point instructions on different data in
>>> parallel: There the number of things done at once is twice for float than
>>> for double (since twice as many floats fit in the CPU). Again no idea at
>>> all if the compiler generates this parallelism for us at any critical
>>> point.
>>
>> This is of course all correct, but none of the proposed changes are anywhere
>> close to in a tight loop. In some cases it's only a local variable which is
>> not even spilled to RAM.
>>
>> I tried to look at the assembler output and the first thing I noted is that we
>> don't compile with -O2? Is that on purpose?
>>
>> Anyway, with -O3, the first case in plannernotes.h produces better code without
>> casting to float. You could of course get the same result by replacing 1000.0
>> with 1000.0f, but this seems rather obscure.
>>
>> I removed two cases concerning binary representation from the patch and made a
>> PR. Perhaps we get Linus to have a look at it.
>>
>
> we don't optimize in C code intentionally, so that the code is more
> readable. there are opportunities for that in many places, to help the
> compiler out, but we just don't do it for the sake of readability.
>
> at the same time, i don't have a good answer about the -Oxx flags -
> e.g. why don't we use them. i have a vague memory that this was
> discussed back in the GTK days.
> from my experience the -Oxx flags can definitely improve areas with
> heavy operations. the only problem i see is that one should probably
> enable the -Oxx for both release and debug, to catch possible compiler
> bugs too.
>

if we stick to GCC we can also do the per-function optimizations:
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

see:
- hot
- optimize

clang supports partially the attribute namespace as well.

lubomir
--