CPU hogging in the current master

Thu Sep 26 01:30:43 UTC 2013

On 26 September 2013 10:24, Robert Helling <helling at atdotde.de> wrote:
>
> On 26.09.2013, at 05:42, Linus Torvalds <torvalds at linux-foundation.org> wrote:
>
> Linus,
>
>> a) the one-liner patch to profile.c just says "we don't bother
>>    calculating TTS at 10s granularity, just do one-minute one"
>>
>> b) stop doing the (very expensive) pow() calculation every time.
>>
>> The two are independent, and it could be two separate commits, but I think
>> both of these fall under the issue of "don't waste CPU time", so here it
>> is as one patch.
>>
>> I think (a) is pretty obvious, no need to expose on that any more.
>>
>> But (b) is slightly more complicated. It re-organizes the saturation
>> calculations to be in my opinion clearer: we used to have the "one second"
>> case completely separate from the "generic interval" case, and this undoes
>> that.
>>
>> It *does* keep the special static cache for the one-second buehlmann
>> factors, and expands that with a *dynamic* cache for each tissue index
>> that contains the previous value of the buehlmann factor for a particular
>> duration.
>>
>> The point is, usually we end up using some fixed duration, so the cache
>> hit ratio is quite high. And doing a memory load from a cache is *much*
>> faster than calculating exponentials.
>>
>> Somebody should double-check that I didn't do anything bad when I
>> re-organized the math, but quite frankly, I think my code is easier to
>> read than the old code. Not that that protects us from typos or thinkos.
>
> I take this as a prompt to comment. I think both the idea and the code are right. I also agree that the point for treating one second special is probably also gone. I did some quick testing comparing the responsiveness between your version and one where I commented out the test for the special one second case. Of course the latter should be slightly faster when the interval is not one second while it is slower in the one second case (as the if condition and the look up is more complicated) but I couldn't notice any difference in snappiness when selecting the trimix dives from the test cases (the case that started that thread) while in the planner planning for a five hour dive the continuous update seems to be a bit faster with the special 1s case. But that difference is so small that I would sacrifice it in the interest of simpler logic in the code.
>
> Since now, for the TTS calculation we only compute in multiples of minutes there is I think still a case for not computing it for every sample but only once or twice per minute of bottom time but I would not bother too much since at least for my taste and my recently bought IMac it is already fast enough.
>

i have tested both patches (from linus and robert) and the performance
is similar. it still takes near a second to when simply navigate
trough the test case dives with the arrow keys. to confirm again that
calculate_deco_information() is the cause i have put a "return;" in
the function making it a NOP, which makes the UI response snappier. if
i also go in and disable the marble globe update for example it's even
better but not that significantly as a side-by-side comparison to the
TTS NOP factor.

lubomir
--