RFC: Statistics in Subsurface

Dirk Hohndel dirk at hohndel.org
Tue May 12 14:51:14 PDT 2020



> On May 12, 2020, at 2:24 PM, Dirk Hohndel via subsurface <subsurface at subsurface-divelog.org> wrote:
>> I am comfortable with your points of view, above. The 10m or 10min increments could easily be configurable. For instance a person with OW certification (dives to 18m only with almost all dives in the 10-18m range) would probably want at most  a 5m increment in depth. Unless I understand you wrongly (again).  Normally with statistical software (like R) the default increment is determined by the (max-min) range of the data as well as the number of data points being plotted. Of course I would not like an increment of 3.674 m of depth as might be the case when increment is automatically calculated by machine. My only point is that a single fixed increment is possibly restrictive and it would help if there were a simple rule to do some adjustment of the increment.
>> 
> 
> It's those details that tend to make something go from "straight forward" to "crazy tricky".
> No, I definitely don't want 3.764m increments, but some people might want 3.04m increments (ten feet). Getting this almost right for most people is easy (ask the user how many data points they would like, and then round so in their unit system the result is marginally pleasant). Getting exactly what people would want is likely about as painful as my initial over-engineered idea.

So here's a bad sketch:

First you pick your values and your grouping. Most of them are box/whiskers, the two that have just a simple # per column could be just a plain line graph of bar chart.
Then you pick the grouping (i.e. x-axis)



Once you pick one, if needed, there's a way to specify even further



For example, with by time with a fixed # of columns we would create something that feels "semi reasonable" that gets us the right number of columns.
You've been diving for ten years and want seven columns, so... groups of 18 months?




What am I missing? What could be better?

>> As far as specifying categories like tags I like the present UI where one could specify a number of tags to be included in the filter, giving great flexibility. Again my impression of such a plot possibly differs from yours. I like your binary set idea (a set including compared to a set excluding). But I would more realistically often want to compare (e.g. SAC when comparing two tags "air" and "nitrox"), a use case which does not necessarily imply a binary comparison because it could compare 3 or 4 tags. Does this make sense at all?
>> 
> 
> It makes sense for people who are able to use sets of tags in a meaningful way - one could have a mutually exclusive set of three tags (say, air, nitrox, trimix) and create statistics over them. Of course the results become "strange" if dives potentially have multiple of those tags.
> Again, to get this "mostly right" is fairly easy. To cover all the crazy corner cases is what's hard.


> 
>> Lastly, I do not like candlestick graphs because the application in econometrics does not include the equivalent of a mean value. It is meant to indicate the limits and sometimes direction of change within a specific time period giving rise to the candle forming the central part of the graph. In my opinion a minimal box and whisker approach is more readily interpretable.
>> 
> 
> I keep saying "candlestick" when I mean to say "box and whiskers". My mistake. You are spot on correct, the error is mine.

And I was wrong in my earlier email. As I said above, for total duration and number of dives, you'd want a bar chart or line graph. For the rest I think box and whiskers is fine.


/D
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200512/55cef45a/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2020-05-12 at 2.42.18 PM.png
Type: image/png
Size: 254860 bytes
Desc: not available
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200512/55cef45a/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2020-05-12 at 2.42.26 PM.png
Type: image/png
Size: 152671 bytes
Desc: not available
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200512/55cef45a/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2020-05-12 at 2.42.31 PM.png
Type: image/png
Size: 165917 bytes
Desc: not available
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200512/55cef45a/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2020-05-12 at 2.50.02 PM.png
Type: image/png
Size: 177967 bytes
Desc: not available
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200512/55cef45a/attachment-0007.png>


More information about the subsurface mailing list