Feedback and testing of the dive statistics

Peter Zaal peter.zaal at gmail.com
Sun Jan 10 07:27:30 PST 2021


Hi all,

 

Warning: long mail!

 

So I finally got around to do some proper testing and looking at the new statistics module in Subsurface. I did follow the development of the on the side in this mailing list, and it looked so nice that although being a developer myself (but not C/C++) and usually because of that being a awful tester, I did take a good look at it.

 

I think I have a very interesting log of my real dives. Of my currently 1900+ dives I have now about almost 800 dives from the last 8-9 years in Subsurface. In the long run I want to put in all my dives, but this takes a lot of time, since I have to add a lot them manually from paper because I don’t have them digitally. But I think this is quite a nice set of data, with all kind of different type of dives, recreational, technical, lots of dive sites, lots of buddies, all year around. Yes even during corona we can go out and diving is not prohibited with us and I usually make 1 or sometimes 2 dives a week (80 dives in 2020).

 

Ofcourse I had some peek previews of the statistics in this mailing list and knew a little about what was possible. But what I did was *before* going through all the options myself, I made a list of the things I would like to know about my dives. And then just see if the statistics can provide that data.
I will go through this list one by one, and describe what I did, if the data could be provided by the statistics module, the things I noticed, the things I was missing, etc. On the end I have a summary of all my findings (I hope), but in this way you have some context.

 

So first impression: before starting on my list, I did play around a little. The first thing I noticed was that the filter was also active. This is great! I think I didn’t see any screen previews with that, but this makes it so easy to zoom in some specific data that is presented, by filtering on that specific data. E.g. some of my charts had some unexpected labels in the legend, because I made some errors in data. Never noticed this before, but this now comes forward and by filtering I could easily find these dives and correct them. I do have an extra suggestion about this, but more on that later on. But having the filter on screen available right away is so much easier than having to close the statistics, filter the dives, reopen the statistics etc. Great!


My first impression was that the statistics module is absolutely amazing. I can’t say enough about the work that has gone into this. Brilliant. Lots and lots of kudos to Berthold and Willem (if I am correct). It is also blazing fast. Even with 800 dives it feels like every selection I make is instantly drawn on the screen.
I am not a statistics person, but I found it quite easy to understand how it works and getting useful data. So you also have to keep in mind that my comments are from a noob statistical point of view. Things that might be obvious to me might actually not be statistically-wise.

 

So, now on with the information I tried to get out of Subsurface with the statistics module.

 

- How many buddies do I dive with each  year?
Base=Date yearly, Data=Buddies, Chart=stacked
This sort of gave me the information, but each bar is -ofcourse- composed of all the buddies in that year. The height of the bar is the total number of buddies for that year (what I wanted to know), but I cannot see how much that is exactly. E.g. in 2013 this number is between 180 and 200, but I don’t know how much exactly. And yes, I understand this is not the number of dives, because lots of dives were with multiple buddies. For other data this is more relevant, but it would be great to somehow see the total of dives of a bar.

A thing I noticed (because of the legend actually) is that Buddies are not only the people from the Buddy field, but also includes the Divemaster(s). This is not consistent with the Filter where this is called ‘people’.

 

- Number of dives per buddy
Base=Buddies, Data=none, Chart=vertical
So yes this gives me ofcourse the data I wanted. But looking at this, with about 80 different buddies, what I realized is that what I really wanted to see with who I dive the most, then second, etc. The bars are sorted by Buddy name, but it would be extremely helpful if you could sort by Number of dives for bar charts. A very little like with the piechart where only the top 5 are shown (+ other).

 

- Dive time per buddy (overall)
Base=Buddies, Data=Duration Sum
Data is provided, but same request: it would be so handy to have this sorted by Duration

 

- How many dives did I do in each country I dived?
Not possible, the Country is not available as a variable

 

- How many dives per max. depth
Base=Max. depth in 5m steps, Data=none

Easy one

 

- Number of dives per max. depth, but now over time/period
Base=Date yearly, Data=Max.Depth
I was looking for a simple chart with a bar for each year with the total number of dives. What I got was much more than this. The box-whisker gave me all the information, and much more.
What I came to realize is that with this query (and others to come), I was actually looking for a way to select an operation of Max (in other cases sometimes Min), so that I could see the max of max. depth per year or whatever period.
I can select Mean, Median and Sum, but why not Min and Max? This seems very easy to provide, the same as with mean, median and sum, but just another math.
Then I looked at the data in the infobox, and it showed Min, Q1, Mean, Q3 and max. At first I thought this the Q1 and Q3 meant 1st quarter and 3rd quarter of the year, which made no sense to me at all. Where is Q2 and Q4. But it also didn’t change when selecting Quarterly or Monthly. Then I figured this is probably some statistics thing (I told you I am  a noob in statistics), and indeed it is. I do understand a little about the difference between mean and median, but is Q1 and Q3 something ‘normal’ people are interested in. Is this really something useful information, have not seen it anywhere else ever, and to me it just clutters the information.

What I did miss in the infobox is the number of dives it is about. If I select an operation like Mean, it does show an infobox with the Count. I would really like to see the Count also in the infobox on box-whiskers. And btw, vice-versa: when I select an operation I would like to see the Min and Max in the infobox on a bar

 

- SAC vs suit
Base=Suit type, Data=SAC
Yes this provides the information, but again: I miss a way to see the exact totals per bar

 

- Total (and average) dive time over time (yes, I say average instead of mean 😉)
Base=Date yearly, Data=Duration Sum/Mean
But again: missing the min and max numbers in the infobox of a bar
Funfact: my average dive time is over the years quite steady around 70 minutes per dive. Even though I make much longer dives in the last couple of years (2-4 hours), obviously this is just a small number of dives that don’t have much effect on the total average.

 

- Number of dives over time
Base=Date, Data=none
Easy one

 

- Dive time on oc and cc
Base=Dive mode, Data=Duration sum
Easy one. Sometimes, in cases like this it would be more nice to see the duration in hours instead of minutes.

 

- Number of dives per location
Base=Dive site, Data=none, Chart=vertical
At first I was confused because I didn’t find the Location variable, but then found out this is Dive site.
With 100’s of dive sites this chart is a bit… cluttered. As before in another query, it would be so nice if this can be sorted by number of dives. And even better if you could select an ‘Only top X’ (like the piechart that only show the top 5).
I also noticed that, in contrary to other variables, the bars are not sorted by Dive site name, but in some random order? 

 

- SAC vs depth
Base=SAC, Data=Max.Depth
The whiskers show the information, but again, I am missing the Count in the infobox (or Number of dive and %)

 

- Number of dives in each water type (fresh, brackishe, salt)
Not possible, water type is not a variable you can select

 

- Number of dives vs temperature, over time
Base=Date monthly, Data=Water temperature
What I was expecting to see was the temperature rise and fall over the months. But this is ofcourse the *minimum* water temperate, and -again- unfortunately it is not possible to select a Min (or Max) operation. The box-whisker does provide this (but missing the Count), but I was looking for a simple bar chart.
Also a binning of 20 degrees temperature seems a bit much and unuseful.

 

- Dive time vs. Temperature
Base=Water temperature, Data=Duration
Same, the box-whiskers provide the information, but missing the Count. And on the operation missing the Max operation to create an simple bar chart to view my maximum dive time with temperatures.

 

- Number of cave and tech dives
Not possible, since there is no Tags variable.
I have all my cave and tech dives marked with C1, C2, T1 and/or T2 tags. So what I wanted to do is filter all dives with these tags, and then create statistics on these dives. So that I can easily see how many dives I made of these types e.g. every year.
So it would be great to add the Tags as a variable. Probably more fun statistics can made with that, depending on what you use the tags for.

 

So this concludes my testing. I did find some things that are bugs imho, but it did not crash a single time and overall I am extremely impressed by the statistics!!

 

Summary

 

Feature requests

- Sometimes looking at the charts, you want to zoom in on some piece. What would be extremely nice is that if a part of a chart is selected (pointed at), e.g. one piece of a bar, you could right-click and select ‘Filter on this’ or something. Then these dives would be automatically be selected. This ofcourse would change the statistics again, but it would be very useful to find a specific number of dives.

- The dive list on the lower left is disabled. The scrollbars are also disabled which is not very handy. Even better, if would be great if one could just double-click a dive to go to the edit-mode of that dive!
- On stacked charts, provide a way to see the total number of dives per bin. This is really useful information.
- Separate the Buddies variable in ‘real’ Buddies and People (Buddies plus Divemaster(s)). For me, a divemaster is not a ‘real’ buddy and s/he should not count in my statistics. I think most people will see it that way.
Also make the naming consistent with the Filter.
- Provide a way (checkbox?) to sort by No. of dives/Duration/etc (Y-axis value) to make is easier to find the most interesting data.
- Provide a way to only show the Top X bars; very useful if you have a lot of data/bars

- Add Min and Max to the Operations, this can provide a lot of easy/simple charts with useful insights
- Add Count to the infobox on box-whisker
- Add Min and Max to the infobox on bars when an operation is selected.
- Add Country variable
- Add Water type variable
- Add Tags variable
- Remove the Q1 and Q3 information, I don’t think this is something the users are interested in or have knowledge about.
- Consistently use ‘Dive site’ and not Location. So maybe in the Notes tab, it should be renamed?
- Binning for Water temperature: add 1 degree, remove 20 degrees

 

Bugs
- Binning for Max. Depth has a double ‘in 10 m steps’
- Binning for Dive # has no unit, I think it should be ‘dives’?, e.g. ‘in 5 dives steps’
- Dive sites are not sorted by name on the chart
- When switching between chart types, or hovering over parts that have an extreme large infobox, sometimes parts of the old chart stays visible
- When the dive statistics view is visible, and then selecting View -> All, the Info view is not shown, but instead the dive statistics is (compressed) in the upper-left

 

Some observations / discussion points:
- The yellow warning icon on some charts is strange. I did read it means ‘it’s not the best chart’, but actually sometimes it provides useful information. I feel the warning icon should not be shown.
- For the chart types there is a grouping of Histogram and Categorical. As said before, I don’t know much about statistics, and I am not really interested in this grouping, I just want to select a type.
Even more, in the beginning I could not see any difference in the Vertical / Horizontal / Box-whisker of the Histogram and the Categorical one. When switching between the two, the chart just shifted a bit, but nothing else changed. But then suddenly I had some chart where there was a difference between the two, and it seems this is because there was no data for certain periods. I think that makes sense? Histogram always shows all data, also for bins that has no data, whereas Categorical only shows bars with data (no ‘empty’ bars). If that’s the case I would rather have 1 type and an extra option ‘show empty data’. But this is just my simple view ofcourse 😉

 

 

Kind regards,

 

Peter

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20210110/1efb240a/attachment-0001.htm>


More information about the subsurface mailing list