RFC: Statistics in Subsurface

Rick Walsh rickmwalsh at gmail.com
Thu May 14 02:56:31 PDT 2020


I think it's great statistics are being tackled again, and it's excellent
to have several mock-ups to discuss so hopefully some consensus can be made
before people spend countless hours on code that ends up rejected because
the presentation isn't as desired.

On Thu, 14 May 2020, 17:24 Willem Ferguson via subsurface, <
subsurface at subsurface-divelog.org> wrote:

> I must admit that I do not like any of these three representations. They
> are inappropriate and inaccurate, leading to misinterpretation.
>
> The top graph is normally used to indicate trends in three *independent*
> variables that may or may not be correlated. In the dive the data represent
> a *single* variable with its min and max values.
>
I would agree if the graph were plotting data from each dive. But as it's
shown it is a plot of yearly min, max and median I don't think it's a
misinterpretation.

> The middle graph is a histogram that would normally also represent three
> *independent* variables that have been sampled on the same x-axis scale.
> Again, in the dive case the min and max values represent the *same*
> variable.
>
> The bottom graph is normally used to indicate the proportion of a total
> that is formed by a specific component. In the case of this specific graph,
> the median would be indicated by the height of the orange bar (i.e.
> vertical distance between the grey-orange border and the orange/blue
> border). The max would be indicated by the height of the blue part of the
> graph, etc. Clearly this is not what is meant.
>
I agree completely.

> I want to make a call that, if we are dealing with representing
> statistics, we actually use the proper statistics representations that we
> are all used to. Most likely that is either some variant of a box and
> whiskers diagram or a vertical bar chart with error bars. If these diagrams
> have been shown once to an uninformed person, the interpretation will
> always be easy. Lets use diagrams for what they are meant to convey and not
> use a sports car to drive offroad. We do not want any statistics related to
> Subsurface to be presented in an unprofessional and inappropriate way.
>
Yes, graphs should be used appropriately

> As far as the horizontal graphs are concerned, they have a place, but we
> need to understand where they come from, and that is from the old days when
> we tried to print graphs on a mainframe line printer that could not print
> characters vertically. The conventional way to represent histograms or bar
> charts is in the vertical way *unless there is good reason to do
> otherwise*. These days there is no problem in printing labels vertically.
> To have a horizontal bar graph with depth measurements along the vertical
> axis is just totally unorthodox and not up to modern standards.
>
> Kind regards,
>
> willem
>
>
>
>
>
>
>
>
> This message and attachments are subject to a disclaimer.
> Please refer to
> http://upnet.up.ac.za/services/it/documentation/docs/004167.pdf for full
> details.
> _______________________________________________
> subsurface mailing list
> subsurface at subsurface-divelog.org
> http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200514/8eb3b010/attachment.html>


More information about the subsurface mailing list