RFC: Statistics in Subsurface
Willem Ferguson
willemferguson at zoology.up.ac.za
Thu May 14 00:21:39 PDT 2020
On 2020/05/13 23:11, Dirk Hohndel via subsurface wrote:
> The video that Pedro linked to seemed to indicate that the first chart
> is most likely to be understood, and that the the second one was
> harder to see trends in.
>
> Conflicting with that is the desire to more typically show bar graphs
> sideways, as that makes it easier to deal with many data sets (think
> labeling the columns vs. labeling rows)
>
> So all this is super helpful in figuring out how we should visualize
> these things - but not necessarily all leading to the same answers, as
> I'm not sure how well these line graphs work when turned 90 degrees :-)
>
> /D
>
>
>
>> On May 13, 2020, at 12:53 PM, Hartley Horwitz <hhrwtz at gmail.com
>> <mailto:hhrwtz at gmail.com>> wrote:
>>
>> I"ve attached 3 graphs showing the statistics summary. Once again I
>> showed them to a work colleague. He found the upper 2 graphs easiest
>> to understand.
>>
>> ...Hartley
>>
>> On Wed, May 13, 2020 at 3:24 PM Dirk Hohndel <dirk at hohndel.org
>> <mailto:dirk at hohndel.org>> wrote:
>>
>> That is excellent input!
>>
>> Your final point is one that I had kinda assumed - most of the
>> "more interesting" data no one but a geek will look into. And to
>> them either box and whiskers (so quartiles) or at least floating
>> box with mean (or your version in the first SAC chart below with
>> the 0 based box with the mean as height and with whiskers for
>> min/max) should make sense. But it also makes sense to look for
>> simper ways to give access to the same data. Can you give an
>> example for the "line graph with 3 lines for min/mean/max"?
>>
>> Thanks
>>
>> /D
>>
>
>
> _______________________________________________
> subsurface mailing list
> subsurface at subsurface-divelog.org
> http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
I must admit that I do not like any of these three representations. They
are inappropriate and inaccurate, leading to misinterpretation.
The top graph is normally used to indicate trends in three *independent*
variables that may or may not be correlated. In the dive the data
represent a *single* variable with its min and max values.
The middle graph is a histogram that would normally also represent three
*independent* variables that have been sampled on the same x-axis scale.
Again, in the dive case the min and max values represent the *same*
variable.
The bottom graph is normally used to indicate the proportion of a total
that is formed by a specific component. In the case of this specific
graph, the median would be indicated by the height of the orange bar
(i.e. vertical distance between the grey-orange border and the
orange/blue border). The max would be indicated by the height of the
blue part of the graph, etc. Clearly this is not what is meant.
I want to make a call that, if we are dealing with representing
statistics, we actually use the proper statistics representations that
we are all used to. Most likely that is either some variant of a box and
whiskers diagram or a vertical bar chart with error bars. If these
diagrams have been shown once to an uninformed person, the
interpretation will always be easy. Lets use diagrams for what they are
meant to convey and not use a sports car to drive offroad. We do not
want any statistics related to Subsurface to be presented in an
unprofessional and inappropriate way.
As far as the horizontal graphs are concerned, they have a place, but we
need to understand where they come from, and that is from the old days
when we tried to print graphs on a mainframe line printer that could not
print characters vertically. The conventional way to represent
histograms or bar charts is in the vertical way *unless there is good
reason to do otherwise*. These days there is no problem in printing
labels vertically. To have a horizontal bar graph with depth
measurements along the vertical axis is just totally unorthodox and not
up to modern standards.
Kind regards,
willem
--
This message and attachments are subject to a disclaimer.
Please refer to
http://upnet.up.ac.za/services/it/documentation/docs/004167.pdf
<http://upnet.up.ac.za/services/it/documentation/docs/004167.pdf> for
full
details.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200514/bd7048ab/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: simple_stats.JPG
Type: image/jpeg
Size: 37710 bytes
Desc: not available
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200514/bd7048ab/attachment-0001.jpe>
More information about the subsurface
mailing list