RFC: Statistics in Subsurface (Willem Ferguson)

Hartley Horwitz hhrwtz at gmail.com
Thu May 14 13:09:27 PDT 2020


> ---------- Forwarded message ----------
> From: Willem Ferguson <willemferguson at zoology.up.ac.za>
> To: Subsurface Mailing List <subsurface at subsurface-divelog.org>
> Cc:
> Bcc:
> Date: Thu, 14 May 2020 21:04:14 +0200
> Subject: Fwd: Re: RFC: Statistics in Subsurface
>
> ….snip....
> I attach a suggestion that, to me, what it does is to actually plot the
> raw data points and show what the mean value for each dataset is (red
> bar). This is much more usable than a mere report of min, mean, max. For
> instance, for the wetsuit dataset, the bottom two points are probably
> outliers (possibly erroneous cylinder pressures or cylinder type entered
> into the dive log?) and one might consider not using these to interpret
> the data. For wetsuit, it appears that SAC mostly varies between 13 and
> 21, and that the min and max values indicated are not necessarily so
> useful. For the semidry suit data, the data points are much more
> cohesive and the min and max values plotted are possibly more useful. It
> depends on the person looking at the graph to use the min and max as
> plotted, or to use some other way of interpretation. This would provide
> a good impression of the distribution of the SAC data for each suit type
> and still provide mean, max and min values. And I think most persons
> should be able to interpret the diagram easily?
>

I know this plot as a 'beeswarm'.  I'm not sure if that's a universally
used term, but I use those in Spotfire (a large statistics tool).  I
thought I should show the same type of plot that contains a bit more data,
because it changes the view somewhat.  So here's a collection of data with
more than 100 data points, categorized 7 different ways.  I've cut out the
labels because this isn't data extracted from anything remotely associated
with diving.
[image: image.png]

Is this clear to people?  I love beeswarm for my job. I do find these tend
to balloon outwards (in width) as the data population increases.

I'm not sure beeswarm is so clear to the average diver, but maybe I
underestimate the userbase.

...Hartley

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200514/9ee19097/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 18760 bytes
Desc: not available
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20200514/9ee19097/attachment-0001.png>


More information about the subsurface mailing list