Accessing historical versions of the cloud copy of the dive log

Linus Torvalds torvalds at linux-foundation.org
Tue Jun 18 10:25:11 PDT 2019


On Mon, Jun 17, 2019 at 1:20 PM Willem Ferguson
<willemferguson at zoology.up.ac.za> wrote:
>
>
> One of the (quite a few) things that differentiates Subsurface from
> other equivalent software is the usage of a git repository as a dive
> log. It is not just another data format, comparable to XML or CSV. git
> is a whole infrastructure that strengthens dive log security immensely.

We actually don't even take full advantage of the git format.

The layout of the git repo means that it's supposed to be easy to
parse incrementally and also update incrementally.

We actually do the latter: when we save changes, we need to only write
the git objects that represent *changed* dives, and that makes our
dive log saving *much* faster when you have lots and lots of dives.

But we don't do the former: right now when you open a subsurface git
dive repository, we read *all* the dives from it synchronously. Which
means that it's actually pretty expensive to start up subsurface if
you have several thousand dives.

But the log format means that it should be possible in theory to
lazily only read the files that actually get shown in the UI, the same
way we only save dives that have been changed.

I have to say that subsurface ends up really bogging down once you
have even just a thousand logged dives (I'm not there yet, but I'm
starting to approach it). And we _should_ be able to do better, but a
lot of our infrastructure was written with the assumption that we have
all the dives parsed already.

I've considered adding some kind of demand-loading logic, but it looks
somewhat painful, and the real performance problem tends to be showing
the dive list, not parsing the dives.

Anyway, I do think our git backend is very cool, and would allow us to
do even more cool things than we already do.

I'm obviously biased, and did the git save format because I like git,
but there are very real technical reasons for it, some of which are
huge and fundamental (using git means that we get merging of
non-linear history almost for free, which is what allows us to do
complex local caching independently of the cloud storage being
available). Doing something like that with a "real" database (that so
many dive log software things use) would be a complete nightmare to
do.

So we already take advantage of the git format in various ways, but
there is a lot more we could do in theory.

>   I guess the rhetorical
> question is how could one make git work for a user who is not very
> computer savvy? And the answer to that question looks like "Yes, but it
> would require an enormous coding effort".

Yeah, actually allowing people to look at the history could be
powerful, but really sounds like a major project. I think it might be
better to just make people perhaps more aware of the existing git
functionality.

There are already other git visualization tools - not just gitk - that
people *could* use. I don't know how it would look in a browser on
github. for example. Maybe that would be a useful thing for people who
aren't quite ready to really run git locally.

                     Linus


More information about the subsurface mailing list