RFC: Initial git save format

Linus Torvalds torvalds at linux-foundation.org
Thu Mar 6 13:28:39 PST 2014


So Dirk knows about this effort, and it's been rattling around in my
head for a long while, but it took some time for me to digest how to
actually do it. The attached patch is a *rough* initial
implementation.

I say rough, because:

 (a) I didn't do the git configuration part, particularly not the UI
to set the save file.

Right now, to try this patch out, you need to re-use an old git
repository or create a new one ("git init") outside of subsurface. You
then also need to create a one-line "git pointer" file, which can be
anywhere and that points to that repository and the specific branch
you want to use for saving.

The one-liner "git pointer" file is just the string "git", some
whitespace, and then "repository:branch".

For example, I already had a git repository to track my (and Dirks)
XML file in my home directory (~/scuba). It had an existing
checked-out master branch that contains various XML files, and I'm not
going to touch that. But I want to tell subsurface to use the "linus"
branch to save things in the new format, so I do:

    echo git /home/torvalds/scuba:linus > git-test

and now if I tell subsurface to save to that file, it will actually
save to that branch in my scuba repository. It will happily create the
branch if it doesn't exist, and if it does exist, it will save the
changes as a new commit (with the previous commit as a parent).

 (b) the code to actually *load* the data from a git repository does
not exist. This is purely a write-only "save as" operation right now,
so that people can comment on the file formats etc. And I need to be
able to save things in order to test loading. So right now, the way I
test things is:

 - start up subsurface with the regular xml file
 - make random changes
 - do a "save as" into the git-test file
 - "git log -p linus" to see the end result

 (c) the file format *will* change. Right now the file format is a
tree, with each trip getting a subdirectory of its own (remember: not
actually checked out, so you won't see any subdirectories - but they
are tracked as such inside git), and then within that a file for each
dive.

The file format for each dive is fairly sane, and might not need much
changing (it looks a lot like the current XML, except it's a
file-per-dive, and it lacks all the crazy XML syntax). But I don't
save the dive trip notes etc right now at all, so the trip data itself
isn't there, and I am pretty sure that I want to do a deeper directory
hierarchy with at the very least each year getting its own
subdirectory.

So for example, right now I can do not just "git log -p linus" to see
the changes, but can do things like

  git show linus:trip040/2014-01-15-11:12:00-474-36f102b7

to see one particular dive in my last trip, and I get something like

    duration 62:05 min
    gps -10.441307 105.554471
    location "North West Point"
    divemaster "Hama"
    buddy "Dirk"
    suit "2/3mm wetsuit"
    cylinder vol=12.0l workpressure=200.0bar description="12L 200 bar"
    weightsystem weight=4.082kg description="Integrated"
    divecomputer model="Mares Icon HD Net Ready" deviceid=e59d50b9
diveid=86356adf
      depth max=26.2m mean=13.43m
      temperature water=28.2°C
      surface pressure=1.0bar
        0:05min 5.1m 28.4°C
        0:10min 5.9m
        0:15min 6.1m 28.5°C
        0:20min 5.9m 224.0bar
        0:25min 6.4m 28.4°C
        0:30min 6.3m 28.5°C
        0:35min 6.6m 28.6°C
        0:40min 6.6m 28.7°C 222.4bar
        0:45min 6.7m
        0:50min 6.7m
        0:55min 7.0m 28.8°C
    ...

and

   git show --stat linus

shows my last change, that just added a fake new dive in a fake new trip:

    commit 8eadfaa4144cfef2f39a3f2c252f6827c3022489
    Author: Subsurface <subsurface at hohndel.org>
    Date:   Thu Mar 6 12:45:34 2014 -0800

        subsurface commit

     trip041/2014-03-06-12:45:08-475-fa53548b | 12 ++++++++++++
     1 file changed, 12 insertions(+)

so things work, and you can use git to examine things even if
subsurface itself can't read the end result yet.

So on the whole I think it's a reasonable starting point, and while it
is not actually useful for real work, it *is* useful for testing and
commenting.

Dirk, while I think this is good enough to apply (and it has my
sign-off), the upsides aren't big until you can load things too.
Please comment, though.

A few questions:

 - This has been tested with libgit2 in current Fedora 20 (which is
version 0.19). How does it work on Windows/OSX? I know libgit2 works
on those platforms, but I don't know how much pain it adds to the
build requirements.

   On F20, all you need to do is "yum install libgit2-devel".

 - should I make a directory per dive, and make each dive computer be
a file of its own? Right now it's "one file per dive", but I could
make it "one directory per dive (or perhaps per day) and then a "dive"
file for the core data, and separate files for each dive computer.

 - any particular other comments about the save format?

 - is somebody willing to write the Qt GUI to pick a git repo and
branch, so that the hacky git-pointer file can be removed?

Hmm?

                     Linus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Initial-implementation-of-git-save-format.patch
Type: text/x-patch
Size: 25292 bytes
Desc: not available
URL: <http://lists.hohndel.org/pipermail/subsurface/attachments/20140306/61440e13/attachment-0001.bin>


More information about the subsurface mailing list