XML format change
Jef Driesen
jefdriesen at telenet.be
Tue Dec 25 11:32:11 PST 2012
On 24-12-12 19:02, Linus Torvalds wrote:
> On Mon, Dec 24, 2012 at 2:12 AM, Jef Driesen <jefdriesen at telenet.be> wrote:
>>
>> Actually you should also take into account the libdivecomputer backend type,
>> because just the model number and serial number tuple might overlap with
>> those from other backends. But if you encode the model as the full device
>> name (e.g. Suunto Vyper Air) and not just the model number, then you are
>> already doing that of course.
>
> Yes. I originally saved the vendor/device information separately, so we had
>
> dc->vendor = "Suunto";
> dc->product = "Vyper Air";
>
> rather than the current
>
> dc->model = "Suunto Vyper Air";
>
> setup.
>
> It ended up being just extra work (you had to always check both), and
> unlike libdivecomputer, there's no "backend type" for subsurface, so
> there was no upside.
>
> One thing I probably *should* have done is to make the "dc->deviceid"
> be the SHA1SUM of not just the libdivecomputer device ID string, but
> make it the SHA1 of the combination of model string and device ID
> string. That would have been easy to do, and then the deviceid really
> would be unique (well, modulo collisions in just the 32-bit truncated
> space - but when people tend to have a single dive computer, and
> having five would be considered unusual, there just isn't much point
> in worrying about collisions ;)
I wouldn't worry about collisions either :-)
> [ Background for Jef, who probably didn't look at what subsurface
> does: not only do we combine the libdivecomputer vendor/product into a
> single model thing, the dive and device ID strings are not kept as
> strings at all by subsurface. We create the dive ID by calculating the
> SHA1 of your "fingerprint" string, and we do the device ID by
> calculating the SHA1 over your model/firmware/serial numbers. In both
> cases we then just take the 20-byte SHA1 and use the first four bytes
> to create a 32-bit integer. So we've turned the arbitrary
> libdivecomputer information into two 32-bit opaque numbers. That makes
> things *much* easier to work with, and has the same amount of actual
> information. ]
Thanks for the quick overview. I did look at the subsurface code already, but
without digging into all the details.
Just a small remark. I wouldn't include the firmware version into the SHA1. Many
modern devices can be updated, and thus the firmware version isn't fixed. I
think it's pretty annoying to have your device being recognized as a new device
after a firmware update. Especially for devices like the OSTC that receive
frequent updates. The reason why probably nobody ran into this yet is that very
few backends fill in the firmware version.
The fact that the firmware version is included in the DC_EVENT_DEVINFO is a bit
historic mistake. The idea was that some devices might have a data format that
is dependent on the firmware version (e.g. a new firmware may introduce some new
features). In that case the firmware would be necessary to parse the data.
However, for this purpose, the firmware version from the DC_EVENT_DEVINFO is
useless, because it contains the current firmware version, and not the firmware
version at the time each dive was recorded. All devices that have multiple data
format versions indeed store the version per dive. So the firmware version from
the DC_EVENT_DEVINFO isn't used for anything by libdivecomputer.
BTW, the libdivecomputer serial number is primary intended to be used as an
device ID. That's why it does not necessary match with the human readable serial
number. Usually the human readable number uses some special encoding (little/big
endian, BCD, ascii), but for libdc we don't really care. We don't want the
serial number to change if we fix a bug in the serial number decoding :-)
Using a hash for the serial number (or the subsurface deviceid), is something I
considered too. But so far all serial numbers nicely fit into a 32bit integer,
and there was just no need for any hashing. Someday that may change, but since
we make no promise that the serial number matches the human readable serial
number, that should be no problem.
Calculating the fingerprint hash isn't really an option for libdivecomputer.
There are some devices (e.g. Uwatec Memomouse and Smart/Galileo) where you have
to send the fingerprint (which in this case is the device timestamp) to the
device, and then the device implements the "download new dives only" feature
internally. Calculating a hash wouldn't work here, because you can't go back
from a hashed fingerprint to the raw timestamp. For others devices, using a hash
is possible, but then there is no real advantage to just consider the
fingerprint as some opaque piece of data. The application simply doesn't have to
care what it actually is.
> But if we were to now change subsurface to mix in the model string too
> (not just the model number that libdivecomputer uses for
> DC_EVENT_DEVINFO) into the device ID in subsurface, the existing
> device ID's would change, so it would be slightly inconvenient. Also,
> I do think that it is likely a good idea to always have the model
> information in things like nickname tables, even if it would be
> redundant - just for the human readability. So while having "globally
> unique" device ID numbers (again, modulo collisions that we don't
> really care about) could have been a programming convenience, I
> suspect we're just as well off just always using the <model,deviceid>
> tuple.
Having a human readable name is indeed a bit more friendly, compared to some
abstract model number or hash value.
Jef
More information about the subsurface
mailing list