API redesign progress

Mon Jun 25 13:02:01 PDT 2012

Linus,

Could you do me a favor and provide some high level overview of the api 
design that you have in mind? I'm trying to understand your point of 
view, but it's not always crystal clear to me. If you could explain your 
api design with some pseudo code, call sequences, etc I think that would 
help a lot.

Some more comments below.

On 2012-06-24 20:32, Linus Torvalds wrote:
> On Sat, Jun 23, 2012 at 1:45 PM, Jef Driesen <jefdriesen at telenet.be> 
> wrote:
>>
>> Creating the parser in the callback is convenient. In some cases you 
>> can't
>> create a parser upfront (at the time of calling dc_device_open, or 
>> even
>> immediately after it) because the actual model number is required 
>> for
>> parsing the data.
>
> Jef, this is exactly the kind of totally broken thinking that makes
> libdivecomputer interfaces so nasty to use.
>
> The user SHOULD NOT CARE about these kinds of internal 
> libdivecomputer
> issues. The whole *point* of libdivecomputer should be to hide all
> these internal implementation issues, and as long as you continue to
> think that your internal details should matter for the interface, the
> interface will be crap.
>
> Why the hell should I care that you need some model number for the
> parser? I do not even *KNOW* the model number, for chissake! The only
> thing that knows (and cares) is libdivecomputer itself.
>
> So according to your absolutely insane logic, I cannot create the
> parser at device open time, because I don't know if that particular
> device needs the model number. But that means that I should magically
> decide to create the dive parser after I get the DC_EVENT_DEVINFO
> callback. Can you not see that THAT MAKES NO SENSE! From an interface
> standpoint, that is absolutely insane. I don't even *care* about the
> DC_EVENT_DEVINFO - the fact that I actually catch it and print out 
> the
> model number on the progress bar is simply because that (to me) is an
> indicator that libdivecomputer is working at all.

Why would you want to create a parser in advance? You can't really 
parse anything until you get the first dive in the dive callback 
function. And in that case you can equally well create the parser in the 
dive callback. So I don't understand why this is a problem.

If you call dc_parser_new() from inside the dive callback, you don't 
need to care about the model number at all. If a parser needs it, it's 
guaranteed to be available in the device handle (regardless of whether 
you catch the events or not). If it's not required, then no problem 
either. That's what I meant when I said it's convenient: you don't need 
any device specific knowledge, and it will always just work.

The only real issue I see here is that the api doesn't prevent you from 
calling dc_parser_new() immediately after dc_device_open(), in which 
case you may end up with a non functional parser.

Why you use DC_EVENT_DEVINFO as an indicator is a mystery to me. We 
have DC_EVENT_{PROGRESS,WAITING} for that purpose. The primary purpose 
for DEVINFO is for use with the fingerprint feature, which subsurface 
isn't using.

> You fixed the interfaces to the top-level open code, and to the
> top-level parser create code. Please just fix this too.
>
> The fix is damn easy: get rid of the insane and useless
> "dc_parser_new()" and "dc_parser_destroy()" interfaces entirely.
> Create the parser inside of libdivecomputer (whenever it makes sense)
> and don't expose that totally idiotic and pointless interface to the
> user at all. Seriously. There is absolutely *ZERO* upside to ask me 
> to
> create a parser, since I don't care, and I cannot possibly do it at a
> sane time anyway, because for the user, no sane time exists at all.
>
> Just create the parser internally, and hide that information in the
> "dc_device_t" structure that is totally internal to libdivecomputer,
> and is just an opaque object to the users.
>
> Please. Stop these insane interfaces that only complicate the use of
> libdivecomputer, and don't help anybody.

I'm not sure what you propose here. Do you mean to integrate the parser 
object into the device object? How is that supposed to work? For example 
a device can contain more than one dive, so how do you decide which dive 
to parse? Or do you mean the dive callback should be changed to receive 
a parser object directly, pre-populated with the dive data, model 
numbers, and whatever else is necessary. Something like:

int
callback (dc_parser_t *parser, void *userdata)
{
    ...
    dc_parser_get_datetime (parser, ...);
    dc_parser_samples_foreach (parser, ...);
    ...
}

Essentially turning the dc_parser_t into a dc_dive_t object. I 
considered doing this in the early days of the libdivecomputer project, 
but there are a couple of problems with this approach.

With this api, your have no other option than to do the parsing 
directly in the callback function. However some devices (e.g. the 
sensusultra) have very strict timing requirements and you are very 
limited in the amount of processing you can do in the callback without 
messing up the downloading. In this cases it's usually better to save 
the data and process it once the download has finished, or offload it to 
a separate thread. But that's impossible because the lifetime of the 
parser is now limited to the duration of callback function.

Assume we introduce some method to extend the lifetime of the parser to 
remain valid outside the callback (e.g. a deep copy or something). We 
still can't transfer this parser to another thread because it may share 
some state with the device handle. Remember that in the roadmap emails 
[1], I announced that I'm about to introduce a new library context 
object to fix the problems related to this shared state. With the 
context object, you could create one context for the device thread and 
another one for the parser thread. If you then call:

dc_context_new (&ctx_device);
dc_context_new (&ctx_parser);

dc_device_open (&device, ctx_device, ...);

callback (...)
{
    dc_parser_new (&parser, ctx_parser, device);
}

The device and parser won't share any state and everything is fine. But 
if the parser is created directly by the device, there is no way to use 
another context object. The sensusultra timing problem can be fixed with 
internal caching, but the threading problem is a more fundamental one. 
How would you solve this?

[1] http://www.divesoftware.org/libdc/roadmap.html#17

>> You also have to take into account the use-case where you want to 
>> parse data
>> without a device connection available.
>
> No you don't. You can expose damn well any interface you want for 
> that
> case, but that's a totally separate case. That in *no* way means that
> when we have the device available, the user should somehow magically
> know when to create the parser. There is zero point.
>
>> BTW, for this situation I want to introduce some variant of the new
>> dc_parser_new function that doesn't take a device handle. I have 
>> been
>> thinking of something like this:
>>
>> dc_status_t
>> dc_parser_new_for_data (dc_parser_t **parser,
>>                        const dc_event_devinfo_t *devinfo,
>>                        const dc_event_clock_t *clock);
>
> Congratulations on making the interface EVEN MORE BROKEN.
>
> Don't do this. Seriously. Just f*cking don't. Any sane user of the
> libdivecomputer stuff will never ever care. If I opened a device
> through libdivecomputer, you already have ALL that information. It's
> totally pointless to give it to be as a callback, and then expect me
> to just give it back to you. Can't you see how STUPID that is?
>
> Make that interface what you do *internally*. Use it for the 
> pointless
> case where somebody cares. Don't force it for the normal case.

I think you misunderstood me here. This new dc_parser_new_for_data() 
function is supposed to co-exist with the dc_parser_new() function. You 
would only use it for the "offline" parsing use-case, so it's not 
intended as a replacement for the dc_parser_new() function. Isn't that 
what you meant when you wrote I can "expose damn well any interface you 
want" above?

> Even if I were to want to parse a pre-downloaded dump from a file, I
> WOULD NEVER EVER WANT TO USE THAT INTERFACE. I would want to use the
> "open dive computer" interface, and just give you the filename. And
> then libdivecomputer can again internally do the above.
>
> So feel free to do those kinds of changes for your internal flow, but
> if you really think that it's a good idea to expose that kind of
> internal detail to the user, I don't know what to say. Except to say
> that you're wrong. It's a really *bad* idea to make these kinds of
> interfaces. Don't make things worse, make them better.

I'm not convinced a file import interface is the right solution here. 
What if the application doesn't store the data in a raw file (for 
example in a database), or uses some different format (for example the 
individual dives instead of a memory dump)? If an application wants to 
support offline parsing, then it will need some specific knowledge 
anyway. For example in the Uwatec Smarttrak scenario, the data isn't 
stored in the original format, and you have to combine several database 
fields to reconstruct it. You can't do this without any knowledge of the 
format. So the internals are already exposed.

>>>  - The event handling is still horrible. There does not seem to be 
>>> any
>>> sane generic way to just get the event type and meaning.
>>
>> Which events are you referring to? The device events (e.g. 
>> DC_EVENT_*), or
>> the parser events (e.g. DC_SAMPLE_EVENT)? Can you explain in some 
>> more
>> detail?
>
> The DC_SAMPLE_EVENT interface is just crazy.
>
> Why do I have to have this kind of code in there:
>
>
>         static const char *events[] = {
>                 "none", "deco", "rbt", "ascent", "ceiling",
> "workload", "transmitter",
>                 "violation", "bookmark", "surface", "safety stop",
> "gaschange",
>                 "safety stop (voluntary)", "safety stop (mandatory)",
> "deepstop",
>                 "ceiling (safety stop)", "unknown", "divetime", 
> "maxdepth",
>                 "OLF", "PO2", "airtime", "rgbm", "heading", "tissue
> level warning"
>         };
>         const int nr_events = sizeof(events) / sizeof(const char *);
>         ...
>         type = value.event.type;
>         name = "invalid event number";
>         if (type < nr_events)
>                 name = events[type];
>
> where that set of strings is something I had to just mindlessly copy
> from your "example.c".
>
> The 'enum parser_sample_event_t' numbers are totally meaningless to
> me. They don't tell me anything. Same goes for 'enum
> parser_sample_flags_t'. And what does the even 'value' actually do?
> None of this makes any sense, and more importantly, as far as I can
> tell, I *cannot* make sense of them.
>
> Yes, the 'flags' things seems to possibly make sense as a "this 
> begins
> the event" and "this ends it". That's not documented, but at least
> it's *somehow* sensible. But if that is all it does, why is it some
> bitfield thing? What else can it possibly be? What should I do as a
> user to protect myself from you changing the flags?
>
> And the 'value' field? What does it do? It makes no sense to me, and 
> I
> can't tell what it would do. Does it have any meaning? Is it in
> degrees for the heading event? Is it minutes for the airtime event?
> What is it?

The sample events are indeed a big mess. I'm aware of this, and they 
will be removed from the api. I don't know yet how to implement an 
appropriate replacement, so at the moment the idea is to introduce a 
vendor extension type and just pass the raw binary data. Applications 
that insist on using this type of data can still get it with some 
additional effort. But at least we won't be stuck forever with a broken 
interface.

Jef