API redesign progress

Linus Torvalds torvalds at linux-foundation.org
Mon Jun 25 14:17:00 PDT 2012


On Mon, Jun 25, 2012 at 1:02 PM, Jef Driesen <jefdriesen at telenet.be> wrote:
>
> Why would you want to create a parser in advance? You can't really parse
> anything until you get the first dive in the dive callback function. And in
> that case you can equally well create the parser in the dive callback. So I
> don't understand why this is a problem.

I don't want to create a parser AT ALL.

That's 100% your internal problem, and you should consider it that way.

You should create the parser on demand. You should *not* ask the
application to create it and worry about it, since the application has
absolutely zero use for it.

The app just wants to get the results.

> If you call dc_parser_new() from inside the dive callback, you don't need to
> care about the model number at all. If a parser needs it, it's guaranteed to
> be available in the device handle (regardless of whether you catch the
> events or not). If it's not required, then no problem either. That's what I
> meant when I said it's convenient: you don't need any device specific
> knowledge, and it will always just work.

I currently *do* create the parser in the dive callback.

And it's stupid. It means that I either have to keep track of the old
parser for the next dive callback (why?) or I do what I do now -
create it anew for each dive. Right now I create it for each dive,
because that's what your example.c code does, and it's not at all
clear from anything whether I needed to do it or not.

But even if I can cache it across dives, the interface is just totally
moronic. Why does that interface exist at all? Why should I do it,
when you could do it so much better yourself without me ever calling
"dc_parser_new()" at all?

In other words, please answer me this *simple* question: what is the
advantage of me knowing about "create parser" at all? Seriously. I
don't care. There is absolutely *NOTHING* I can do with the
"dc_parser_t" structure.

And if there is nothing I can do with it, I should never even have to
see it, and I shouldn't have to maintain it.

Why don't you just make it a hidden member in the opaque "dc_device_t"
data structure, and you create it invisibly on demand? Why do you
force me to use insane interfaces? What's the advantage of doing

   dc_parser_set_data(parser, data, size);
   ...
   dc_parser_get_field (parser, DC_FIELD_DIVETIME, 0, &divetime);
   ...

etc? Why should the user care? The user *shouldn't* care.

This whole idiotic "you're passing me values just so that I can
mindlessly pass them back to you" is a completely horrible model.
Those values make no sense to me - why do you even give them to me?
And the data structures you internally create to hold the parser state
for the values equally make no sense to me - so why do you have
interfaces that require that stupid back-and-forth model?

You could hide *everything* in 'dc_device_t'. When I get a divecb
callback event, I shouldn't need to care about your parsers. But even
more fundamentally, I shouldn't even need to care about that
'data/size' thing. It's totally irrelevant to me. I can't do anything
with it anyway. Stop giving it.

Seriously.

So I would suggest that in dive_cb(), you should pass me exactly *one*
thing as an argument: the user-data that I wanted to have and told you
I wanted to have. And then you pass me an opaque "dive" data structure
that contains all the information you have gathered so far. I don't
want to know what it contains, I don't *want* to know about parsers, I
don't want to know about crud like that. It might even be
dive-computer-specific structure - not generic at all - that your
particular dive computer wants. Do this, instead of giving me "const
unsigned char *data, unsigned int size" that makes no sense, or the
"const unsigned char *fingerprint, unsigned int fsize" stuff etc. None
of that matters.

So the interface I would suggest would look simply like

    int dive_cb(dc_dive_t *dive, void *userdata)
    {

because anything else is just garbage as far as I'm concerned. I can't
use it anyway.

Then, give me a few functions to get the dive data from that device
during that dive. Don't make me create a parser, don't make me do that
"dc_parser_set_data()" insanity (you just *gave* me that data, you
already know it, just remember it yourself in the device data
structure!). Just give me

   dc_dive_get_field(dc_dive_t *dive, enum dive_field, int index, void *result);

so that I can get the normal fields (DC_FIELD_DATETIME,
DC_FIELD_DURATION, DC_FIELD_GASMIX etc). Or you might expose them as
separate accessor functions, ie

   /* Return dive date and time */
   void dc_dive_get_datetime(dc_dive_t *dive, dc_datetime_t *datetime);

   /* Return dive duration in seconds */
   long dc_dive_get_duration(dc_dive_t *dive);

   /* return number of gasmixes in 'mix[]' array of size maxmixes */
   int dc_dive_get_gasmix(dc_dive_t *dive, dc_gasmix_t *mix, int maxmixes)

etc. If you need to create a parser, you do it here. But maybe for
some dive computers the "parser" is some static thing that you don't
even need to create at all. Why have that whole parser-dependent model
and expose it? Even if you internally use a parser, just put the
pointer to that parser in that "dc_device_t" data structure (or in the
"dc_dive_t" one - maybe some day you decide that you really *do* want
to do per-dive parser generation).

> Why you use DC_EVENT_DEVINFO as an indicator is a mystery to me. We have
> DC_EVENT_{PROGRESS,WAITING} for that purpose. The primary purpose for
> DEVINFO is for use with the fingerprint feature, which subsurface isn't
> using.

You even said that you want to make the parser generation function
take that devinfo structure pointer. You said you'd want to introduce
something like

  dc_status_t
  dc_parser_new_for_data (dc_parser_t **parser,
                       const dc_event_devinfo_t *devinfo,
                       const dc_event_clock_t *clock);

but I don't even *have* that structure until after I get the
DC_EVENT_DEVINFO callback. I don't *care* about it.

And more importantly, I shouldn't care. I shouldn't have to care about
*any* of this.

So why do you make the interfaces have all these totally irrelevant things?

You used to do this with the whole dive computer open thing. You've
cleaned that up, and now we don't have to care about random internal
libdivecomputer implementation issues any more. And that's GREAT.

Now I'm just asking you to basically do the same for the parser stuff
and the event handling: clean it up so that the users don't have to
see your internal implementation issues that are totally irrelevant to
any user.

Even *if* the user actually wants to save some "raw dive data", make a
helper function for that! So even if somebody wants to save the dive
data you got, just add another

  dc_dive_get_raw_divedata(dc_dive_t *dive, void **buffer, unsigned int *size);

and then that can give people the dive data. But the whole "let's give
some raw dive data to the user, just so that the user should then set
up a parser and pass it back to us" model is exactly the wrong way
around. It designs things for the wrong model.

> I'm not sure what you propose here. Do you mean to integrate the parser
> object into the device object? How is that supposed to work? For example a
> device can contain more than one dive, so how do you decide which dive to
> parse?

Look at the suggestion above. Put the parser pointer into the
dc_device_t structure (you claimed earlier that it is a "one per
device" thing), hide it from the user that doesn't care. Don't expose
it.

The dive data you can pass in the per-dive opaque structure. Again,
the user really really doesn't even want to know the details. You can
hide anything you want in there. Maybe it just contains the raw dive
data. Maybe it contains some extra fields (like your own internal
pointers to the functions to parse it). Maybe it contains a
back-pointer to the "dc_device_t" so that from the dive data you can
always just find the device data, and the device data then contains
the parser data.

See? There are many different implementation models for this, and the
library interface shouldn't expose what you do internally. It should
make sense at a higher level, and by making sense at that higher level
it actually allows you to *change* your implementation.

For example, right now you have this model for the dive callback function:

   static int dive_cb(const unsigned char *data, unsigned int size,
           const unsigned char *fingerprint, unsigned int fsize,
           void *userdata)
   {

and you could literally *inside* libdivecomputer change it to do

    typedef struct dc_dive_t {
        dc_device_t *device;
        const unsigned char *data;
        unsigned int size;
        const unsigned char *fingerprint;
        unsigned int fsize;
    } dc_dive_t;

and then pass me a pointer to that object (without ever actually
exposing the structure to me, so that I couldn't see what it contains)
as the 'dive'.  Now whenever I pass you that 'dive' back to the parse
functions, you have the data right there. No need for
'dc_parser_set_data()' etc insane contortions, and the interface is
simpler not just for me, but for you too. You hardly have to change
any code at all, because you really have all the same data - but
you've cleaned up the interface.

And now that you have that opaque data structure, if you make some
organizational changes, none of the users of the library need to even
know. They don't care. Maybe the "data" part won't even be a single
buffer for some devices - maybe the dive computer itself uses XML, and
instead of having that "const unsigned char *data/size" thing for
those dive computers, maybe you'll use some xml library for that, and
it has a "xmlNode" entry for the dive instead?

Ok, so hopefully no dive computer ever does anything that insane, but
you know it could happen. It may not be XML, but it might be some
other similar odd encapsulated format. The uemis has that odd base64
encoding, maybe you want the 'dc_dive_t' to have the pre-decoded
information in etc.

Or if it's a database, it would again not be about "const unsigned
char *data/size" model, but be about some kind of "dive key +
connection information for the database". See? Exposing it as a
"data/size" thing is wrong. Give the user just the opaque pointer, and
you can hide *anything* in there, and hide it in the format that makes
sense for *that* particular dive computer.

THAT is a good model. It's a good model for the user (who sees "ok,
that's a dive, I want to know when it happened and the details of
it"), and it's good for you ("I can put the data in the most simple
form for *this* particular dive computer"), and it's flexible and
simple. It doesn't make the user do extra things.

> Or do you mean the dive callback should be changed to receive a
> parser object directly, pre-populated with the dive data, model numbers, and
> whatever else is necessary. Something like:
>
> int
> callback (dc_parser_t *parser, void *userdata)
> {

Yes, I think that would actually be an improvement over the current
model, and I don't think it would be wrong either. In many ways, my
"dc_dive_t" *is* exactly that pre-populated parser thing, but I would
suggest against thinking of it as a "parser" thing. It might contain
pointers to the real parser, or more likely actually contain just a
pointer to the dc_device_t, which in turn contains a pointer to the
parser.  Don't name things by how you happened to implement them, name
them by what they represent.

It represents the data for "one dive". Name it that way.

                   Linus


More information about the subsurface mailing list