[PATCH 1/3] Don't allocate an intermediate buffer for most parsed xmlNode contents

Fri Dec 28 08:04:37 PST 2012

I like the patch, but I think it's subtly buggy as-is, and will cause
problems with non-subsurface xml files.

On Fri, Dec 28, 2012 at 5:43 AM, Lubomir I. Ivanov <neolit123 at gmail.com> wrote:
> @@ -1253,25 +1236,13 @@ static void visit_one_node(xmlNode *node)
>         if (!content)
>                 return;
>
> -       /* Trim whitespace at beginning */
> -       while (isspace(*content))
> -               content++;
> -
> -       /* Trim whitespace at end */
> -       len = strlen(content);
> -       while (len && isspace(content[len-1]))
> -               len--;
> -
> -       if (!len)
> -               return;
> -

This basically deletes the "ignore empty nodes" logic.

Which means that if we ever see an empty node - say temperature or
whatever - it will now feed that empty string into our number parsing,
which will make the parser unhappy.

So I actually think you should keep the "strip space at beginning"
logic, not because we need to strip it, but because we want to ignore
empty nodes.

So I'd leave the code something like

    while (isspace(*content))
        content++;
    if (!*content)
        return;

which keeps us ignoring empty xml nodes.

Empty xml nodes really are *very* common in various crap dive computer
XML. You can see the old Suunto XML files in the git history, for
example: they have empty BOOKMARK nodes in every sample (and then have
some data in some of them). The other XML files I've seen do similar
insane things (ie iirc, oddf tends to always have the insane xml nodes
for the weird stuff it supports, but they're generally empty)

Witht hat simple addition to keep existing semantics, ACK on this patch.

             Linus