[PATCH 1/3] Don't allocate an intermediate buffer for most parsed xmlNode contents
Lubomir I. Ivanov
neolit123 at gmail.com
Fri Dec 28 08:18:23 PST 2012
On 28 December 2012 18:04, Linus Torvalds <torvalds at linux-foundation.org>wrote:
> I like the patch, but I think it's subtly buggy as-is, and will cause
> problems with non-subsurface xml files.
>
> On Fri, Dec 28, 2012 at 5:43 AM, Lubomir I. Ivanov <neolit123 at gmail.com>
> wrote:
> > @@ -1253,25 +1236,13 @@ static void visit_one_node(xmlNode *node)
> > if (!content)
> > return;
> >
> > - /* Trim whitespace at beginning */
> > - while (isspace(*content))
> > - content++;
> > -
> > - /* Trim whitespace at end */
> > - len = strlen(content);
> > - while (len && isspace(content[len-1]))
> > - len--;
> > -
> > - if (!len)
> > - return;
> > -
>
> This basically deletes the "ignore empty nodes" logic.
>
> Which means that if we ever see an empty node - say temperature or
> whatever - it will now feed that empty string into our number parsing,
> which will make the parser unhappy.
>
> So I actually think you should keep the "strip space at beginning"
> logic, not because we need to strip it, but because we want to ignore
> empty nodes.
>
> So I'd leave the code something like
>
> while (isspace(*content))
> content++;
> if (!*content)
> return;
>
> which keeps us ignoring empty xml nodes.
>
> Empty xml nodes really are *very* common in various crap dive computer
> XML. You can see the old Suunto XML files in the git history, for
> example: they have empty BOOKMARK nodes in every sample (and then have
> some data in some of them). The other XML files I've seen do similar
> insane things (ie iirc, oddf tends to always have the insane xml nodes
> for the weird stuff it supports, but they're generally empty)
>
> Witht hat simple addition to keep existing semantics, ACK on this patch.
>
aha, i see,
here is a small update for this patch - attachment.
it now uses a method i've noticed yesterday xmlIsBlankNode():
http://www.xmlsoft.org/html/libxml-tree.html#xmlIsBlankNode
commit message update:
-----------------------------------------
visit_one_node(), now also has a xmlIsBlankNode() check, where if
1 is returned, the node contains only whitespace or is empty and should
not be processed.
-----------------------------------------
lubomir
--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.hohndel.org/pipermail/subsurface/attachments/20121228/fe900f75/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Don-t-allocate-an-intermediate-buffer-for-most-parse.patch
Type: application/octet-stream
Size: 6738 bytes
Desc: not available
URL: <http://lists.hohndel.org/pipermail/subsurface/attachments/20121228/fe900f75/attachment-0001.obj>
More information about the subsurface
mailing list