[PATCH] DLD upload

Lubomir I. Ivanov neolit123 at gmail.com
Fri Mar 15 04:55:47 PDT 2013


On 15 March 2013 07:01, Miika Turkia <miika.turkia at gmail.com> wrote:
> On Thu, Mar 14, 2013 at 5:53 PM, Dirk Hohndel <dirk at hohndel.org> wrote:
>> Miika Turkia <miika.turkia at gmail.com> writes:
>>>>> I have one question. How should we handle languages with e.g. Cyrillic
>>>>> letters? They are not supported in divelogs.de and display as question
>>>>> marks currently in there. The current encoding of the XMLs in .DLD is
>>>>> iso-8859-1 but utf-8 is not any better. Of course if divelogs.de would
>>>>> support utf-8 we would not have to worry about it...
>>>>
>>>> Actually divelogs.de allows you to input Cyrillic chars. It works for me at
>>>> least when in full edit mode (press "edit dive" at the bottom). When using
>>>> in-place mode it shows questions marks.
>>>> The problem arises when you export dives from divelogs.de:
>>>>
>>>> 1. XML declares iso-8859-1 charset
>>>> 2. All Cyrillic characters are represented and numeric references.
>>>> Unfortunately Subsurface imports them as-is.
>>>
>>> Looks like the CDATA around any free form fields was critical. The
>>> patch I just sent should take care of this. (Having the
>>> cdata-section-elements declared seems to imply also that the content
>>> of the mentioned elements is to be pure ascii, so no need for further
>>> character set hacking.)
>>
>> How does this mesh with UTF-8 encodings? ASCII is 7 bit...
>
> The non-ascii characters are represented in character references (Н).
>
> A problem we currently have in our import of divelogs.de is that these
> character references are not converted to utf-8. And so far I have not
> figured a way to do that conversion.
>

since we are already using libxml2 this appears to work, but i have no
idea how reliable it is for our needs:

#include <libxml/parser.h>
#include <libxml/parserInternals.h>

...

char buf[] = "АБВГДЕЖ +
something else in ASCII";
xmlParserCtxtPtr ctx = xmlCreateMemoryParserCtxt(buf, sizeof(buf));
char *res = xmlStringDecodeEntities(ctx, buf, XML_SUBSTITUTE_REF, 0, 0, 0);
if (res) {
	gtk_window_set_title(GTK_WINDOW(main_window), res);
	free((void *)res);
}

http://www.xmlsoft.org/html/libxml-parserInternals.html#xmlCreateMemoryParserCtxt

lubomir
--


More information about the subsurface mailing list