[PATCH] DLD upload

Miika Turkia miika.turkia at gmail.com
Thu Mar 14 22:01:52 PDT 2013


On Thu, Mar 14, 2013 at 5:53 PM, Dirk Hohndel <dirk at hohndel.org> wrote:
> Miika Turkia <miika.turkia at gmail.com> writes:
>>>> I have one question. How should we handle languages with e.g. Cyrillic
>>>> letters? They are not supported in divelogs.de and display as question
>>>> marks currently in there. The current encoding of the XMLs in .DLD is
>>>> iso-8859-1 but utf-8 is not any better. Of course if divelogs.de would
>>>> support utf-8 we would not have to worry about it...
>>>
>>> Actually divelogs.de allows you to input Cyrillic chars. It works for me at
>>> least when in full edit mode (press "edit dive" at the bottom). When using
>>> in-place mode it shows questions marks.
>>> The problem arises when you export dives from divelogs.de:
>>>
>>> 1. XML declares iso-8859-1 charset
>>> 2. All Cyrillic characters are represented and numeric references.
>>> Unfortunately Subsurface imports them as-is.
>>
>> Looks like the CDATA around any free form fields was critical. The
>> patch I just sent should take care of this. (Having the
>> cdata-section-elements declared seems to imply also that the content
>> of the mentioned elements is to be pure ascii, so no need for further
>> character set hacking.)
>
> How does this mesh with UTF-8 encodings? ASCII is 7 bit...

The non-ascii characters are represented in character references (Н).

A problem we currently have in our import of divelogs.de is that these
character references are not converted to utf-8. And so far I have not
figured a way to do that conversion.

miika


More information about the subsurface mailing list