[PATCH] DLD upload

Sergey Starosek sergey.starosek at gmail.com
Fri Mar 15 04:47:20 PDT 2013


Miika,


> The non-ascii characters are represented in character references (Н).
>
> A problem we currently have in our import of divelogs.de is that these
> character references are not converted to utf-8. And so far I have not
> figured a way to do that conversion.
>

Current  divelogs.xslt template incorrectly converts NCRs:

&#1077 becomes &#1077

Verified both with command line xsltproc and by adding xmlDocDump(stderr,
transformed); at the end of test_xslt_transforms()

Modified divelogs.xslt:

diff --git a/xslt/divelogs.xslt b/xslt/divelogs.xslt
index f66ffcc..ef10e2d 100644
--- a/xslt/divelogs.xslt
+++ b/xslt/divelogs.xslt
@@ -47,10 +47,10 @@
         <xsl:for-each select="LOCATION|SITE">
           <xsl:choose>
             <xsl:when test="following-sibling::SITE[1] != ''">
-              <xsl:value-of select="concat(., ' / ')"/>
+              <xsl:value-of disable-output-escaping="yes"
select="concat(., ' / ')"/>
             </xsl:when>
             <xsl:otherwise>
-              <xsl:value-of select="."/>
+              <xsl:value-of disable-output-escaping="yes" select="."/>
             </xsl:otherwise>
           </xsl:choose>
         </xsl:for-each>
@@ -78,7 +78,7 @@
       </temperature>

       <buddy>
-        <xsl:value-of select="PARTNER"/>
+        <xsl:value-of disable-output-escaping="yes" select="PARTNER"/>
       </buddy>

       <!-- Helium? -->
@@ -119,7 +119,7 @@
       </xsl:if>

       <notes>
-        <xsl:value-of select="LOGNOTES"/>
+        <xsl:value-of disable-output-escaping="yes" select="LOGNOTES"/>
       </notes>

       <xsl:for-each select="SAMPLE/DEPTH">

At this point if I transform divelogs file using xsltproc and import that
XML, then I can see cyrillic characters in Subsurface.
But no luck with direct import (i.e. --import). Subsurface complains on
non-mathing nodes "divelog.dives.dive.notes.textnoenc".
Patching visit_one_node() to consider textenc nodes allows to import XML,
but still with NCRs.

I'm completely lost at this point.

Sergey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.hohndel.org/pipermail/subsurface/attachments/20130315/59a9cf61/attachment-0001.html>


More information about the subsurface mailing list