[PATCH] Fixed potential, rare corruption of unicode characters

Tue Oct 2 09:46:25 PDT 2012

From: "Lubomir I. Ivanov" <neolit123 at gmail.com>

In divelist.c:get_string(), when truncating the string to a maximum
of 60 characters (to be shown in the divelist), make sure we are
counting in guinchar (sizeof usually 2) instead of gchar (sizeof usually 1).
Use Glib functions such as g_utf8_strlen() and g_utf8_strncpy() to do that.

This patch fixes the potential problem when truncating a UTF-8 string
by calculating its length using strlen() in bytes.

For char = 1 byte, if the length returned by strlen() is an odd number
this means there is at least one single byte length character in there.

But also if the same string has a UTF-8 character at exactly the truncate
position minus x(probably 1) bytes, we are going to split the bytes
forming said UTF-8 char resulting in an incorrect string.

Signed-off-by: Lubomir I. Ivanov <neolit123 at gmail.com>
---
 divelist.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/divelist.c b/divelist.c
index e09eca5..910b0c9 100644
--- a/divelist.c
+++ b/divelist.c
@@ -788,12 +788,11 @@ static void get_string(char **str, const char *s)
 
 	if (!s)
 		s = "";
-	len = strlen(s);
+	len = g_utf8_strlen(s, -1);
 	if (len > 60)
 		len = 60;
-	n = malloc(len+1);
-	memcpy(n, s, len);
-	n[len] = 0;
+	n = malloc(len * sizeof(gunichar) + 1);
+	g_utf8_strncpy(n, s, len);
 	*str = n;
 }
 
-- 
1.7.11.msysgit.0