fixed buffer lengths with snprintf are not utf-8 safe

Lubomir I. Ivanov neolit123 at gmail.com
Wed Oct 17 11:21:41 PDT 2012


when we found some issues that were related to truncation of utf-8
strings, i was skeptical that fixed buffer lengths like in
info.c:show_dive_info() + snprintf may be problematic.
apparently this is true for longer strings with unicode chars.
another example is divelist.c:date_data_func().

each "xx" couple is a utf-8 char, while a singe "x" is ansi char.

"FDDABBCDDEE"
if truncate position is the second B char for example we are splitting
the BB char in half resulting in incorrect string.
if its at the first C or B we are good.

another thing that can happen, which is (somehow) worse if the string
is last in the literal pool is splitting before the \0 char, which
will try to print what follows.

perhaps:
g_utf8_strlen()
g_utf8_strncpy()
malloc(sizeof(gunichar) * n)

malloc() / realloc() would be better than some sort of a smart
truncation technique i.e checking the unichar at truncate position.

may be the correct way to go here.
i will try to send a patch later on.

lubomir
--


More information about the subsurface mailing list