[PATCH 2/2] Support UTF8 encoded tags

Maximilian Güntner maximilian.guentner at gmail.com
Thu Nov 14 14:20:56 UTC 2013


2013/11/14 Thiago Macieira <thiago at macieira.org>:
> Details below.
>
> I'm not sure this is correct. The changes that are trying to fix the bug
> *should* have had no effect.
>
> On quinta-feira, 14 de novembro de 2013 19:47:30, Maximilian Güntner wrote:
>> +static void utf8_string(char *buffer, void *_res)
>> +{
>> +     int size;
>> +     char *res;
>> +     while (isspace(*buffer))
>> +             buffer++;
>> +     size = strlen(buffer);
>> +     while (size && isspace(buffer[size-1]))
>> +             size--;
>> +     if (!size)
>> +             return;
>
> Note that there's a possible bug here. I know you've just moved the code
> around, so this is something for the future.
>
> According to the code above, if utf8_string is called on a string containing
> only spaces, the first loop will advance buffer to the end, so strlen(buffer) ==
> 0. The second loop will never execute because of that, and then we return
> without setting _res. That means the caller could see garbage.

I see.
Plus, we don't really need utf8_string() here since the CSV parser
already strips all leading
and following spaces. I should have checked what the function actually
does before using it ;).

>> diff --git a/qt-ui/completionmodels.cpp b/qt-ui/completionmodels.cpp
>> index 31733ad..965ce87 100644
>> --- a/qt-ui/completionmodels.cpp
>> +++ b/qt-ui/completionmodels.cpp
>> @@ -43,7 +43,7 @@ void TagCompletionModel::updateModel()
>>       QStringList list;
>>       struct tag_entry *current_tag_entry = g_tag_list->next;
>>       while (current_tag_entry != NULL) {
>> -             list.append(QString(current_tag_entry->tag->name));
>> +             list.append(QString::fromUtf8(current_tag_entry->tag->name));
>
> This should not be necessary since we set the codec for C strings to UTF-8 in
> main(). However, this doesn't hurt.
>
>> -                     notesBackup[mydive].tags = QString(buf);
>> +                     notesBackup[mydive].tags = QString::fromUtf8(buf);
>
> Same thing as before: it shouldn't be necessary.
>>               taglist_get_tagstring(d->tag_list, buf, 1024);
>> -             ui.tagWidget->setText(QString(buf));
>> +             ui.tagWidget->setText(QString::fromUtf8(buf));
>
> Ditto.
>
>> -                     taglist_add_tag(mydive->tag_list, tag.toAscii().data());
>> +                     taglist_add_tag(mydive->tag_list, tag.toUtf8().data());
>
> Ditto for the reverse: toAscii() is documented to be a misnomer and it
> actually encodes using the codec set for the C strings. In this case, it's
> UTF-8.

Okay. The *real* fix for #230 is the previous patch (1/2), which
changes tr() to trUtf8().
I just used fromUtf8()/toUtf8() to make sure that we *really* use UTF-8.
So you are right, all these changes are pretty redundant.

>> +     completers.tags->setCaseSensitivity(Qt::CaseInsensitive);
>
> Unrelated change? At least, this is not explained.

You got me. I noticed it during testing, fixed it and forgot about it :)

>>                       if (tmp->tag->source != NULL)
>> -                             fprintf(f, "%s", tmp->tag->source);
>> +                             quote(f, tmp->tag->source, 0);
>>                       else
>> -                             fprintf(f, "%s", tmp->tag->name);
>> +                             quote(f, tmp->tag->name, 0);
>
> Unrelated change?

You got me again.

@ Dirk:

I will send a v2 of this patch series that reverts all unnecessary
toUtf8() / fromUtf8() changes
and adds two separate patches for the unrelated changes.

Thanks,

Maximilian


More information about the subsurface mailing list