[PATCH 2/8] Files: add wrappers for open(), fopen(), sqlite3_open()

Thiago Macieira thiago at macieira.org
Wed Dec 18 15:36:56 UTC 2013


On quinta-feira, 19 de dezembro de 2013 01:26:34, Lubomir I. Ivanov wrote:
> should i replace it to be?:
> int sz = strlen(utf) * 3;

Just sz = strlen(utf8) + 1;

In the worst case, conversion from UTF-8 to UTF-16 results in the same number 
of characters, or double the number of bytes. That's actually the US-ASCII 
case: each byte becomes one 16-bit word. For everything else, UTF-16 takes 
fewer number of characters.

You multiply by 3 when you convert from UTF-16 to UTF-8 for the worst case 
scenario.

> 
> >> +FILE *subsurface_fopen(const char *path, const char *mode)
> >> +{
> >> +     FILE *ret = NULL;
> >> +     wchar_t *wpath = utf8_to_utf16(path);
> >> +     if (wpath) {
> >> +             wchar_t *wmode = utf8_to_utf16(mode);
> > 
> > This one is going to be expensive... the mode is actually US-ASCII, so a
> > simpler algorithm would be cheaper, but then we have to write it, maintain
> > it, etc. This is fine.
> 
> yeah, but it still requires a wchar buffer at least in MSVCRT:
> http://msdn.microsoft.com/en-us/library/yeby3zcb.aspx
> so i left it like that.

Yup. My point is that you could do an ASCII-to-UTF16 conversion instead:

	wchar_t wmode[strlen(mode) + 1];
	for (i = 0; i < strlen(mode); ++i)
		wmode[i] = mode[i];

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.hohndel.org/pipermail/subsurface/attachments/20131218/7e7d3838/attachment.sig>


More information about the subsurface mailing list