storing to git data files [was Re: Subsurface ans Dropbox]

Linus Torvalds torvalds at linux-foundation.org
Wed May 27 14:54:52 PDT 2015


On Wed, May 27, 2015 at 1:56 PM, Dirk Hohndel <dirk at hohndel.org> wrote:
> Any utf8 in your path by any chance?

Argh. That's probably it.

We do everything by byte offsets in "is_git_repository()", but then we also do

        loc = format_string("%.*s", flen, filename);

and that in turn ends up using vsnprintf(). Which would work fine in
the C locale, but I wouldn't be surprised if it does everything wrong
with a non-C locale.

I pretty much hate the nasty localization of the standard C library
stdio routines. I've mentioned that before, and I guess I'll mention
it again. The C standards people got localization badly wrong. They
should have made localized output (and input) use a format modifier,
not changed the traditional C semantics of the stdio code. We've been
hit by the "." vs "," thing on FP output before.

(The standard got it right for printing out the "thousands separator":
you have to explicitly ask for it. But the decimal dot-vs-comma
localization happens whether you ask for it or not)

I'm actually pretty sure that "%.*s" is _supposed_ to always count
bytes according to ANSI: "If the precision is specified, no more than
that many bytes shall be written."

But at the same time, I would also not be AT ALL surprised if some
library got this wrong, and decided that precision is in "characters",
not bytes.

So I think the format_string() thing _should_ work correctly, but I
also think it's quite likely to show "features" of the C library.

Joakim: does it work for you if you just revert commit b770d0a6b71c
("git-access: use the new format_string helpers")

                       Linus


More information about the subsurface mailing list