Create backup file when writing new xml file?

Lubomir I. Ivanov neolit123 at gmail.com
Mon Feb 17 05:38:57 UTC 2014


On 17 February 2014 05:52, Thiago Macieira <thiago at macieira.org> wrote:
> Em dom 16 fev 2014, às 17:35:57, Thiago Macieira escreveu:
>> Uh... I realise that the question might be about string comparisons. Since
>> we're talking about filenames here, it actually bears the question of what
>> the filesystem driver on OS X and on Windows consider to be the same. Do
>> they restrict to ASCII? Do they apply to the full Unicode range? How about
>> the Turkish dotless i?
>>
>> Sorry, I don't have a ready answer for you. I hate case-insensitive
>> filenames.

on Windows, the "dotless i" from Latin Extended-A in particular is a
weird one as even case-insensitive names (i.e. Win32 namespace and not
the also supported case-sensitive POSIX namespace in the CreateFile
API) are the "default mode", you can still create e.g. two files in
the same folder with a lower case "dotless i" and one with an upper
case "dotless i".

on a quick test no other characters follow this exception, but we can
safely assume that the toUpperCase() logic (which is way the Win32
namespace handles case-insensitive names) doesn't always works.

what i really don't understand is why we need file name comparison for
a backup save-file like feature.
i think this opens a can of worms on Win32.

>>
>> Instead, I'd recommend that save_backup not take as a parameter the
>> recommended extension. Simply remove the part after the ending dot, whatever
>> it might be, and replace it with .bak.
>

or we can keep the extension in the filename, but suffix:

my_file.ext.bak
my_file.ext~

> Musings on saving files with backups, as a possible feature for QSaveFile:
>
> The simplest operation is to write to a temp file, then rename the original to
> the backup name, then rename the temp to the original name. However, what
> happens if the application crashes between renames? You've got a temp and a
> backup, but no file.
>

i don't think there is much variation to this approach specifically on Windows.
a crash during a rename is pretty bad thing and should not happen,
it's pretty much worse than a malloc() error, which we don't handle
*that well* in Subsurface.

as a working example, a certain Windows software keeps a copy of a
recent save prefixed with "backup_filename.ext" at all times in the
same folder, unless the user disables that. both the current and the
"backup_filename.ext" files are locked while the user edits, while
occasionally the backup file is updated, while the user hits "Save".
(to note here, there is also the Win32 API that if you are
Administrator you can tap into a process and make it unlock a specific
file, but that opens a whole new can of worms)

that same software then supports "Open Backup File", but you can just
rename your file back to "filename.ext", without the "backup_" prefix.

> So, my next attempt is to hardlink the original to the backup name, then
> rename the temp file to the original name. Since link(2) doesn't overwrite like
> rename(2) does, I'd need to unlink(2) the backup name before. If the
> application crashes right after the unlink, the backup is gone but the
> original is still there.
>
> What happens if there are two processes trying to save at the same file? Should
> they use a lock file or flock(2) or something? If they don't lock, both will
> write to separate temp files. The best case scenario is that they operate in
> order: delete, link, rename. If they do the delete+link in order, it also
> works. But suppose both do unlink(2) and then both proceed to link(2): one
> succeeds and the other fails with -EEXIST. How should this second one react?
>
> Another question is what happens if the application crashes while it has the
> temp file open? It will leave the temp file on disk, most likely on the target
> directory, so it's unlikely to get collected by a tmpwatcher. We could use
> O_TMPFILE to avoid that issue or deleting the file soon after creating it, but
> then we can't materialise it later.
>
> linkat(2) with AT_EMPTY_FILE has a warning that says it will only work if the
> process has CAP_DAC_READ_SEARCH, which user processes don't have.
> Alternatively, Lucas found a workaround in [1] that uses AT_SYMLINK_FOLLOW and
> uses /proc/self/fd/%d to locate the original "(deleted)" name and the path,
> but I'm not sure whether this should be allowed in light of the
> CAP_DAC_READ_SEARCH requirement of AT_EMPTY_FILE.
>
> And by the way, what happens if the target directory gets renamed? Do we need
> to keep a file descriptor open to the directory? Can't we use the file
> descriptor for the fie in linkat(2)'s newdirfd?
>
> [1] https://plus.google.com/113877326530782588467/posts/SxQBDV6scHj
>

can't comment much on the QSaveFile ideas, but MINGW / MSVRT doesn't have:
- link()
(it does have unlink(), though which is deprecated for _unlink() and _wunlink())
- linkat()
- flock()
as these are POSIX and technically Windows does not support symlinks
out of the box, it requires something called Link Shell Extension,
which i haven't tried.

lubomir
--


More information about the subsurface mailing list