Create backup file when writing new xml file?

Linus Torvalds torvalds at linux-foundation.org
Mon Feb 17 10:22:19 UTC 2014


On Sun, Feb 16, 2014 at 7:52 PM, Thiago Macieira <thiago at macieira.org> wrote:
>
> So, my next attempt is to hardlink the original to the backup name, then
> rename the temp file to the original name. Since link(2) doesn't overwrite like
> rename(2) does, I'd need to unlink(2) the backup name before. If the
> application crashes right after the unlink, the backup is gone but the
> original is still there.

Yup, this is the standard way of doing backups. Technically you should
also make sure to "fsync()" the new file before you rename it from the
temporary name over the final name.

> What happens if there are two processes trying to save at the same file?

For an application like subsurface, this doesn't matter, but I guess
if you want to make this a generic Qt5 helper function, you might have
to care.

You have two choices:

 - locking - either using flock or using a lock directory (and the
really traditional and safe way really is to use a directory, because
"O_CREAT | O_EXCL" is not necessarily actually reliable on NFS)

 - never actually use "unlink()", instead use "link()" to a unique
name, and then you can unlink the original backup file. And then you
can unlink the "backup of the backup" that is under the unique name
*after* you've done everything else, so that you are never in the
situation that two different processes could have unlinked things to
the point where no backup exists.

Eventually (and it looks like it is going to be 3.15), on LInux we'll
also have a new "rename2()" system call that takes an optional
RENAME_NOREPLACE flag (and RENAME_EXCHANGE), which makes things like
this easier.

But that will obviously be Linux-specific. Not that I know how well
the whole hardlink model works under Windows.

> Another question is what happens if the application crashes while it has the
> temp file open? It will leave the temp file on disk, most likely on the target
> directory, so it's unlikely to get collected by a tmpwatcher. We could use
> O_TMPFILE to avoid that issue or deleting the file soon after creating it, but
> then we can't materialise it later.

There's no sane way to guarantee you won't have stale temporary files.
You can minimize them with various tricks, but I don't think it's
really worth worrying about.

> linkat(2) with AT_EMPTY_FILE has a warning that says it will only work if the
> process has CAP_DAC_READ_SEARCH, which user processes don't have.

Yeah, we tried to allow the AT_EMPTY_FILE case so that you could just
do flink(), and then reverted it again because of the security
implications. The security implications of the /proc hack are actually
less, because the kind of people who care don't mount /proc in the
first place.

For now, I'd suggest against playing games. Yes, you'll get temporary
files if things crash. Try to be careful in the sequence that does
this, so that you don't have operations that might cause problems, and
don't call out to user code. Minimize the risk of crashes instead.

              Linus


More information about the subsurface mailing list