Bug 438076

Summary: mv creates race condition by calling unlink() before rename()
Product: [Fedora] Fedora Reporter: James Ralston <ralston>
Component: coreutilsAssignee: Ondrej Vasik <ovasik>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: high    
Version: rawhideCC: meyering, twaugh
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-03-25 17:25:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description James Ralston 2008-03-18 21:38:32 UTC
In shell programming, if one needs to replace a file atomically (such that there
is no point at which the file doesn't exist), but keep a backup of the old file,
this is the standard construct to use:

# Create example files.
$ echo foo >foo
$ echo bar >bar

# Replace the contents of "foo" with the contents of "bar", atomically.
$ rm -f foo.backup   # remove old backup file, if present
$ ln foo foo.backup  # link() is atomic
$ mv bar foo         # mv uses rename(), which is atomic

The reason why this construct works is because mv uses rename() to move files
that lie on the same filesystem, and rename() is guaranteed to atomically
replace the target file, without affecting any other hard links to the file.

However, since at least coreutils-6.9-16.fc8, the mv command will first unlink
the target if the target has multiple hard links.  Here's strace output from the
mv command above:

$ strace mv bar foo
stat("foo", {st_mode=S_IFREG|0600, st_size=4, ...}) = 0
lstat("bar", {st_mode=S_IFREG|0600, st_size=4, ...}) = 0
lstat("foo", {st_mode=S_IFREG|0600, st_size=4, ...}) = 0
stat("foo", {st_mode=S_IFREG|0600, st_size=4, ...}) = 0
geteuid()                               = 4524
getegid()                               = 4524
getuid()                                = 4524
getgid()                                = 4524
access("foo", W_OK)                     = 0
unlink("foo")                           = 0
rename("bar", "foo")                    = 0

By first unlinking the target file, the mv command has created a race condition
where a process attempting to access "foo" could find it missing, thereby
defeating the guaranteed atomicity of rename().

Simply put: the mv command MUST NOT do this!

Although the coreutils mv command does contain a --backup option, this isn't an
acceptable work-around for three reasons.  First, the --backup behavior isn't
atomic, either:

$ echo foo >foo
$ echo bar >bar
$ strace mv --backup bar foo
[...]
rename("foo", "foo~")                   = 0
rename("bar", "foo")                    = 0

To be atomic, the first rename must instead be:

link("foo", "foo~")                     = 0

Second, even --backup worked, portable shell code can't assume that the version
of "mv" available supports it.  It's far simpler to use ln/mv in all cases than
to specifically test for --backup support and use it only if it's present.

And third, there's a huge installed base of shell scripts that already rely on
the ln/mv construct to be atomic.  It is unacceptable for mv to break this
assumption.

The culprit is this code in copy_internal() in src/copy.c (around line 1343):

else if (! S_ISDIR (dst_sb.st_mode)
         && (x->unlink_dest_before_opening
             || (x->preserve_links && 1 < dst_sb.st_nlink)
             || (!x->move_mode
                 && x->dereference == DEREF_NEVER
                 && S_ISLNK (src_sb.st_mode))
             ))
  {
    if (unlink (dst_name) != 0 && errno != ENOENT)
      {
        error (0, errno, _("cannot remove %s"), quote (dst_name));
        return false;
      }
    new_dst = true;
    if (x->verbose)
      printf (_("removed %s\n"), quote (dst_name));
  }

Specifically, the (x->preserve_links && 1 < dst_sb.st_nlink) test is what
triggers the unlink.  Removing that test should prevent the unlink() from being
called, but I'm not certain if that would break the assumptions of other
functions that might call copy_internal().  (For a simple utility, mv's logic is
deceptively complex; I'm not sure I've completely understood all of the link
handling.)

But regardless, however it's implemented in the code, mv MUST NOT first unlink
the target of a simple file-to-file rename(), regardless of whether the target
has multiple hard links.

Comment 1 Jim Meyering 2008-03-19 12:11:49 UTC
Thank you for the bug report.
This is a bug upstream, too, and I'm fixing it there now.
Will keep you informed.

Comment 2 Jim Meyering 2008-03-19 12:47:39 UTC
Thanks again.  I've posted a patch here:
http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/12985

Comment 3 Ondrej Vasik 2008-03-25 17:25:46 UTC
Thanks both for report and for the patch. Built as coreutils-6.10-14.fc9 ,
closing RAWHIDE

Comment 4 James Ralston 2008-03-25 17:36:21 UTC
Thanks for fixing this so quickly, guys.