The problem may be that something in the new coreutils causes problems with some commands such as touch which causes many rawhide build failures, here's just one: http://koji.fedoraproject.org/koji/getfile?taskID=644607&name=build.log See also the thread on fedora-devel-list http://www.redhat.com/archives/fedora-devel-list/2008-June/thread.html#00153 At least 5 or so other packages have been affected by this. (Also reported in fedora-infrastructure: https://fedorahosted.org/fedora-infrastructure/ticket/593 )
Version: coreutils-6.12-1.fc10
As I said in devel list, problem likely caused by combination of old RHEL5 xen kernel in koji and gl_futimens() . I'm not able to reproduce it on my machine, still trying to get strace of the failure. It would be really helpful to have one. It seems to be some kind race condition - f-spot reported failing passed in scratch build with coreutils-6.12-1.fc10 without troubles ( http://koji.fedoraproject.org/koji/taskinfo?taskID=644691 ).
Another failing build, LabPlot: http://koji.fedoraproject.org/koji/getfile?taskID=645053&name=build.log Don't know how to get a trace on a "real" build on koji itself, probably needs koji admins.
You could get strace easily by adding strace before the failing touch/cp command (and BuildRequires: strace). Strace output will be available in the build.log as is written on stderr. I tried to do that in scratch build but the issue didn't show in three scratch builds I tried so far.
Created attachment 308346 [details] Strace of the failure So far most of the failures were on xenbuilder2 machine.
hi, as I just replied on fedora-devel, This looks like the same problem reported in this thread: http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/13684 Eric Blake fixed that with this change to gnulib's utimens.c: http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=93f08406537 This is probably the result of configuring/building coreutils on a new kernel and running the resulting the binaries on an old kernel; but with the gnulib fix, the tools work around that particular abuse at run-time.
Hi Jim, thanks for joining the bugzilla and suggesting fix but Eric Blake's patch is not fixing issue with koji build (I spotted that patch before I built coreutils-6.12-1.fc10) and therefore the patch is already applied in Fedora. As you could see in strace, koji xen kernel has utimensat() call, so there is no fallback necessary - but it seems that that call is broken/buggy and returns error code 280 instead of 0 and therefore gl_futimens() function in cp/touch/mv/install is failing.
Ahh... thanks for that strace. And you're right that it looks like a kernel problem. I looked the kernel code in fs/utimes.c's do_utimes function. There are several places where the returned variable, "error", is set to non-literal values: error = __user_walk_fd(dfd, filename, (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW, &nd); error = vfs_permission(&nd, MAY_WRITE); error = notify_change(dentry, &newattrs); gnulib *could* provide a utimensat wrapper that detects this bogus return value and maps it to 0, but the minimal-impact (configure-time run-test) approach would work only if configured/build on a losing system. The alternative is to make every system pay the price of the extra comparison. Ugly, but probably the only useful work-around. I'll attach a patch.
Created attachment 308374 [details] possible work-around This makes every utimensat-using application incur the cost (albeit small) of detecting and working around the kernel bug. Not pretty, but maybe what we need. Untested.
Thanks for the patch Jim, that is exactly what I meant by an easy workaround for Fedora on fedora-devel-list. Will use it at least until the kernel koji issue (#442352) will be addressed correctly.
Ondrej, coreutils-6.12-2.fc10 brought back the same build failures that we've had with 6.12-1. I've asked Jeremy to untag it, to allow builds to succeed.
Matthias, but do you have at least strace of the failure with 6-12-2.fc10? There is now no build log of the new failure, no strace of the new failure... and without any tag I doubt I will be able to find out why it is still failing - as the exit code 280 from futimens call should be now changed to 0. I have no chance to reproduce it outside the koji - as the build pass correctly on rawhide and FC-6 kernel without troubles. Is there any koji dist-tag which would allow me to get strace with rawhide/dist-f10 packages with coreutils-6.12-2.fc10?
Ok, got necessary informations from scratch build strace tests, coreutils-6.12-3.fc10 should fix the problem properly.
Created attachment 308560 [details] Better workaround with testcase Old workaround patch is not correct as it is causing not preserved timestamps and is not covering all failure cases. This one contains testcase and provides fallback to other systemcall functions. Works ok so far...
Seems to work properly for a few days, closing RAWHIDE