Bug 718714

Summary: INFO: Namespace unshare failed => build failure
Product: [Fedora] Fedora Reporter: Paul Howarth <paul>
Component: mockAssignee: Clark Williams <williams>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 15CC: mebrown, williams
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: mock-1.0.19-1.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-16 21:01:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
fix to allow retries for namespace 'unshare(2)' failures none

Description Paul Howarth 2011-07-04 11:48:48 UTC
Description of problem:
Intermittent build failures with symptoms:

INFO: mock.py version 1.1.11 starting...
State Changed: init plugins
INFO: selinux enabled
State Changed: start
INFO: Namespace unshare failed.
ERROR: Cannot find/open srpm: /home/paul/cfo-build/perl-DateTime/trunk/perl-DateTime-0.7000-2.0.cf.fc15.src.rpm. Error: Unable to open file
WARNING: unable to delete selinux filesystems (/tmp/mock-selinux-plugin.0a3A9G): [Errno 2] No such file or directory: '/tmp/mock-selinux-plugin.0a3A9G'


Version-Release number of selected component (if applicable):
mock-1.1.11-1.fc15

How reproducible:
Maybe once every 25-50 builds.

Steps to Reproduce:
1. Can happen on any build.

Additional info:
This appears to be a retry-able error - running the same mock command immediately after the failed one invariably results in a successful build.

I never saw this with 1.1.10.

This is on a F-15 x86_64 builder with SELinux enforcing.

Comment 1 Paul Howarth 2011-07-21 09:30:39 UTC
Since rebooting my builder a few days ago, I haven't seen a single instance of this problem, so I suspect it was a kernel issue.

However, one change I'd suggest is that "Namespace unshare failed" become a fatal error as failure there inevitably results in a failed build.

Comment 2 Clark Williams 2011-07-21 14:14:08 UTC
Yeah, sounds like a good idea. I'll raise an exception when the unshare fails.

Comment 3 Paul Howarth 2011-07-21 14:30:51 UTC
(In reply to comment #1)
> Since rebooting my builder a few days ago, I haven't seen a single instance of
> this problem, so I suspect it was a kernel issue.

Hmm, spoke too soon. Just had another one.

Comment 4 Paul Howarth 2011-07-22 12:36:50 UTC
(In reply to comment #2)
> Yeah, sounds like a good idea. I'll raise an exception when the unshare fails.

Please make this exception cause mock to exit with a specific exit code that's different from other error cases - I know this one is retry-able and can get my builder script to retry the build if it detects the right exit code.

Comment 5 Fedora Update System 2011-07-26 21:28:41 UTC
mock-1.1.12-1.el6 has been submitted as an update for Fedora EPEL 6.
https://admin.fedoraproject.org/updates/mock-1.1.12-1.el6

Comment 6 Fedora Update System 2011-07-26 21:29:00 UTC
mock-1.1.12-1.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/mock-1.1.12-1.fc14

Comment 7 Fedora Update System 2011-07-26 21:29:18 UTC
mock-1.0.19-1.el5 has been submitted as an update for Fedora EPEL 5.
https://admin.fedoraproject.org/updates/mock-1.0.19-1.el5

Comment 8 Fedora Update System 2011-07-26 21:29:36 UTC
mock-1.1.12-1.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/mock-1.1.12-1.fc15

Comment 9 Paul Howarth 2011-07-27 10:04:04 UTC
With 1.1.12, it's not exiting when the namespace unshare failure happens but instead when a subsequent failure (which is probably a result of the unshare failure) happens. The exit status is 1, which is indistinguishable from other failure modes.

$ mock -r city-fan-fedora-10-i386 /home/paul/cfo-build/perl-version/trunk/perl-version-0.92-1.fc15.src.rpm
INFO: mock.py version 1.1.12 starting...
State Changed: init plugins
INFO: selinux enabled
State Changed: start
INFO: Namespace unshare failed.
ERROR: Cannot find/open srpm: /home/paul/cfo-build/perl-version/trunk/perl-version-0.92-1.fc15.src.rpm. Error: Unable to open file
WARNING: unable to delete selinux filesystems (/tmp/mock-selinux-plugin.6Q7Myj): [Errno 2] No such file or directory: '/tmp/mock-selinux-plugin.6Q7Myj'

Comment 10 Clark Williams 2011-07-27 12:53:08 UTC
ugh, I did all the hard work (creating an Exception, etc) and failed to actually exit in the except. Sigh...

Lemme whip you up a patch.

Comment 11 Clark Williams 2011-07-27 12:55:51 UTC
Please apply this set of changes to mock.py to see if it does what you want.

diff --git a/py/mock.py b/py/mock.py
index f2c9e55..268b727 100755
--- a/py/mock.py
+++ b/py/mock.py
@@ -691,8 +691,11 @@ def main(ret):
     # New namespace starting from here
     try:
         mock.util.unshare(mock.util.CLONE_NEWNS)
+    except UnshareFailed, e:
+        log.error("Namespace unshare failed.")
+        sys.exit(e.resultcode)
     except:
-        log.info("Namespace unshare failed.")
+        raise
 
     # set personality (ie. setarch)
     if config_opts['internal_setarch']:

Comment 12 Paul Howarth 2011-07-27 15:05:57 UTC
That didn't work:

$ mock -r city-fan-fedora-rawhide-i686 --disable-plugin=tmpfs /home/paul/cfo-build/perl-BerkeleyDB/trunk/perl-BerkeleyDB-0.48-2.fc15.src.rpm
INFO: mock.py version 1.1.12 starting...
State Changed: init plugins
INFO: selinux enabled
State Changed: start
ERROR: global name 'UnshareFailed' is not defined
Traceback (most recent call last):
  File "/usr/sbin/mock", line 491, in <module>
  File "/usr/sbin/mock", line 697, in main
NameError: global name 'UnshareFailed' is not defined
WARNING: unable to delete selinux filesystems (/tmp/mock-selinux-plugin.I2t01s): [Errno 2] No such file or directory: '/tmp/mock-selinux-plugin.I2t01s'

So on line 697 of /usr/sbin/mock I tried changing 'UnshareFailed' to 'mock.exception.UnshareFailed' and then I got this:

$ mock -r city-fan-redhat-9-i386 /home/paul/cfo-build/perl-NetAddr-IP/trunk/perl-NetAddr-IP-4.044-2.fc15.src.rpm
INFO: mock.py version 1.1.12 starting...
State Changed: init plugins
INFO: selinux enabled
State Changed: start
ERROR: global name 'UnshareFailed' is not defined
Traceback (most recent call last):
  File "/usr/sbin/mock", line 491, in <module>
  File "/usr/sbin/mock", line 696, in main
  File "<peak.util.decorators.rewrap wrapping mock.util.unshare at 0x01CA3AA0>", line 3, in unshare
    def unshare(flags): return __decorated(flags)
  File "/usr/lib/python2.7/site-packages/mock/trace_decorator.py", line 70, in trace
  File "/usr/lib/python2.7/site-packages/mock/util.py", line 186, in unshare
NameError: global name 'UnshareFailed' is not defined
WARNING: unable to delete selinux filesystems (/tmp/mock-selinux-plugin.sTh2LE): [Errno 2] No such file or directory: '/tmp/mock-selinux-plugin.sTh2LE'

So it looks like it needs slightly more tweaking yet.

Comment 13 Clark Williams 2011-07-27 15:12:51 UTC
Yeah, you could say it needs more tweaking :)

Try this one:

diff --git a/py/mock.py b/py/mock.py
index f2c9e55..f8a4d0f 100755
--- a/py/mock.py
+++ b/py/mock.py
@@ -691,8 +691,11 @@ def main(ret):
     # New namespace starting from here
     try:
         mock.util.unshare(mock.util.CLONE_NEWNS)
+    except mock.exception.UnshareFailed, e:
+        log.error("Namespace unshare failed.")
+        sys.exit(e.resultcode)
     except:
-        log.info("Namespace unshare failed.")
+        raise
 
     # set personality (ie. setarch)
     if config_opts['internal_setarch']:
diff --git a/py/mock/util.py b/py/mock/util.py
index 5fb0f18..33c5f44 100644
--- a/py/mock/util.py
+++ b/py/mock/util.py
@@ -184,7 +184,7 @@ def unshare(flags):
     try:
         res = _libc.unshare(flags)
         if res:
-            raise UnshareFailed(os.strerror(_errno.value))
+            raise mock.exception.UnshareFailed(os.strerror(_errno.value))
     except AttributeError, e:
         pass

Comment 14 Paul Howarth 2011-07-27 15:39:41 UTC
Yes, that's better:

$ mock -r city-fan-fedora-1-i386 /home/paul/cfo-build/perl-NetAddr-IP/trunk/perl-NetAddr-IP-4.044-2.fc15.src.rpm
INFO: mock.py version 1.1.12 starting...
State Changed: init plugins
INFO: selinux enabled
State Changed: start
ERROR: Namespace unshare failed.
WARNING: unable to delete selinux filesystems (/tmp/mock-selinux-plugin.VL2LSu): [Errno 2] No such file or directory: '/tmp/mock-selinux-plugin.VL2LSu'

This time it exited with the right exit code (undocumented by the way) and my build script was able to retry it straight away, with success. Now if only we could get rid of that warning from the mock-selinux-plugin...

Comment 15 Clark Williams 2011-07-27 17:43:32 UTC
See if this does the trick:

diff --git a/py/mock/plugins/selinux.py b/py/mock/plugins/selinux.py
index 0f5bad1..814919f 100644
--- a/py/mock/plugins/selinux.py
+++ b/py/mock/plugins/selinux.py
@@ -67,11 +67,12 @@ class SELinux(object):
 
     decorate(traceLog())
     def _selinuxAtExit(self):
-        try:
-            os.unlink(self.filesystems)
-        except OSError, e:
-            getLog().warning("unable to delete selinux filesystems (%s): %s" % 
-            pass
+        if os.path.exists(self.filesystems):
+            try:
+                os.unlink(self.filesystems)
+            except OSError, e:
+                getLog().warning("unable to delete selinux filesystems (%s): %s
+                pass
 
     decorate(traceLog())
     def _selinuxPreYumHook(self):

Comment 16 Paul Howarth 2011-07-27 18:36:23 UTC
That diff looks a bit mangled so I went with this:

--- /usr/lib/python2.7/site-packages/mock/plugins/selinux.py.orig	2011-07-26 20:30:18.000000000 +0100
+++ /usr/lib/python2.7/site-packages/mock/plugins/selinux.py	2011-07-27 19:08:54.403560565 +0100
@@ -50,7 +50,7 @@
             rootObj.addHook("preyum", self._selinuxPreYumHook)
             rootObj.addHook("postyum", self._selinuxPostYumHook)
         else:
-            getLog().warn("selinux: 'yum' does not support '--setopt' option")
+            getLog().warning("selinux: 'yum' does not support '--setopt' option")
 
     decorate(traceLog())
     def _selinuxCreateFauxFilesystems(self):
@@ -68,11 +68,12 @@
 
     decorate(traceLog())
     def _selinuxAtExit(self):
-        try:
-            os.unlink(self.filesystems)
-        except OSError, e:
-            getLog().warning("unable to delete selinux filesystems (%s): %s" % (self.filesystems, e))
-            pass
+        if os.path.exists(self.filesystems):
+            try:
+                os.unlink(self.filesystems)
+            except OSError, e:
+                getLog().warning("unable to delete selinux filesystems (%s): %s" % (self.filesystems, e))
+                pass
 
     decorate(traceLog())
     def _selinuxPreYumHook(self):

That had the desired effect:

$ mock -r city-fan-fedora-10-x86_64 /home/paul/cfo-build/perl-JSON-XS/trunk/perl-JSON-XS-2.31-1.fc15.src.rpm
INFO: mock.py version 1.1.12 starting...
State Changed: init plugins
INFO: selinux enabled
State Changed: start
ERROR: Namespace unshare failed.

Great - right exit code too! I think it just needs documenting now in mock/exception.py.

Comment 17 Clark Williams 2011-07-27 19:22:19 UTC
Created attachment 515583 [details]
fix to allow retries for namespace 'unshare(2)' failures

proposed patch to allow retries when unshare(2) fails

Comment 18 Fedora Update System 2011-07-30 10:37:58 UTC
Package mock-1.1.12-1.el6:
* should fix your issue,
* was pushed to the Fedora EPEL 6 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=epel-testing mock-1.1.12-1.el6'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/mock-1.1.12-1.el6
then log in and leave karma (feedback).

Comment 19 Paul Howarth 2011-08-01 14:41:00 UTC
(In reply to comment #17)
> Created attachment 515583 [details]
> fix to allow retries for namespace 'unshare(2)' failures
> 
> proposed patch to allow retries when unshare(2) fails

Patch looks good.

Just one query: should getLog().warn in the SELinux plugin be getLog().warning?

Comment 20 Clark Williams 2011-08-01 15:13:28 UTC
(In reply to comment #19)
> (In reply to comment #17)
> > Created attachment 515583 [details]
> > fix to allow retries for namespace 'unshare(2)' failures
> > 
> > proposed patch to allow retries when unshare(2) fails
> 
> Patch looks good.
> 
> Just one query: should getLog().warn in the SELinux plugin be getLog().warning?

Well, that getLog class in trace_decorator.py is kinda magic in that it does late bindings and other weird mojo. Both .warn and .warning will work and will generate a warning entry.

I'll change it to warning (as well as one in the root_cache plugin) just for consistency's sake.

Comment 21 Paul Howarth 2011-08-01 15:31:25 UTC
(In reply to comment #20)
> > Just one query: should getLog().warn in the SELinux plugin be getLog().warning?
> 
> Well, that getLog class in trace_decorator.py is kinda magic in that it does
> late bindings and other weird mojo. Both .warn and .warning will work and will
> generate a warning entry.
> 
> I'll change it to warning (as well as one in the root_cache plugin) just for
> consistency's sake.

OK, thanks. I have seen some tracebacks with getlog in them (didn't prevent the build working so I wasn't too bothered) and wondered if that was related - apparently not then. Let's see if I can reproduce one...

Comment 22 Fedora Update System 2011-08-09 01:27:18 UTC
mock-1.1.12-1.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 23 Fedora Update System 2011-08-09 01:32:44 UTC
mock-1.1.12-1.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 24 Paul Howarth 2011-08-09 07:41:19 UTC
Reopening pending release of version including the patches discussed earlier.

I haven't seen any getlog tracebacks over the last week or so, so I'll open a fresh bug if I spot one.

Comment 25 Fedora Update System 2011-08-16 21:01:35 UTC
mock-1.1.12-1.el6 has been pushed to the Fedora EPEL 6 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 26 Fedora Update System 2011-08-16 21:02:39 UTC
mock-1.0.19-1.el5 has been pushed to the Fedora EPEL 5 stable repository.  If problems still persist, please make note of it in this bug report.