Description of problem: import rpm does not work in mod_wsgi application Version-Release number of selected component (if applicable): rpm-python-4.11.2-1.fc20.x86_64 How reproducible: always Steps to Reproduce: 1. install Fedora 20 with httpd and mod_wsgi 2. mkdir -p /usr/share/rhn/wsgi/ 3. cat >/usr/share/rhn/wsgi/bug.py <<EOF import rpm def application(environ, start_response): status = '200 OK' output = 'Hello World!' response_headers = [('Content-type', 'text/plain'), ('Content-Length', str(len(output)))] start_response(status, response_headers) return [output] EOF 4. cat >/etc/httpd/conf.d/zz-spacewalk-server-wsgi.conf <<EOF <IfModule !wsgi_module> LoadModule wsgi_module modules/mod_wsgi.so </IfModule> WSGIPythonPath "/usr/share/rhn" WSGIScriptAlias /BUG /usr/share/rhn/wsgi/bug.py <IfVersion >= 2.4> <Directory /usr/share/rhn/wsgi> Require all granted </Directory> </IfVersion> EOF 5. systemctl restart httpd 6. curl http://localhost/BUG Actual results: Internal server error, /var/log/httpd/error_log shows mod_wsgi (pid=6312): Exception occurred processing WSGI script '/usr/share/rhn/wsgi/bug.py'. Traceback (most recent call last): File "/usr/share/rhn/wsgi/bug.py", line 1, in <module> import rpm File "/usr/lib64/python2.7/site-packages/rpm/__init__.py", line 10, in <module> from rpm.transaction import * File "/usr/lib64/python2.7/site-packages/rpm/transaction.py", line 5, in <module> from rpm._rpm import ts as TransactionSetCore ImportError: cannot import name ts Expected results: Hello World! (without error) Additional info: after downgrade to rpm-python-4.11.1-7.fc20.x86_64 it works again This bug blocks Spacewalk 2.1 release on Fedora 20.
"Hello world!" is what I get with that here, with rpm 4.11.2. Whats likely to make the difference between 4.11.1 and 4.11.2 is that the latter requires rpm initialization to succeed for the module import to succeed, older versions do not. Possible reasons for initialization failure include crypto (ie NSS) initialization failure and/or failure to read default config files. A strace of such a failure should give further clues, but dunno how easy/hard it is to get a strace of mod_wsgi...
Ahh, that should be it, there's 'Failed to initialize NSS library' message in the log: mod_wsgi (pid=6847, process='', application='localhost|/bug'): Loading WSGI script '/usr/share/rhn/wsgi/bug.py'. error: Failed to initialize NSS library mod_wsgi (pid=6847): Target WSGI script '/usr/share/rhn/wsgi/bug.py' cannot be loaded as Python module. mod_wsgi (pid=6847): Exception occurred processing WSGI script '/usr/share/rhn/wsgi/bug.py'. Traceback (most recent call last): ...
Well, the web application uses mod_ssl not mod_nss so it most likely do not initialize nss properly. Is there a way how to force initialization from python before import rpm?
Okay, so crypto init failure it is. Rpm uses its own private NSS context and handles its own initialization, whatever crypto apache mod_foo uses should not be relevant. The question is why is NSS_InitContext() failing? Could be some permission issue (eg NSS needs to load dynamic libraries upon init) within httpd/mod_wsgi or something... try to get a strace, please. Also note that if 4.11.2 is failing this way, then almost certainly 4.11.1 is failing on crypto init as well and rpm without a working crypto is not a happy camper...
Created attachment 865592 [details] strace log of httpd process Here is strace of failing httpd process. 8900 is pid of httpd master process 8902 is the child serving the page
It seems to fail really early on in NSS_InitContext(), around line 859 of the log: 8902 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 21 [...] 8902 munmap(0x7f18c95c8000, 4096) = 0 --> NSS_InitContext() called 8902 write(2, "error: ", 7) = 7 8902 write(2, "Failed to initialize NSS library\n", 33) = 33 Here's what I see on my systems: open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 14 [...] = 0 munmap(0x7ffa6fce9000, 4096) = 0 --> NSS_InitContext() called gettid() = 4570 rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER, 0x7ffa6e7b9750}, NULL, 8) = 0 open("/lib64/libsoftokn3.so", O_RDONLY|O_CLOEXEC) = 14 read(14, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000N\0\0\0\0\0\0"..., 832) = 832 [...] Setting SIGPIPE to SIG_IGN occurs inside NSPR (which NSS heavily relies on internally) initialization, but the failing case doesn't even get that far before bailing out, not to mention getting to load libsoftokn3 etc. Offhand, I haven't got a clue as to why NSS initialization would fail there, assuming all the libraries are actually present and intact. Like said, this failure is almost certainly also happening with 4.11.1 and 4.11.2 being just the messenger here... but just to rule out the impossible, can you check whether 'Failed to initialize NSS library' error is present in the logs with 4.11.1 as well?
> Like said, this failure is almost certainly also happening with 4.11.1 and > 4.11.2 being just the messenger here... but just to rule out the impossible, > can you check whether 'Failed to initialize NSS library' error is present in > the logs with 4.11.1 as well? Hi Panu, You're right. I can see 'Failed to initialize' even with rpm-4.11.1-7.fc20: mod_wsgi (pid=19782, process='', application='ibm-x3650m4-03-vm01.lab.eng.brq.redhat.com|/bug'): Loading WSGI script '/usr/share/rhn/wsgi/bug.py'. error: Failed to initialize NSS library
Did you check for SELinux issues?
Hi Florian, Yes, I did. There are no AVC denials at all. I've even recompiled policy without dontaudit and then there are 4 AVCs: type=USER_AVC msg=audit(1392972492.064:2395): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='avc: received policyload notice (seqno=21) exe="/usr/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?' type=AVC msg=audit(1392972502.422:2397): avc: denied { rlimitinh } for pid=19945 comm="satidmap.pl" scontext=system_u:system_r:httpd_t:s0 tcontext=system_u:system_r:httpd_sys_script_t:s0 tclass=process type=AVC msg=audit(1392972502.422:2397): avc: denied { siginh } for pid=19945 comm="satidmap.pl" scontext=system_u:system_r:httpd_t:s0 tcontext=system_u:system_r:httpd_sys_script_t:s0 tclass=process type=AVC msg=audit(1392972502.422:2397): avc: denied { noatsecure } for pid=19945 comm="satidmap.pl" scontext=system_u:system_r:httpd_t:s0 tcontext=system_u:system_r:httpd_sys_script_t:s0 tclass=process Neither one of them is related to the mod_wsgi apllication (it runs as httpd_t).
More over if I modify wsgi application to def application(environ, start_response): status = '200 OK' output = 'Hello World!' f = open("/etc/hosts"); output = f.read() f.close() response_headers = [('Content-type', 'text/plain'), ('Content-Length', str(len(output)))] start_response(status, response_headers) return [output] It outputs content of /etc/hosts just fine.
(In reply to Michael Mráka from comment #7) > Hi Panu, > You're right. I can see 'Failed to initialize' even with rpm-4.11.1-7.fc20: Ok, thanks for confirming - rpm 4.11.2 is just the messenger here then, and the new check is serving its purpose: failure to init crypto will just lead into strange/ugly problems later on, so its best to figure out what the actual cause is. Note that /etc/hosts isn't really related to this, it just shows up in the trace because rpm preloads the name services just before initializing crypto (or trying to do so, in this case) I'd need some way of reproducing the issue in order to help. I tested the reproducer steps on my two separate systems of mine, plus a fresh virtualized minimal installation of F20, none of them exhibit the reported problem. Are you able to reproduce this on multiple hosts? I assume you can... What other steps are taking place from F20 installation to hitting this? Perhaps one of the spacewalk (sub)packages, or their dependencies, are altering something in the system setup that causes this - whatever it actually is.
On the reproducer host, something prevents NSS from working in the wsgi environment. The issue is reproducable with just NSS alone, eliminating rpm from the blame list ;) This mirrors exactly the way rpm initializes NSS: --- bug.py --- import nss.nss as nss def application(environ, start_response): flags = (nss.NSS_INIT_READONLY | nss.NSS_INIT_NOCERTDB | nss.NSS_INIT_NOMODDB | nss.NSS_INIT_FORCEOPEN | nss.NSS_INIT_NOROOTINIT | nss.NSS_INIT_OPTIMIZESPACE) ctx = nss.nss_init_context(flags = flags) status = '200 OK' output = 'Hello World!' + ctx response_headers = [('Content-type', 'text/plain'), ('Content-Length', str(len(output)))] start_response(status, response_headers) return [output] --- ...which spits the following, slightly more informative error in the logs: [Fri Feb 21 08:09:33.467775 2014] [:error] [pid 23078] [client ::1:47412] mod_wsgi (pid=23078): Exception occurred processing WSGI script '/usr/share/rhn/wsgi/bug.py'. [Fri Feb 21 08:09:33.467871 2014] [:error] [pid 23078] [client ::1:47412] Traceback (most recent call last): [Fri Feb 21 08:09:33.467939 2014] [:error] [pid 23078] [client ::1:47412] File "/usr/share/rhn/wsgi/bug.py", line 8, in application [Fri Feb 21 08:09:33.468043 2014] [:error] [pid 23078] [client ::1:47412] ctx = nss.nss_init_context(flags = flags) [Fri Feb 21 08:09:33.468121 2014] [:error] [pid 23078] [client ::1:47412] NSPRError: (SEC_ERROR_NO_MODULE) security library: no security module can perform the requested operation. My guess is that there's some sort of security mechanism (or path issue, or something...) preventing NSS from dynamically loading the modules it needs, but this is not an rpm bug.
Possibly related: bug 1006280 ? Would be interesting to see if this still reproduces when the httpd has forking disabled.
Panu, thanks for you investigation. Ales,running httpd in debug mode (/usr/sbin/httpd -DFOREGROUND -X, i.e. only one worker, no fork) the error message has changed: mod_wsgi (pid=24401): Exception occurred processing WSGI script '/usr/share/rhn/wsgi/bug.py'. Traceback (most recent call last): File "/usr/share/rhn/wsgi/bug.py", line 8, in application ctx = nss.nss_init_context(flags = flags) NSPRError: (SEC_ERROR_UNKNOWN_PKCS11_ERROR) Unknown PKCS #11 error.
Reassigning to nss component.
Just FWIW, its not clear to me that this is a bug in NSS either. On my own systems and a fresh F20 test-installation these test-cases work just fine, so this is something specific to the spacewalk setup.
Well, I tracked the issue down to httpd + mod_swgi + mod_perl + rpm + nss combination ;). Both mod_wsgi and mod_perl part of spacewalk web application imports/uses rpm (rpm-python resp. perl-RPM2). And it seems that mod_perl's directive PerlModule is processed in the main httpd parent thus all childs trying to import rpm hit bug 800304. I.e. perl-RPM2 calls nss_init_context() in the main process then rpm-python does the same in the child, after the fork, and it fails.
Oh ... eek. You really dont want to mix up two independent rpm API bindings within a single process, that's going to end up in tears. This NSS issue is likely just a tip of that particular iceberg. libpm has quite a lot of global state, crypto being one of them, macro and rpmrc configuration being another, but there's also signal handlers and whatnot.
> Oh ... eek. You really dont want to mix up two independent rpm API bindings Completely agree, I really don't :). I'm just fixing code which worked for years (maybe 10+) and stopped after the recent rpm change ;).
That's also known as walking on thin ice ;) NSS is certainly more troublesome and picky about its surroundings than beecrypt (which rpm used for crypto in the old days) in many ways, but the crypto library is just one case of all the that could raise from mixing up multiple independent rpm API "users" within a single process.
The issue has been resolved by removing rpm dependency from spacewalk perl stack. See spacewalk master commit 38e7070e857e6f31335c4858978447fac4a6b00d.
Spacewalk 2.2 has been released: https://fedorahosted.org/spacewalk/wiki/ReleaseNotes22