Description of problem: After upgrade from 2.3 to 2.4 jabber sm module is send segfault on service restart. This issue reproducible only when you restart with init.d script. Script report OK, but when do restart it report sm [FAILED] on stop and /var/log/messages segfault. This issue affecting production, please advise how to troubleshoot it. [root@ca01repo00 jabberd]# rpm -qa | grep jabber jabberpy-0.5-0.21.el6.noarch jabberd-2.2.14-6.el6.x86_64 spacewalk-setup-jabberd-2.3.2-1.el6.noarch Version-Release number of selected component (if applicable): How reproducible: [root@ca01repo00 jabberd]# service jabberd status router (pid 31553) is running... sm (pid 31561) is running... c2s (pid 31569) is running... s2s (pid 31577) is running... [root@ca01repo00 jabberd]# service jabberd restart Terminating jabberd processes ... Stopping s2s: [ OK ] Stopping c2s: [ OK ] Stopping sm: [FAILED] Stopping router: [ OK ] Initializing jabberd processes ... Starting router: [ OK ] Starting sm: [ OK ] Starting c2s: [ OK ] Starting s2s: [ OK ] Actual results: Feb 11 04:46:13 ca01repo00 kernel: sm[31091]: segfault at 18 ip 00007fb270dd22b6 sp 00007fff6f26a898 error 4 in libc-2.12.so[7fb270caa000+18a000] Expected results: Additional info:
I believe that issue cause by osa-dispatcher. It can't connect to jabber server no matter I what is changed in rhn.conf. From my prospective is a BLOCKER. Can't manage system from spacewalk. [root@ca01repo00 jabberd]# hostname ca01repo00.adm.myorg.com [root@ca01repo00 jabberd]# nslookup ca01repo00.adm.myorg.com Server: 172.16.104.26 Address: 172.16.104.26#53 Name: ca01repo00.adm.myorg.com Address: 172.16.104.36 I believe that issue cause by osa-dispatcher. It can't connect to jabber server no matter I what is changed in rhn.conf. From my prospective is a BLOCKER. Can't manage system from spacewalk. [root@ca01repo00 jabberd]# hostname ca01repo00.adm.myorg.com [root@ca01repo00 jabberd]# nslookup ca01repo00.adm.myorg.com Server: 172.16.104.26 Address: 172.16.104.26#53 Name: ca01repo00.adm.myorg.com Address: 172.16.104.36 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: rhnSQL/driver_postgresql._execute_wrapper('Executing SQL: "\n select id, password\n from rhnPushDispatcher\n where jabber_id like %(jabber_id)s\n " with bind params: {jabber_id: rhn-dispatcher-sat%}',) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.setup_connection('Connecting to', 'ca01repo00.adm.myorg.com') 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib._get_jabber_client 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib._get_jabber_client('Connecting to', 'ca01repo00.adm.myorg.com') 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.__init__ 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.__init__ 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.check_cert('Loading cert', <X509Name object '/C=CA/ST=ON/L=MyLocale/O=MyOrg/OU=ca01repo00.adm.myorg.com/CN=ca01repo00.adm.myorg.com'>) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Attempting to connect',) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process(300,) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process('before select(); timeout', 300.0) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process('select() returned',) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib._auth_dispatch(<jabber.xmlstream.Node instance at 0xc74f38>,) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Connected',) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Expecting features stanza, got:', <features><address xmlns = 'http://affinix.com/jabber/address' >172.16.104.36</address><auth xmlns = 'http://jabber.org/features/iq-auth' /><register xmlns = 'http://jabber.org/features/iq-register' /><starttls xmlns = 'urn:ietf:params:xml:ns:xmpp-tls' ><required /></starttls></features>) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('starttls node', <jabber.xmlstream.Node instance at 0xc8a050>) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process(None,) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process('before select(); timeout', None) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process('select() returned',) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib._auth_dispatch(<jabber.xmlstream.Node instance at 0xc8a200>,) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Expecting proceed stanza, got:', <proceed />) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Preparing for TLS handshake',) 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('ERROR', 'Traceback caught:') 2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('ERROR', 'Traceback (most recent call last):\n File "/usr/share/rhn/osad/jabber_lib.py", line 642, in connect\n ssl.do_handshake()\nWantReadError\n')
I am was be able get some backtrace, might be will be useful. https://fpaste.org/321806/52890621/
The paste is no longer available: The paste you are looking for does not exist
Please see updated link. https://fpaste.org/322825/14555419/
Hello Everyone, Any input on issue ? Thank you.
After some troubleshooting with my college we found issue, but I am not confident if this is the actual fix, but at least osa-dispatcher started and register. File /usr/share/rhn/osad/jabber_lib.py On line 648 #raise SSLHandshakeError, None, sys.exc_info()[2] and set to pass Code Result: except SSL.SSL.Error: # Error in the SSL handshake - most likely mismatching CA cert log_error("Traceback caught:") log_error(extract_traceback()) #raise SSLHandshakeError, None, sys.exc_info()[2] pass Log Output Feb 18 12:51:18 ca01repo00 jabberd/c2s[27462]: [26] created user: user=rhn-dispatcher-sat; realm= Feb 18 12:51:18 ca01repo00 jabberd/c2s[27462]: [26] registration succeeded, requesting user creation: jid=rhn-dispatcher-sat.myorg.com Feb 18 12:51:18 ca01repo00 jabberd/sm[27454]: created user: jid=rhn-dispatcher-sat.myorg.com Feb 18 12:51:18 ca01repo00 jabberd/c2s[27462]: [26] legacy authentication succeeded: host=, username=rhn-dispatcher-sat, resource=superclient, TLS negotiated Feb 18 12:51:18 ca01repo00 jabberd/c2s[27462]: [26] requesting session: jid=rhn-dispatcher-sat.myorg.com/superclient Feb 18 12:51:18 ca01repo00 jabberd/sm[27454]: session started: jid=rhn-dispatcher-sat.myorg.com/superclient
(In reply to Slava from comment #6) > Code Result: > > except SSL.SSL.Error: > # Error in the SSL handshake - most likely mismatching CA cert > log_error("Traceback caught:") > log_error(extract_traceback()) > #raise SSLHandshakeError, None, sys.exc_info()[2] > pass Well, if you have a trouble with SSL handshake, you shouldn't ignore the SSL error. I'd rather check the SSL CA certificate or the actual date+time.
System up to date in term of services and time. And before upgrade everything was working fine.
I upgraded to last version of glibc, but segfault still in place and I think it cause by osa-dispatcher. I have production 3 proxies with jabber and RHEL 6.7 no problems all runs as expected. Feb 24 10:34:17 qa01repo00 kernel: sm[24738]: segfault at 10 ip 00007f7457135c08 sp 00007ffff7bdd568 error 4 in libc-2.12.so[7f745700d000+18a000]
This segfault was occurring when SELinux was running in enforcing mode. Putting the host into permissive mode for this single type was the only current way I have found around the issue: semanage permissive -a osad_t
Spacewalk 2.8 (and older) has already reached it's End Of Life. Thank you for reporting this issue and we are sorry that we were not able to fix it before end of life. If you would still like to see this bug fixed and are able to reproduce it against current version of Spacewalk 2.9, you are encouraged change the 'version' and re-open it.