Bug 1306563 - Jabber segfault
Summary: Jabber segfault
Keywords:
Status: CLOSED EOL
Alias: None
Product: Spacewalk
Classification: Community
Component: Server
Version: 2.4
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Michael Mráka
QA Contact: Red Hat Satellite QA List
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-11 10:08 UTC by Slava
Modified: 2019-10-21 13:12 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-21 13:12:06 UTC
Embargoed:


Attachments (Terms of Use)

Description Slava 2016-02-11 10:08:38 UTC
Description of problem:

After upgrade from 2.3 to 2.4 jabber sm module is send segfault on service restart. This issue  reproducible only when you restart with init.d script. Script report  OK, but when do restart it report sm [FAILED] on stop and /var/log/messages segfault.


This issue affecting production, please advise how to troubleshoot it.

[root@ca01repo00 jabberd]# rpm -qa | grep jabber
jabberpy-0.5-0.21.el6.noarch
jabberd-2.2.14-6.el6.x86_64
spacewalk-setup-jabberd-2.3.2-1.el6.noarch



Version-Release number of selected component (if applicable):


How reproducible:

[root@ca01repo00 jabberd]# service jabberd status
router (pid 31553) is running...
sm (pid 31561) is running...
c2s (pid 31569) is running...
s2s (pid 31577) is running...
[root@ca01repo00 jabberd]# service jabberd restart
Terminating jabberd processes ...
Stopping s2s:                                              [  OK  ]
Stopping c2s:                                              [  OK  ]
Stopping sm:                                               [FAILED]
Stopping router:                                           [  OK  ]
Initializing jabberd processes ...
Starting router:                                           [  OK  ]
Starting sm:                                               [  OK  ]
Starting c2s:                                              [  OK  ]
Starting s2s:                                              [  OK  ]


Actual results:

Feb 11 04:46:13 ca01repo00 kernel: sm[31091]: segfault at 18 ip 00007fb270dd22b6 sp 00007fff6f26a898 error 4 in libc-2.12.so[7fb270caa000+18a000]

Expected results:


Additional info:

Comment 1 Slava 2016-02-11 11:50:25 UTC
I believe that issue cause by osa-dispatcher. It can't connect to jabber server no matter I what is changed in rhn.conf. From my prospective is a BLOCKER. Can't   manage system from spacewalk.

[root@ca01repo00 jabberd]# hostname
ca01repo00.adm.myorg.com
[root@ca01repo00 jabberd]# nslookup ca01repo00.adm.myorg.com
Server:		172.16.104.26
Address:	172.16.104.26#53

Name:	ca01repo00.adm.myorg.com
Address: 172.16.104.36


I believe that issue cause by osa-dispatcher. It can't connect to jabber server no matter I what is changed in rhn.conf. From my prospective is a BLOCKER. Can't   manage system from spacewalk.

[root@ca01repo00 jabberd]# hostname
ca01repo00.adm.myorg.com
[root@ca01repo00 jabberd]# nslookup ca01repo00.adm.myorg.com
Server:		172.16.104.26
Address:	172.16.104.26#53

Name:	ca01repo00.adm.myorg.com
Address: 172.16.104.36


2016/02/11 06:12:01 -04:00 16889 0.0.0.0: rhnSQL/driver_postgresql._execute_wrapper('Executing SQL: "\n    select id, password\n      from rhnPushDispatcher\n     where jabber_id like %(jabber_id)s\n    " with bind params: {jabber_id: rhn-dispatcher-sat%}',)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.setup_connection('Connecting to', 'ca01repo00.adm.myorg.com')
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib._get_jabber_client
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib._get_jabber_client('Connecting to', 'ca01repo00.adm.myorg.com')
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.__init__
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.__init__
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.check_cert('Loading cert', <X509Name object '/C=CA/ST=ON/L=MyLocale/O=MyOrg/OU=ca01repo00.adm.myorg.com/CN=ca01repo00.adm.myorg.com'>)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Attempting to connect',)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process(300,)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process('before select(); timeout', 300.0)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process('select() returned',)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib._auth_dispatch(<jabber.xmlstream.Node instance at 0xc74f38>,)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Connected',)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Expecting features stanza, got:', <features><address xmlns = 'http://affinix.com/jabber/address' >172.16.104.36</address><auth xmlns = 'http://jabber.org/features/iq-auth'  /><register xmlns = 'http://jabber.org/features/iq-register'  /><starttls xmlns = 'urn:ietf:params:xml:ns:xmpp-tls' ><required /></starttls></features>)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('starttls node', <jabber.xmlstream.Node instance at 0xc8a050>)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process(None,)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process('before select(); timeout', None)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.process('select() returned',)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib._auth_dispatch(<jabber.xmlstream.Node instance at 0xc8a200>,)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Expecting proceed stanza, got:', <proceed />)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('Preparing for TLS handshake',)
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('ERROR', 'Traceback caught:')
2016/02/11 06:12:01 -04:00 16889 0.0.0.0: osad/jabber_lib.connect('ERROR', 'Traceback (most recent call last):\n  File "/usr/share/rhn/osad/jabber_lib.py", line 642, in connect\n    ssl.do_handshake()\nWantReadError\n')

Comment 2 Slava 2016-02-12 16:03:12 UTC
I am was be able get some backtrace, might be will be useful.

https://fpaste.org/321806/52890621/

Comment 3 Tomas Lestach 2016-02-15 10:26:35 UTC
The paste is no longer available: The paste you are looking for does not exist

Comment 4 Slava 2016-02-15 13:13:25 UTC
Please see updated link.

https://fpaste.org/322825/14555419/

Comment 5 Slava 2016-02-18 15:06:22 UTC
Hello Everyone,
Any input on issue ? Thank you.

Comment 6 Slava 2016-02-18 18:34:27 UTC
After some troubleshooting with my college we found issue, but I am not confident if this is the actual fix, but at least osa-dispatcher started and register.

File 

/usr/share/rhn/osad/jabber_lib.py


On line 648
#raise SSLHandshakeError, None, sys.exc_info()[2]

and set to

pass


Code Result:

        except SSL.SSL.Error:
            # Error in the SSL handshake - most likely mismatching CA cert
            log_error("Traceback caught:")
            log_error(extract_traceback())
            #raise SSLHandshakeError, None, sys.exc_info()[2]
            pass

Log Output

Feb 18 12:51:18 ca01repo00 jabberd/c2s[27462]: [26] created user: user=rhn-dispatcher-sat; realm=
Feb 18 12:51:18 ca01repo00 jabberd/c2s[27462]: [26] registration succeeded, requesting user creation: jid=rhn-dispatcher-sat.myorg.com
Feb 18 12:51:18 ca01repo00 jabberd/sm[27454]: created user: jid=rhn-dispatcher-sat.myorg.com
Feb 18 12:51:18 ca01repo00 jabberd/c2s[27462]: [26] legacy authentication succeeded: host=, username=rhn-dispatcher-sat, resource=superclient, TLS negotiated
Feb 18 12:51:18 ca01repo00 jabberd/c2s[27462]: [26] requesting session: jid=rhn-dispatcher-sat.myorg.com/superclient
Feb 18 12:51:18 ca01repo00 jabberd/sm[27454]: session started: jid=rhn-dispatcher-sat.myorg.com/superclient

Comment 7 Tomas Lestach 2016-02-19 08:43:06 UTC
(In reply to Slava from comment #6)
> Code Result:
> 
>         except SSL.SSL.Error:
>             # Error in the SSL handshake - most likely mismatching CA cert
>             log_error("Traceback caught:")
>             log_error(extract_traceback())
>             #raise SSLHandshakeError, None, sys.exc_info()[2]
>             pass

Well, if you have a trouble with SSL handshake, you shouldn't ignore the SSL error.
I'd rather check the SSL CA certificate or the actual date+time.

Comment 8 Slava 2016-02-19 15:44:52 UTC
System up to date in term of services and time.  And before upgrade everything was working fine.

Comment 9 Slava 2016-02-24 17:11:36 UTC
I upgraded to last version of glibc, but segfault still in place and I think it cause by osa-dispatcher. I have production 3 proxies with jabber and RHEL 6.7 no problems all runs as expected. 



 Feb 24 10:34:17 qa01repo00 kernel: sm[24738]: segfault at 10 ip 00007f7457135c08 sp 00007ffff7bdd568 error 4 in libc-2.12.so[7f745700d000+18a000]

Comment 10 Brian Harrington 2016-09-27 01:05:23 UTC
This segfault was occurring when SELinux was running in enforcing mode.  Putting the host into permissive mode for this single type was the only current way I have found around the issue:

semanage permissive -a osad_t

Comment 11 Michael Mráka 2019-10-21 13:12:06 UTC
Spacewalk 2.8 (and older) has already reached it's End Of Life.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before end of life. If you would still like
to see this bug fixed and are able to reproduce it against current version
of Spacewalk 2.9, you are encouraged change the 'version' and re-open it.


Note You need to log in before you can comment on or make changes to this bug.