Description of problem: On RHEL7 jabberd (and osa-disapatcher because of that) fails to start Version-Release number of selected component (if applicable): Current SWnightly: jabberd-2.3.6-1.el7.x86_64 osa-dispatcher-5.11.70-1.el7.noarch glibc-2.17-106.el7_2.6.x86_64 How reproducible: often Steps to Reproduce: 1. Restart SW on RHEL7 Actual results: jabberd-* services failing Expected results: Should work Additional info: Some feelings here: issue is more frequent when I do `rhn-satellite restart` then when I do `rhn-satellite stop; sleep 5; rhn-satellite start`. When I have added strace like this: "ExecStart=/usr/bin/strace -- /usr/bin/c2s -c /etc/jabberd/c2s.xml" to the service file: /usr/lib/systemd/system/jabberd-c2s.service I was not able to reproduce. Maybe strace introduced some delay somewhere? At the end, after some retries I got detailed log with "ExecStart=/usr/bin/c2s -D -c /etc/jabberd/c2s.xml".
Workaround is to restart jabberd and osa-dispatcher manually: # systemctl jabberd restart # systemctl osa-dispatcher restart Not so tested workaround it to run `rhn-satellite stop; sleep 5; rhn-satellite start` instead of `rhn-satellite restart`.
Note that although log complains about DB inconsistency, this does not fix the issue: # rm -rf /var/lib/jabberd/db/* # rhn-satellite restart
Same problem here with a Spacewalk 2.5 installation (though on CentOS 7): If I restart the Spacewalk VM jabber doesn't start up and as a result osa-dispatcher service fails. It works after manually restarting jabberd and osa-dispatcher. Here is the startup log: jabberd/router[1253]: starting up jabberd/s2s[1254]: starting up (interval=3, queue=60, keepalive=0, idle=86400) jabberd/router[1253]: process id is 1253, written to /var/lib/jabberd/pid/router.pid jabberd/s2s[1254]: process id is 1254, written to /var/lib/jabberd/pid/s2s.pid jabberd/s2s[1254]: attempting connection to router at ::1, port=5347 jabberd/s2s[1254]: [6] [router] write error: Connection refused (111) jabberd/s2s[1254]: connection to router closed jabberd/s2s[1254]: attempting reconnect (3 left) jabberd/sm[1251]: starting up jabberd/sm[1251]: process id is 1251, written to /var/lib/jabberd/pid/sm.pid jabberd/sm[1251]: loading 'db' storage module jabberd/router[1253]: loaded user table (1 users) jabberd/router[1253]: loaded filters (0 rules) jabberd/router[1253]: [::, port=5347] listening for incoming connections c2s: Sat Jun 25 18:56:35 2016 [notice] modules search path: /usr/lib64/jabberd c2s: Sat Jun 25 18:56:35 2016 [info] loading 'db' authreg module sm: unable to join the environment c2s: unable to allocate memory for the lock table c2s: PANIC: Cannot allocate memory c2s: Sat Jun 25 18:56:36 2016 [critical] db: corruption detected! close all jabberd processes and run db_recover systemd: jabberd-c2s.service: main process exited, code=exited, status=2/INVALIDARGUMENT systemd: Unit jabberd-c2s.service entered failed state. systemd: jabberd-c2s.service failed. systemd: Stopped Jabber Server. systemd: Stopping Jabber Server... systemd: Stopping Jabber IM Session Manager... systemd: Stopping Jabber Server To Server Connector... jabberd/s2s[1254]: attempting connection to router at ::1, port=5347 jabberd/s2s[1254]: shutting down jabberd/router[1253]: [::1, port=42448] connect jabberd/s2s[1254]: connection to router closed jabberd/router[1253]: [::1, port=42448] disconnect systemd: Stopped Jabber Server To Server Connector. jabberd/router[1253]: shutting down osa-dispatcher: Spacewalk 1296 2016/06/25 18:56:37 +02:00: ('Error connecting to jabber server: Unable to connect to the host and port specified. See https://access.redhat.com/solutions/327903 for more information. ',) osa-dispatcher: Spacewalk 1296 2016/06/25 18:56:37 +02:00: ('Error caught:',) osa-dispatcher: ERROR: unhandled exception occurred: (unicode argument expected, got 'str'). systemd: osa-dispatcher.service: control process exited, code=exited status=255 systemd: Failed to start OSA Dispatcher daemon. systemd: Unit osa-dispatcher.service entered failed state. systemd: osa-dispatcher.service failed. - Just executing a 'spacewalk-service restart' doesn't fix the issue. - What does work, as mentioned before by Jan Hutař, is a 'systemctl restart jabberd' followed by a 'systemctl restart osa-dispatcher'. - I also tried following the advice at 'https://access.redhat.com/solutions/327903' but it didn't change anything. - executing 'spacewalk-setup-jabberd' doesn't help either.
Run alternatives --config java Remove all Java version but 8 and my services started working again.
(In reply to Aaron from comment #8) > Run alternatives --config java > Remove all Java version but 8 and my services started working again. Unfortunately this broke the webui.
I found AVC message during restarting service osa-dispatcher. Probably it isn't associated with this bug. >> systemctl restart osa-dispatcher.service ... type=SERVICE_STOP msg=audit(1468591032.022:1734): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=osa-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' type=AVC msg=audit(1468591032.089:1735): avc: denied { write } for pid=31826 comm="osa-dispatcher" name="osad" dev="dm-0" ino=101055545 scontext=system_u:system_r:osa_dispatcher_t:s0 tcontext=system_u:object_r:usr_t:s0 tclass=dir type=SYSCALL msg=audit(1468591032.089:1735): arch=c000003e syscall=87 success=no exit=-13 a0=207a5e0 a1=ebb8 a2=81b4 a3=7ffd12208a20 items=0 ppid=1 pid=31826 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="osa-dispatcher" exe="/usr/bin/python2.7" subj=system_u:system_r:osa_dispatcher_t:s0 key=(null) type=SERVICE_START msg=audit(1468591032.313:1736): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=osa-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
I confirm everything, using 7.2 and latest updates
Adding a +1. Workaround confirmed as well: stop jabberd,start jabberd , restart osa-dispatcher.
Can also confirm, using latest CentOS 7 release and Spacewalk 2.5.
Potential fix submitted here: https://github.com/spacewalkproject/spacewalk/pull/462 Before: [root@spacewalk rhn]# /usr/sbin/spacewalk-service restart Shutting down spacewalk services... Stopping RHN Taskomatic... Stopped RHN Taskomatic. Stopping cobblerd (via systemctl): [ OK ] Redirecting to /bin/systemctl stop rhn-search.service Redirecting to /bin/systemctl stop osa-dispatcher.service Redirecting to /bin/systemctl stop httpd.service Redirecting to /bin/systemctl stop tomcat.service Redirecting to /bin/systemctl stop jabberd.service Redirecting to /bin/systemctl stop postgresql.service Done. Starting spacewalk services... Redirecting to /bin/systemctl start postgresql.service Redirecting to /bin/systemctl start jabberd.service Redirecting to /bin/systemctl start tomcat.service Waiting for tomcat to be ready ... Redirecting to /bin/systemctl start httpd.service Redirecting to /bin/systemctl start osa-dispatcher.service Job for osa-dispatcher.service failed because the control process exited with error code. See "systemctl status osa-dispatcher.service" and "journalctl -xe" for details. Redirecting to /bin/systemctl start rhn-search.service Starting cobblerd (via systemctl): [ OK ] Starting RHN Taskomatic... Done. [root@spacewalk rhn]# systemctl status osa-dispatcher ● osa-dispatcher.service - OSA Dispatcher daemon Loaded: loaded (/usr/lib/systemd/system/osa-dispatcher.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Wed 2016-09-28 00:39:49 UTC; 27min ago Process: 5679 ExecStart=/usr/sbin/osa-dispatcher --pid-file /var/run/osa-dispatcher.pid (code=exited, status=255) Process: 5676 ExecStartPre=/bin/rm -f /var/run/osa-dispatcher.pid (code=exited, status=0/SUCCESS) Main PID: 4007 (code=killed, signal=TERM) Sep 28 00:39:48 spacewalk.lab.libcore.so systemd[1]: Starting OSA Dispatcher daemon... Sep 28 00:39:49 spacewalk.lab.libcore.so osa-dispatcher[5679]: Spacewalk 5679 2016/09/28 00:39:49 -00:00: ('Error connecting to jabber server: Unable to connect to the host and ...ation. ',) Sep 28 00:39:49 spacewalk.lab.libcore.so osa-dispatcher[5679]: Spacewalk 5679 2016/09/28 00:39:49 -00:00: ('Error caught:',) Sep 28 00:39:49 spacewalk.lab.libcore.so osa-dispatcher[5679]: ERROR: unhandled exception occurred: (unicode argument expected, got 'str'). Sep 28 00:39:49 spacewalk.lab.libcore.so systemd[1]: osa-dispatcher.service: control process exited, code=exited status=255 Sep 28 00:39:49 spacewalk.lab.libcore.so systemd[1]: Failed to start OSA Dispatcher daemon. Sep 28 00:39:49 spacewalk.lab.libcore.so systemd[1]: Unit osa-dispatcher.service entered failed state. Sep 28 00:39:49 spacewalk.lab.libcore.so systemd[1]: osa-dispatcher.service failed. Hint: Some lines were ellipsized, use -l to show in full. After: [root@spacewalk rhn]# /usr/sbin/spacewalk-service restart Shutting down spacewalk services... Stopping RHN Taskomatic... Stopped RHN Taskomatic. Stopping cobblerd (via systemctl): [ OK ] Redirecting to /bin/systemctl stop rhn-search.service Redirecting to /bin/systemctl stop osa-dispatcher.service Redirecting to /bin/systemctl stop httpd.service Redirecting to /bin/systemctl stop tomcat.service Redirecting to /bin/systemctl stop jabberd.service Redirecting to /bin/systemctl stop postgresql.service Done. Starting spacewalk services... Redirecting to /bin/systemctl start postgresql.service Redirecting to /bin/systemctl start jabberd.service Redirecting to /bin/systemctl start tomcat.service Waiting for tomcat to be ready ... Redirecting to /bin/systemctl start httpd.service Redirecting to /bin/systemctl start osa-dispatcher.service Redirecting to /bin/systemctl start rhn-search.service Starting cobblerd (via systemctl): [ OK ] Starting RHN Taskomatic... Done. [root@spacewalk rhn]# systemctl status osa-dispatcher ● osa-dispatcher.service - OSA Dispatcher daemon Loaded: loaded (/usr/lib/systemd/system/osa-dispatcher.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/osa-dispatcher.service.d └─10-dependency.conf Active: active (running) since Wed 2016-09-28 01:13:53 UTC; 38s ago Process: 7684 ExecStart=/usr/sbin/osa-dispatcher --pid-file /var/run/osa-dispatcher.pid (code=exited, status=0/SUCCESS) Process: 7681 ExecStartPre=/bin/rm -f /var/run/osa-dispatcher.pid (code=exited, status=0/SUCCESS) Main PID: 7686 (osa-dispatcher) CGroup: /system.slice/osa-dispatcher.service └─7686 /usr/bin/python -s /usr/sbin/osa-dispatcher --pid-file /var/run/osa-dispatcher.pid Sep 28 01:13:53 spacewalk.lab.libcore.so systemd[1]: Starting OSA Dispatcher daemon... Sep 28 01:13:53 spacewalk.lab.libcore.so systemd[1]: Started OSA Dispatcher daemon.
I tried the fix as per the pull request (modifying the base unit file) and found it didn't work for me. Interestingly though, a 60 second PreExec sleep does do the job and it now cleanly starts on system/service start/restart. It almost seems as if osa-dispatcher is trying to connect to jabberd-c2s before it has completely started, causing mayhem with both.
Hi I setup a complet new system wit CentOS Linux release 7.2.1511 (Core) and spacewalk 2.6. It made a lot of trouble at the installation process and later I saw that jabber and osa dispatcher are not startet. In the logs I found kernel: sm[2824]: segfault at 7f98e9a6cfd8 ip 00007f00e5c66b55 sp 00007ffc413a4bb0 error 4 in libdb-4.8.so[7f00e5ba7000+17d000 After a reboot the tomcat was up but not jabber and osa-dispatcher I can confirm that systemctl jabberd restart systemctl osa-dispatcher restart can help (1 or 2 tries) The sm is in my humble opinion the cause why the other processes are die. For me I found the following solution # add Restart=always RestartSec=5 /usr/lib/systemd/system/jabberd-sm.service [Service] User=jabber ExecStart=/usr/bin/sm -c /etc/jabberd/sm.xml Restart=always RestartSec=5 # tomcat.service /usr/lib/systemd/system/jabberd.service Requires=tomcat.service jabberd-router.service jabberd-sm.service jabberd-c2s.service jabberd-s2s.service # add Restart=always RestartSec=5 /usr/lib/systemd/system/osa-dispatcher.service [Service] Type=forking EnvironmentFile=-/etc/sysconfig/osa-dispatcher PIDFile=/var/run/osa-dispatcher.pid ExecStart=/usr/sbin/osa-dispatcher --pid-file /var/run/osa-dispatcher.pid ExecStartPre=/bin/rm -f /var/run/osa-dispatcher.pid Restart=always RestartSec=5
I can not reproduce this bug - running Spacewalk nightly on RHEL 7. tried these steps: #after fresh installation spacewalk-service status # every service is active spacewalk-service restart # every service is active spacewalk-service stop spacewalk-service start # everything ok Tried on two machines, same results. The patch mentioned in Comment 15 sent as PR has been closed, see resolution: https://github.com/spacewalkproject/spacewalk/pull/462#issuecomment-307578187 We've decided to close this bug. Feel free to comment or open this bug if you can reproduce it against nightly. Additional info: # rpm -qa |grep spacewalk |sort spacewalk-admin-2.6.1-1.el7.noarch spacewalk-backend-2.7.114-1.el7.noarch spacewalk-backend-app-2.7.114-1.el7.noarch spacewalk-backend-applet-2.7.114-1.el7.noarch spacewalk-backend-config-files-2.7.114-1.el7.noarch spacewalk-backend-config-files-common-2.7.114-1.el7.noarch spacewalk-backend-config-files-tool-2.7.114-1.el7.noarch spacewalk-backend-iss-2.7.114-1.el7.noarch spacewalk-backend-iss-export-2.7.114-1.el7.noarch spacewalk-backend-libs-2.7.114-1.el7.noarch spacewalk-backend-package-push-server-2.7.114-1.el7.noarch spacewalk-backend-server-2.7.114-1.el7.noarch spacewalk-backend-sql-2.7.114-1.el7.noarch spacewalk-backend-sql-postgresql-2.7.114-1.el7.noarch spacewalk-backend-tools-2.7.114-1.el7.noarch spacewalk-backend-xml-export-libs-2.7.114-1.el7.noarch spacewalk-backend-xmlrpc-2.7.114-1.el7.noarch spacewalk-base-2.7.3-1.el7.noarch spacewalk-base-minimal-2.7.3-1.el7.noarch spacewalk-base-minimal-config-2.7.3-1.el7.noarch spacewalk-branding-2.7.4-1.el7.noarch spacewalk-certs-tools-2.7.1-1.el7.noarch spacewalk-common-2.7.2-1.el7.noarch spacewalk-config-2.7.2-1.el7.noarch spacewalk-dobby-2.7.3-1.el7.noarch spacewalk-doc-indexes-2.5.2-1.el7.noarch spacewalk-html-2.7.3-1.el7.noarch spacewalk-java-2.7.86-1.el7.noarch spacewalk-java-config-2.7.86-1.el7.noarch spacewalk-java-lib-2.7.86-1.el7.noarch spacewalk-java-postgresql-2.7.86-1.el7.noarch spacewalk-postgresql-2.7.2-1.el7.noarch spacewalk-repo-2.6-0.el7.noarch spacewalk-reports-2.7.5-1.el7.noarch spacewalk-schema-2.7.24-1.el7.noarch spacewalk-search-2.7.5-1.el7.noarch spacewalk-selinux-2.7.1-1.el7.noarch spacewalk-setup-2.7.9-1.el7.noarch spacewalk-setup-jabberd-2.7.1-1.el7.noarch spacewalk-setup-postgresql-2.7.3-1.el7.noarch spacewalk-taskomatic-2.7.86-1.el7.noarch spacewalk-usix-2.7.5-1.el7.noarch spacewalk-utils-2.7.15-1.el7.noarch
This BZ closed some time during 2.5, 2.6 or 2.7. Adding to 2.7 tracking bug.