Bug 1264901
Summary: | rabbitmq-server fails to start on boot with systemd notify error | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Andrew Blum <ablum> | ||||||
Component: | rhosp-director | Assignee: | Dmitry Tantsur <dtantsur> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | Shai Revivo <srevivo> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 7.0 (Kilo) | CC: | dtantsur, fdinitto, jeckersb, jslagle, mburns, plemenko, rhel-osp-director-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | 10.0 (Newton) | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-08-30 12:58:56 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Andrew Blum
2015-09-21 12:28:48 UTC
Eck, any thoughts? I believe a slow system start (OVS bridge) is the cause of the problem. I noticed there were several other notify services that were timing out. So, I modified /etc/systemd/system.conf [root@director ~]# grep ^DefaultTimeout /etc/systemd/system.conf DefaultTimeoutStartSec=180s After a reboot I found that only 1 service failed to start on boot - openstack-ironic-discoverd which was a Type=simple. This service was failing to start with: Sep 21 15:12:55 director.uc.example.com ironic-discoverd[549]: File "/usr/lib/python2.7/site-packages/keystoneclient/v2_0/client.py", line 196, in get_raw_token_from_identity_service Sep 21 15:12:55 director.uc.example.com ironic-discoverd[549]: _("Authorization Failed: %s") % e) Sep 21 15:12:55 director.uc.example.com ironic-discoverd[549]: keystoneclient.openstack.common.apiclient.exceptions.AuthorizationFailure: Authorization Failed: Unable to establish connection to http://10.200.0.1:5000/v2.0/tokens To deal with openstack-ironic-discoverd I added the network-online.target dependency (since it doesn't make sense to try to connect to the API endpoint before networking is online: # mkdir -p /etc/systemd/system/openstack-ironic-discoverd.service.d # cd /etc/systemd/system/openstack-ironic-discoverd.service.d # echo -e '[Unit]\nAfter=network-online.target\nWants=network-online.target' > require-networking.conf # systemctl daemon-reload Can you attach the rabbitmq log (/var/log/rabbitmq/rabbit@<node>.log) as well as the journal (journalctl -u rabbitmq-server)? Sure. I'll attach an sosreport. Here is the output from the journalctl command: [root@director ~]# journalctl -u rabbitmq-server -- Logs begin at Tue 2015-09-22 13:16:12 EDT, end at Tue 2015-09-22 13:29:50 EDT. -- Sep 22 13:17:13 director.uc.example.com systemd[1]: Starting RabbitMQ broker... Sep 22 13:17:25 director.uc.example.com systemd[1]: rabbitmq-server.service: Got notification message from PID 1248, but reception only permitted for PID 1161 Sep 22 13:18:43 director.uc.example.com systemd[1]: rabbitmq-server.service operation timed out. Terminating. Sep 22 13:18:44 director.uc.example.com systemd[1]: Failed to start RabbitMQ broker. Sep 22 13:18:44 director.uc.example.com systemd[1]: Unit rabbitmq-server.service entered failed state. [root@director ~]# Thanks for taking a look ! -- Andrew Created attachment 1075944 [details]
sosreport from RHEL-OSP7 director after rabbitmq-server timed out
Created attachment 1075946 [details]
rabbit log
After a brief glance, I'm not sure what the deal is. There's nothing during the bootup in the rabbit log, and the journal normally shows the rabbitmq "splash screen" which is one of the first things the server does when it starts booting, but the splash is absent as well. So either the process is not really being started at all by systemd, or it's hanging for some reason *very* early in the startup process. I'll see if I can reproduce this in my dev environment. Honestly I rarely if ever have rabbitmq-server set to start during bootup, instead relying on it to be managed by pacemaker once the system is up. This bug did not make the OSP 8.0 release. It is being deferred to OSP 10. Just fyi - this was fixed in RabbitMQ service file since rabbitmq-server-3.6.1-2.el7ost (available in RHOS9). So no further override in /etc/systemd/system is required. Could we just drop generating RabbitMQ service override in Director? Hi, sorry, I don't know. Mike, John? *** This bug has been marked as a duplicate of bug 1348700 *** |