Bug 1106564
| Summary: | Vdsm failed to create rhevm bridge(happened during hosted-engine --deploy) | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Artyom <alukiano> | ||||||
| Component: | vdsm | Assignee: | Nobody <nobody> | ||||||
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Aharon Canan <acanan> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 3.4.0 | CC: | alukiano, bazulay, danken, gklein, iheim, jmoskovc, lpeer, mavital, sbonazzo, ybronhei, yeylon | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | integration | ||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2014-06-23 07:44:31 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
supervdsmd died, since its libvirt connection broke, since libvirtd was restarted.
Could it be that `hosted-engine --deploy` calls `vdsm-tool configure` while vdsm is running? If so, it should better stop vdsm, supervdsm, and libvirt first, configure them, and re-start them.
MainProcess|Thread-16::DEBUG::2014-06-09 14:56:37,586::supervdsmServer::96::SuperVdsm.ServerCallback::(wrapper) call addNetwork with ('rhevm', {'force': 'False', 'nics': ['eth0'], 'bootproto': 'dhcp', 'bridged': 'True', 'blockingdhcp': 'true', 'ONBOOT': 'yes'}) {}
MainProcess|Thread-16::DEBUG::2014-06-09 14:56:37,587::utils::642::root::(execCmd) '/sbin/ip route show to 0.0.0.0/0 table all' (cwd None)
MainProcess|Thread-16::DEBUG::2014-06-09 14:56:37,605::utils::662::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
MainProcess|Thread-16::WARNING::2014-06-09 14:56:37,608::libvirtconnection::116::root::(wrapper) connection to libvirt broken. ecode: 1 edom: 7
MainProcess|Thread-16::CRITICAL::2014-06-09 14:56:37,608::libvirtconnection::118::root::(wrapper) taking calling process down.
MainThread::DEBUG::2014-06-09 14:56:37,609::supervdsmServer::424::SuperVdsm.Server::(main) Terminated normally
This is a question for the HE setup developers. (In reply to Dan Kenigsberg from comment #1) > supervdsmd died, since its libvirt connection broke, since libvirtd was > restarted. > > Could it be that `hosted-engine --deploy` calls `vdsm-tool configure` while > vdsm is running? If so, it should better stop vdsm, supervdsm, and libvirt > first, configure them, and re-start them. > It may be. But services are ensured to be configured and started before trying to add the bridge. However, we can change setup for shutting down the services before calling configure. I also was pretty sure that configure took care of shutting them down if found running. Artyom can you reproduce? Can you also attach hosted engine logs? (In reply to Sandro Bonazzola from comment #3) > I also was pretty sure that configure took care of shutting them > down if found running. You are right - if this is indeed the case, it should be solved there. Note that libvirtd was down for 12 long minutes. 2014-06-09 11:43:30.862+0000: 9947: debug : virConnectCompareCPU:17135 : conn=0x7ff28c05aff0, xmlDesc=<cpu match="minimum"><model>SandyBridge</model><vendor>Intel</vendor></cpu>, flags=0 2014-06-09 11:56:27.278+0000: 9929: debug : virHookCheck:119 : No hook script /etc/libvirt/hooks/daemon 2014-06-09 11:56:27.278+0000: 9929: debug : virHookCheck:119 : No hook script /etc/libvirt/hooks/qemu 2014-06-09 11:56:27.278+0000: 9929: debug : virHookCheck:119 : No hook script /etc/libvirt/hooks/lxc 2014-06-09 11:56:27.278+0000: 9929: info : virNetlinkEventServiceStopAll:420 : stopping all netlink event services 2014-06-09 11:56:27.450+0000: 20090: info : libvirt version: 0.10.2, package: 29.el6_5.8 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2014-05-14-10:17:57, x86-027.build.eng.bos.redhat.com) Artyom, supervdsm.log suggests that it was running since 2014-06-06 14:27:25,199. Are you 100% sure that the host had no vdsm on in before deployment? I was failed to reproduce exactly same error, but I ran on the same host, with the same parameters deployment process and encounter with problem that in time of rhevm bridge configuration, from some reason host failed to receive ip from dhcp. I will attach hosted-engine-setup log I hope it will help to understand what happen, because it must be the same parameters, if I will success to catch this error I will update setup log. Created attachment 908183 [details]
hosted-engine-setup.log
(In reply to Dan Kenigsberg from comment #6) > Note that libvirtd was down for 12 long minutes. > > 2014-06-09 11:43:30.862+0000: 9947: debug : virConnectCompareCPU:17135 : > conn=0x7ff28c05aff0, xmlDesc=<cpu > match="minimum"><model>SandyBridge</model><vendor>Intel</vendor></cpu>, > flags=0 > 2014-06-09 11:56:27.278+0000: 9929: debug : virHookCheck:119 : No hook > script /etc/libvirt/hooks/daemon > 2014-06-09 11:56:27.278+0000: 9929: debug : virHookCheck:119 : No hook > script /etc/libvirt/hooks/qemu > 2014-06-09 11:56:27.278+0000: 9929: debug : virHookCheck:119 : No hook > script /etc/libvirt/hooks/lxc > 2014-06-09 11:56:27.278+0000: 9929: info : virNetlinkEventServiceStopAll:420 > : stopping all netlink event services > 2014-06-09 11:56:27.450+0000: 20090: info : libvirt version: 0.10.2, > package: 29.el6_5.8 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, > 2014-05-14-10:17:57, x86-027.build.eng.bos.redhat.com) > > Artyom, supervdsm.log suggests that it was running since 2014-06-06 > 14:27:25,199. Are you 100% sure that the host had no vdsm on in before > deployment? Yes, it was clean host, from some reason vdsm log have difference with libvirt log in 3 hours. I also tried to run clean deployment on other hosts and it successfully finished, so maybe it problem that related to this specific host. Sandro - we must call configure with the --force flag in that case, which stops the running services that relates to vdsm before restarting anything else. Dan - I doubt that configure stop libvirt and not supervdsm, its on the same logic. I think something else happen to libvirt there (In reply to Artyom from comment #9) > > 2014-06-09 11:56:27.450+0000: 20090: info : libvirt version: 0.10.2, > > package: 29.el6_5.8 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, > > 2014-05-14-10:17:57, x86-027.build.eng.bos.redhat.com) > > > > Artyom, supervdsm.log suggests that it was running since 2014-06-06 > > 14:27:25,199. Are you 100% sure that the host had no vdsm on in before > > deployment? > > Yes, it was clean host, from some reason vdsm log have difference with > libvirt log in 3 hours. libvirt log is in UTC, your vdsm log is in Israel summer time (GMT+3), but this has nothing to do with the fact that your host was not clean when you started deployement. According to the logs, supervdsm has been running there since 3 days earlier. Please reopen this bug when it reproduces, and include libvirtd.log, as Yaniv suspects that libvirt has crashed and not restarted. |
Created attachment 904762 [details] vdsm, supervdsm and libvirt logs Description of problem: Vdsm failed to create rhevm bridge, this happened during hosted-engine --deploy on 'Configuring the management bridge'. Version-Release number of selected component (if applicable): vdsm-4.14.7-3.el6ev.x86_64 libvirt-0.10.2-29.el6_5.8.x86_64 How reproducible: 50% Steps to Reproduce: 1. On clean host(without any vdsm or libvirt packages) install hosted-engine, yum install ovirt-hosted-engine-setup.noarch -y 2. Run hosted-engine --deploy, and continue until 'Configuring the management bridge' stage 3. Actual results: Setup failed with error Expected results: Setup success to configure rhevm bridge without any errors Additional info: