Created attachment 971058 [details] upgrade.tar.gz Description of problem: [6.6-3.5]Failed to upgrade hypervisor via RHEVM 3.5 RHEV-H UI: 1. Networking status show as unknown. 2. Nic status show as unconfigured. RHEVM UI: pop-up below info on RHEVM UI: Host dell-pet105-02.qe.lab.eng.nay.redhat.com is not responding. It will stay in Connecting state for a grace period of 60 seconds and after that an attempt to fence the host will be issued. Version-Release number of selected component (if applicable): rhev-hypervisor6-6.6-20141218.0.el6ev ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch vdsm-4.16.8.1-4.el6ev.x86_64 RHEVM vt13.4 rhevm-3.5.0-0.26.el6ev.noarch How reproducible: 100% Steps to Reproduce: 1. Install rhev-hypervisor6-6.6-20141218.0 2. Register hypervisor to RHEVM. 3. Approve it. 4. Maintenance the host. 5. Click Upgrade button and upgrade to itself. Actual results: [6.6-3.5]Failed to upgrade hypervisor via RHEVM(VT) Expected results: Upgrade the hypervisor can succeed via RHEVM. Additional info: Not only pet105 issue, I encountered this bug on usb disk. I tested 7.0(1218) upgrade itself, no such issue.
A first investigation shows that libvirtd and vdsmd are not coming up correctly.
I can reproduce this issue on rhev-hypervisor7-7.0-20141218.0.el7ev ovirt-node-3.1.0-0.37.20141218gitcf277e1.el7.noarch
(In reply to Ying Cui from comment #3) > I can reproduce this issue on rhev-hypervisor7-7.0-20141218.0.el7ev > ovirt-node-3.1.0-0.37.20141218gitcf277e1.el7.noarch Are you sure it is that one? The problem on 6.6 was that vdsmd and libvirtd were not coming up again. What is the problem you are seeing on 7.0?
(In reply to Fabian Deutsch from comment #4) > (In reply to Ying Cui from comment #3) > > I can reproduce this issue on rhev-hypervisor7-7.0-20141218.0.el7ev > > ovirt-node-3.1.0-0.37.20141218gitcf277e1.el7.noarch > > Are you sure it is that one? > > The problem on 6.6 was that vdsmd and libvirtd were not coming up again. > > What is the problem you are seeing on 7.0? For rhevh 7.0 itself upgrade via rhevm portal failed. libvirtd was running, but vdsmd was not coming up. The rhevh did not reboot automatically after upgrade, in rhevm portal there display: Host dell-per515-02.qe.lab.eng.nay.redhat.com is not responding. It will stay in Connecting state for a grace period of 60 seconds and after that an attempt to fence the host will be issued. My test steps: 1. Install rhevh 7.0 1218 successful. 2. Register rhevh 7.0 to rhevm successful. 3. Go to rhevm portal, approve it up. 4. Maintenance the host. 5. Click Upgrade button. 6. Select rhevh 7.0 1218.iso then OK. 7. checking /data/updates/ in rhevh host, the iso is transferred. Actually result: 1. rhevh did not reboot automatically after upgrade. 2. check rhevm portal, there display non-response. 3. failed to upgrade rhevh 7.0 itself via RHEVM 3.5. [root@dell-per515-02 admin]# systemctl status vdsmd vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled) Active: inactive (dead) since Mon 2014-12-22 15:50:15 UTC; 3min 51s ago Process: 53242 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh --post-stop (code=exited, status=0/SUCCESS) Process: 53024 ExecStart=/usr/share/vdsm/daemonAdapter -0 /dev/null -1 /dev/null -2 /dev/null /usr/share/vdsm/vdsm (code=exited, status=0/SUCCESS) Process: 52836 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS) Main PID: 53024 (code=exited, status=0/SUCCESS) CGroup: /system.slice/vdsmd.service Dec 22 15:48:24 dell-per515-02.qe.lab.eng.nay.redhat.com python[53024]: DIGEST-MD5 client step 2 Dec 22 15:48:24 dell-per515-02.qe.lab.eng.nay.redhat.com python[53024]: DIGEST-MD5 ask_user_info() Dec 22 15:48:24 dell-per515-02.qe.lab.eng.nay.redhat.com python[53024]: DIGEST-MD5 make_client_response() Dec 22 15:48:24 dell-per515-02.qe.lab.eng.nay.redhat.com python[53024]: DIGEST-MD5 client step 3 Dec 22 15:50:11 dell-per515-02.qe.lab.eng.nay.redhat.com systemd[1]: Stopping Virtual Desktop Server Manager... Dec 22 15:50:15 dell-per515-02.qe.lab.eng.nay.redhat.com python[53024]: DIGEST-MD5 client mech dispose Dec 22 15:50:15 dell-per515-02.qe.lab.eng.nay.redhat.com python[53024]: DIGEST-MD5 common mech dispose Dec 22 15:50:15 dell-per515-02.qe.lab.eng.nay.redhat.com vdsmd_init_common.sh[53242]: vdsm: Running run_final_hooks Dec 22 15:50:15 dell-per515-02.qe.lab.eng.nay.redhat.com systemd[1]: Stopped Virtual Desktop Server Manager. Dec 22 15:50:17 dell-per515-02.qe.lab.eng.nay.redhat.com systemd[1]: Stopped Virtual Desktop Server Manager. [root@dell-per515-02 admin]# systemctl status libvirtd libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled) Active: active (running) since Mon 2014-12-22 15:48:12 UTC; 12min ago Main PID: 52835 (libvirtd) CGroup: /system.slice/libvirtd.service └─52835 /usr/sbin/libvirtd --listen Dec 22 15:52:02 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Cannot find 'pm-is-supported' in path: No such file or directory Dec 22 15:52:02 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Failed to get host power management capabilities Dec 22 15:52:02 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Cannot find 'pm-is-supported' in path: No such file or directory Dec 22 15:52:02 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Failed to get host power management capabilities Dec 22 15:52:02 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Cannot find 'pm-is-supported' in path: No such file or directory Dec 22 15:52:02 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Failed to get host power management capabilities Dec 22 15:52:03 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Cannot find 'pm-is-supported' in path: No such file or directory Dec 22 15:52:03 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Failed to get host power management capabilities Dec 22 15:52:03 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Cannot find 'pm-is-supported' in path: No such file or directory Dec 22 15:52:03 dell-per515-02.qe.lab.eng.nay.redhat.com libvirtd[52835]: Failed to get host power management capabilities
Created attachment 972079 [details] sosreport_rhevh7.0_1218
# cd /tmp # ll -rw-r--r--. 1 root root 0 Dec 22 15:50 ovirt.log -rw-r--r--. 1 root root 0 Dec 22 15:50 ovirt_upgraded
Created attachment 972080 [details] engine_log_rhevh7.0_1218_comment 5
From the appearance point of view rhevh 6 itself upgrade and rhevh 7 itself upgrade are similar, but after your investigation if it is not the same root cause, we can spite the two bugs. Thanks.
Created attachment 972088 [details] /var/log for comment 5 some logs are not in sosreport, so I pasted all /var/log/
(In reply to Ying Cui from comment #12) > Created attachment 972088 [details] > /var/log for comment 5 > > some logs are not in sosreport, so I pasted all /var/log/ Hi Ying, We have two differente issues here, the original one that you reported and the other listed below which should be fixed by: ovirt-node-plugin-vdsm-0.2.0-17. This happened because augeas failed when we had unneeded double quotes into /etc/default/ovirt for MANAGED_BY key. As temporary workaround until next test iso is available (to test the original report) is: - Host is UP on RHEV-M - on TUI press F2 edit /etc/default/ovirt and remove the double "" in MANAGED_BY Example, from : ""RHEV-M https://IP:443"" to "RHEV-M https://IP:443" - Put host in maintainer in RHEV-M and execute the upgrade. ovirt-node-upgrade.log ========================= 2014-12-22 15:50:23,706 - ERROR - ovirt-node-upgrade - Error: Upgrade Failed: Unable to save to file! Traceback (most recent call last): File "/usr/sbin/ovirt-node-upgrade", line 364, in run self._run_upgrade() File "/usr/sbin/ovirt-node-upgrade", line 255, in _run_upgrade if not upgrade.ovirt_boot_setup(): File "/usr/lib/python2.7/site-packages/ovirtnode/install.py", line 687, in ovirt_boot_setup File "/usr/lib/python2.7/site-packages/ovirtnode/ovirtfunctions.py", line 371, in disable_firstboot File "/usr/lib/python2.7/site-packages/augeas.py", line 385, in save IOError: Unable to save to file!
> We have two differente issues here, the original one that you reported and > the other listed below which should be fixed by: > ovirt-node-plugin-vdsm-0.2.0-17. This happened because augeas failed when we > had unneeded double quotes into /etc/default/ovirt for MANAGED_BY key. Thanks Douglas for detail explanation. Let me try this workaround. And I need to double confirm with you: do I need to split this bug into two? see my comment 11. Thanks.
(In reply to Ying Cui from comment #14) > > We have two differente issues here, the original one that you reported and > > the other listed below which should be fixed by: > > ovirt-node-plugin-vdsm-0.2.0-17. This happened because augeas failed when we > > had unneeded double quotes into /etc/default/ovirt for MANAGED_BY key. > > Thanks Douglas for detail explanation. > Let me try this workaround. > > And I need to double confirm with you: do I need to split this bug into two? > see my comment 11. Thanks. Hi Ying, it's up to you, if you want to make a record about it, yes. On the other hand, we already have the package wich should fix it (ovirt-node-plugin-vdsm-0.2.0-17).
> > Thanks Douglas for detail explanation. > > Let me try this workaround. Test workaround in comment 13 on rhevh 7.0 build and rhevh 6.0 build, works good. RHEVH 7.0 itself and RHEVH 6.0 itself can be upgraded via RHEVM, and Up in rhevm automatically after upgrading. rhev-hypervisor7-7.0-20141218.0.el7ev ovirt-node-3.1.0-0.37.20141218gitcf277e1.el7.noarch rhev-hypervisor6-6.6-20141218.0.el6ev ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch > > > > And I need to double confirm with you: do I need to split this bug into two? > > see my comment 11. Thanks. > > Hi Ying, it's up to you, if you want to make a record about it, yes. On the > other hand, we already have the package wich should fix it > (ovirt-node-plugin-vdsm-0.2.0-17). Yeah, according to comment 13, two difference issues here, the another issue should be fixed in ovirt-node-plugin-vdsm component. so we'd better to open another bug on ovirt-node-plugin-vdsm component to record this change. Thanks.
new bug 1177216 on ovirt-node-plugin-vdsm component to trace unneeded double quotes into /etc/default/ovirt for MANAGED_BY key cause upgrade Failed issue.
It looks like the "remaining" parts of this bug are the same cause which is described in bug 1179068. Douglas, what do you think?
(In reply to Fabian Deutsch from comment #18) > It looks like the "remaining" parts of this bug are the same cause which is > described in bug 1179068. > > Douglas, what do you think? Hi Fabian, I was checking the original report description and reproduced the issue locally again. Basically, I would say that the report happened because of bz#1177216 mainly. For now, I will close as duplicate of bz#1177216, if there were other issues around this topic, we can re-open this one. shaochen/Ying if you have any questions, please let me know. *** This bug has been marked as a duplicate of bug 1177216 ***