Created attachment 632793 [details] log_collector Description of problem: RHEVH upgrade is performed, but rhevm GUI reports that install failed. Following error is shown in GUI: Host dell-r210ii-08.rhev.lab.eng.brq.redhat.com installation failed. SSH command failed while executing at host '10.34.66.81', refer to logs for further information.) Host is marked as install failed. However if user puts the host in maintanace and activates it, host is working properly. Version-Release number of selected component (if applicable): Red Hat Enterprise Virtualization Manager Version: '3.1.0-22.el6ev' both rhevh: vdsm-4.9-113.4.el6_3.x86_64 How reproducible:100% Steps to Reproduce: 1. Add RHEVH 20121012.0.el6_3 host into setup (cluster 3.0) 2. Upgrade the host to RHEVH (20121023.0.el6_3) via rhevm GUI Actual results: Host is upgraded, GUI reports that install failed Expected results: Host is upgraded, GUI correctly reports about the upgrade Additional info: log-collector files attached engine.log 2012-10-24 14:20:46,677 ERROR [org.ovirt.engine.core.utils.hostinstall.VdsInstallerSSH] (pool-4-thread-50) SSH error running command 10.34.66.81:'/usr/share/vdsm-reg/vdsm-upgrade': java.io.IOException: Command returned failure code 1 during SSH session 10.34.66.81:22' '/usr/share/vdsm-reg/vdsm-upgrade' at org.ovirt.engine.core.utils.ssh.SSHClient.executeCommand(SSHClient.java:442) [engine-utils.jar:] at org.ovirt.engine.core.utils.hostinstall.VdsInstallerSSH.executeCommand(VdsInstallerSSH.java:387) [engine-utils.jar:] at org.ovirt.engine.core.utils.hostinstall.VdsInstallerSSH.executeCommand(VdsInstallerSSH.java:426) [engine-utils.jar:] at org.ovirt.engine.core.bll.OVirtUpgrader.RunStage(OVirtUpgrader.java:54) [engine-bll.jar:] at org.ovirt.engine.core.bll.VdsInstaller.Install(VdsInstaller.java:280) [engine-bll.jar:] at org.ovirt.engine.core.bll.InstallVdsCommand.executeCommand(InstallVdsCommand.java:110) [engine-bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:825) [engine-bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:916) [engine-bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1300) [engine-bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:168) [engine-utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:107) [engine-utils.jar:] at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:931) [engine-bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:285) [engine-bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner.executeValidatedCommands(MultipleActionsRunner.java:182) [engine-bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner.RunCommands(MultipleActionsRunner.java:162) [engine-bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner$1.run(MultipleActionsRunner.java:84) [engine-bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:64) [engine-utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_09-icedtea] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_09-icedtea] at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_09-icedtea] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) [rt.jar:1.7.0_09-icedtea] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) [rt.jar:1.7.0_09-icedtea]
Created attachment 632794 [details] screenshot 1
dup of bug#849315? To confirm please attach logs at /var/log/vdsm-reg/vds_bootstrap_upgrade.* Thanks.
Created attachment 632813 [details] vds_bootstrap_upgrade.20121024_122039
Looks like bug#849315.
Created attachment 632837 [details] vdsm-upgrade script
Confirmed. *** This bug has been marked as a duplicate of bug 849315 ***
This bug is not really duplicate of 849315 since BZ 849315 solves this problem for rhevh hosts with vdsm 4.9.6.x I encountered this problem with rhevh host which contained vdsm 4.9-113.4. Please push fixes used to solve bug 849315 also to z of 4.9. please consider also pushing: http://gerrit.ovirt.org/7301 http://gerrit.ovirt.org/7279
Every bug in one release likely to be in previous... :) Barak, if you think we should backport this, please flag.
(In reply to comment #8) > Every bug in one release likely to be in previous... :) > > Barak, if you think we should backport this, please flag. wouldn't backporting this only cause upgrade from an older rhev-h to fail? the next 3.0 update of vdsm is most likely to be 3.1 already, so unless there is a very good reason, we wouldn't backport this.
(In reply to comment #9) > (In reply to comment #8) > > Every bug in one release likely to be in previous... :) > > > > Barak, if you think we should backport this, please flag. > > wouldn't backporting this only cause upgrade from an older rhev-h to fail? > the next 3.0 update of vdsm is most likely to be 3.1 already, so unless > there is a very good reason, we wouldn't backport this. Currently upgrading from older rhev-h does fail. rhevm-3.0 does not check for status... so it is natural for this fix. Our options: 1. Document in release notes that this is expected behaviour when upgrading older rhev-h. 2. Fix and push z stream, hoping that people will upgrade rhev-h to this version before upgrading to rhevm-3.1.
(In reply to comment #10) > Currently upgrading from older rhev-h does fail. does it fail rhev-m 3.0 customers or 3.1 customers as well? > rhevm-3.0 does not check for status... so it is natural for this fix. > > Our options: > > 1. Document in release notes that this is expected behaviour when upgrading > older rhev-h. if it fails for the first upgrade, then ok it is one thing. if it always fail for a 3.0 customer with a newer rhev-h, it is a problem which we should consider fixing asap in 3.0.8. we actually try very hard to avoid such regressions. is there a workaround? > > 2. Fix and push z stream, hoping that people will upgrade rhev-h to this > version before upgrading to rhevm-3.1.
(In reply to comment #11) > (In reply to comment #10) > > > Currently upgrading from older rhev-h does fail. > > does it fail rhev-m 3.0 customers or 3.1 customers as well? > > > rhevm-3.0 does not check for status... so it is natural for this fix. > > ^^^^^^^^^^^^^^^^^^^^^^^ > > Our options: > > > > 1. Document in release notes that this is expected behaviour when upgrading > > older rhev-h. > > if it fails for the first upgrade, then ok it is one thing. This is what happens. > if it always fail for a 3.0 customer with a newer rhev-h, it is a problem > which we should consider fixing asap in 3.0.8. > we actually try very hard to avoid such regressions. Should not. I guess Martin can confirm. > is there a workaround? Yes, ignore the error as in comment#0, and start enable the host. > > > > > 2. Fix and push z stream, hoping that people will upgrade rhev-h to this > > version before upgrading to rhevm-3.1.
if this only happens on the first upgrade to a new rhev-h, and same happens for 3.1 users, then what exactly is to fix? isn't this the same behavior we'd see for a 3.1 user for first upgrade from an older rhev-h?
What I described in this BZ is upgrade from rhevh20121012.0.el6_3 to 20121023.0.el6_3 both of them have vdsm-4.9-113.4.el6_3.x86_64
(In reply to comment #13) > if this only happens on the first upgrade to a new rhev-h, and same happens > for 3.1 users, then what exactly is to fix? isn't this the same behavior > we'd see for a 3.1 user for first upgrade from an older rhev-h? I honestly don't understand the question.
Martin, We cannot avoid this error, as even if we upgrade from node1 to node2, node1 has this bug... and as there is no planned nodes without vdsm-4.9.6 (maybe 1) it is not worth to distribute z of node. I am closing this for now. Thank you.