Bug 847103 - Upgrade script should successfully terminate before reboot occurs
Upgrade script should successfully terminate before reboot occurs
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: ovirt-node (Show other bugs)
6.3
Unspecified Unspecified
urgent Severity urgent
: rc
: 6.4
Assigned To: Fabian Deutsch
Virtualization Bugs
: TestBlocker, ZStream
Depends On:
Blocks: 854765 855361 863151 867562
  Show dependency treegraph
 
Reported: 2012-08-09 13:49 EDT by Alon Bar-Lev
Modified: 2016-04-26 11:02 EDT (History)
16 users (show)

See Also:
Fixed In Version: ovirt-node-2.5.0-8.el6
Doc Type: Bug Fix
Doc Text:
On completion of hypervisor upgrades the hypervisor is rebooted. Previously this reboot was performed from within the shell session responsible for performing the upgrade. This resulted in the status of the host being listed in the Manager as "Install Failed" - even when the upgrade was successful. The rhev-hypervisor6 package has been updated. On upgrades the shell session is now allowed to finish gracefully while the reboot is initiated in the background.
Story Points: ---
Clone Of:
: 854765 (view as bug list)
Environment:
Last Closed: 2013-02-28 11:37:55 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alon Bar-Lev 2012-08-09 13:49:35 EDT
Currently the ovirt-functions::reboot() executes /sbin/reboot directly, previous versions used cron to perform reboot but slept until reboot.

As a result the ssh session that invokes the upgrade script is not terminated, leaving the TCP session opened, thus enforce engine to wait for timeout.

Desired behaviour is to allow script to successfully terminate before reboot by executing reboot at background, either by crond[1], or by sub-shell[2], or any other magic.

[1] echo "* * * * * sleep 10 && /sbin/reboot" > /var/spool/cron/root
[2] nohup sh -c '( sleep 10 && reboot )' < /dev/null > /dev/null 2>&1
Comment 2 Ilanit Stein 2012-09-03 02:56:51 EDT
This bug fails RHEV-H tests, involving upgrade. Marking it as testblocker.
Comment 7 Fabian Deutsch 2012-09-03 06:28:10 EDT
IIRC we need to handle this change from a blocking reboot to a non-blocking reboot during some installation cases. Otherwise e.g. a login prompt will be displayed in the auto-installation case.
Comment 8 Alon Bar-Lev 2012-09-03 06:31:54 EDT
(In reply to comment #7)
> IIRC we need to handle this change from a blocking reboot to a non-blocking
> reboot during some installation cases. Otherwise e.g. a login prompt will be
> displayed in the auto-installation case.

I don't understand... which login prompt? Even while sleeping there was one, or people could ssh to machine.
Comment 9 Fabian Deutsch 2012-09-03 07:18:10 EDT
(In reply to comment #8)
> (In reply to comment #7)
> > IIRC we need to handle this change from a blocking reboot to a non-blocking
> > reboot during some installation cases. Otherwise e.g. a login prompt will be
> > displayed in the auto-installation case.
> 
> I don't understand... which login prompt? Even while sleeping there was one,
> or people could ssh to machine.

I wasn't aware that a login prompt appeared in the auto-install case, this should normally prevented by the sleep.
Comment 10 Alon Bar-Lev 2012-09-03 07:27:29 EDT
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > IIRC we need to handle this change from a blocking reboot to a non-blocking
> > > reboot during some installation cases. Otherwise e.g. a login prompt will be
> > > displayed in the auto-installation case.
> > 
> > I don't understand... which login prompt? Even while sleeping there was one,
> > or people could ssh to machine.
> 
> I wasn't aware that a login prompt appeared in the auto-install case, this
> should normally prevented by the sleep.

I am sorry, but I don't understand why the login prompt is an issue... :)
Comment 11 Fabian Deutsch 2012-09-03 08:42:48 EDT
It's not a critical but a cosmetic issue we had in the past. I've got an up- and downstream patch which I'm going to test now.
Comment 14 Fabian Deutsch 2012-09-03 09:30:02 EDT
It turns out that one side effect is, that after an auto-install all subsequent services are booted until the login prompt is displayed.
This is more cosmetic but can distract users.
Comment 17 Mike Burns 2012-09-05 09:09:12 EDT
My understanding is that this issue is solved in latest builds by a change in the vdsm code to use the python code in the iso image directly rather than the legacy bash scripts.  Because of that change, we're not hitting the sleep timeout issue anymore.  

Is this correct?  If so, can we mark this testonly?
Comment 18 Alon Bar-Lev 2012-09-05 10:38:36 EDT
Mike,

I tried this[1] image.

1. There is no /data/updates in new image, so image upload fails.

2. After I create /data/updates, update succeeds but no reboot.

3. Another issue: after I approve, vdsm does not come up:

[root@alonbl4 ~]# /etc/init.d/vdsmd start
checking certs..
vdsm: libvirt already configured for vdsm                  [  OK  ]
Starting iscsid: 
vdsm: Failed to define network filters on libvirt          [FAILED]

Do you have rhev-m environment to test?


[1] http://jenkins.virt.bos.redhat.com/jenkins/view/RHEL%206/job/rhev-hypervisor-6-z-stream-rhev-31/
Comment 19 Mike Burns 2012-09-05 14:55:33 EDT
(In reply to comment #18)
> Mike,
> 
> I tried this[1] image.
> 
> 1. There is no /data/updates in new image, so image upload fails.

/data/updates is created by vdsm on startup.  This is caused by (#3) vdsm failing to start

> 
> 2. After I create /data/updates, update succeeds but no reboot.

Partially ovirt-node, partially vdsm.  Patches posted upstream:

http://gerrit.ovirt.org/7777
http://gerrit.ovirt.org/7778

> 
> 3. Another issue: after I approve, vdsm does not come up:

Pure vdsm
> 
> [root@alonbl4 ~]# /etc/init.d/vdsmd start
> checking certs..
> vdsm: libvirt already configured for vdsm                  [  OK  ]
> Starting iscsid: 
> vdsm: Failed to define network filters on libvirt          [FAILED]
> 
> Do you have rhev-m environment to test?
> 
> 
> [1]
> http://jenkins.virt.bos.redhat.com/jenkins/view/RHEL%206/job/rhev-hypervisor-
> 6-z-stream-rhev-31/
Comment 30 Mike Burns 2012-10-16 07:05:28 EDT
in testing on 6.3.z, QE has found the following:

old RHEVH   New RHEVH      RHEVM          Result
3.0         3.0            3.0            Pass
3.1         3.1            3.1            Pass
3.0         3.1            3.0            Pass
3.0         3.1            3.1            Fail (upgrade is successful, but shows failure in RHEV-M
Comment 31 Mike Burns 2012-10-16 07:13:28 EDT
Correction, the following scenario was not tested yet:

3.0         3.1            3.0            Pass
Comment 32 Mike Burns 2012-10-16 07:19:48 EDT
Ying, 

Can you provide an update after testing 3.0 -> 3.1 on RHEVM 3.0?
Comment 33 Ying Cui 2012-10-16 07:37:32 EDT
(In reply to comment #32)
> Ying, 
> 
> Can you provide an update after testing 3.0 -> 3.1 on RHEVM 3.0?

Test hypervisor6-6.3-20121012.0.el6_3 upgrade to hypervisor6-6.3-20121015.0.rhev31.el6_3, it is successful, so 3.0 -> 3.1 on RHEVM 3.0 IC is PASS.
Comment 45 errata-xmlrpc 2013-02-28 11:37:55 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0556.html

Note You need to log in before you can comment on or make changes to this bug.