| Summary: | RHEV-H update fails to upgrade from portal and can not be activated | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | wdaniel | ||||||
| Component: | rhev-hypervisor | Assignee: | Fabian Deutsch <fdeutsch> | ||||||
| Status: | CLOSED NEXTRELEASE | QA Contact: | Pavel Stehlik <pstehlik> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 3.2.0 | CC: | alonbl, benglish, dfediuck, fdeutsch, iheim, wdaniel, yeylon | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | node | ||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2014-03-13 18:16:45 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
|
Description
wdaniel
2013-12-11 20:43:34 UTC
I'm seeing many SQL related IO errors in the logs. Itamar, who could help here? Also HTTPS related errors (In reply to wdaniel from comment #0) > I gave him the workaround of "clicking activate" but RHEV comes back with > the error message: > Cannot switch Host to Maintenance mode, Host is not operational. > I had him check the admin TUI post-upgrade to see if the upgrade was in fact > successful, but it was not. Hey Wallay, when does this message appear? Before the update? (The host can not be set to maintenance mode to do the upgrade). Or after the update? (In reply to Fabian Deutsch from comment #9) > (In reply to wdaniel from comment #0) > > I gave him the workaround of "clicking activate" but RHEV comes back with > > the error message: > > Cannot switch Host to Maintenance mode, Host is not operational. > > I had him check the admin TUI post-upgrade to see if the upgrade was in fact > > successful, but it was not. > > Hey Wallay, > > when does this message appear? > > Before the update? (The host can not be set to maintenance mode to do the > upgrade). > > Or after the update? Fabian, In this order of events, the customer went through the regular upgrade process and hit the failure message. It is after the upgrade process that the customer attempts to click "activate" to work around this, and that is when he encounters the "Host is non-op" message. Also, I wanted to repeat that that customer checked the admin TUI on the RHEV-H and the version reported there is still the old 6.3 version, not the expected upgraded version. As far as I can tell the failure that he initially runs into (not being able to activate) fits the same criteria as mentioned in this article: https://access.redhat.com/site/solutions/380313 This being the case, I would imagine that the hypervisor would still be updated. Is the hypervisor supposed to reboot somewhere in the upgrade process? Is there any way to tell on the "non-op" hypervisor where in the upgrade process it has stopped? Wallace, yes - maybe Alon can help us here to tell where the logs reside when the update got initiated through RHEV-M. Does the customer see a backup entry in grub when he reboots the machine? And yes - a reboot is happening after the update was pushed ot the machine. Upgrade messages are be written to: - enine:/var/log/ovirt-engine/engine.log - host:/var/log/vdsm-reg/vds_bootstrap_upgrade*.log - host: ovirt-node logs. (In reply to Fabian Deutsch from comment #12) > Wallace, > > yes - maybe Alon can help us here to tell where the logs reside when the > update got initiated through RHEV-M. > > Does the customer see a backup entry in grub when he reboots the machine? > And yes - a reboot is happening after the update was pushed ot the machine. Fabian, The customer has confirmed that the GRUB menu does not show any backup entries post-reboot, and the only entry is the currently installed 6.3 image. (In reply to wdaniel from comment #14) > The customer has confirmed that the GRUB menu does not show any backup > entries post-reboot, and the only entry is the currently installed 6.3 image. Hey Wallace, right - that means the updated did not happen. Please ask the customer to provide the logs named by Alon in comment 13. Created attachment 845083 [details]
vds_bootstrap logs
Created attachment 845085 [details]
vdsm logs
(In reply to Fabian Deutsch from comment #15) > (In reply to wdaniel from comment #14) > > The customer has confirmed that the GRUB menu does not show any backup > > entries post-reboot, and the only entry is the currently installed 6.3 image. > > Hey Wallace, > > right - that means the updated did not happen. Please ask the customer to > provide the logs named by Alon in comment 13. Fabian, Alon, The requested files have been attached to the bug, please let me know if there is anything else I can provide to you guys. Thanks! Comment on attachment 845083 [details]
vds_bootstrap logs
I guess in ovirt-node log there will be more information
Wed, 27 Nov 2013 18:35:56 DEBUG <BSTRAP component='setMountPoint' status='OK' message='Mount succeeded.'/>
Wed, 27 Nov 2013 18:35:56 INFO Using default value for: BOOT_SIZE
Wed, 27 Nov 2013 18:35:56 INFO Using default value for: ROOT_SIZE
Wed, 27 Nov 2013 18:35:56 INFO Using default value for: CONFIG_SIZE
Wed, 27 Nov 2013 18:35:56 INFO Using default value for: LOGGING_SIZE
Wed, 27 Nov 2013 18:35:56 INFO Using default value for: DATA_SIZE
Wed, 27 Nov 2013 18:35:56 INFO Using default value for: SWAP2_SIZE
Wed, 27 Nov 2013 18:35:56 INFO Using default value for: DATA2_SIZE
Wed, 27 Nov 2013 18:35:56 ERROR <BSTRAP component='RHEV_INSTALL' status='FAIL'/>
Wallace, could you please also attach /var/log/ovirt.log Fabian, Alon, Apologies for the file name mix up, it's rare that I have anyone request those specific files. Right now we have sosreports from 2 hosts in the environment where this is happening, and for one host (rhev5) there is no data in that 'ovirt.log' file. On the other (rhev6), there are only the following 3 lines: 2013-11-27 18:35:56,427 - DEBUG - ovirtfunctions - Translating: 2013-11-27 18:35:56,442 - DEBUG - ovirtfunctions - Translating: 2013-11-27 18:35:56,465 - INFO - install - Installing the image. Is there anything else I can get to you guys? Hey Wallace, could you please attach the sosreports form the nodes and the engine? (In reply to Fabian Deutsch from comment #23) > Hey Wallace, > > could you please attach the sosreports form the nodes and the engine? Fabian, I'm happy to get those to you, however each sosreport is ~400MB, which seems to exceed Bugzilla's size limit. Are there any particular folders or files I can pick out and compress to get them to you? Fabian, I hadn't heard anything back, but now have the log collector available to download at the following location: http://file.rdu.redhat.com/~wdaniel/wdaniel/00988528-sosreport-LogCollector-20131202112026.tar.xz It's 700MB but should have everything you need. Let me know if there is anything else I can get to you. Hey Daniel, The RHEV-H version is quite old. I need to see how we can progress with debugging here. This was opened on 3.2 / 6.3, while we already have 3.3 / 6.5 in support. As mentioned in comment 26 this is too old, and we're not aware of this issue in recent versions. |