Red Hat Bugzilla – Bug 734110
rhevh - upgrade ovirt node fails due to nonexistent breth0
Last modified: 2012-02-21 00:04:23 EST
Description of problem:
When upgrading from rhevh (rhevh-5.7-20110725.1 to rhevh-5.7-20110824.0) from rhevm UI, then host is rebooted in the loop. During booting there is failure of gathering info for 'breth0' interface. The upgrade from 5.6-11.1 works without problems.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. try to update from 5.7-20110725.1
I already reinstalled the host, so please forgive me that I didn't provide any logs (if you need them, pls let me know).
I'm unable to reproduce the reboot loop using the exact steps provided in the description.
Pavel will attempt to reproduce and provide either logs or serial console trace.
Note: I did see the failure when trying to start breth0 and saw that both ifcfg-rhevm and ifcfg-breth0 were persisted, but it did not cause an issue with the upgrade.
In our test environment, we can not reproduce it.
Upgrade rhevh-20110725.1 to rhevh-20110824.0 via RHEV-M UI.
1. Install rhevh-20110725.1 manual by Menu.
rhevh install on local SATA disk.
rhevh is in iSCSI domain to connect iSCSI soft LUN.
2. rhevh-20110725.1 can be up in RHEV-M UI.
3. Upgrade rhevh-20110725.1 to rhevh-20110824.0 via RHEV-M UI.
4. RHEV-H reboot as we expected after upgrade.
5. rhevh-20110824.0 rhevh is up automatically after upgrade. and iSCSI domain is up as well.
This issue is not reproduced.
SM107 is the latest build for current announced mail list.
SM 109 is ready for QA test in tlv team, but not announce , it just change there is a new bootstrap.
Backtrace from the vmcore found on the Pavel's machine:
PID: 9868 TASK: ffff810369a06820 CPU: 7 COMMAND: "iscsiadm"
#0 [ffff81035046f8b0] crash_kexec at ffffffff800afef5
#1 [ffff81035046f970] __die at ffffffff80065127
#2 [ffff81035046f9b0] do_page_fault at ffffffff80067474
#3 [ffff81035046faa0] error_exit at ffffffff8005dde9
[exception RIP: netpoll_send_skb_on_dev+34]
RIP: ffffffff8024121e RSP: ffff81035046fb58 RFLAGS: 00010002
RAX: 0000000000000006 RBX: ffff810377484000 RCX: ffffffff80000000
RDX: ffff810377484000 RSI: ffff81036abbbdc0 RDI: ffffffff8862adc0
RBP: 0000000000000000 R8: 0000000000000000 R9: ffffffff885b7f95
R10: 0000000080000000 R11: 0000000000000004 R12: ffff81036abbbdc0
R13: ffffffff804efdd0 R14: 0000000000000020 R15: ffff81036874520c
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#4 [ffff81035046fb70] br_dev_queue_push_xmit at ffffffff885b8154
#5 [ffff81035046fb80] br_forward_finish at ffffffff885b81e1
#6 [ffff81035046fb90] __br_deliver at ffffffff885b8377
#7 [ffff81035046fbb0] br_dev_xmit at ffffffff885b7294
#8 [ffff81035046fbd0] netpoll_send_skb_on_dev at ffffffff802412ab
#9 [ffff81035046fbf0] write_msg at ffffffff8862a0e1
#10 [ffff81035046fc20] __call_console_drivers at ffffffff8009350f
#11 [ffff81035046fc40] release_console_sem at ffffffff80017354
#12 [ffff81035046fc70] vprintk at ffffffff80093d04
#13 [ffff81035046fcf0] printk at ffffffff80093dbb
#14 [ffff81035046fde0] sd_revalidate_disk at ffffffff88123635
#15 [ffff81035046fea0] sd_rescan at ffffffff88122534
#16 [ffff81035046feb0] scsi_rescan_device at ffffffff880199a1
#17 [ffff81035046fec0] store_rescan_field at ffffffff8801b649
#18 [ffff81035046fed0] sysfs_write_file at ffffffff80110ade
#19 [ffff81035046ff10] vfs_write at ffffffff80016b92
#20 [ffff81035046ff40] sys_write at ffffffff8001745b
#21 [ffff81035046ff80] tracesys at ffffffff8005d28d (via system_call)
RIP: 00002b40725897d0 RSP: 00007fff5286bfb8 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: ffffffff8005d28d RCX: ffffffffffffffff
RDX: 0000000000000001 RSI: 000000000044cd08 RDI: 0000000000000003
RBP: 0000000000443461 R8: 0000000000000001 R9: 00002b40725dd100
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 00007fff5286c050 R14: 00007fff5286c4b0 R15: 000000000044cd08
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
netpoll in the trace indicates this is netconsole related.
Ying, did you have netconsole configured?
> netpoll in the trace indicates this is netconsole related.
> Ying, did you have netconsole configured?
Yes, I configured netconsole for this testing.
We'll look at this for 5.8, it's likely kernel issue with netconsole on this particular NIC.
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
mburns-reviewed -- no tech note needed
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.