Bug 1283191
| Summary: | mkdumprd fails on RHEV hosts with running VMs | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Julio Entrena Perez <jentrena> | |
| Component: | kexec-tools | Assignee: | Xunlei Pang <xlpang> | |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | |
| Severity: | high | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 6.7 | CC: | bhe, bwoods, chayang, jentrena, juzhang, kdump-team-bugs, lilu, mhuang, mmilgram, nyelle, paul, pdwyer, ruyang, sreber, surkumar, virt-bugs, xlpang, ycui, zhguo | |
| Target Milestone: | rc | Keywords: | ZStream | |
| Target Release: | --- | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | kexec-tools-2.0.0-292.el6 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1305481 (view as bug list) | Environment: | ||
| Last Closed: | 2016-05-10 19:12:09 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1172231, 1305481 | |||
|
Description
Julio Entrena Perez
2015-11-18 12:32:47 UTC
(In reply to Julio Entrena Perez from comment #0) > Description of problem: > mkdumprd fails on RHEV hosts with running VMs due to the > > # service kdump restart > Stopping kdump: [ OK ] > Detected change(s) the following file(s): > > /etc/kdump.conf > Rebuilding /boot/initrd-2.6.32-573.7.1.el6.x86_64kdump.img > The ifcfg-vnet0 or ifcfg-xxx which contains DEVICE=vnet0 field doesn't exist. > Failed to run mkdumprd > Starting kdump: [FAILED] > > The vnet<x> network interfaces of the running guests seem to confuse > mkdumprd: > > # ifconfig | egrep -v ^\ \|^$\|^vlan\|^eth1\.\|^lo > eth0 Link encap:Ethernet HWaddr 00:10:18:73:B8:8D > rhevm Link encap:Ethernet HWaddr 00:10:18:73:B8:8D > vnet0 Link encap:Ethernet HWaddr FE:01:A4:AD:FE:CA Could you mind setuping ifcfg for device vnet0? Then you can try it again. Thanks Minfei (In reply to Minfei Huang from comment #1) > > Could you mind setuping ifcfg for device vnet0? Then you can try it again. The interfaces are already UP since the VMs are running: # ifconfig | grep -A2 vnet vnet0 Link encap:Ethernet HWaddr FE:01:A4:AD:FE:CA inet6 addr: fe80::fc01:a4ff:fead:feca/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 -- vnet1 Link encap:Ethernet HWaddr FE:01:A4:AD:FE:CB inet6 addr: fe80::fc01:a4ff:fead:fecb/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 # touch /etc/kdump.conf # service kdump restart Stopping kdump: [ OK ] Detected change(s) the following file(s): /etc/kdump.conf Rebuilding /boot/initrd-2.6.32-573.7.1.el6.x86_64kdump.img The ifcfg-vnet0 or ifcfg-xxx which contains DEVICE=vnet0 field doesn't exist. Failed to run mkdumprd Starting kdump: [FAILED] (In reply to Julio Entrena Perez from comment #2) > (In reply to Minfei Huang from comment #1) > > > > Could you mind setuping ifcfg for device vnet0? Then you can try it again. > > The interfaces are already UP since the VMs are running: > > # ifconfig | grep -A2 vnet > vnet0 Link encap:Ethernet HWaddr FE:01:A4:AD:FE:CA > inet6 addr: fe80::fc01:a4ff:fead:feca/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > -- > vnet1 Link encap:Ethernet HWaddr FE:01:A4:AD:FE:CB > inet6 addr: fe80::fc01:a4ff:fead:fecb/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > Yes, this net device is up. What I mean is to write a persistent config in ifcfg-x. Kdump will use it to gather enough info to setup it in 2nd kernel. Thanks Minfei (In reply to Julio Entrena Perez from comment #2) > (In reply to Minfei Huang from comment #1) > > > > Could you mind setuping ifcfg for device vnet0? Then you can try it again. > Sorry, I misread your update. vnet<x> interfaces are tap interfaces created for the vNICs of the running VMs. Those interfaces are created and destroyed dynamically and consequently don't have an ifcfg file. mkdumprd should ignore those interfaces. (In reply to Julio Entrena Perez from comment #4) > (In reply to Julio Entrena Perez from comment #2) > > (In reply to Minfei Huang from comment #1) > > > > > > Could you mind setuping ifcfg for device vnet0? Then you can try it again. > > > Sorry, I misread your update. > vnet<x> interfaces are tap interfaces created for the vNICs of the running > VMs. > Those interfaces are created and destroyed dynamically and consequently > don't have an ifcfg file. > mkdumprd should ignore those interfaces. Seems kdump will use ifcfg to setup network? Could you have a try? Thanks Minfei Hi Cui Ying, Could you help reproduce this bug on rhev host? Seems the ip route accquired from the destination is through the guest's vnet, I think this is wrong, it should be one of the host's netdev.
Could you please paste the content of your /etc/kdump.conf plus the output of following command both on host?
ip route get to <dest ip>
As an example, if using ssh kdump to "10.66.129.152", so in /etc/kdump.conf like:
ssh root.129.152
sshkey /root/.ssh/kdumprsa
And the command output is like:
# ip route get to 10.66.129.152
10.66.129.152 via 10.16.47.254 dev eth0 src 10.16.45.13
cache mtu 1500 advmss 1460 hoplimit 64
# cat /etc/kdump.conf | grep -v ^# path /var/crash core_collector makedumpfile -c --message-level 1 -d 31 Dump is to be produced locally so there's no <dest ip>. Apologies, wrong host. # cat /etc/kdump.conf | grep -v ^# path /var/crash core_collector makedumpfile -c --message-level 1 -d 31 fence_kdump_nodes rhevm1-375.usersys.redhat.com fence_kdump_args -p 7410 -i 5 Looks like including the "fence_kdump_nodes" line causes the problem. "rhevm" is a bridge, and it uses "eth1" for one of its physical interfaces. If I understand correctly, RHEV will create a new vnet<n> for every new guest<n> dynamically, and add "vnet<n>" to "rhevm" bridge, the kdump will simply fail when detecting no ifcfg-vnet<m> file created for the corresponding vnet<m> interface. I think for such bridges in RHEV, we can interpret only "eth1" for "rhevm" and ignore all the "vnet<m>" in the bridge, the bridge still works, and so does kdump on RHEV. Could you please help start some VMs on "10.33.20.24 (root/redhat)", currently there're no vnet interfaces on the host, I want to gather some information to make a formal patch. Thanks! Sorry, done. For vnet<x>, by what ways can I identify it as a virtual interface assigned to VMs, like through some file under /sys/class/net/vnet0/ ? Anyone knows that? It should be safe to assume that any interface named vnet<x> is related to a virtual machine: http://libvirt.org/git/?p=libvirt.git;a=blob;f=src/conf/domain_conf.h;h=ae6d546978973539766ebb114e7be3a802c329fa;hb=HEAD#l1084 This is very helpful, thank you! patch posted: http://post-office.corp.redhat.com/archives/kexec-kdump-list/2016-January/msg00037.html Hi, On RHEV, is the bridge name "rhevm" also safe to use like vnet<n>? rhevm is the default management network on RHEV hosts, it is used to communicate to the RHEV-M (manager) and it will _usually_ have the main IP address of the host. If a host has to dump a vmcore over the network, it's _likely_ that this is the interface that should be used for that purpose. (In reply to Xunlei Pang from comment #32) > Hi Zhiyi, > > It's a known issue, you can refer to the following link: > https://bugzilla.redhat.com/show_bug.cgi?id=1284605 > > It's been fixed since Release 2.0.0-294 after this bug, so you try to test > it > using newer release versions after "kexec-tools-2.0.0-294.el6.x86_64" > > Thanks. Thanks Xunlei, Verify this bug on rhel 6.8 with kexec-tools-2.0.0-294.el6.x86_64: [root@dhcp-10-61 ~]# service kdump restart Stopping kdump: [ OK ] Detected change(s) the following file(s): /etc/kdump.conf Rebuilding /boot/initrd-2.6.32-621.el6.x86_64kdump.img Starting kdump: [ OK ] Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0734.html Requesting that the bug be re-opened: After updating to the new kexec, the customer is still unable to bring up kdump. The customer can bring down the tap device and start kdump; however, after a yum update kernel then kdump goes down and the only work around is to bring down the tap device again. Case # 01620459 |