Created attachment 464036 [details]
> Description of problem:
Netdump client always hangs up on RHEL3.9 kvm guest when e1000 emulation device is selected.
Here is the log
< netdump activated - performing handshake with the server. >
After the above-mentioned message is displayed on the console screen,
nothing is displayed.
See attached Screenshot.png for all messages on the console.
Red Hat Enterprise Linux Version Number: 6.0
Release Number: Partner GA
Kernel Version: 2.6.32-71.el6.x86_64
> Steps to Reproduce:
1. Setup netdump client and server on RHEL3.9 kvm guests. See sysreports for detail.
2. Execute the following on the client:
# echo c > /proc/sysrq-trigger
> Actual results:
The netdump client hangs up.
> Expected results:
After completing netdump, the vmcore is normally collected on the server.
Created attachment 464539 [details]
I installed two guest on a RHEL6 Host:
[root@intel-s3e36-01 ~]# virsh list
Id Name State
7 rhel3.9_x86_64_hvm running
8 rhel3.9_i386_hvm running
[root@intel-s3e36-01 ~]# rpm -q kernel
rhel3.9_x86_64_hvm was set as netdump server, rhel3.9_i386_hvm as client.
When I trigger a crash, it start to dump. But seems got hang.
Here is my guest xml:
[root@intel-s3e36-01 ~]# cat /etc/libvirt/qemu/rhel3.9_i386_hvm.xml
<type arch='x86_64' machine='rhel6.0.0'>hvm</type>
<disk type='file' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<target dev='hda' bus='ide'/>
<address type='drive' controller='0' bus='0' unit='0'/>
<controller type='ide' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
<model type='cirrus' vram='9216' heads='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
Both rhel3.9_x86_64_hvm and rhel3.9_i386_hvm used bridge and e1000.
can you use xendump to retrieve a core of the guest after it hangs?
scratch that, given its kvm, rather can you use the qemu gdb service to attach to the hung guest to get a dump or backtrace of it in its hung state?
Triage assignment. If you feel this bug doesn't belong to you, or that it cannot be handled in a timely fashion, please contact me for re-assignment
Created attachment 471457 [details]
tcpdump of the netdump packets
192.168.122.55 is the netdump server
192.168.122.135 is the netdump client
I was able to reproduce the issue, and made tcpdump capture (attached in c5)
I can see some malformed packets being sent from client.
On the same token I found, not all devices were always supported for netdump.
I found some old doc, that does not include e1000 in such list:
(search for "support)
but I haven't found anything for RHEL3 explicitly... any idea?
any input appreciated, I continue to work on it..
Created attachment 471534 [details]
RHEL3: disable udp checksum check for netpoll
it looks like the e1000 netpoll function fails to checksum properly
received packets.. given it's qemu e1000 emulation, it might be bug
in the emulation itself... hw checksums..?
if I disable the udp checksum validation completely for netdump,
it works and I get the full vmcore to the server
need to find some e1000 master probably.. :)
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
This request was erroneously denied for the current release of Red Hat
Enterprise Linux. The error has been fixed and this request has been
re-proposed for the current release.
being worked on, update from Chris Wright:
On Fri, Jan 14, 2011 at 09:58:30AM -0800, Chris Wright wrote:
> The upstream sf driver is the same. But I think I finally have an idea
> of what's going wrong. I spent way too much time chasing down a red
> herring on the tx path only to realize it's just fine.
> I'm building some test qemu-kvm binaries w/ patches to the e1000
> emulation to test today. I'll update you when I've got results from
This is an issue with the length value we put in the rx descriptor. The guest asked for SECRC, but we act as if we are sending the full frame with final ethernet CRC. This should be fixed in anything >= qemu-kvm-0.12.1.2-2.119.el6, and is a duplicate of bugzilla 603413. Please re-open if testing shows that it's not working with newer qemu-kvm. I tested and found I could recreate and fix the problem with fixes similar to the patch associated with bz 603413.
*** This bug has been marked as a duplicate of bug 603413 ***
I tried with qemu-kvm-0.12.1.2-2.129 and the netdump works properly
thanks a lot,