Bug 871337
| Summary: | [Hyper-V]Kernel panic with error 'not syncing:Fatal exception' when probe and unprobe the hv_netvsc module | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Shengnan Wang <shwang> | ||||||||||
| Component: | kernel | Assignee: | Haiyang Zhang <haiyangz> | ||||||||||
| Status: | CLOSED WORKSFORME | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||
| Severity: | high | Docs Contact: | |||||||||||
| Priority: | high | ||||||||||||
| Version: | 5.9 | CC: | areis, habdi, haiyangz, jasowang, jbian, kys, leiwang, mjenner, qguan, rhod, sforsber, shwang | ||||||||||
| Target Milestone: | rc | ||||||||||||
| Target Release: | --- | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2013-04-02 08:09:30 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Attachments: |
|
||||||||||||
Created attachment 635440 [details]
kernel panic.jpg
It works well in the x86_64 guest(kernel-2.6.18-336.el5). About this point, please see the comment 117 in bug 824877. (In reply to comment #2) > It works well in the x86_64 guest(kernel-2.6.18-336.el5). About this point, > please see the comment 117 in bug 824877. The calltrace is different, maybe a new bug. K.Y., could you have a look at this issue? Thanks (In reply to comment #4) > The calltrace is different, maybe a new bug. K.Y., could you have a look at > this issue? > Agree with you, it should be a new bug. Tested that this problem can be reproduced with the guest on the iSCSI storage, no this problem found for local storage guest. There is a 500G iSCSI storage attched to the Hyper-V Server core host through a FC HBA when testing. Keywrod Regression Removed. Jason, I think we had fixed a similar bug a while ago on RHEL 5.9. Haiyang fixed it. I am going to have Haiyang look at this bug. (In reply to comment #8) > Jason, > > I think we had fixed a similar bug a while ago on RHEL 5.9. Haiyang fixed > it. I am going to have Haiyang look at this bug. Hi KY, The similar bug fixed is the kernel panic issue with the guest installed on the local storage. Now it pass the same test without kernel panic. As mentioned in comment 7, it is a new bug. It can be reproduced with the guest installed on the iSCSI storage. Shengnan, Thank you for the bug report. I have some questions about the setup: Is the Hyper-V host the iSCSI initiator? And, are the guest VHDs saved on the iSCSI disk? Thanks. (In reply to comment #10) > Shengnan, > > Thank you for the bug report. I have some questions about the setup: > > Is the Hyper-V host the iSCSI initiator? And, are the guest VHDs saved on Yes, the Hyper-V host installed the iSCSI initiator and use it to connect to the iSCSI storage. > the iSCSI disk? Also yes, problem happens when the guest VHD file saved on the iSCSI disk. No kernel panic found with the guest on the same host but save the VHD file on the local host disk. > Hi Haiyang, Any news? The time is really running out, and we need to take care of it urgently. Thanks, Ronen. Yes, we are treating it as top priority. And, our QA team is working to reproduce it. We will keep you updated. We are using the script provided and have the same setup in our lab, but our tests so far have not replicated this issue... The same problem found on *local storage* this time. It is about 40% to reproduce it by using kernel -347, no matter on local or iSCSI storage. (Also, the problem happens with kernel -346 on the same environment.) Host CPU: Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Memory: 32 GB We build 7 RHEL VMs on the host. The steps is the same as comment 0. There are two new Call Trace happened on the VMs, please review the new attachment to get details. Created attachment 651783 [details]
kernel_panic_error_message_with_new_calltrace
Created attachment 651784 [details]
kernel_panic_error_message_with_new_calltrace_02
Hi habdi, Did you reproduce the bug? You'd better install more guest to run the same test. Reproduced with RHEL5.9-Server-20121123.2 i386 guest (kernel-2.6.18-347.el5) on local storage by 50%. When the issue occurred, the guest call trace and kernel panic. The kernel panic information is in the file attached. Created attachment 654715 [details]
kernel panic when remove and reload the hv_netvsc.jpg
KY, This looks like a bug that will rarely happen outside the QA lab. We tend to be careful not to include many such bug fixes in RHEL5.10. Do you intend to fix it in RHEL5.10? KY and I discussed this problem. We are not planning to fix it in the near future. (We didn't reproduce it in house.) |
Description of problem: After excuting the 'modprobe' and 'modprobe -r' commands with the bellow shell script for about half an hour, the RHEL5.9 (2.6.18-345.el5) x86_64 guest kernel panic. If flood ping the guest ip from the other machine during probe and unprobe the hv_netvsc module, this issue will occure in about ten minutes. #cat add_remove_hv_netvsc.sh #!/bin/bash while [ "1" = "1" ] do modprobe -r hv_netvsc modprobe hv_netvsc done Version-Release number of selected component (if applicable): Host: Windows 2008 R2 Hyper-V Hyper-V Version: 6.1.7600.16385 Guest: RHEL5.9 x86_64 (2.6.18-345.el5) Pv drivers: # lsmod |grep hv_* hid_base_hv 68177 1 hid_hyperv hv_netvsc 25665 0 hv_utils 12001 0 hv_storvsc 17601 3 hv_vmbus 30265 4 hid_hyperv,hv_netvsc,hv_utils,hv_storvsc How reproducible: 100% Steps to Reproduce: 1.Login the guest and check the hv_netvsc module. 2.Probe and unprobe the hv_netvsc module with the script. Actual results: At step2, the guest kernel panic. Expected results: The guest works well with the module probe and unprobe. Additional info: 1. The i386 and i386 PAE (2.6.18-345.el5) guests pass the same test. 2. The x86_64 (2.6.18-344.el5) guest kernel panic in the same test. 3. The x86_64 (2.6.18-343.el5) guest kernel panic in the same test.