Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 871337

Summary: [Hyper-V]Kernel panic with error 'not syncing:Fatal exception' when probe and unprobe the hv_netvsc module
Product: Red Hat Enterprise Linux 5 Reporter: Shengnan Wang <shwang>
Component: kernelAssignee: Haiyang Zhang <haiyangz>
Status: CLOSED WORKSFORME QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 5.9CC: areis, habdi, haiyangz, jasowang, jbian, kys, leiwang, mjenner, qguan, rhod, sforsber, shwang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-04-02 08:09:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
kernel panic.jpg
none
kernel_panic_error_message_with_new_calltrace
none
kernel_panic_error_message_with_new_calltrace_02
none
kernel panic when remove and reload the hv_netvsc.jpg none

Description Shengnan Wang 2012-10-30 09:29:27 UTC
Description of problem:
After excuting the 'modprobe' and 'modprobe -r' commands with the bellow shell script for about half an hour, the RHEL5.9 (2.6.18-345.el5) x86_64 guest kernel panic. If flood ping the guest ip from the other machine during probe and unprobe the hv_netvsc module, this issue will occure in about ten minutes.

#cat add_remove_hv_netvsc.sh
#!/bin/bash

while [ "1" = "1" ]
do
     modprobe -r hv_netvsc
     modprobe hv_netvsc
done

Version-Release number of selected component (if applicable):
Host: Windows 2008 R2 Hyper-V 
Hyper-V Version: 6.1.7600.16385
Guest: RHEL5.9 x86_64 (2.6.18-345.el5)
Pv drivers: 
# lsmod |grep hv_*
hid_base_hv            68177  1 hid_hyperv
hv_netvsc              25665  0 
hv_utils               12001  0 
hv_storvsc             17601  3 
hv_vmbus               30265  4 hid_hyperv,hv_netvsc,hv_utils,hv_storvsc


How reproducible:
100%

Steps to Reproduce:

1.Login the guest and check the hv_netvsc module.
2.Probe and unprobe the hv_netvsc module with the script.


Actual results:
At step2, the guest kernel panic.

Expected results:
The guest works well with the module probe and unprobe.


Additional info:
1. The i386 and i386 PAE (2.6.18-345.el5) guests pass the same test.
2. The x86_64 (2.6.18-344.el5) guest kernel panic in the same test.
3. The x86_64 (2.6.18-343.el5) guest kernel panic in the same test.

Comment 1 Shengnan Wang 2012-10-30 09:35:32 UTC
Created attachment 635440 [details]
kernel panic.jpg

Comment 2 Shengnan Wang 2012-10-30 09:46:20 UTC
It works well in the x86_64 guest(kernel-2.6.18-336.el5). About this point, please see the comment 117 in bug 824877.

Comment 4 jason wang 2012-10-30 10:17:30 UTC
(In reply to comment #2)
> It works well in the x86_64 guest(kernel-2.6.18-336.el5). About this point,
> please see the comment 117 in bug 824877.

The calltrace is different, maybe a new bug. K.Y., could you have a look at this issue?

Thanks

Comment 7 Shengnan Wang 2012-10-31 09:58:43 UTC
(In reply to comment #4)

> The calltrace is different, maybe a new bug. K.Y., could you have a look at
> this issue?
> 

Agree with you, it should be a new bug. Tested that this problem can be reproduced with the guest on the iSCSI storage, no this problem found for local storage guest.

There is a 500G iSCSI storage attched to the Hyper-V Server core host through a FC HBA when testing.

Keywrod Regression Removed.

Comment 8 K. Y. Srinivasan 2012-11-01 15:20:58 UTC
Jason,

I think we had fixed a similar bug a while ago on RHEL 5.9. Haiyang fixed it. I am going to have Haiyang look at this bug.

Comment 9 Shengnan Wang 2012-11-05 03:13:03 UTC
(In reply to comment #8)
> Jason,
> 
> I think we had fixed a similar bug a while ago on RHEL 5.9. Haiyang fixed
> it. I am going to have Haiyang look at this bug.

Hi KY,

The similar bug fixed is the kernel panic issue with the guest installed on the local storage. Now it pass the same test without kernel panic.

As mentioned in comment 7, it is a new bug. It can be reproduced with the guest installed on the iSCSI storage.

Comment 10 Haiyang Zhang 2012-11-05 16:12:43 UTC
Shengnan,

Thank you for the bug report. I have some questions about the setup:

Is the Hyper-V host the iSCSI initiator? And, are the guest VHDs saved on the iSCSI disk?

Thanks.

Comment 11 Qin Guan 2012-11-06 01:05:26 UTC
(In reply to comment #10)
> Shengnan,
> 
> Thank you for the bug report. I have some questions about the setup:
> 
> Is the Hyper-V host the iSCSI initiator? And, are the guest VHDs saved on

Yes, the Hyper-V host installed the iSCSI initiator and use it to connect to the iSCSI storage.

> the iSCSI disk?

Also yes, problem happens when the guest VHD file saved on the iSCSI disk. No kernel panic found with the guest on the same host but save the VHD file on the local host disk.

>

Comment 12 Ronen Hod 2012-11-07 06:45:48 UTC
Hi Haiyang,

Any news?
The time is really running out, and we need to take care of it urgently.

Thanks, Ronen.

Comment 13 Haiyang Zhang 2012-11-07 15:25:21 UTC
Yes, we are treating it as top priority. And, our QA team is working to reproduce it. We will keep you updated.

Comment 14 habdi 2012-11-09 14:58:52 UTC
We are using the script provided and have the same setup in our lab, but our tests so far have not replicated this issue...

Comment 17 Bian Jinwei 2012-11-26 06:22:21 UTC
The same problem found on *local storage* this time.

It is about 40% to reproduce it by using kernel -347, no matter on 
local or iSCSI storage. (Also, the problem happens with kernel -346
on the same environment.)

 Host CPU: Intel(R) Xeon(R) CPU E5405 @ 2.00GHz
 Memory: 32 GB
 We build 7 RHEL VMs on the host. 

The steps is the same as comment 0. There are two new Call Trace 
happened on the VMs, please review the new attachment to get details.

Comment 18 Bian Jinwei 2012-11-26 06:23:59 UTC
Created attachment 651783 [details]
kernel_panic_error_message_with_new_calltrace

Comment 19 Bian Jinwei 2012-11-26 06:25:04 UTC
Created attachment 651784 [details]
kernel_panic_error_message_with_new_calltrace_02

Comment 20 Shengnan Wang 2012-11-30 02:49:50 UTC
Hi habdi,

Did you reproduce the bug? You'd better install more guest to run the same test.

Reproduced with RHEL5.9-Server-20121123.2 i386 guest (kernel-2.6.18-347.el5) on local storage by 50%. When the issue occurred, the guest call trace and kernel panic. The kernel panic information is in the file attached.

Comment 21 Shengnan Wang 2012-11-30 02:51:19 UTC
Created attachment 654715 [details]
kernel panic when remove and reload the hv_netvsc.jpg

Comment 23 Ronen Hod 2013-03-28 08:32:57 UTC
KY,

This looks like a bug that will rarely happen outside the QA lab. We tend to be careful not to include many such bug fixes in RHEL5.10. Do you intend to fix it in RHEL5.10?

Comment 24 Haiyang Zhang 2013-03-28 20:43:56 UTC
KY and I discussed this problem.
We are not planning to fix it in the near future. (We didn't reproduce it in house.)