Bug 715306 - [RHEL5.7] Panic on specific host while testing KVM guest installs
Summary: [RHEL5.7] Panic on specific host while testing KVM guest installs
Keywords:
Status: CLOSED DUPLICATE of bug 703045
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.7
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Herbert Xu
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-06-22 14:10 UTC by PaulB
Modified: 2018-11-27 21:26 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-10-24 20:27:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshot of the panic (62.76 KB, image/png)
2011-09-23 12:31 UTC, Matthias Hensler
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Legacy) 64062 0 None None None Never

Description PaulB 2011-06-22 14:10:55 UTC
Description of problem:
 While testing the installation of RHEL6.2 KVM guests on a RHEL5.7 (2.6.18-266.el5) KVM host, there was a kernel PANIC.

Version-Release number of selected component (if applicable):
 2.6.18-266.el5

How reproducible:
 I was not able to reproduce this issue.
 [] https://beaker.engineering.redhat.com/jobs/99814
    Testing passed successfuly here.

 [] https://beaker.engineering.redhat.com/jobs/99852
    Testing passed successfully here.

Actual results:
 There was a kernel PANIC during testing.
 See here:
  https://beaker.engineering.redhat.com/recipes/202109
  http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2011/06/992/99245/202109//console.log
  <-SNIP->
   Unable to handle kernel NULL pointer dereference at 00000000000003c8 RIP:  
   [<ffffffff886f0380>] :bridge:__br_deliver+0xcd/0xfc 
   PGD 303ff8d067 PUD 303f302067 PMD 0  
   Oops: 0000 [1] SMP  
   last sysfs file: /devices/pci0000:00/0000:00:00.0/irq 
   CPU 32  
  <-SNIP->
   Pid: 23436, comm: cp Tainted: G     ---- 2.6.18-266.el5 #1 
   RIP: 0010:[<ffffffff886f0380>]  [<ffffffff886f0380>]   
   :bridge:__br_deliver+0xcd/0xfc 
  <-SNIP->
   [<ffffffff886ef294>] :bridge:br_dev_xmit+0xc7/0xdb 
   [<ffffffff802363e6>] dev_hard_start_xmit+0x1b7/0x28a 
   [<ffffffff8002f76f>] dev_queue_xmit+0x1f3/0x2a3 
   [<ffffffff80031e05>] ip_output+0x29a/0x2dd 
   [<ffffffff80034526>] ip_queue_xmit+0x42c/0x486 
   [<ffffffff800221ce>] tcp_transmit_skb+0x646/0x67e 
   [<ffffffff8001bf25>] tcp_rcv_established+0x5a1/0x8bd 
   [<ffffffff8003b44b>] tcp_v4_do_rcv+0x2a/0x2fa 
   [<ffffffff80030c23>] release_sock+0x54/0xc1 
   [<ffffffff80026882>] tcp_sendmsg+0xa03/0xb07 
   [<ffffffff80054e8b>] sock_sendmsg+0xf8/0x14a 
   [<ffffffff800a2e52>] autoremove_wake_function+0x0/0x2e 
   [<ffffffff8804ddc2>] :ext3:ext3_mark_inode_dirty+0x33/0x3c 
   [<ffffffff8022de73>] kernel_sendmsg+0x35/0x47 
   [<ffffffff8862e6ce>] :sunrpc:xs_tcp_send_request+0x10a/0x2c8 
   [<ffffffff8862d5e9>] :sunrpc:xprt_transmit+0xc4/0x1ab 
   [<ffffffff887e50d8>] :nfs:nfs3_xdr_readargs+0x0/0x8d 
   [<ffffffff8862b422>] :sunrpc:call_transmit+0x1ee/0x228 
   [<ffffffff88630bcd>] :sunrpc:__rpc_execute+0x9c/0x2df 
   [<ffffffff887de251>] :nfs:nfs_execute_read+0x41/0x5a 
   [<ffffffff887de823>] :nfs:nfs_pagein_one+0x265/0x285 
   [<ffffffff887dc71e>] :nfs:nfs_coalesce_requests+0x9f/0xd9 
   [<ffffffff887de992>] :nfs:nfs_readpages+0x14f/0x1bb 
   [<ffffffff8001309a>] __do_page_cache_readahead+0xfc/0x17b 
   [<ffffffff800321fa>] blockable_page_cache_readahead+0x53/0xb2 
   [<ffffffff8002eafe>] make_ahead_window+0x82/0x9e 
   [<ffffffff800141bb>] page_cache_readahead+0x17f/0x1af 
   [<ffffffff8000c3b2>] do_generic_mapping_read+0xc6/0x359 
   [<ffffffff8000d2b6>] file_read_actor+0x0/0x159 
   [<ffffffff8000c791>] __generic_file_aio_read+0x14c/0x198 
   [<ffffffff80016f20>] generic_file_aio_read+0x36/0x3b 
   [<ffffffff8000cfdf>] do_sync_read+0xc7/0x104 
   [<ffffffff800a2e52>] autoremove_wake_function+0x0/0x2e 
   [<ffffffff8000b7c7>] vfs_read+0xcb/0x171 
   [<ffffffff80011d5a>] sys_read+0x45/0x6e 
   [<ffffffff8005d28d>] tracesys+0xd5/0xe0 
   Code: 48 8b 80 c8 03 00 00 48 85 c0 74 1f 48 8b 50 38 48 8b 43 18  
   RIP  [<ffffffff886f0380>] :bridge:__br_deliver+0xcd/0xfc 
   RSP <ffff81102ae15468> 
   CR2: 00000000000003c8 
   <0>Kernel panic - not syncing: Fatal exception
  <-SNIP->

Expected results:
 Successful installation of RHEL6.2 KVM guests on RHEL5.7 KVM host without PANICing the RHEL5.7 kernel.


Additional info:
 I was not able to reproduce this issue.
 See system hostname in following comment.

Comment 2 Qian Cai 2011-06-23 11:34:45 UTC
This looks like already fixed upstream.
https://bugzilla.kernel.org/show_bug.cgi?id=16448

Comment 3 PaulB 2011-07-26 14:35:33 UTC
All,
This issue was reproduced here on another host:

(See following comment for system hostname)

https://beaker.engineering.redhat.com/recipes/233791
http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2011/07/1140/114084/233791//console.log
<-SNIP->
Pid: 5494, comm: cp Tainted: G     ---- 2.6.18-269.el5 #1 
RIP: 0010:[<ffffffff88863380>]  [<ffffffff88863380>] :bridge:__br_deliver+0xcd/0xfc 
RSP: 0018:ffff8102100e1468  EFLAGS: 00010246 
RAX: 0000000000000000 RBX: ffff81020e652500 RCX: 0000000000000000 
RDX: ffff810201f0f0c0 RSI: 0000000000000202 RDI: ffff81022eabe000 
RBP: ffff81020e652500 R08: 0000000000000004 R09: ffff8102100e129c 
R10: 000000000000009c R11: ffffffff8022c9a4 R12: ffff810201f0f118 
R13: ffffffff80360ae0 R14: ffff81020e652000 R15: 0000000000000000 
FS:  00002b0189b1c1f0(0000) GS:ffff810107b9ce40(0000) knlGS:0000000000000000 
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b 
CR2: 00000000000003c8 CR3: 000000021003e000 CR4: 00000000000006e0 
Process cp (pid: 5494, threadinfo ffff8102100e0000, task ffff81022a862820) 
Stack:  0000000000000000 ffff810201f0f0c0 ffff810201f0f0c0 ffffffff88862294 
 ffff81022eb32580 ffff810201f0f0c0 ffffffff805bed10 ffffffff80236a0d 
 ffff81022090a660 ffff81020e652000 ffff810201f0f0c0 ffffffff80360ae0 
Call Trace: 
 [<ffffffff88862294>] :bridge:br_dev_xmit+0xc7/0xdb 
 [<ffffffff80236a0d>] dev_hard_start_xmit+0x1b7/0x28a 
 [<ffffffff8002f76f>] dev_queue_xmit+0x1f3/0x2a3 
 [<ffffffff80031e05>] ip_output+0x29a/0x2dd 
 [<ffffffff80034526>] ip_queue_xmit+0x42c/0x486 
 [<ffffffff800221ce>] tcp_transmit_skb+0x646/0x67e 
 [<ffffffff8001bf25>] tcp_rcv_established+0x5a1/0x8bd 
 [<ffffffff8003b44b>] tcp_v4_do_rcv+0x2a/0x2fa 
 [<ffffffff80032f7c>] __tcp_push_pending_frames+0x76f/0x849 
 [<ffffffff80030c23>] release_sock+0x54/0xc1 
 [<ffffffff80026882>] tcp_sendmsg+0xa03/0xb07 
 [<ffffffff80054ea7>] sock_sendmsg+0xf8/0x14a 
 [<ffffffff800a2e4e>] autoremove_wake_function+0x0/0x2e 
 [<ffffffff8804ddc2>] :ext3:ext3_mark_inode_dirty+0x33/0x3c 
 [<ffffffff8022e499>] kernel_sendmsg+0x35/0x47 
 [<ffffffff887a16ce>] :sunrpc:xs_tcp_send_request+0x10a/0x2c8 
 [<ffffffff887a05e9>] :sunrpc:xprt_transmit+0xc4/0x1ab 
 [<ffffffff889580dc>] :nfs:nfs3_xdr_readargs+0x0/0x8d 
 [<ffffffff8879e422>] :sunrpc:call_transmit+0x1ee/0x228 
 [<ffffffff887a3bcd>] :sunrpc:__rpc_execute+0x9c/0x2df 
 [<ffffffff88951255>] :nfs:nfs_execute_read+0x41/0x5a 
 [<ffffffff88951827>] :nfs:nfs_pagein_one+0x265/0x285 
 [<ffffffff8894f722>] :nfs:nfs_coalesce_requests+0x9f/0xd9 
 [<ffffffff88951996>] :nfs:nfs_readpages+0x14f/0x1bb 
 [<ffffffff8001309a>] __do_page_cache_readahead+0xfc/0x17b 
 [<ffffffff800321fa>] blockable_page_cache_readahead+0x53/0xb2 
 [<ffffffff8002eafe>] make_ahead_window+0x82/0x9e 
 [<ffffffff800141bb>] page_cache_readahead+0x17f/0x1af 
 [<ffffffff8000c3b2>] do_generic_mapping_read+0xc6/0x359 
 [<ffffffff8000d2b6>] file_read_actor+0x0/0x159 
 [<ffffffff8000c791>] __generic_file_aio_read+0x14c/0x198 
 [<ffffffff80016f20>] generic_file_aio_read+0x36/0x3b 
 [<ffffffff8000cfdf>] do_sync_read+0xc7/0x104 
 [<ffffffff800a2e4e>] autoremove_wake_function+0x0/0x2e 
 [<ffffffff8000b7c7>] vfs_read+0xcb/0x171 
 [<ffffffff80011d5a>] sys_read+0x45/0x6e 
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 
 
 
Code: 48 8b 80 c8 03 00 00 48 85 c0 74 1f 48 8b 50 38 48 8b 43 18  
RIP  [<ffffffff88863380>] :bridge:__br_deliver+0xcd/0xfc 
 RSP <ffff8102100e1468> 
CR2: 00000000000003c8 
 <0>Kernel panic - not syncing: Fatal exception 
<-SNIP->


Note:
This issue seems intermittent, as retesting I was not able to reproduce the PANIC:
https://beaker.engineering.redhat.com/jobs/114617

-pbunyan

Comment 5 Konstantin Khorenko 2011-09-21 16:20:49 UTC
Hello All,

we've also got several customers who faced this issue (we do not know exactly how to reproduce panics, but there were multiple crashes):

Unable to handle kernel NULL pointer dereference at 00000000000003c8 RIP:
 [<ffffffff884e5459>] :bridge:__br_deliver+0xe5/0x115
PGD 80e46f067 PUD 81293e067 PMD 0
Oops: 0000 [1] SMP
last sysfs file:
CPU: 7
Modules linked in: ...
Pid: 23239, comm: nsrmmd Tainted: P --- 2.6.18-028stab093.2 #1 028stab093
RIP: 0060:[<ffffffff884e5459>] [<ffffffff884e5459>] :bridge:__br_deliver+0xe5/0x115
RSP: 0068:ffff81080f66d8b8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff81082b11c580 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffc2000e7c0304 RDI: ffff81082e521000
RBP: ffff81082b11c580 R08: 0000000000360004 R09: ffff81080691fb00
R10: ffff81082b11c000 R11: ffff81083b7043c0 R12: ffff81080691fb58
R13: ffffffff8036fda0 R14: ffff81082b11c000 R15: ffff81080691fb58
FS: 00002b4f2d766250(0000) GS:ffff81083bf56940(0000) knlGS:0000000000000000
CS: 0060 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000003c8 CR3: 000000080e46c000 CR4: 00000000000006e0
Process nsrmmd (pid: 23239, veid=0, threadinfo ffff81080f66c000, task ffff8108394496a0)
Stack: 0000000000000004 ffff81080691fb00 ffff81080691fb00 ffffffff884e4330
 ffff81081fc0c6e0 ffff81080691fb00 ffff81082b11c000 ffffffff80231935
 ffff81081fc0c6e0 ffff81080691fb00 ffff81082b11c000 ffffffff8036fda0
Call Trace:
 [<ffffffff884e4330>] :bridge:br_dev_xmit+0xd3/0xe7
 [<ffffffff80231935>] dev_hard_start_xmit+0x1b7/0x28a
 [<ffffffff80030ca7>] dev_queue_xmit+0x3de/0x4b1
 [<ffffffff800330fd>] ip_output+0x31f/0x365
 [<ffffffff80035acc>] ip_queue_xmit+0x58e/0x5f7
 [<ffffffff80030d42>] dev_queue_xmit+0x479/0x4b1
 [<ffffffff80022b47>] tcp_transmit_skb+0x72a/0x762
 [<ffffffff8001c16a>] tcp_rcv_established+0x64c/0x9bf
 [<ffffffff8003d31c>] tcp_v4_do_rcv+0x4d/0x37b
 [<ffffffff80031f81>] release_sock+0x54/0xc1
 [<ffffffff8001da68>] tcp_recvmsg+0x4f2/0xba6
 [<ffffffff80032d49>] sock_common_recvmsg+0x2d/0x43
 [<ffffffff8008b636>] default_wake_function+0x0/0xe
 [<ffffffff8004819a>] do_sock_read+0xad/0xee
 [<ffffffff802291f9>] sock_readv+0xb7/0xd1
 [<ffffffff800644f6>] thread_return+0x6a/0x177
 [<ffffffff800a3453>] autoremove_wake_function+0x0/0x2e
 [<ffffffff800646ec>] wait_for_completion+0x8f/0xa2
 [<ffffffff8008b636>] default_wake_function+0x0/0xe
 [<ffffffff882138db>] :st:release_buffering+0x3b/0x55
 [<ffffffff800ef421>] do_readv_writev+0xc6/0x1ca
 [<ffffffff800ef626>] sys_readv+0x49/0xe5
 [<ffffffff80060166>] system_call+0x7e/0x83


Code: 48 8b 80 c8 03 00 00 48 85 c0 74 1f 48 8b 50 38 48 8b 43 18
RIP [<ffffffff884e5459>] :bridge:__br_deliver+0xe5/0x115
 RSP <ffff81080f66d8b8> 

2.6.18-028stab093.2 - is a Parallels Virtuozzo Containers kernel based on 2.6.18-274.el5 Red Hat kernel.

--
Best regards,

Konstantin Khorenko,
PVCfL/OpenVZ developer,
Parallels

Comment 6 Matthias Hensler 2011-09-23 12:30:39 UTC
This panic was triggered twice in one week after the machine was rebooted from kernel 2.6.18-238.9.1.el5 to 2.6.18-274.3.1.el5. The host currently runs 3 KVM virtual machines.

Stacktrace is incomplete, but shows the same signature as the ones in this bug.

Comment 7 Matthias Hensler 2011-09-23 12:31:29 UTC
Created attachment 524602 [details]
Screenshot of the panic

Comment 12 John Newbigin 2011-10-20 22:41:11 UTC
Is #730917 the fix for this bug?

Comment 13 Jarod Wilson 2011-10-24 20:27:56 UTC
(In reply to comment #12)
> Is #730917 the fix for this bug?

Yes. Specifically for RHEL5.8, bug 703045.

*** This bug has been marked as a duplicate of bug 703045 ***


Note You need to log in before you can comment on or make changes to this bug.