Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 2.1 product line. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 111219 (IT_44189)

Summary:

VMWare GSX Server - KERNEL PANIC

Product:

Red Hat Enterprise Linux 2.1

Reporter:

Raymond Azzopardi <raymond.azzopardi>

Component:

kernel

Assignee:

Larry Woodman <lwoodman>

Status:

CLOSED ERRATA

QA Contact:

Severity:

medium

Docs Contact:

Priority:

medium

Version:

2.1

CC:

djoo, jbaron, mmesser, riel, tao, tbarr

Target Milestone:

---

Target Release:

---

Hardware:

i386

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2004-12-13 20:06:27 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

123573

Attachments:

Description	Flags
patch to fix race in smp_call_function	none
final copy of patch to fix race in smp_call_function	none

Description Raymond Azzopardi 2003-11-30 12:27:55 UTC

From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; 
FunWebProducts-MyWay; .NET CLR 1.1.4322)

Description of problem:
I am encountering problems on my Red Hat Linux with a KERNEL problem. 
Server running Red Hat Linux with VMWare GSX Server Version 2.5.0 
Build 3986 (8 environments Windows 2000 and Windows 2003).

Version-Release number of selected component (if applicable):
kernel 2.4.9-e.12 smp

How reproducible:
Sometimes

Steps to Reproduce:
1. None. The problem reproduces itself without even doing anything on 
the server.
2.
3.
    

Actual Results:  Reboot server.

Additional info:

Processor swapper (pid '0, stackpage C545F000)
Stack C545E000 C545E000 C545E000 C01139eF F77eBee8 C0105400 C024556A
Call trace [<C0139eF>] SMP_Call_function_interrupt
[<C0105400>] default_idle [KERNEL]0x0 [Kernel]0x2F
[<C024556A>] call_call_function_interrupt [KERNEL]0x5
[<C0105400>] default_idle [KERNEL]0x0
[<C010542e>] default_idle [KERNEL]0x0
[<C0105492>] default_idle [KERNEL]0x2e
[<C011C5e6>] call_console_drivers[KERNEL]0x46
[<C011C756>] call_console_drivers[KERNEL]0xeb
Code 8b 3c 1e 89 04 1e 8b 42 20 89 3C 81 5b 5e 5F C3
<0> KERNEL PANIC: NOT CONTINUING 8d b4 26 00

Comment 1 Arjan van de Ven 2003-11-30 13:25:02 UTC

can you provide lsmod output ?

Comment 3 Arjan van de Ven 2004-01-05 08:02:46 UTC

closing due to inactivity on requested information

Comment 4 David Joo 2004-05-10 01:37:13 UTC

Arjan,

I have seen another case on this.

This is the panic log details;

Oops: 0000
Kernel 2.4.9-e.37enterprise
CPU:    1
EIP:    0010:[<c0138652>]    Tainted: PF
EFLAGS: 00013002
EIP is at do_ccupdate_local [kernel] 0x22
eax: 00000000   ebx: 00000004   ecx: f7f65efc   edx: ce838000
esi: 00000074   edi: ce838000   ebp: ce839f00   esp: ce839e90
ds: 0018   es: 0018   ss: 0018
Process vmware (pid: 28276, stackpage=ce839000)
Stack: ce838000 f7f84000 ce838000 c0113bef f7f65ef8 c0357120 c024724e
c0357120
       f4e465c0 00000001 f7f84000 ce838000 ce839f00 ffffe000 f7f80018
f7f80018
       fffffffa c0119a8d 00000010 00003206 ce839f0c 00003202 f4e465c0
00000000
Call Trace: [<c0113bef>] smp_call_function_interrupt [kernel] 0x2f
(0xce839e9c)
[<c024724e>] call_call_function_interrupt [kernel] 0x5 (0xce839ea8)
[<c0119a8d>] schedule [kernel] 0x3ad (0xce839ed4)
[<c0125704>] schedule_timeout [kernel] 0x84 (0xce839f04)
[<c0125670>] process_timeout [kernel] 0x0 (0xce839f1c)
[<c01574fe>] do_select [kernel] 0x20e (0xce839f34)
[<c01578a9>] sys_select [kernel] 0x339 (0xce839f6c)
[<c0146bd6>] sys_read [kernel] 0x96 (0xce839f7c)
[<c01073e3>] system_call [kernel] 0x33 (0xce839fc0)

Can you varify that it is the same problem or not please?

Comment 9 Neil Horman 2004-08-20 11:18:35 UTC

Created attachment 102917 [details]
patch to fix race in smp_call_function

After speaking with vmware, I believe that they have identified a race
condition that may arise under heavy load.  smp_call_function in the 2.4.9
kernel contains a race condition in both the call data assigned to it and the
use of the atomic fields that gate its execution.  This appears to be corrected
both in upstream 2.6 kernels and in the RHEL3 2.4.21 kernel series.  This patch
is a variant of the patch provided by vmware for the problem, which solves this
issue.	Tested by both vmware and myself, and appears to work well.

Comment 11 Neil Horman 2004-08-20 20:06:06 UTC

Created attachment 102939 [details]
final copy of patch to fix race in smp_call_function

Heres the final copy of the patch that got acked after various and sundry
clean-ups were preformed.

Comment 12 Larry Woodman 2004-09-10 19:49:48 UTC

This patch looks OK and was reviewed on rhkernel-list so it sould be
included in Pensacola-U6.

Larry

Comment 13 Jason Baron 2004-09-13 20:06:18 UTC

This is already in U6...changing status to modified.

Comment 14 Juergen Vigna 2004-10-18 07:41:24 UTC

When is the kernel with this patch released? I checked the last
available e-49 and there it is not included yet. As I'm hit of this
problem too on our VMware RH AS 2.1 server I would be curious to know
in which version of kernel this will be included.

Comment 15 John Flanagan 2004-12-13 20:06:28 UTC

An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-505.html