Bug 487672

Summary: slab corruption with dlm and clvmd on ppc64
Product: Red Hat Enterprise Linux 5 Reporter: Bryn M. Reeves <bmr>
Component: kernelAssignee: David Teigland <teigland>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.3CC: ccaulfie, cjt, dhoward, emcnabb, jpirko, tao, teigland
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: ppc64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 08:12:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 491677    
Attachments:
Description Flags
patch to test none

Description Bryn M. Reeves 2009-02-27 12:37:59 UTC
Description of problem:
We're seeing corruption of slab cache data structures when running clvmd on ppc64. The problem has shown itself as corruption of slab-internal list pointers an running with the debug kernel and red zones enabled confirms that something is writing past the end of the allocated object.

A number of cores have been collected and although there are several different patterns of corruption the suspicion is that these are related.

E.g., in this example the upper 32-bits of a list pointer were overwritten with 0x32000000:

list_del corruption. prev->next should be c0000000ed9e0000, but was 32000000ed9e0000
kernel BUG in list_del at lib/list_debug.c:65!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=128 NUMA
Modules linked in: lock_dlm gfs2 dlm configfs ipv6 xfrm_nalgo crypto_api snd_powermac snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_page_alloc snd_timer snd soundcore i2c_core parport_pc lp parport sg ibmveth dm_round_robin dm_multipath dm_snapshot dm_zero dm_mirror dm_mod ibmvscsic sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
NIP: C0000000001BE174 LR: C0000000001BE170 CTR: 80000000001C5840
REGS: c00000000f45f840 TRAP: 0700   Not tainted  (2.6.18-92.1.18.el5)
MSR: 8000000000021032 <ME,IR,DR>  CR: 28000082  XER: 00000006
TASK = c0000000ffc149c0[28] 'events/2' THREAD: c00000000f45c000 CPU: 2
GPR00: C0000000001BE170 C00000000F45FAC0 C0000000005F1BF0 0000000000000058
GPR04: 8000000000001032 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000002 C000000000680200 C0000000FF19BC08 0000000000000000
GPR12: 0000000000004000 C0000000004D5280 0000000000000000 C0000000003E3048
GPR16: 4000000001C00000 C0000000003E1A58 0000000000000000 0000000000638000
GPR20: C0000000004953E8 00000000020953E8 0000000002095658 000000000000000B
GPR24: 0000000000000000 0000000000000000 C0000000EFB32900 C0000000ED9E0358
GPR28: C0000000FC725398 C0000000F9328100 C000000000532918 C0000000ED9E0000
NIP [C0000000001BE174] .list_del+0x44/0xb0
LR [C0000000001BE170] .list_del+0x40/0xb0
Call Trace:
[C00000000F45FAC0] [C0000000001BE170] .list_del+0x40/0xb0 (unreliable)
[C00000000F45FB40] [C0000000000E7570] .free_block+0xdc/0x1d4
[C00000000F45FC00] [C0000000000E772C] .drain_array+0xc4/0x144
[C00000000F45FCB0] [C0000000000E9930] .cache_reap+0x118/0x2d0
[C00000000F45FD50] [C0000000000815E8] .run_workqueue+0xdc/0x168
[C00000000F45FDF0] [C000000000082354] .worker_thread+0x12c/0x19c
[C00000000F45FEE0] [C000000000086E74] .kthread+0x128/0x178
[C00000000F45FF90] [C000000000026FFC] .kernel_thread+0x4c/0x68
Instruction dump:
7c641b78 f8010010 f821ff81 60000000 60000000 e9230008 e8a90000 7fa51800
41be0014 e87e8000 4beae8f9 60000000 <0fe00000> e93f0000 7fe4fb78 e8a90008
<0>Sending IPI to other cpus...


Version-Release number of selected component (if applicable):
2.6.18-92.1.18.el5, 2.6.18-128.1.1.el5

How reproducible:
100% reproducible with clvmd enabled. Disabling clvmd prevents the problem from occurring.

Steps to Reproduce:
1. Configure ppc64 cluster
2. Enable clvmd
3. Boot
  
Actual results:
System quickly panics with backtrace similar to above

Expected results:
No panic

Additional info:

Comment 2 Bryn M. Reeves 2009-02-27 12:45:48 UTC
Looking at the words around the corrupted list pointer in the core from comment #0 we find:

c0000000f93280f0:  0000000000000000 565f766730315130   ........V_vg01Q0
c0000000f9328100:  32000000ed9e0000 c0000000ed9e0000   2...............

Which appears to be an LVM2 lock string.

Comment 3 Bryn M. Reeves 2009-02-27 15:11:50 UTC
There's a VG named vg01 present on these hosts:

$ grep vg01 lvmdump-*/vgs
lvmdump-1/vgs:  vg01 wz--n-  4.00M   1   7   0 16.52G 176.00M 0IQtLj-uPpD-y82v-FsFV-xplc-oNv7-0ovNi3
lvmdump-2/vgs:  vg01 wz--n-  4.00M   1   7   0 16.52G 176.00M 0IQtLj-uPpD-y82v-FsFV-xplc-oNv7-0ovNi3

So the string in comment #2 ("V_vg01Q02\0") appears as though it may have come from:

LVM2.*/lib/locking/cluster_locking.c:
int lock_resource(struct cmd_context *cmd, const char *resource, uint32_t flags)
#endif
{
[...]
        switch (flags & LCK_SCOPE_MASK) {
        case LCK_VG:
                /* If the VG name is empty then lock the unused PVs */
                if (!*resource) /* FIXME Deprecated */
                        dm_snprintf(lockname, sizeof(lockname), "P_orphans");
                else if (*resource == '#')
                        dm_snprintf(lockname, sizeof(lockname), "P_%s", resource + 1);
                else   
                        dm_snprintf(lockname, sizeof(lockname), "V_%s",
                                     resource);

I'm not sure about the "Q02" - is it possible that's part of the resource string?

Comment 4 Bryn M. Reeves 2009-02-27 15:26:23 UTC
I'm having trouble getting sane backtraces from the core that showed the assumed lock strings. E.g. I see this backtrace in the log:

NIP [C0000000001BE174] .list_del+0x44/0xb0
LR [C0000000001BE170] .list_del+0x40/0xb0
Call Trace:
[C00000000F45FAC0] [C0000000001BE170] .list_del+0x40/0xb0 (unreliable)
[C00000000F45FB40] [C0000000000E7570] .free_block+0xdc/0x1d4
[C00000000F45FC00] [C0000000000E772C] .drain_array+0xc4/0x144
[C00000000F45FCB0] [C0000000000E9930] .cache_reap+0x118/0x2d0
[C00000000F45FD50] [C0000000000815E8] .run_workqueue+0xdc/0x168
[C00000000F45FDF0] [C000000000082354] .worker_thread+0x12c/0x19c
[C00000000F45FEE0] [C000000000086E74] .kthread+0x128/0x178
[C00000000F45FF90] [C000000000026FFC] .kernel_thread+0x4c/0x68

Which seems completely consistent with e.g. LR/NIP in the register dumps but looking at that via the bt command shows:

 #0 [c00000000f45fac0] .gen_pool_free at c0000000001be170
 #1 [c00000000f45fb40] .__drain_alien_cache at c0000000000e7570
 #2 [c00000000f45fc00] .drain_freelist at c0000000000e772c
 #3 [c00000000f45fcb0] .s_show at c0000000000e9930
 #4 [c00000000f45fd50] .cleanup_workqueue_thread at c0000000000815e8
 #5 [c00000000f45fdf0] .workqueue_cpu_callback at c000000000082354
 #6 [c00000000f45fee0] .bit_waitqueue at c000000000086e74
 #7 [c00000000f45ff90] .kernel_thread at c000000000026ffc

The addresses are correct but the resolved symbol names are not. Same thing if I disassemble e.g. list_del:

/usr/src/debug/kernel-2.6.18/linux-2.6.18.ppc64/lib/list_debug.c: 61
0xc0000000001bdb9c <.list_del>: beq-    cr7,0xc0000000001bdc0c <list_del+112>
0xc0000000001bdba0 <.list_del+4>:       ld      r9,0(r31)
0xc0000000001bdba4 <.list_del+8>:       addi    r9,r9,1
0xc0000000001bdba8 <.list_del+12>:      cmpd    cr7,r9,r11
0xc0000000001bdbac <.list_del+16>:      beq-    cr7,0xc0000000001bdc0c <list_del+112>
/usr/src/debug/kernel-2.6.18/linux-2.6.18.ppc64/lib/list_debug.c: 63
0xc0000000001bdbb0 <.list_del+20>:      bl      0xc0000000001c4c44 <__pci_bus_find_cap+20>
/usr/src/debug/kernel-2.6.18/linux-2.6.18.ppc64/lib/list_debug.c: 61
0xc0000000001bdbb4 <.list_del+24>:      nop
0xc0000000001bdbb8 <.list_del+28>:      mr      r4,r29

There are apparent calls to PCI related symbols all through the routine which cannot be correct. I think the disassembly is correct (it looks to be doing more or less what I'd expect) but the symbol resolution is borked. I'm going to try to get some assistance on this from the crash folks.

Comment 5 Bryn M. Reeves 2009-02-27 15:30:52 UTC
Mystery in comment #4 solved.. seems to be a CAS bug. CAS uses a fairly crude heuristic to determine the kernel version of a given vmcore image and it seems that in this case it is failing because the vmcore in question has rather a lot of things that look like kernel ident strings in pagecache:

strings 20090120002426-vmcore | grep '2\.6\.18'
[ pages and pages of output ]

I'll report this to the CAS maintainer.

Comment 6 Bryn M. Reeves 2009-02-27 15:37:37 UTC
Getting proper backtraces now:

crash> bt
PID: 28     TASK: c0000000ffc149c0  CPU: 2   COMMAND: "events/2"

 R0:  c0000000001be170    R1:  c00000000f45fac0    R2:  c0000000005f1bf0   
 R3:  0000000000000058    R4:  8000000000001032    R5:  0000000000000000   
 R6:  0000000000000000    R7:  0000000000000000    R8:  0000000000000002   
 R9:  c000000000680200    R10: c0000000ff19bc08    R11: 0000000000000000   
 R12: 0000000000004000    R13: c0000000004d5280    R14: 0000000000000000   
 R15: c0000000003e3048    R16: 4000000001c00000    R17: c0000000003e1a58   
 R18: 0000000000000000    R19: 0000000000638000    R20: c0000000004953e8   
 R21: 00000000020953e8    R22: 0000000002095658    R23: 000000000000000b   
 R24: 0000000000000000    R25: 0000000000000000    R26: c0000000efb32900   
 R27: c0000000ed9e0358    R28: c0000000fc725398    R29: c0000000f9328100   
 R30: c000000000532918    R31: c0000000ed9e0000   
 NIP: c0000000001be174    MSR: 8000000000021032    OR3: c00000000006c9a8
 CTR: 80000000001c5840    LR:  c0000000001be170    XER: 0000000000000006
 CCR: 0000000028000082    MQ:  0000000000000000    DAR: 0000000000000000
 DSISR: 0000000000000000     Syscall Result: 0000000000000000
 NIP [c0000000001be174] .list_del

 #0 [c00000000f45fac0] .list_del at c0000000001be170
 #1 [c00000000f45fb40] .free_block at c0000000000e7570
 #2 [c00000000f45fc00] .drain_array at c0000000000e772c
 #3 [c00000000f45fcb0] .cache_reap at c0000000000e9930
 #4 [c00000000f45fd50] .run_workqueue at c0000000000815e8
 #5 [c00000000f45fdf0] .worker_thread at c000000000082354
 #6 [c00000000f45fee0] .kthread at c000000000086e74
 #7 [c00000000f45ff90] .kernel_thread at c000000000026ffc

Comment 7 David Teigland 2009-02-27 17:19:55 UTC
Do we know whether this is a regression between 5.2 and 5.3?
The area in question is probably the 32/64 compat code in fs/dlm/user.c.
This code is "highly delicate" and was changed in 5.3 by this patch:
http://post-office.corp.redhat.com/archives/rhkernel-list/2008-August/msg00847.html
which was for bug 458760 -- the compat_input() change is the scary one.

Comment 8 David Teigland 2009-02-27 17:38:22 UTC
Looking at the patch mentioned in comment 7,

- compat_input(kbuf, k32buf, count - sizeof(struct dlm_write_request32));
+ compat_input(kbuf, k32buf, count + 1);

But in compat_input() I don't think we account for the extra "1":

+ memcpy(kb->i.lspace.name, kb32->i.lspace.name, count -
+	 offsetof(struct dlm_write_request32, i.lspace.name));

and

+ memcpy(kb->i.lock.name, kb32->i.lock.name, count -
+	 offsetof(struct dlm_write_request32, i.lock.name));

meaning, I think, that we memcpy 1 extra byte.

Comment 9 Bryn M. Reeves 2009-02-27 17:40:38 UTC
I'm beginning to think we might have two problems here; the one that we were seeing with kernels <107.el5 and possibly something slightly different in the later ones relating to this patch.

The cluster where this is being seen triggers the problem very reliably so I'll get a build submitted with that change backed out for testing.

Comment 10 Curtis Taylor 2009-02-27 17:44:35 UTC
(In reply to comment #7)
> Do we know whether this is a regression between 5.2 and 5.3?

The problem happened before going to 5.3.  The test cluster wasn't updated to RHEL5.3 until 23 Feb. (All vmcores before that time, including the first, were from 5.2)

Comment 11 Bryn M. Reeves 2009-02-27 17:46:27 UTC
> The problem happened before going to 5.3.  The test cluster wasn't updated to
> RHEL5.3 until 23 Feb. (All vmcores before that time, including the first, were

But the recent tests have produced somewhat different footprints - as I mentioned in comment #9 it looks like we might have multiple issues here.

I'm building a kernel now with the change Dave pointed out reverted for testing - will update again when that's ready.

Comment 13 Curtis Taylor 2009-02-27 18:21:50 UTC
How is compat_input() not accounting for the new nil byte?  The nil is added with kzalloc(), and it's accounted for with the parameters passed to compat_input() so that compat_input() doesn't have to account for it. 

Aren't all the occurances with the LVM2 lock name before the corruption from vmcore's prior to using 2.6.18-128.0.1?  Wouldn't rolling that change back, reintroduce the problem it solves?

Comment 14 Curtis Taylor 2009-02-27 18:24:57 UTC
(In reply to comment #13)
> Wouldn't rolling that change back, reintroduce the problem it solves?

That didn't read like I intended... of course rolling it back will reintroduce the problem it solves, but is that what we want to do in this case, when it could be one of the multiple problems encountered?

Comment 15 Bryn M. Reeves 2009-02-27 18:31:24 UTC
Right now I think we're just trying to get some clarity. If the test kernel changes the footprint of the crash (or causes it to revert to the patterns we'd seen in earlier cores) then I'd suggest we fork off a new bugzilla to track the problem introduced by that patch and use this BZ to focus on the original problem that was being seen on this cluster.

Comment 16 Curtis Taylor 2009-02-27 23:59:45 UTC
(In reply to comment #8)
> ... I think, that we memcpy 1 extra byte.

I understand why the patch didn't work as expected:

sizeof(dlm_write_request32) = 0x68
sizeof(dlm_lock_params32) = 0x58
sizeof(dlm_write_request) = 0x78
sizeof(dlm_lock_params) = 0x68
but
offsetof(struct dlm_write_request32,  i.lock.name) = 0x64
offsetof(struct dlm_write_request, i.lock.name) = 0x78

      memcpy(kb->i.lock.name, kb32->i.lock.name, count -
                  offsetof(struct dlm_write_request32, i.lock.name));

In the case of the recent debug kernel vmcore the count passed to device_write is 0x6f and i.lock.namelen=7.  In compat_input() this 

memcpy(kb->i.lock.name, kb32->i.lock.name, count -
                  offsetof(struct dlm_write_request32, i.lock.name));

copies 0xc, not 0x8 bytes.  We memcpy'ed 4 extra bytes and didn't effectively nil terminate name either.  dlm_lock_params32 needs 4 bytes of padding before name before that patch would work.

Comment 17 Curtis Taylor 2009-02-28 00:59:22 UTC
An alternative view is that libdlm:dlm/lib/libdlm.c in ls_lock_v6() and ls_lock_v5() is miscalculating len when it assumes 
     offsetof(struct dlm_write_request, i.lock.name) = sizeof(dlm_write_request) and does this:

                req->i.lock.namelen = namelen;
                memcpy(req->i.lock.name, name, namelen);
...
       len = sizeof(struct dlm_write_request_v5) + namelen;
...
       status = write(lsinfo->fd, req, len);

Is it possible to fix libdlm and leave the original dlm kernel patch in place?

Comment 18 Curtis Taylor 2009-02-28 01:57:20 UTC
Perhaps this needs a two pronged fix.  To keep the slab from getting corrupted: device_write() needs to be changed to:

-     kbuf = kmalloc(count + 1 + (sizeof(struct dlm_write_request) -
-                     sizeof(struct dlm_write_request32)), GFP_KERNEL);
+     kbuf = kmalloc(sizeof(struct dlm_write_request) + 1 + 
+                 (count - offsetof(struct dlm_write_request32, i.lock.name))

At the same time, libdlm needs to be fixed to provide a proper size to the write() system call in ls_lock_v5() and ls_lock_v6().  There, the 'len" should be calculated as:
      len = offsetof(struct dlm_write_request, i.lock.name) + namelen

Comment 19 Curtis Taylor 2009-03-02 18:23:57 UTC
NOTE:  any libdlm change needs to consider the "count < sizeof(struct dlm_write_request32)" condition in device_write().  If the vgname is 1 byte, and the lock name is V_?, device_write() will return EINVAL.

Comment 20 David Teigland 2009-03-02 19:35:17 UTC
Curtis, thanks for the detailed analysis.  I'll step through all your
calculations myself to verify, and then post kernel/libdlm patches that
we can build for you to test.

Comment 21 David Teigland 2009-03-02 22:35:16 UTC
Given the unobvious way that offsetof works, I'd like to avoid using it at all. I think we should be able to code things in such a way that the particular offset doesn't matter anywhere.  It seems to me that the simplest approach to all this is to calculate the name length right away and pass that around to use
when copying, like the following diff.  Is there something I'm missing that would not work with this?

diff --git a/fs/dlm/user.c b/fs/dlm/user.c
index 302cf0c..e219e53 100644
--- a/fs/dlm/user.c
+++ b/fs/dlm/user.c
@@ -84,7 +84,7 @@ struct dlm_lock_result32 {
 
 static void compat_input(struct dlm_write_request *kb,
                         struct dlm_write_request32 *kb32,
-                        size_t count)
+                        int namelen)
 {
        kb->version[0] = kb32->version[0];
        kb->version[1] = kb32->version[1];
@@ -96,8 +96,7 @@ static void compat_input(struct dlm_write_request *kb,
            kb->cmd == DLM_USER_REMOVE_LOCKSPACE) {
                kb->i.lspace.flags = kb32->i.lspace.flags;
                kb->i.lspace.minor = kb32->i.lspace.minor;
-               memcpy(kb->i.lspace.name, kb32->i.lspace.name, count -
-                      offsetof(struct dlm_write_request32, i.lspace.name));
+               memcpy(kb->i.lspace.name, kb32->i.lspace.name, namelen);
        } else if (kb->cmd == DLM_USER_PURGE) {
                kb->i.purge.nodeid = kb32->i.purge.nodeid;
                kb->i.purge.pid = kb32->i.purge.pid;
@@ -115,8 +114,7 @@ static void compat_input(struct dlm_write_request *kb,
                kb->i.lock.bastaddr = (void *)(long)kb32->i.lock.bastaddr;
                kb->i.lock.lksb = (void *)(long)kb32->i.lock.lksb;
                memcpy(kb->i.lock.lvb, kb32->i.lock.lvb, DLM_USER_LVB_LEN);
-               memcpy(kb->i.lock.name, kb32->i.lock.name, count -
-                      offsetof(struct dlm_write_request32, i.lock.name));
+               memcpy(kb->i.lock.name, kb32->i.lock.name, namelen);
        }
 }
 
@@ -525,15 +523,25 @@ static ssize_t device_write(struct file *file, const char 
 #ifdef CONFIG_COMPAT
        if (!kbuf->is64bit) {
                struct dlm_write_request32 *k32buf;
+               int namelen = 0;
+
+               if (count > sizeof(struct dlm_write_request32))
+                       namelen = count - sizeof(struct dlm_write_request32);
+
                k32buf = (struct dlm_write_request32 *)kbuf;
-               kbuf = kmalloc(count + 1 + (sizeof(struct dlm_write_request) -
-                              sizeof(struct dlm_write_request32)), GFP_KERNEL);
-               if (!kbuf)
+
+               /* add 1 after namelen so that the name string is terminated */
+               kbuf = kzalloc(sizeof(struct dlm_write_request) + namelen + 1,
+                              GFP_KERNEL);
+               if (!kbuf) {
+                       kfree(k32buf);
                        return -ENOMEM;
+               }
 
                if (proc)
                        set_bit(DLM_PROC_FLAGS_COMPAT, &proc->flags);
-               compat_input(kbuf, k32buf, count + 1);
+
+               compat_input(kbuf, k32buf, namelen);
                kfree(k32buf);
        }
 #endif

Comment 22 Curtis Taylor 2009-03-03 00:05:53 UTC
(In reply to comment #21)
In general I like the idea of using your calculation of namelen to get the job done.  That change coupled with the kmalloc() change to kzalloc() effectively nil terminates the lock name.  Now, any garbage that must be passed because 
   count = sizeof(struct dlm_write_request32) + namelen
gets washed away.

Comment 23 Christine Caulfield 2009-03-03 08:51:39 UTC
Yes, I like the idea of passing around namelen. it's far tidier and less prone to error.

Comment 24 David Teigland 2009-03-03 15:08:36 UTC
Created attachment 333877 [details]
patch to test

Here is the patch from comment 21 that we can build and test.

Comment 26 Curtis Taylor 2009-03-04 16:41:40 UTC
Tests with this patch have been a success.  (cluster nodes that were getting the slab corruption predictably, are no longer seeing any problems and can mount their gfs filesystems.)

Comment 28 David Teigland 2009-03-04 17:06:15 UTC
Yes, I'll begin all the steps to get it posted upstream and internally.

Comment 30 David Teigland 2009-03-04 17:47:51 UTC
patch from comment 24 posted to rhkernel
http://post-office.corp.redhat.com/archives/rhkernel-list/2009-March/msg00072.html

Comment 31 RHEL Program Management 2009-03-04 18:28:40 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 32 RHEL Program Management 2009-03-04 18:28:47 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 33 RHEL Program Management 2009-03-04 18:29:03 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 34 RHEL Program Management 2009-03-04 18:29:25 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 35 RHEL Program Management 2009-03-04 18:29:46 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 36 Don Zickus 2009-03-09 18:55:27 UTC
in kernel-2.6.18-134.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 48 errata-xmlrpc 2009-09-02 08:12:52 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html