Bug 593902

Summary: blkbk generates Call Traces because of aoe bug
Product: Red Hat Enterprise Linux 5 Reporter: Masanori ITOH <masanori.itoh>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: low    
Version: 5.5CC: drjones, masanori.itoh, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-02 13:16:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
a backport patch for aoe driver none

Description Masanori ITOH 2010-05-20 02:23:21 UTC
Created attachment 415298 [details]
a backport patch for aoe driver

Description of problem:
  xen blkbk driver generates call traces when used with aoe

Version-Release number of selected component (if applicable):
  kernel-xen-2.6.18-194.el5

How reproducible:
  Always reproducible when aoe devices are attached to Xen domU guests.

Steps to Reproduce:
  1. Setup an AoE server, and an AoE client as Xen dom0.
  2. Create a Xen guest.
  3. Attach aoe devices to the Xen domU guest above.
  4. You would see bunch of Call Traces like the bottom.
  
Actual results:
  Got bunch of call trace messages.

Expected results:
  Do not see call trace messages.

Additional info:
  This is because aoe driver does not follow block subsystem initialization rule, and xenblk back is a victim of the bug. Essentialy the problem was fixed in linux-2.6.32. 
  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7135a71b19be1faf48b7148d77844d03bc0717d6

I wrote a backport patch as attached(fix_aoe_calltraces_backport.patch).

Note that under some rare conditions, the above fix is not enough, and we need one more aoe patch which I posted to upstream. I'm posting the fix in another bug report. 

The following is an example of call traces which I got:
May  6 18:54:59 cb-blsv3d1 kernel: BUG: warning at lib/kref.c:32/kref_get() (Not tainted)
May  6 18:54:59 cb-blsv3d1 kernel:
May  6 18:54:59 cb-blsv3d1 kernel: Call Trace:
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff80237b0b>] kref_get+0x38/0x3d
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff80259c09>] kobject_get+0x12/0x17
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff8024acad>] blk_get_queue+0x1f/0x26
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff8839390a>] :blkbk:dispatch_rw_block_io+0x4e4/0x5ab
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff80263f77>] __kprobes_text_start+0x317/0x438
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff80286570>] dequeue_task+0x18/0x37
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff802865b7>] deactivate_task+0x28/0x60
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff8026df02>] monotonic_clock+0x35/0x7b
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff80261e83>] thread_return+0x6c/0x113
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff8020622a>] hypercall_page+0x22a/0x1000
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff8033b517>] kobject_cleanup+0x39/0x7e
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff88393d4b>] :blkbk:blkif_schedule+0x37a/0x463
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff883939d1>] :blkbk:blkif_schedule+0x0/0x463
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff80299db3>] keventd_create_kthread+0x0/0xc4
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff80233476>] kthread+0xfe/0x132
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff8025fb2c>] child_rip+0xa/0x12
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff80299db3>] keventd_create_kthread+0x0/0xc4
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff80233378>] kthread+0x0/0x132
May  6 18:54:59 cb-blsv3d1 kernel:  [<ffffffff8025fb22>] child_rip+0x0/0x12

Comment 1 Andrew Jones 2010-05-20 16:02:15 UTC
While the xen blkback driver is the noisy victim here, the suggested patch is all in the aoe driver. Moving this to the kernel component.

Comment 2 RHEL Program Management 2014-03-07 12:47:03 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 3 RHEL Program Management 2014-06-02 13:16:56 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).

Comment 4 Red Hat Bugzilla 2023-09-14 01:21:13 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days