Bug 294691 - [Xen][5.2] keeping create/destroy causes Dom0 reboots
[Xen][5.2] keeping create/destroy causes Dom0 reboots
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen (Show other bugs)
5.1
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Bill Burns
Martin Jenner
:
Depends On:
Blocks: 391501 222082 RHEL5u2_relnotes 409971 448753
  Show dependency treegraph
 
Reported: 2007-09-18 09:36 EDT by Flavio Leitner
Modified: 2010-10-22 14:43 EDT (History)
5 users (show)

See Also:
Fixed In Version: RHEL 5.2
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-06-26 07:29:59 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Backported patch from upstream to RHEL5 (4.14 KB, patch)
2007-09-18 09:36 EDT, Flavio Leitner
no flags Details | Diff

  None (edit)
Description Flavio Leitner 2007-09-18 09:36:14 EDT
Description of problem:
  keeping create/destroy causes Dom0 reboots
  This problem occurs when we try about 1000 times create/destroy.
  But we did not observe this problem on i386 and ia64.

From logs:
Kernel BUG at drivers/xen/xenbus/xenbus_dev.c:112
invalid opcode: 0000 [1] SMP
last sysfs file: /class/net/lo/ifindex
CPU 0
Pid: 30015, comm: xenstore-rm Not tainted 2.6.18-36.el5xen #1
RIP: e030:[<ffffffff8039d780>]  [<ffffffff8039d780>] queue_reply+0x61/0x93
RSP: e02b:ffff8800347f3e88  EFLAGS: 00010206
RAX: 0000000000001030 RBX: ffff880033536000 RCX: 0000000000001020
RDX: 0000000000001000 RSI: ffff880033537021 RDI: ffff880034ffe048
RBP: ffff880034ffc000 R08: ffff880034ffe038 R09: ffff880034ffc024
R10: ffff880034930a80 R11: ffff880033ee00c0 R12: ffff880034ffe048
R13: 0000000000001020 R14: ffff880034ffc024 R15: 000000001ccad100
FS:  00002aaaaaac2e50(0000) GS:ffffffff80599000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process xenstore-rm (pid: 30015, threadinfo ffff8800347f2000, task ffff880033ee00c0)
Stack:  ffff880033536000  0000000000000001  ffff880034ffc000  ffff880033536000
0000000000000000  ffffffff8039db26  ffff880033ee00c0  000000000000000d
ffff88003478a800  ffff88003478a800
Call Trace:
[<ffffffff8039db26>] xenbus_dev_write+0x16c/0x301
[<ffffffff802162de>] vfs_write+0xce/0x174
[<ffffffff80216b2b>] sys_write+0x45/0x6e
[<ffffffff8025d2f1>] tracesys+0xa7/0xb2


Relevant Code:
102 static void queue_reply(struct xenbus_dev_data *u,
103                         char *data, unsigned int len)
104 {
105         int i;
106
107         mutex_lock(&u->reply_mutex);
108
109         for (i = 0; i < len; i++, u->read_prod++)
110                 u->read_buffer[MASK_READ_IDX(u->read_prod)] = data[i];
111
=>112       BUG_ON((u->read_prod - u->read_cons) > sizeof(u->read_buffer));
113
114         mutex_unlock(&u->reply_mutex);
115
116         wake_up(&u->read_waitq);
117 }


Version-Release number of selected component (if applicable):
kernel-xen-2.6.18-36.el5-x86_64

How reproducible:
After 1100 times create/destroy,
We always met this problem.

Steps to Reproduce:

  1. create HVM domain by virt-install
# virt-install --name=test --ram=350 --vcpus=2 --
file=/root/test.img --file-size=10 --vnc --debug --noautoconsole
--paravirt --location=ftp://10.131.236.20/rhel5.1b_x86 _64

2. after 5 seconds, we see memory usage.
 # free

3. see TEST domain by xm list
 # xm list

4. shutdown TEST domain.
 # virsh destroy test

5. keeping procedure 1 - 4.

6. after 1100 times try, Dom0 reboots

Actual Results:
 Dom0 reboots

Expected Results:
 Dom0 works fine.

Additional info:
The fix is available in upstream code with good feedback from customer:
http://lists.xensource.com/archives/html/xen-changelog/2007-03/msg00446.html
Comment 1 Flavio Leitner 2007-09-18 09:36:14 EDT
Created attachment 198351 [details]
Backported patch from upstream to RHEL5
Comment 7 RHEL Product and Program Management 2008-02-01 17:40:29 EST
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 8 Don Domingo 2008-02-03 18:34:20 EST
adding same release note to RHEl5.2 release notes under "Known Issues". 
Comment 11 Don Domingo 2008-04-01 22:12:13 EDT
Hi,
the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at
which point no further additions or revisions will be entertained.

a mockup of the RHEL5.2 release notes can be viewed at the following link:
http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html

please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
bug number.

Cheers,
Don
Comment 12 RHEL Product and Program Management 2008-06-09 18:00:23 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 13 Bill Burns 2008-06-26 07:29:59 EDT
I have tried to reproduce this on RHEL 5.2 x86_64 and the problem appears to
have been solved by other changes. I did over 1150 i386 paravirt guest installs
and then did over 1150 x86_64 paravirt guest installs with no issue. Marking
this as closed in current release. Please try and reproduce on RHEL 5.2 and
reopen if problems are encountered.
 Thanks
Comment 17 Chris Lalancette 2010-07-19 09:49:32 EDT
Clearing out old flags for reporting purposes.

Chris Lalancette

Note You need to log in before you can comment on or make changes to this bug.