Description of problem: keeping create/destroy causes Dom0 reboots This problem occurs when we try about 1000 times create/destroy. But we did not observe this problem on i386 and ia64. From logs: Kernel BUG at drivers/xen/xenbus/xenbus_dev.c:112 invalid opcode: 0000 [1] SMP last sysfs file: /class/net/lo/ifindex CPU 0 Pid: 30015, comm: xenstore-rm Not tainted 2.6.18-36.el5xen #1 RIP: e030:[<ffffffff8039d780>] [<ffffffff8039d780>] queue_reply+0x61/0x93 RSP: e02b:ffff8800347f3e88 EFLAGS: 00010206 RAX: 0000000000001030 RBX: ffff880033536000 RCX: 0000000000001020 RDX: 0000000000001000 RSI: ffff880033537021 RDI: ffff880034ffe048 RBP: ffff880034ffc000 R08: ffff880034ffe038 R09: ffff880034ffc024 R10: ffff880034930a80 R11: ffff880033ee00c0 R12: ffff880034ffe048 R13: 0000000000001020 R14: ffff880034ffc024 R15: 000000001ccad100 FS: 00002aaaaaac2e50(0000) GS:ffffffff80599000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process xenstore-rm (pid: 30015, threadinfo ffff8800347f2000, task ffff880033ee00c0) Stack: ffff880033536000 0000000000000001 ffff880034ffc000 ffff880033536000 0000000000000000 ffffffff8039db26 ffff880033ee00c0 000000000000000d ffff88003478a800 ffff88003478a800 Call Trace: [<ffffffff8039db26>] xenbus_dev_write+0x16c/0x301 [<ffffffff802162de>] vfs_write+0xce/0x174 [<ffffffff80216b2b>] sys_write+0x45/0x6e [<ffffffff8025d2f1>] tracesys+0xa7/0xb2 Relevant Code: 102 static void queue_reply(struct xenbus_dev_data *u, 103 char *data, unsigned int len) 104 { 105 int i; 106 107 mutex_lock(&u->reply_mutex); 108 109 for (i = 0; i < len; i++, u->read_prod++) 110 u->read_buffer[MASK_READ_IDX(u->read_prod)] = data[i]; 111 =>112 BUG_ON((u->read_prod - u->read_cons) > sizeof(u->read_buffer)); 113 114 mutex_unlock(&u->reply_mutex); 115 116 wake_up(&u->read_waitq); 117 } Version-Release number of selected component (if applicable): kernel-xen-2.6.18-36.el5-x86_64 How reproducible: After 1100 times create/destroy, We always met this problem. Steps to Reproduce: 1. create HVM domain by virt-install # virt-install --name=test --ram=350 --vcpus=2 -- file=/root/test.img --file-size=10 --vnc --debug --noautoconsole --paravirt --location=ftp://10.131.236.20/rhel5.1b_x86 _64 2. after 5 seconds, we see memory usage. # free 3. see TEST domain by xm list # xm list 4. shutdown TEST domain. # virsh destroy test 5. keeping procedure 1 - 4. 6. after 1100 times try, Dom0 reboots Actual Results: Dom0 reboots Expected Results: Dom0 works fine. Additional info: The fix is available in upstream code with good feedback from customer: http://lists.xensource.com/archives/html/xen-changelog/2007-03/msg00446.html
Created attachment 198351 [details] Backported patch from upstream to RHEL5
This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the current Red Hat Enterprise Linux release. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?".
adding same release note to RHEl5.2 release notes under "Known Issues".
Hi, the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at which point no further additions or revisions will be entertained. a mockup of the RHEL5.2 release notes can be viewed at the following link: http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html please use the aforementioned link to verify if your bugzilla is already in the release notes (if it needs to be). each item in the release notes contains a link to its original bug; as such, you can search through the release notes by bug number. Cheers, Don
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
I have tried to reproduce this on RHEL 5.2 x86_64 and the problem appears to have been solved by other changes. I did over 1150 i386 paravirt guest installs and then did over 1150 x86_64 paravirt guest installs with no issue. Marking this as closed in current release. Please try and reproduce on RHEL 5.2 and reopen if problems are encountered. Thanks
Clearing out old flags for reporting purposes. Chris Lalancette