Bug 476158

Summary:

rapid consecutive block attach/detach commands garble distributed vbd state

Product:

Red Hat Enterprise Linux 5

Reporter:

Gurhan Ozen <gozen>

Component:

kernel-xen

Assignee:

Xen Maintainance List <xen-maint>

Status:

CLOSED CANTFIX

QA Contact:

Martin Jenner <mjenner>

Severity:

high

Docs Contact:

Priority:

high

Version:

5.3

CC:

jburke, lersek, pbonzini, xen-maint

Target Milestone:

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2011-05-17 14:16:49 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

514491

Attachments:

Description	Flags
reproducer	none

Description Gurhan Ozen 2008-12-12 05:24:54 UTC

Description of problem:

I am not sure if this is a bug in guest kernel or in the virtualization apis. When legit block attach/detach commands are run consecutively, the guest crashes with the following backtrace:

# kobject_add failed for xvdaa with -EEXIST, don't try to register things with the same name in the same directory.

Call Trace:
 [<ffffffff801497bf>] kobject_add+0x16e/0x199
 [<ffffffff8005a64d>] exact_lock+0x0/0x14
 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4
 [<ffffffff8010444e>] register_disk+0x43/0x199
 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80140e92>] add_disk+0x34/0x3d
 [<ffffffff881e3e16>] :xen_vbd:backend_changed+0x10c/0x18f
 [<ffffffff88199115>] :xen_platform_pci:xenbus_read_driver_state+0x26/0x3b
 [<ffffffff88197e51>] :xen_platform_pci:xenwatch_thread+0x0/0x135
 [<ffffffff88197295>] :xen_platform_pci:xenwatch_handle_callback+0x15/0x48
 [<ffffffff88197f6d>] :xen_platform_pci:xenwatch_thread+0x11c/0x135
 [<ffffffff8009db32>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032360>] kthread+0xfe/0x132
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032262>] kthread+0x0/0x132
 [<ffffffff8005dfa7>] child_rip+0x0/0x11

Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP: 
 [<ffffffff8010748a>] create_dir+0x11/0x1cf
PGD fee8067 PUD 1081e067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: /block/xvdaa/dev
CPU 0 
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy xen_vnif xen_balloon xen_vbd i2c_piix4 i2c_core serio_raw pcspkr xen_platform_pci 8139too 8139cp mii ide_cd cdrom dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 633, comm: xenwatch Not tainted 2.6.18-126.el5 #1
RIP: 0010:[<ffffffff8010748a>]  [<ffffffff8010748a>] create_dir+0x11/0x1cf
RSP: 0000:ffff81001dfadda0  EFLAGS: 00010282
RAX: ffff810010fc7e70 RBX: ffff81001bc888a0 RCX: ffff81001dfaddd8
RDX: ffff81001bc888a8 RSI: 0000000000000000 RDI: ffff81001bc888a0
RBP: ffff81001bc888a0 R08: 00000000000000a0 R09: 000000000000003f
R10: ffffffff8009d91a R11: 0000000000000000 R12: ffff81001bc888a0
R13: ffff81001dfaddd8 R14: 0000000000000000 R15: ffff810010fc7e70
FS:  00002adc5a2b7240(0000) GS:ffffffff803ac000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 00000000110c1000 CR4: 00000000000006e0
Process xenwatch (pid: 633, threadinfo ffff81001dfac000, task ffff81001df15100)
Stack:  ffff81001bc888a0 ffff81001bc886b0 ffff81001bc888a0 0000000000000282
 0000000000000000 ffffffff80107a13 ffff810010fc7e70 0000000000000000
 ffff81001bc888a0 ffffffff8014972d ffff810010fc7e70 ffff810010e26000
Call Trace:
 [<ffffffff80107a13>] sysfs_create_dir+0x58/0x76
 [<ffffffff8014972d>] kobject_add+0xdc/0x199
 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4
 [<ffffffff8013d139>] blk_register_queue+0x33/0x77
 [<ffffffff881e3e16>] :xen_vbd:backend_changed+0x10c/0x18f
 [<ffffffff88199115>] :xen_platform_pci:xenbus_read_driver_state+0x26/0x3b
 [<ffffffff88197e51>] :xen_platform_pci:xenwatch_thread+0x0/0x135
 [<ffffffff88197295>] :xen_platform_pci:xenwatch_handle_callback+0x15/0x48
 [<ffffffff88197f6d>] :xen_platform_pci:xenwatch_thread+0x11c/0x135
 [<ffffffff8009db32>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032360>] kthread+0xfe/0x132
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032262>] kthread+0x0/0x132
 [<ffffffff8005dfa7>] child_rip+0x0/0x11


Code: 48 8b 7e 10 48 89 d3 48 81 c7 b8 00 00 00 e8 0c c7 f5 ff fc 
RIP  [<ffffffff8010748a>] create_dir+0x11/0x1cf
 RSP <ffff81001dfadda0>
CR2: 0000000000000010
 <0>Kernel panic - not syncing: Fatal exception

  


Version-Release number of selected component (if applicable):
# rpm -qa | grep xen 
xen-devel-3.0.3-79.el5
xen-devel-3.0.3-79.el5
xen-libs-3.0.3-79.el5
xen-3.0.3-79.el5
xen-debuginfo-3.0.3-79.el5
kernel-xen-2.6.18-126.el5
xen-debuginfo-3.0.3-79.el5
kernel-xen-devel-2.6.18-126.el5
xen-libs-3.0.3-79.el5


How reproducible:
Very.

Steps to Reproduce:
1. Install 5.3 dom0 and guest. 
2. Open up a console to the guest ..
2. create an image file:
   dd if=/dev/zero of=/var/lib/xen/images/block1 bs=256M count=1000
3. for i in $(seq i 20); do xm block-attach $guest tap:aio:/var/lib/xen/images/block1 /dev/xvdaa w ; xm block-detach $guest /dev/xvdaa
4. look at the console
  
Actual results:
Guest crashes.

Expected results:
It shouldn't crash

Additional info:

Comment 1 Gurhan Ozen 2008-12-12 05:55:34 UTC

Additional info:
I tested this on x86_64 host with x86_64 pv.hvm and i386 hvm guest (no i386 pv guest on the machine due to another bug).

 They all crash, and xend has to be restarted before the guest can be destroy. However in the case of i386 hvm guest, things get worse, i couldn't get it to be destroyed, ended up with 

[2008-12-12 00:51:37 xend 1261] DEBUG (XendDomain:208) Cannot recreate information for dying domain 87.  Xend will ignore this domain from now on.

errors.

Comment 3 Chris Lalancette 2009-01-22 08:17:10 UTC

Just as an FYI, this is probably a race condition.  Gurhan told me (and my own tests confirm) that adding a sleep in for loop above allows it to work.  Probably xenstore is adding and removing nodes very fast, and we race inside the callback handlers in the guest.  We can probably fix it up with a spinlock; after all, attaching a disk isn't a common, nor a performance critical, operation.

Chris Lalancette

Comment 8 Paolo Bonzini 2009-09-24 16:35:27 UTC

Created attachment 362527 [details]
reproducer

It is still possible to get errors in block-attach/detach.

The attached script is a bit more complicated than needed because it started as a Windows testcase, but it is enough to reproduce the issue.  It will run attach/detach cycles of 5 devices, with decent pauses in the middle.

After 50-150 attach/detach cycles (which means 250-750 attaches and the same number of detaches) it will start producing unhelpful output:

Starting cycle 618...
xm block-attach RHEL5-32-HVM phy:/dev/loop2 xvdd r
Usage: xm block-attach <Domain> <BackDev> <FrontDev> <Mode>

Create a new virtual block device.
xm block-attach RHEL5-32-HVM phy:/dev/loop3 xvde r
Usage: xm block-attach <Domain> <BackDev> <FrontDev> <Mode>

Create a new virtual block device.
xm block-attach RHEL5-32-HVM phy:/dev/loop4 xvdf r
Usage: xm block-attach <Domain> <BackDev> <FrontDev> <Mode>

Create a new virtual block device.
xm block-attach RHEL5-32-HVM phy:/dev/loop5 xvdg r
Usage: xm block-attach <Domain> <BackDev> <FrontDev> <Mode>



In the last run I made, I noticed that in one run I had 1 device attach failing on cycle 85, two on cycle 86, three on cycle 87, four on cycle 88, and all five from cycle 89 on.  Unfortunately I wasn't cunning enough to save a log or xenstore dump.

Comment 9 Paolo Bonzini 2009-09-24 16:46:16 UTC

The above behavior was apparently in bug 217853, which was fixed by adding -f.

It's possible that fixing the guest-side (domU) bug would get rid of the problem once and for all.

Comment 12 Laszlo Ersek 2011-05-17 14:16:49 UTC

The protocols among the participants (xenstore, blktapctrl, dom0/domU etc; see http://wiki.xensource.com/xenwiki/blktap#head-47b9cac49ceb0351f57917988f1020a435c680a9 for a detailed architecture diagram) seem to be a fertile soil for many races. We apparently did try to patch up some of those, but I think they can't all be fixed under this design -- there are too many agents to synchronize in incremental steps.