Description of problem: I am not sure if this is a bug in guest kernel or in the virtualization apis. When legit block attach/detach commands are run consecutively, the guest crashes with the following backtrace: # kobject_add failed for xvdaa with -EEXIST, don't try to register things with the same name in the same directory. Call Trace: [<ffffffff801497bf>] kobject_add+0x16e/0x199 [<ffffffff8005a64d>] exact_lock+0x0/0x14 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4 [<ffffffff8010444e>] register_disk+0x43/0x199 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4 [<ffffffff80140e92>] add_disk+0x34/0x3d [<ffffffff881e3e16>] :xen_vbd:backend_changed+0x10c/0x18f [<ffffffff88199115>] :xen_platform_pci:xenbus_read_driver_state+0x26/0x3b [<ffffffff88197e51>] :xen_platform_pci:xenwatch_thread+0x0/0x135 [<ffffffff88197295>] :xen_platform_pci:xenwatch_handle_callback+0x15/0x48 [<ffffffff88197f6d>] :xen_platform_pci:xenwatch_thread+0x11c/0x135 [<ffffffff8009db32>] autoremove_wake_function+0x0/0x2e [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032360>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032262>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP: [<ffffffff8010748a>] create_dir+0x11/0x1cf PGD fee8067 PUD 1081e067 PMD 0 Oops: 0000 [1] SMP last sysfs file: /block/xvdaa/dev CPU 0 Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy xen_vnif xen_balloon xen_vbd i2c_piix4 i2c_core serio_raw pcspkr xen_platform_pci 8139too 8139cp mii ide_cd cdrom dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 633, comm: xenwatch Not tainted 2.6.18-126.el5 #1 RIP: 0010:[<ffffffff8010748a>] [<ffffffff8010748a>] create_dir+0x11/0x1cf RSP: 0000:ffff81001dfadda0 EFLAGS: 00010282 RAX: ffff810010fc7e70 RBX: ffff81001bc888a0 RCX: ffff81001dfaddd8 RDX: ffff81001bc888a8 RSI: 0000000000000000 RDI: ffff81001bc888a0 RBP: ffff81001bc888a0 R08: 00000000000000a0 R09: 000000000000003f R10: ffffffff8009d91a R11: 0000000000000000 R12: ffff81001bc888a0 R13: ffff81001dfaddd8 R14: 0000000000000000 R15: ffff810010fc7e70 FS: 00002adc5a2b7240(0000) GS:ffffffff803ac000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000010 CR3: 00000000110c1000 CR4: 00000000000006e0 Process xenwatch (pid: 633, threadinfo ffff81001dfac000, task ffff81001df15100) Stack: ffff81001bc888a0 ffff81001bc886b0 ffff81001bc888a0 0000000000000282 0000000000000000 ffffffff80107a13 ffff810010fc7e70 0000000000000000 ffff81001bc888a0 ffffffff8014972d ffff810010fc7e70 ffff810010e26000 Call Trace: [<ffffffff80107a13>] sysfs_create_dir+0x58/0x76 [<ffffffff8014972d>] kobject_add+0xdc/0x199 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4 [<ffffffff8013d139>] blk_register_queue+0x33/0x77 [<ffffffff881e3e16>] :xen_vbd:backend_changed+0x10c/0x18f [<ffffffff88199115>] :xen_platform_pci:xenbus_read_driver_state+0x26/0x3b [<ffffffff88197e51>] :xen_platform_pci:xenwatch_thread+0x0/0x135 [<ffffffff88197295>] :xen_platform_pci:xenwatch_handle_callback+0x15/0x48 [<ffffffff88197f6d>] :xen_platform_pci:xenwatch_thread+0x11c/0x135 [<ffffffff8009db32>] autoremove_wake_function+0x0/0x2e [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032360>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8009d91a>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032262>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 Code: 48 8b 7e 10 48 89 d3 48 81 c7 b8 00 00 00 e8 0c c7 f5 ff fc RIP [<ffffffff8010748a>] create_dir+0x11/0x1cf RSP <ffff81001dfadda0> CR2: 0000000000000010 <0>Kernel panic - not syncing: Fatal exception Version-Release number of selected component (if applicable): # rpm -qa | grep xen xen-devel-3.0.3-79.el5 xen-devel-3.0.3-79.el5 xen-libs-3.0.3-79.el5 xen-3.0.3-79.el5 xen-debuginfo-3.0.3-79.el5 kernel-xen-2.6.18-126.el5 xen-debuginfo-3.0.3-79.el5 kernel-xen-devel-2.6.18-126.el5 xen-libs-3.0.3-79.el5 How reproducible: Very. Steps to Reproduce: 1. Install 5.3 dom0 and guest. 2. Open up a console to the guest .. 2. create an image file: dd if=/dev/zero of=/var/lib/xen/images/block1 bs=256M count=1000 3. for i in $(seq i 20); do xm block-attach $guest tap:aio:/var/lib/xen/images/block1 /dev/xvdaa w ; xm block-detach $guest /dev/xvdaa 4. look at the console Actual results: Guest crashes. Expected results: It shouldn't crash Additional info:
Additional info: I tested this on x86_64 host with x86_64 pv.hvm and i386 hvm guest (no i386 pv guest on the machine due to another bug). They all crash, and xend has to be restarted before the guest can be destroy. However in the case of i386 hvm guest, things get worse, i couldn't get it to be destroyed, ended up with [2008-12-12 00:51:37 xend 1261] DEBUG (XendDomain:208) Cannot recreate information for dying domain 87. Xend will ignore this domain from now on. errors.
Just as an FYI, this is probably a race condition. Gurhan told me (and my own tests confirm) that adding a sleep in for loop above allows it to work. Probably xenstore is adding and removing nodes very fast, and we race inside the callback handlers in the guest. We can probably fix it up with a spinlock; after all, attaching a disk isn't a common, nor a performance critical, operation. Chris Lalancette
Created attachment 362527 [details] reproducer It is still possible to get errors in block-attach/detach. The attached script is a bit more complicated than needed because it started as a Windows testcase, but it is enough to reproduce the issue. It will run attach/detach cycles of 5 devices, with decent pauses in the middle. After 50-150 attach/detach cycles (which means 250-750 attaches and the same number of detaches) it will start producing unhelpful output: Starting cycle 618... xm block-attach RHEL5-32-HVM phy:/dev/loop2 xvdd r Usage: xm block-attach <Domain> <BackDev> <FrontDev> <Mode> Create a new virtual block device. xm block-attach RHEL5-32-HVM phy:/dev/loop3 xvde r Usage: xm block-attach <Domain> <BackDev> <FrontDev> <Mode> Create a new virtual block device. xm block-attach RHEL5-32-HVM phy:/dev/loop4 xvdf r Usage: xm block-attach <Domain> <BackDev> <FrontDev> <Mode> Create a new virtual block device. xm block-attach RHEL5-32-HVM phy:/dev/loop5 xvdg r Usage: xm block-attach <Domain> <BackDev> <FrontDev> <Mode> In the last run I made, I noticed that in one run I had 1 device attach failing on cycle 85, two on cycle 86, three on cycle 87, four on cycle 88, and all five from cycle 89 on. Unfortunately I wasn't cunning enough to save a log or xenstore dump.
The above behavior was apparently in bug 217853, which was fixed by adding -f. It's possible that fixing the guest-side (domU) bug would get rid of the problem once and for all.
The protocols among the participants (xenstore, blktapctrl, dom0/domU etc; see http://wiki.xensource.com/xenwiki/blktap#head-47b9cac49ceb0351f57917988f1020a435c680a9 for a detailed architecture diagram) seem to be a fertile soil for many races. We apparently did try to patch up some of those, but I think they can't all be fixed under this design -- there are too many agents to synchronize in incremental steps.