Bug 730609

Summary: Guest can not init virtio serial port when do hotplug virtio serial frequently
Product: Red Hat Enterprise Linux 6 Reporter: Joy Pu <ypu>
Component: kernelAssignee: Amit Shah <amit.shah>
Status: CLOSED WORKSFORME QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2CC: areis, juzhang, mkenneth, shuang, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-01-12 08:33:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Joy Pu 2011-08-15 04:57:31 UTC
Description:
Hot add and delete one serial port 100 times with no sleep will cause the RHEL 6.1 failed to init virtio serial port. If do the some operate with 1s sleep, the guest will be OK.


Version-Release number of selected component (if applicable):
kernel: 
2.6.32-178.el6.x86_64
kvm&qemu: 
gpxe-roms-qemu-0.9.7-6.7.el6.noarch
qemu-kvm-debuginfo-0.12.1.2-2.177.el6.x86_64
qemu-img-0.12.1.2-2.177.el6.x86_64
qemu-kvm-tools-0.12.1.2-2.177.el6.x86_64
qemu-kvm-0.12.1.2-2.177.el6.x86_64


How reproducible:
Always

Steps to Reproduce:
1. boot up a guest
2. hot add and hot delete the virtio serial port from monitor 100 times without sleep:
(qemu)device_del serialport-00
(qemu)device_add virtserialport,bus=virtio-serial-pci0.0,id=serialport-00,name=serialport-00


Actual results:
guest kernel panic
Expected results:
The hot add and delete should be OK

Additional info:
1. call trace:
2011-08-15 12:01:31: ------------[ cut here ]------------
2011-08-15 12:01:31: WARNING: at fs/sysfs/dir.c:491 sysfs_add_one+0xc9/0x130() (Not tainted)
2011-08-15 12:01:31: Hardware name: KVM
2011-08-15 12:01:31: sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:05.0/virtio2/virtio-ports/vport0p2/name'
2011-08-15 12:01:31: Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport sg microcode virtio_console i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_net virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: freq_table]
2011-08-15 12:01:31: Pid: 7, comm: events/0 Not tainted 2.6.32-131.0.15.el6.x86_64 #1
2011-08-15 12:01:31: Call Trace:
2011-08-15 12:01:31: Call Trace:
2011-08-15 12:01:31:  [<ffffffff81067137>] ? warn_slowpath_common+0x87/0xc0
2011-08-15 12:01:31:  [<ffffffff81067226>] ? warn_slowpath_fmt+0x46/0x50
2011-08-15 12:01:31:  [<ffffffff811e7039>] ? sysfs_add_one+0xc9/0x130
2011-08-15 12:01:31:  [<ffffffff811e53e2>] ? sysfs_add_file_mode+0x62/0xb0
2011-08-15 12:01:31:  [<ffffffff811e8611>] ? internal_create_group+0xc1/0x1a0
2011-08-15 12:01:31:  [<ffffffff811e8723>] ? sysfs_create_group+0x13/0x20
2011-08-15 12:01:31:  [<ffffffffa0130fa0>] ? control_work_handler+0x1a0/0x478 [virtio_console]
2011-08-15 12:01:31:  [<ffffffffa0130e00>] ? control_work_handler+0x0/0x478 [virtio_console]
2011-08-15 12:01:31:  [<ffffffff81088830>] ? worker_thread+0x170/0x2a0
2011-08-15 12:01:31:  [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
2011-08-15 12:01:31:  [<ffffffff810886c0>] ? worker_thread+0x0/0x2a0
2011-08-15 12:01:31:  [<ffffffff8108ddf6>] ? kthread+0x96/0xa0
2011-08-15 12:01:31:  [<ffffffff8100c1ca>] ? child_rip+0xa/0x20
2011-08-15 12:01:31:  [<ffffffff8108dd60>] ? kthread+0x0/0xa0
2011-08-15 12:01:31:  [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
2011-08-15 12:01:31: ---[ end trace 1565761bbca940d8 ]---
2011-08-15 12:01:31: virtio-ports vport0p2: Error -17 creating sysfs device attributes
2011-08-15 12:01:31: ------------[ cut here ]------------
2011-08-15 12:01:31: WARNING: at fs/sysfs/dir.c:491 sysfs_add_one+0xc9/0x130() (Tainted: G        W  ----------------  )
2011-08-15 12:01:31: Hardware name: KVM
2011-08-15 12:01:31: sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:05.0/virtio2/virtio-ports/vport0p2/name'
2011-08-15 12:01:31: Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport sg microcode virtio_console i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_net virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: freq_table]
2011-08-15 12:01:31: Pid: 7, comm: events/0 Tainted: G        W  ----------------   2.6.32-131.0.15.el6.x86_64 #1
2011-08-15 12:01:31: Call Trace:
2011-08-15 12:01:31:  [<ffffffff81067137>] ? warn_slowpath_common+0x87/0xc0
2011-08-15 12:01:31:  [<ffffffff81067226>] ? warn_slowpath_fmt+0x46/0x50
2011-08-15 12:01:31:  [<ffffffff811e7039>] ? sysfs_add_one+0xc9/0x130
2011-08-15 12:01:31:  [<ffffffff811e53e2>] ? sysfs_add_file_mode+0x62/0xb0
2011-08-15 12:01:31:  [<ffffffff811e8611>] ? internal_create_group+0xc1/0x1a0
2011-08-15 12:01:31:  [<ffffffff811e8723>] ? sysfs_create_group+0x13/0x20
2011-08-15 12:01:31:  [<ffffffffa0130fa0>] ? control_work_handler+0x1a0/0x478 [virtio_console]
2011-08-15 12:01:31:  [<ffffffffa0130e00>] ? control_work_handler+0x0/0x478 [virtio_console]
2011-08-15 12:01:31:  [<ffffffff81088830>] ? worker_thread+0x170/0x2a0
2011-08-15 12:01:31:  [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
2011-08-15 12:01:31:  [<ffffffff810886c0>] ? worker_thread+0x0/0x2a0
2011-08-15 12:01:31:  [<ffffffff8108ddf6>] ? kthread+0x96/0xa0
2011-08-15 12:01:31:  [<ffffffff8100c1ca>] ? child_rip+0xa/0x20
2011-08-15 12:01:31:  [<ffffffff8108dd60>] ? kthread+0x0/0xa0
2011-08-15 12:01:31:  [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
2011-08-15 12:01:31: ---[ end trace 1565761bbca940d9 ]---
2011-08-15 12:01:32: virtio
2011-08-15 12:01:32: -ports vport0p2: Error -17 creating sysfs device attributes


2. cmdline
/usr/auto/autotest-devel/client/tests/kvm/qemu -name 'vm1' -chardev socket,id=human_monitor_id_humanmonitor1,path=/tmp/monitor-humanmonitor1-20110815-115428-RGJi,server,nowait -mon chardev=human_monitor_id_humanmonitor1,mode=readline -chardev socket,id=serial_id_20110815-115428-RGJi,path=/tmp/serial-20110815-115428-RGJi,server,nowait -device isa-serial,chardev=serial_id_20110815-115428-RGJi -drive file='/usr/auto/autotest-devel/client/tests/kvm/images/RHEL-Server-6.1-64-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idlp1oo7,mac=9a:4a:7b:7e:0d:2f,id=ndev00idlp1oo7,bus=pci.0,addr=0x3 -netdev tap,id=idlp1oo7,vhost=on,ifname='t0-115428-RGJi',script='/usr/auto/autotest-devel/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no' -m 3072 -smp 1,cores=1,threads=1,sockets=1 -cpu cpu64-rhel6,+sse2,+x2apic -spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host,driftfix=none -M rhel6.1.0 -boot order=cdn,once=c,menu=off   -usbdevice tablet -no-kvm-pit-reinjection -device virtio-serial-pci,id=virtio-serial-pci0 -chardev socket,path=/tmp/virtio-console-_EZ5gR/0,id=vc0,server,nowait -device virtconsole,chardev=vc0,name=console-0,id=console-0,bus=virtio-serial-pci0.0 -chardev socket,path=/tmp/virtio-console-_EZ5gR/1,id=vc1,server,nowait -device virtconsole,chardev=vc1,name=console-1,id=console-1,bus=virtio-serial-pci0.0 -device virtio-serial-pci,id=virtio-serial-pci1 -chardev socket,path=/tmp/virtio-console-_EZ5gR/2,id=vs2,server,nowait -device virtserialport,chardev=vs2,name=serialport-2,id=serialport-2,bus=virtio-serial-pci1.0 -chardev socket,path=/tmp/virtio-console-_EZ5gR/3,id=vs3,server,nowait -device virtserialport,chardev=vs3,name=serialport-3,id=serialport-3,bus=virtio-serial-pci1.0 -enable-kvm

Comment 2 Amit Shah 2011-09-05 09:33:39 UTC
These are warnings where it looks like creating sysfs files takes some time, while the port is already yanked off.

When you say 'guest cannot init virtio serial port', what do you mean exactly?  Do you not see entries in /dev/vport*?  Something else?

The fix will be to take a lock in the port init routine and ensure the init has finished before we delete the port or add a new one at the same id.

Comment 3 Joy Pu 2011-09-30 06:17:57 UTC
(In reply to comment #2)
> These are warnings where it looks like creating sysfs files takes some time,
> while the port is already yanked off.
> 
> When you say 'guest cannot init virtio serial port', what do you mean exactly? 
> Do you not see entries in /dev/vport*?  Something else?
> 
This is hard to test. When I add "ls -l  /dev/vport*" to check it after hotplug its actions changes like with sleep 1s. I think the problem is just as you said. hotplug a device will return before everything in guest is OK. So hotplug and un-hotplug without sleep will interrupt the init process in guest system.

> The fix will be to take a lock in the port init routine and ensure the init has
> finished before we delete the port or add a new one at the same id.

Comment 5 RHEL Program Management 2011-12-13 04:41:06 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 8 Amit Shah 2012-01-06 07:32:34 UTC
> In this version, still can see the warning message from guest serial port log,
> but the scripts will report init and delete virtio serial console succeed.

Can you give more details?

Which warning message do you see?

Also, the script reports init and delete succeeded after 100 non-stop runs?  What used to happen in the original kernel?  Did the script just stop?  Guest freeze?

Comment 9 Joy Pu 2012-01-06 08:22:33 UTC
(In reply to comment #8)
> > In this version, still can see the warning message from guest serial port log,
> > but the scripts will report init and delete virtio serial console succeed.
> 
> Can you give more details?
> 
> Which warning message do you see?
> 
> Also, the script reports init and delete succeeded after 100 non-stop runs? 
> What used to happen in the original kernel?  Did the script just stop?  Guest
> freeze?

Sorry, forget to upgrade guest's kernel. After guest kernel update, the warning message is gone.

In the scripts it only judge the init and delete with monitor command output. So in bug report version of qemu kvm and kernel. guest will show warning message and freeze for a while which will lead the monitor command failed and warning message in guest. After update host kernel and qemu kvm, the guest kernel only show warning message but alive. So only warning message shows but monitor command return no error. After update guest kernel the warning message didn't show up and the monitor command return no error.

This is the test version:
host & guest kernel version:
2.6.32-223.el6.test.x86_64
qemu and kvm version:
rpm -qa|grep qemu
qemu-img-0.12.1.2-2.213.el6.x86_64
qemu-kvm-0.12.1.2-2.213.el6.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.213.el6.x86_64
qemu-kvm-tools-0.12.1.2-2.213.el6.x86_64
gpxe-roms-qemu-0.9.7-6.7.el6.noarch

Comment 10 Amit Shah 2012-01-06 08:52:45 UTC
(In reply to comment #9)
> Sorry, forget to upgrade guest's kernel. After guest kernel update, the warning
> message is gone.

Ah OK, that's much better.

> In the scripts it only judge the init and delete with monitor command output.
> So in bug report version of qemu kvm and kernel. guest will show warning
> message and freeze for a while which will lead the monitor command failed and
> warning message in guest. After update host kernel and qemu kvm, the guest
> kernel only show warning message but alive. So only warning message shows but
> monitor command return no error. After update guest kernel the warning message
> didn't show up and the monitor command return no error.

For this test, only guest kernel update is required; host qemu and host kernel don't matter.

So can you run a last quick test: just test with buggy as well as new build of guest kernel without modifying host packages and let me know if with the new kernel warnings as well as freezes are gone?

Thanks for the testing!

Comment 11 Joy Pu 2012-01-06 11:21:17 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > Sorry, forget to upgrade guest's kernel. After guest kernel update, the warning
> > message is gone.
> 
> Ah OK, that's much better.
> 
> > In the scripts it only judge the init and delete with monitor command output.
> > So in bug report version of qemu kvm and kernel. guest will show warning
> > message and freeze for a while which will lead the monitor command failed and
> > warning message in guest. After update host kernel and qemu kvm, the guest
> > kernel only show warning message but alive. So only warning message shows but
> > monitor command return no error. After update guest kernel the warning message
> > didn't show up and the monitor command return no error.
> 
> For this test, only guest kernel update is required; host qemu and host kernel
> don't matter.
> 
> So can you run a last quick test: just test with buggy as well as new build of
> guest kernel without modifying host packages and let me know if with the new
> kernel warnings as well as freezes are gone?
> 
> Thanks for the testing!

Welcome and had more tests with this case with new guest kernel and both old host kernel and new host kernel.  After update guest kernel, seems guest will not freeze again. But the warning message still can be triggered in both old host kernel and new host kernel.The difference is old host kernel seems more easily to trigger the init failed warning in guest.

With new host kernel and the warning message is still shows up sometime. 1000 times hotplug it shows 7 times.

In the old host kernel. The warning message is easily shows up. For 100 times hotplug, it will show up 8~9 times in average. 

Warning message is :
2012-01-06 19:28:30: ------------[ cut here ]------------
2012-01-06 19:28:30: WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Tainted: G        W  ----------------  )
2012-01-06 19:28:30: Hardware name: KVM
2012-01-06 19:28:30: sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:05.0/virtio2/virtio-ports/vport0p2/name'
2012-01-06 19:28:30: Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport sg microcode virtio_console virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: mperf]
2012-01-06 19:28:30: Pid: 7, comm: events/0 Tainted: G        W  ----------------   2.6.32-223.el6.test.x86_64 #1
2012-01-06 19:28:30: Call Trace:
2012-01-06 19:28:30:  [<ffffffff81069997>] ? warn_slowpath_common+0x87/0xc0
2012-01-06 19:28:30:  [<ffffffff81069a86>] ? warn_slowpath_fmt+0x46/0x50
2012-01-06 19:28:30:  [<ffffffff811ed599>] ? sysfs_add_one+0xc9/0x130
2012-01-06 19:28:30:  [<ffffffff811eb882>] ? sysfs_add_file_mode+0x62/0xb0
2012-01-06 19:28:30:  [<ffffffff811eeb71>] ? internal_create_group+0xc1/0x1a0
2012-01-06 19:28:30:  [<ffffffff811eec83>] ? sysfs_create_group+0x13/0x20
2012-01-06 19:28:30:  [<ffffffffa0134fc0>] ? control_work_handler+0x1a0/0x498 [virtio_console]
2012-01-06 19:28:30:  [<ffffffffa0134e20>] ? control_work_handler+0x0/0x498 [virtio_console]
2012-01-06 19:28:30:  [<ffffffff8108b0d0>] ? worker_thread+0x170/0x2a0
2012-01-06 19:28:30:  [<ffffffff81090a10>] ? autoremove_wake_function+0x0/0x40
2012-01-06 19:28:30:  [<ffffffff8108af60>] ? worker_thread+0x0/0x2a0
2012-01-06 19:28:30:  [<ffffffff810906a6>] ? kthread+0x96/0xa0
2012-01-06 19:28:30:  [<ffffffff8100c14a>] ? child_rip+0xa/0x20
2012-01-06 19:28:30:  [<ffffffff81090610>] ? kthread+0x0/0xa0
2012-01-06 19:28:30:  [<ffffffff8100c140>] ? child_rip+0x0/0x20
2012-01-06 19:28:30: ---[ end trace 93ec87a4da16c595 ]---
2012-01-06 19:28:30: virtio-ports vport0p2: Error -17 creating sysfs device attributes

Comment 12 Amit Shah 2012-01-06 11:35:21 UTC
OK, as long as the guest doesn't freeze, we should be fine.

As for triggering this sysfs warning, I'll have to see if there's a way to wait on the sysfs path really going away at unplug time.

Comment 19 Amit Shah 2012-01-12 08:33:11 UTC
Joy has been testing this bug in various scenarios (smp 1, smp 3) and looks like it's no longer reproducible with RHEL6.1 nor Fedora 16 guests.  We don't know yet why it's no longer reproducible, but the only change was the host s/w was upgraded to RHEL6.2.  We can try reproducing on a RHEL6.1 host, but that'll just be an academic exercise.

I'll close the bug for now; if this shows up again, we can work on it.