1241886 – hot plugged pci devices won't appear unless reboot

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1241886 - hot plugged pci devices won't appear unless reboot

Summary: hot plugged pci devices won't appear unless reboot

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	qemu-kvm-rhev
Sub Component:
Version:	7.2
Hardware:	ppc64le
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	7.2
Assignee:	Laurent Vivier
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:	1182027 1182040
Blocks:	1172478 RHEV3.6PPC
TreeView+	depends on / blocked

Reported:	2015-07-10 10:29 UTC by Zhengtong
Modified:	2016-02-21 11:06 UTC (History)
CC List:	14 users (show)
Fixed In Version:	qemu-kvm-rhev-2.3.0-19.el7
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-12-04 16:49:00 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
IBM Linux Technology Center	127768	0	None	None	None	Never
Red Hat Product Errata	RHBA-2015:2546	0	normal	SHIPPED_LIVE	qemu-kvm-rhev bug fix and enhancement update	2015-12-04 21:11:56 UTC

Description Zhengtong 2015-07-10 10:29:33 UTC

Description of problem:

The chardev device won't appear unless reboot the guest after hot plug.

Version-Release number of selected component (if applicable):

Host:3.10.0-292.el7.ppc64le

# /usr/libexec/qemu-kvm --version
QEMU emulator version 2.3.0 (qemu-kvm-rhev-2.3.0-8.el7), Copyright (c) 2003-2008 Fabrice Bellard


How reproducible:

100%


Steps to Reproduce:
1.Boot guest w/o virtio-serial-pci or virtioserialport device.

/usr/libexec/qemu-kvm -name liuzt-RHEL-7.1-20150219.1_LE -machine pseries,accel=kvm,usb=off -m 32768 -realtime mlock=off -smp 64,sockets=1,cores=16,threads=4 \
-monitor stdio \
-rtc base=localtime,clock=host \
-no-shutdown \
-boot strict=on \
-device usb-ehci,id=usb,bus=pci.0,addr=0x2 \
-device pci-ohci,id=usb1,bus=pci.0,addr=0x1 \
-device spapr-vscsi,id=scsi0,reg=0x1000 \
-drive file=/root/test_home/liuzt/vdisk/rhel_le.img,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none \
-drive file=/root/test_home/liuzt/iso/RHEL-LE-7.1-20150219.1-Server-ppc64le-dvd1.iso,if=none,id=drive-scsi0-0-1-0,readonly=on,format=raw,cache=none \
-device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 \
-device scsi-cd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,drive=drive-scsi0-0-1-0,id=scsi0-0-1-0,bootindex=2 \
-serial pty \
-device usb-kbd,id=input0 \
-device usb-mouse,id=input1 \
-device usb-tablet,id=input2 \
-vnc 0:16 -device VGA,id=video0,vgamem_mb=16,bus=pci.0,addr=0x4 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 \
-msg timestamp=on \
-netdev tap,id=hostnet0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \
-device spapr-vlan,netdev=hostnet0,id=net0,mac=52:54:00:c4:e7:83,reg=0x2000 \
-chardev socket,path=/root/test_home/liuzt/Manuall_test/virtio-serail/serial-socket1,id=channel1,server,nowait \
-chardev socket,path=/root/test_home/liuzt/Manuall_test/virtio-serail/serial-socket2,id=channel2,server,nowait \
-qmp tcp:0:4444,server,nowait \


2.hot plug virtio-serial-pci&virtserialport
(qemu) device_add virtio-serial-pci,id=virtio-serial0
(qemu) device_add virtserialport,id=port0,chardev=channel1,bus=virtio-serial0.0,name=com.redhat.vsdm1


3.check device in guest

Actual results:

No new device appears under dir /dev, If reboot guest , it will appear.


Expected results:

It should appear w/o reboot action.

Additional info:

Comment 2 David Gibson 2015-07-13 00:35:44 UTC

Is rtas_errd running inside the guest?  This is necessary for the guest to see hotplug actions.

Comment 3 Zhengtong 2015-07-13 02:21:16 UTC

(In reply to David Gibson from comment #2)
> Is rtas_errd running inside the guest?  This is necessary for the guest to
> see hotplug actions.

Yes ,running inside guest

[root@dhcp71-167 ~]# ps aux | grep rtas
root      3643  0.0  0.0   5376  4096 ?        Ss   06:14   0:00 /usr/sbin/rtas_errd
root      5843  0.0  0.0 110784  2816 pts/0    S+   06:16   0:00 grep --color=auto rtas

and in boot.log
[root@dhcp71-167 ~]# cat /var/log/boot.log
...
[^[[32m  OK  ^[[0m] Started opal_errd (PowerNV platform error handling) Service.
[^[[32m  OK  ^[[0m] Started ppc64-diag rtas_errd (platform error handling) Service.
...

Comment 4 David Gibson 2015-07-13 03:08:12 UTC

Ok.

It looks like you are running a RHEL7.1 guest, which might not have a recent enough rtas_errd to work correctly with hotplug.  Does the problem also occur with a RHEL7.2 snapshot?

Comment 5 Thomas Huth 2015-07-15 06:19:07 UTC

Also do you see anything new in the output of "dmesg" or "journalctl" after you did the hotplug?

Comment 6 Zhengtong 2015-07-15 06:27:51 UTC

(In reply to David Gibson from comment #4)
> Ok.
> 
> It looks like you are running a RHEL7.1 guest, which might not have a recent
> enough rtas_errd to work correctly with hotplug.  Does the problem also
> occur with a RHEL7.2 snapshot?


Yes , after I try again with rhel7.2, found it also happened with RHEL7.2 guest.

Comment 7 Thomas Huth 2015-07-15 08:33:22 UTC

I just tried to hotplug a virtio-serial device to my guest with the following command:

virsh qemu-monitor-command --hmp thuth-virtio-le "device_add virtio-serial-pci,id=virtio-serial1"

and I got the following messages in the output of "dmesg":

[  122.904310] virtio-pci 0000:00:02.0: enabling device (0000 -> 0003)
[  122.904958] virtio-pci 0000:00:02.0: virtio_pci: leaving for legacy driver
[  123.044829] virtio_console virtio3: Error -2 initializing vqs
[  123.044907] virtio_console: probe of virtio3 failed with error -2

Maybe that's a hint? Zhengtong, do you get a similar message?

Comment 8 Thomas Huth 2015-07-15 09:27:44 UTC

Zhengtong, could you please also check whether the problem also occurs if you try to hot-plug a second device in the same way? ... for me, it seems like it only happens for the first device that I try to hotplug.

Comment 9 Zhengtong 2015-07-15 09:38:15 UTC

(In reply to Thomas Huth from comment #8)
> Zhengtong, could you please also check whether the problem also occurs if
> you try to hot-plug a second device in the same way? ... for me, it seems
> like it only happens for the first device that I try to hotplug.

This is the dmesg info while I hot plug 3 devices below:

(qemu) device_add virtio-serial-pci,id=virtio-serial1
(qemu) device_add virtio-serial-pci,id=virtio-serial2
(qemu) device_add virtio-serial-pci,id=virtio-serial3


dmesg:
[  122.793888] RTAS: event: 1, Type: Unknown, Severity: 1
[  123.156661] pci 0000:00:00.0: [1af4:1003] type 00 class 0x078000
[  123.157143] pci 0000:00:00.0: reg 0x10: [io  0x10000-0x1001f]
[  123.157297] pci 0000:00:00.0: reg 0x14: [mem 0x00000000-0x00000fff]
[  123.159349] pci 0000:00:00.0: BAR 1: assigned [mem 0x100a1000000-0x100a1000fff]
[  123.159404] pci 0000:00:00.0: BAR 0: assigned [io  0x10000-0x1001f]
[  123.159838] virtio-pci 0000:00:00.0: enabling device (0000 -> 0003)
[  123.160623] virtio-pci 0000:00:00.0: virtio_pci: leaving for legacy driver
[  123.174682] virtio_console virtio1: Error -2 initializing vqs
[  123.174784] virtio_console: probe of virtio1 failed with error -2
[  130.109147] RTAS: event: 2, Type: Unknown, Severity: 1
[  130.446936] pci 0000:00:05.0: [1af4:1003] type 00 class 0x078000
[  130.447267] pci 0000:00:05.0: reg 0x10: [io  0x10000-0x1001f]
[  130.447414] pci 0000:00:05.0: reg 0x14: [mem 0x00000000-0x00000fff]
[  130.449154] pci 0000:00:05.0: BAR 1: assigned [mem 0x100a1001000-0x100a1001fff]
[  130.449221] pci 0000:00:05.0: BAR 0: assigned [io  0x10040-0x1005f]
[  130.449726] virtio-pci 0000:00:05.0: enabling device (0000 -> 0003)
[  130.451221] virtio-pci 0000:00:05.0: virtio_pci: leaving for legacy driver
[  156.748449] RTAS: event: 3, Type: Unknown, Severity: 1
[  157.135089] pci 0000:00:06.0: [1af4:1003] type 00 class 0x078000
[  157.135424] pci 0000:00:06.0: reg 0x10: [io  0x10000-0x1001f]
[  157.135573] pci 0000:00:06.0: reg 0x14: [mem 0x00000000-0x00000fff]
[  157.137360] pci 0000:00:06.0: BAR 1: assigned [mem 0x100a1002000-0x100a1002fff]
[  157.137425] pci 0000:00:06.0: BAR 0: assigned [io  0x10060-0x1007f]
[  157.137976] virtio-pci 0000:00:06.0: enabling device (0000 -> 0003)
[  157.139726] virtio-pci 0000:00:06.0: virtio_pci: leaving for legacy driver


Seems like the error msg only existed at the first hotplug device.

Comment 10 Zhengtong 2015-07-15 09:57:49 UTC

(In reply to Thomas Huth from comment #8)
> Zhengtong, could you please also check whether the problem also occurs if
> you try to hot-plug a second device in the same way? ... for me, it seems
> like it only happens for the first device that I try to hotplug.

Tried with the second time hot plugged device, It works well, and shows up without reboot.
So Is there anything different with the first time host plugged device?

Comment 11 Thomas Huth 2015-07-15 10:06:40 UTC

(In reply to Zhengtong from comment #10)
> So Is there anything different with the first time host plugged device?

Seems like the guest kernel fails to initialize the first hot-plugged device for some strange reasons. But it's good to know that you see the same "Error -2 initializing vqs" in the dmesg output as I see ... that's already a good first hint for debugging this problem. Thanks!

Comment 12 Laurent Vivier 2015-07-15 12:45:31 UTC

I don't know if it is the same bug, but I add a NIC, remove it and add it again, I have:

[   24.770044] pci 0000:00:01.0: BAR 6: assigned [mem 0x100a0000000-0x100a003ffff pref]
[   24.770100] pci 0000:00:01.0: BAR 1: assigned [mem 0x100a0040000-0x100a0040fff]
[   24.770154] pci 0000:00:01.0: BAR 0: assigned [io  0x10000-0x1001f]
[   24.770326] virtio-pci 0000:00:01.0: enabling device (0000 -> 0003)
[   24.770709] virtio-pci 0000:00:01.0: virtio_pci: leaving for legacy driver
[   24.771559] virtio_net: probe of virtio3 failed with error -2
[   52.513670] list_add corruption. prev->next should be next (c000000001540278), but was           (null). (prev=c000000037cc2fc8).
[   52.513785] ------------[ cut here ]------------
[   52.513812] WARNING: at lib/list_debug.c:33
[   52.513831] Modules linked in: ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables virtio_balloon virtio_console xfs libcrc32c sd_mod crc_t10dif crct10dif_common virtio_net ibmvscsi scsi_transport_srp scsi_tgt virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod
[   52.514445] CPU: 0 PID: 2364 Comm: drmgr Not tainted 3.10.0-294.el7.ppc64le #1
[   52.514484] task: c0000000371e3ab0 ti: c000000005e64000 task.ti: c000000005e64000
[   52.514522] NIP: c0000000004c3784 LR: c0000000004c3780 CTR: c0000000005a6540
[   52.514560] REGS: c000000005e67890 TRAP: 0700   Not tainted  (3.10.0-294.el7.ppc64le)
[   52.514598] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000424  XER: 20000000
[   52.514691] CFAR: c00000000094ccbc SOFTE: 1 
GPR00: c0000000004c3780 c000000005e67b10 c000000001100bb0 0000000000000075 
GPR04: c000000001608018 c000000001618c90 6338292e0d0a3030 3030333763633266 
GPR08: c000000000ca0bb0 0000000000000000 0000000000000000 3030303030303063 
GPR12: 0000000042000442 c00000000fb80000 000000000000000c 0000000000000000 
GPR16: 00000100368e9930 0000000010015680 0000000010015458 00000000100153e8 
GPR20: 00003ffffa47c5a8 00000100367da928 00003ffffa47c659 00000100368e9970 
GPR24: c00000003fff5f40 c0000000373fc009 c000000001070880 c000000037cc8b80 
GPR28: c000000001540000 c000000001540278 c000000037cc2fc8 c000000037cc9148 
[   52.515214] NIP [c0000000004c3784] __list_add+0xe4/0x110
[   52.515241] LR [c0000000004c3780] __list_add+0xe0/0x110
[   52.515266] Call Trace:
[   52.515281] [c000000005e67b10] [c0000000004c3780] __list_add+0xe0/0x110 (unreliable)
[   52.515328] [c000000005e67b90] [c00000000004d624] update_dn_pci_info+0x194/0x2a0
[   52.515374] [c000000005e67bd0] [c0000000000951fc] pci_dn_reconfig_notifier+0x4c/0x80
[   52.515426] [c000000005e67c10] [c000000000118da8] blocking_notifier_call_chain+0x98/0x100
[   52.515482] [c000000005e67c60] [c0000000007771c4] of_attach_node+0x34/0x170
[   52.515521] [c000000005e67cd0] [c0000000000949e4] ofdt_write+0x604/0x800
[   52.515562] [c000000005e67d90] [c0000000003a04f4] proc_reg_write+0x84/0x120
[   52.515602] [c000000005e67dd0] [c0000000002f7350] SyS_write+0x150/0x400
[   52.515641] [c000000005e67e30] [c00000000000a17c] system_call+0x38/0xb4
[   52.515679] Instruction dump:
[   52.515699] e8010010 eba1ffe8 ebc1fff0 ebe1fff8 7c0803a6 4e800020 3c62ffa5 7fa4eb78 
[   52.515765] 38636420 7fc6f378 484894c5 60000000 <0fe00000> 4bffff5c 3c62ffa5 7fa6eb78 
[   52.515832] ---[ end trace ac972e2b37881070 ]---
[   52.516829] pci 0000:00:06.0: BAR 6: assigned [mem 0x100a0000000-0x100a003ffff pref]
[   52.516872] pci 0000:00:06.0: BAR 1: assigned [mem 0x100a0040000-0x100a0040fff]
[   52.516927] pci 0000:00:06.0: BAR 0: assigned [io  0x10000-0x1001f]
[   52.517098] virtio-pci 0000:00:06.0: enabling device (0000 -> 0003)
[   52.517571] virtio-pci 0000:00:06.0: virtio_pci: leaving for legacy driver
[   52.518463] virtio_net: probe of virtio3 failed with error -2

Comment 13 David Gibson 2015-07-16 01:14:51 UTC

Laurent, that looks like a different bug to me - it's an oops rather than a simple error during initialization.  Probably worth filing a new BZ for it.

Comment 14 David Gibson 2015-07-16 01:29:22 UTC

Classifying this as a host-side bug in qemu-kvm-rhev for the time being.  There's a good chance this could be a guest side kernel bug, or even a guest-side rtas_errd bug.  We can refile if that turns out to be the case.

Comment 15 Laurent Vivier 2015-07-16 07:21:33 UTC

With the qemu monitor command "info qtree" we can have the PCI configuration from the qemu side:

The existing and good network interface is:

     dev: virtio-net-pci, id "net0"
        addr = 05.0
        class Ethernet controller, addr 00:05.0, pci id 1af4:1000 (sub 1af4:0001)
        bar 0: i/o at 0x60 [0x7f]
        bar 1: mem at 0xc0002000 [0xc0002fff]
        bar 6: mem at 0xffffffffffffffff [0x3fffe]
            mac = "52:54:00:22:ab:39"
            netdev = "hostnet0"

The hotplugged and not good interface is:

      dev: virtio-net-pci, id "net1"
        addr = 01.0
        class Ethernet controller, addr 00:01.0, pci id 1af4:1000 (sub 1af4:0001)
        bar 0: i/o at 0xffffffffffffffff [0x1e]
        bar 1: mem at 0x80040000 [0x80040fff]
        bar 6: mem at 0xffffffffffffffff [0x3fffe]
            mac = "52:54:00:68:50:53"
            netdev = "hostnet1"

[   53.483278] pci 0000:00:01.0: BAR 1: assigned [mem 0x100a0040000-0x100a0040fff]
[   53.483332] pci 0000:00:01.0: BAR 0: assigned [io  0x10000-0x1001f]
[   53.483506] virtio-pci 0000:00:01.0: enabling device (0000 -> 0003)
[   53.483928] virtio-pci 0000:00:01.0: virtio_pci: leaving for legacy driver
[   53.484823] virtio_net: probe of virtio3 failed with error -2

lspci   Region 0: I/O ports at 0000 [size=32]

We can see the BARs don't seem to be initialized correctly.

If we add one more, it is OK:

[  201.463788] pci 0000:00:06.0: BAR 6: assigned [mem 0x100a0080000-0x100a00bffff pref]
[  201.463847] pci 0000:00:06.0: BAR 1: assigned [mem 0x100a0041000-0x100a0041fff]
[  201.463901] pci 0000:00:06.0: BAR 0: assigned [io  0x10080-0x1009f]
[  201.464069] virtio-pci 0000:00:06.0: enabling device (0000 -> 0003)
[  201.464710] virtio-pci 0000:00:06.0: virtio_pci: leaving for legacy driver
[  201.563926] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
lspci Region 0: I/O ports at 0080 [size=32]

      dev: virtio-net-pci, id "net2"
        addr = 06.0
        class Ethernet controller, addr 00:06.0, pci id 1af4:1000 (sub 1af4:0001)
        bar 0: i/o at 0x80 [0x9f]
        bar 1: mem at 0x80041000 [0x80041fff]
        bar 6: mem at 0xffffffffffffffff [0x3fffe]
            mac = "52:54:00:21:a9:df"
            netdev = "hostnet2"

After Reboot (with just the "net0" and "net1")

both work

      dev: virtio-net-pci, id "net0"
        addr = 05.0
        class Ethernet controller, addr 00:05.0, pci id 1af4:1000 (sub 1af4:0001)
        bar 0: i/o at 0x80 [0x9f]
        bar 1: mem at 0xc0081000 [0xc0081fff]
        bar 6: mem at 0xffffffffffffffff [0x3fffe]
            mac = "52:54:00:22:ab:39"
            netdev = "hostnet0"

      dev: virtio-net-pci, id "net1"
        addr = 01.0
        class Ethernet controller, addr 00:01.0, pci id 1af4:1000 (sub 1af4:0001)
        bar 0: i/o at 0x20 [0x3f]
        bar 1: mem at 0xc0001000 [0xc0001fff]
        bar 6: mem at 0xffffffffffffffff [0x3fffe]
        bus: virtio-bus
            mac = "52:54:00:68:50:53"
            netdev = "hostnet1"

Comment 16 Laurent Vivier 2015-07-16 17:49:37 UTC

I've checked upstream QEMU, and there is the same problem:

be0df8c Merge remote-tracking branch 'remotes/ehabkost/tags/numa-pull-request' into staging

QEMU 2.3.90 monitor - type 'help' for more information
(qemu) device_add virtio-net-pci,id=virtio-net-pci0,mac=52:54:00:12:34:56
[   98.753067] pci 0000:00:00.0: BAR 6: assigned [mem 0x100a0000000-0x100a003ffff pref]
[   98.753135] pci 0000:00:00.0: BAR 1: assigned [mem 0x100a0040000-0x100a0040fff]
[   98.753194] pci 0000:00:00.0: BAR 0: assigned [io  0x10000-0x1001f]
[   98.753361] virtio-pci 0000:00:00.0: enabling device (0000 -> 0003)
[   98.753620] virtio-pci 0000:00:00.0: virtio_pci: leaving for legacy driver
[   98.754373] virtio_net: probe of virtio1 failed with error -2
(qemu) info qtree
...
        bar 0: i/o at 0xffffffffffffffff [0x1e]
        bar 1: mem at 0x80040000 [0x80040fff]
...
        bar 0: i/o at 0x20 [0x3f]
        bar 1: mem at 0xc0000000 [0xc0000fff]

Comment 17 David Gibson 2015-07-17 04:35:28 UTC

So, I previously said that non-hotplugged devices had BARs allocated by SLOF, and hotplugged devices had them allocated by qemu.  I've had a closer look, and now I'm a bit confused.
  * I think I see the code in SLOF that will do BAR allocation (obiously only for devices present at boot when SLOF runs)
  * I'm pretty sure that the guest kernel *won't* do any BAR allocation (for the "pseries" platform, some other platforms will)
  * BUT, I haven't managed to find any qemu code that will do BAR allocation for hotplugged devices or otherwise - it will populate the device tree with information about the BARs if they're allocated ("assigned-addresses" property) but I can't spot code that actually sets the BAR registers.

Which would explain this bug, except that it doesn't explain how the BARs are assigned for the second hotplugged device, nor how BAR1 is being allocated for the first hotplugged device.

So clearly I'm missing something.

Comment 18 IBM Bug Proxy 2015-07-17 14:11:58 UTC

------- Comment From fnovak.com 2015-07-17 13:19 EDT-------
Do you have latest updates for

* librtas
* librtas-devel
* powerpc-utils
* ppc64-diag

?

Comment 19 Laurent Vivier 2015-07-17 14:35:29 UTC

In reply to IBM Bug Proxy from comment #18,

To be able to compare the correct behavior with our faulty one,
I'd like to know which version of upstream QEMU works.

I have compared our RHEL7 behavior with a up-to-date fedora and the result is the same:

QEMU 2.4.0-rc1
fedora22 guest (4.0.4-301.fc22):
    librtas       3.13.1-fc22
    ppc64-diag    2.6.7-2.fc22
    powerpc-utils 1.2.24-1.fc22
    SLOF          git-7d766a3ac9b2474f
-> SLOF-0.1.git20150313-1.fc22

Comment 20 Hanns-Joachim Uhl 2015-07-17 14:49:12 UTC

(In reply to Laurent Vivier from comment #19)
> In reply to IBM Bug Proxy from comment #18,
> 
> To be able to compare the correct behavior with our faulty one,
> I'd like to know which version of upstream QEMU works.
> 
> I have compared our RHEL7 behavior with a up-to-date fedora and the result
> is the same:
.
... using a RHEL7.2 current nightly build ..? Please advise ...
> 
> QEMU 2.4.0-rc1
> fedora22 guest (4.0.4-301.fc22):
>     librtas       3.13.1-fc22
.
ok .. 
..
>     ppc64-diag    2.6.7-2.fc22
.
... please update to the most current 2.6.9 level ...
... see RHBZ 1182027 - [7.2 FEAT] ppc64-diag package update - ppc64/ppc64le
..
>     powerpc-utils 1.2.24-1.fc22
.
... please update to the most current 1.2.26 level ...
... see RHBZ 1182040 - [7.2 FEAT] powerpc-utils package update - ppc64/ppc64le
..
>     SLOF          git-7d766a3ac9b2474f
> -> SLOF-0.1.git20150313-1.fc22
.
... could you please update your system to the above package level
and retest this bugzilla again ..? Please advise ...

Comment 21 Laurent Vivier 2015-07-17 17:14:37 UTC

same result with RHEL-7.2-20150708

[  120.630890] pci 0000:00:04.0: BAR 6: assigned [mem 0x100a0000000-0x100a003ffff pref]
[  120.630956] pci 0000:00:04.0: BAR 1: assigned [mem 0x100a0040000-0x100a0040fff]
[  120.631012] pci 0000:00:04.0: BAR 0: assigned [io  0x10000-0x1001f]
[  120.631188] virtio-pci 0000:00:04.0: enabling device (0000 -> 0003)
[  120.631665] virtio-pci 0000:00:04.0: virtio_pci: leaving for legacy driver
[  120.644250] virtio_net: probe of virtio2 failed with error -2

QEMU: qemu-kvm-rhev-2.3.0-10.el7 and upstream QEMU v2.4.0-rc1
GUEST:
    kernel 3.10.0-290.el7.ppc64le
    librtas       1.3.13-2.el7
    ppc64-diag    2.6.9-1.el7
    powerpc-utils 1.2.26-1.el7
    SLOF          git-7d766a3ac9b2474f

Comment 22 IBM Bug Proxy 2015-07-17 19:01:24 UTC

------- Comment From mdroth.com 2015-07-17 18:54 EDT-------
I believe you're hitting the issue addressed by this patch:

http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg03454.html

Some additional discussion on the patch is available here:

http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg01171.html

The gist of it is that it's an acceptable fix for pseries, since pseries using a dedicated IO window that does not have any risk of overlapping a PCI address wherein offset 0 is reserved for some legacy function/port (we don't even have legacy ports on pseries).

The patch hasn't been applied upstream however because the fix applies to all architectures, and there are concerns that in the case of, say, x86, where BAR to overlap legacy I/O space, guests may rely on 0 BARs being rejected by QEMU to function properly.

Doing a full analysis of all the possibilities will require a good amount of time so for now we've been carrying this patch for pkvm. There are a number of ways to limit the behavior to pseries but I think from an upstream perspective we'll need to do the full analysis. Not sure how RedHat would prefer to address this.

>BUT, I haven't managed to find any qemu code that will do BAR allocation for hotplugged devices or otherwise - it will populate the device tree with information about the BARs if they're allocated ("assigned-addresses" property) but I can't spot code that actually sets the BAR registers.

This is correct, we don't currently do BAR assignment in QEMU, but instead rely on the guest assign them. It's actually the pseries kernel that's picking the 0 addr, and QEMU is telling it that's invalid.

We do have the code to populate assigned-addresses and friend because the guest relies on it's presence, even when the actually values are ignored. We do plan on switching to QEMU-based BAR assignment and using rpaphp-based hotplug module in the guest, but there are a number of guest kernel fixes required to use it so that will be enabled later.

Comment 23 IBM Bug Proxy 2015-07-17 19:11:01 UTC

------- Comment From mdroth.com 2015-07-17 19:06 EDT-------
(In reply to comment #9)
> We do have the code to populate assigned-addresses and friend because the
> guest relies on it's presence, even when the actually values are ignored.

More accurately, the guest relies on the config space portion of the properties being populated, regardless of BAR assignments/handling.

Comment 24 Laurent Vivier 2015-07-17 20:25:26 UTC

I confirm this patch fixes the problem.

http://patchwork.ozlabs.org/patch/423796/

Comment 26 David Gibson 2015-07-20 00:38:06 UTC

Ugh.  If I'd realised the hotplug code was relying on the guest for BAR assignment, I would have been a lot less keen to apply Nikunj's patches to use the same logic for cold-plugged devices.

This is basically a horrid hack, relying on the behaviour of a particular guest, which you only get away with because no-one cares about any guests other than Linux.  Under PAPR, BAR assignment is supposed to be handled by the "firmware" which in our case means either qemu or SLOF.

But, that's a problem that can't be fixed in time for RHEL 7.2.   So we need to go with the BAR address 0 fix, but even that has complications upstream.

So, what I think we need to do is this:

  1) Make a version of that patch that affects only Power - just using an ugly ifdef - and apply it as downstream only for RHEL7.2

  2) The fix allowing zero BARs seems correct in general, so pursue the fix upstream, working out how to properly activate/deactivate it

  3) Implement proper BAR assignment in qemu to bring us back closer to PAPR.


Laurent, can you handle (1) please.

Michael, with my upstream hat on, I await your patches for (2) and (3).

Comment 27 IBM Bug Proxy 2015-07-20 04:01:10 UTC

------- Comment From mdroth.com 2015-07-20 03:51 EDT-------
(In reply to comment #14)
> Ugh.  If I'd realised the hotplug code was relying on the guest for BAR
> assignment, I would have been a lot less keen to apply Nikunj's patches to
> use the same logic for cold-plugged devices.

I need to double-check the code, but I think the topic was brought up and the plan was to still do the *actual* PCI device enumeration in SLOF and to still use SLOF's version of assigned-addresses. Nikunj did move some of the bridge enumeration bits over to QEMU though.

When we add the QEMU-based bar assignment SLOF can either check for a new device-tree flag like it does now or examine QEMU's version of assigned-address to determine if we've already done the assignment.

> This is basically a horrid hack, relying on the behaviour of a particular
> guest, which you only get away with because no-one cares about any guests
> other than Linux.  Under PAPR, BAR assignment is supposed to be handled by
> the "firmware" which in our case means either qemu or SLOF.
> But, that's a problem that can't be fixed in time for RHEL 7.2.   So we need
> to go with the BAR address 0 fix, but even that has complications upstream.

Agreed. We have patches to enable BAR assignment in QEMU, but it's actually guest kernel fixes for rpaphp that are necessitating the current/alternate approach. Once those issues are addressed the plan is to enable graceful switch over to QEMU-based BAR assignment using ibm,client-architecture-support flag set by guest.

> So, what I think we need to do is this:
> 1) Make a version of that patch that affects only Power - just using an ugly
> ifdef - and apply it as downstream only for RHEL7.2
>
> 2) The fix allowing zero BARs seems correct in general, so pursue the fix
> upstream, working out how to properly activate/deactivate it
>
> 3) Implement proper BAR assignment in qemu to bring us back closer to PAPR.
>
> Laurent, can you handle (1) please.
>
> Michael, with my upstream hat on, I await your patches for (2) and (3).

Absolutely, thanks.

Comment 28 Laurent Vivier 2015-07-20 18:53:58 UTC

(In reply to David Gibson from comment #26)
>   1) Make a version of that patch that affects only Power - just using an
> ugly ifdef - and apply it as downstream only for RHEL7.2

It seems not possible to manage this with a ifdef: pci.c is in the common-obj part of qemu and is compiled once for all the targets, so we can't use "#ifdef CONFIG_PPC" inside.

I'm trying to do this dynamically by adding a field "accept_addr_0" in PCIBus and check this value in pci_bar_address() to know it we can accept address 0 for BAR. "accept_addr_0" is set to false by default in pci_bus_init() and to true in spapr_phb_realize().

Do you think it is an acceptable approach ?

Comment 29 David Gibson 2015-07-21 04:44:36 UTC

Hrm.  I suspect the final upstream fix will look something like that (maybe something in MachineClass rather than the PCI bus though).  However as a downstream only fix it has the drawback that it needs (small) changes in multiple parts of the code, increasing the chance for conflicts.

The other approach would be to use #ifdef __powerpc__.  That's a ghastly hack: it's actually testing the type of the host, rather than the guest, which is wrong, but would work in our circumstances since we only support KVM, not TCG.  And as a downstream only hack it's small and local.

I'm not sure which is the best way to go here.  So, I'm thinking we should ask someone like Paolo or Michael Tsirkin to make a taste judgement.

Comment 33 Miroslav Rezanina 2015-08-24 15:09:19 UTC

Fix included in qemu-kvm-rhev-2.3.0-19.el7

Comment 35 Qunfang Zhang 2015-08-27 07:53:54 UTC

Reproduced the bug on qemu-kvm-rhev-2.3.0-18.el7.ppc64le with the same steps as comment 0.

Also tested virtio-net-pci, after hotplug, there's no eth* interface inside guest after "ifconfig -a". And after reboot, guest could get a new ip for the hot-plugged network device. 

======================
Verified the bug on qemu-kvm-rhev-2.3.0-19.el7.ppc64le.

CLI:

/usr/libexec/qemu-kvm -name test_qzhang -machine pseries,accel=kvm,usb=off -m 4G -smp 4,sockets=1,cores=4,threads=1 -uuid 8aeab7e2-f341-4f8c-80e8-59e2968d85c2 -realtime mlock=off -nodefaults -monitor stdio -rtc base=utc -device spapr-vscsi,id=scsi0,reg=0x1000 -drive file=RHEL-7.2-LE-0806.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,bootindex=1,id=scsi0-0-0-0  -drive file=RHEL-7.2-20150806.1-Server-ppc64le-dvd1.iso,if=none,id=drive-scsi0-0-1-0,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,drive=drive-scsi0-0-1-0,id=scsi0-0-1-0 -vnc :10 -msg timestamp=on -usb -device usb-tablet,id=tablet1  -vga std -qmp tcp:0:4666,server,nowait -netdev tap,id=hostnet1,script=/etc/qemu-ifup,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:54:5a:5f:5b:5c,disable-legacy=off,disable-modern=on

(1) virtio-serial-pci

(qemu) device_add virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7
(qemu) chardev-add socket,path=/tmp/hello,id=socket1,server,nowait
(qemu) device_add virtserialport,id=port0,chardev=socket1,bus=virtio-serial0.0,name=com.redhat.vsdm1

Afer hotplug the device, it works well and no need to reboot guest. Transfer data between host and guest, succeeds.

(2) virtio-net-pci

(qemu) netdev_add tap,id=hostnet2
(qemu) device_add virtio-net-pci,id=net2,mac=00:54:5a:5f:5b:11,netdev=hostnet2,bus=pci.0,addr=0x6

After hotplug the device, guest could get the new interface and have ip address. No need to reboot.

(3) virtio-blk-pci

(qemu) __com.redhat_drive_add file=disk.qcow2,format=qcow2,id=disk1
(qemu) device_add virtio-blk-pci,id=disk1,drive=disk1,bus=pci.0,addr=0x8

After hotplug the device, and make filesystem for it. The block could be used at once and no need to reboot. Do some dd operation on the disk, works well.

(4) virtio-scsi-pci

(qemu)  __com.redhat_drive_add file=test.qcow2,format=qcow2,id=disk2
(qemu) device_add virtio-scsi-pci,id=scsi0,vectors=0,bus=pci.0,addr=0x7 
(qemu) device_add scsi-hd,ver=sluo,drive=drive-data-disk,bus=scsi0.0,id=data-disk

After hotplug the device, and make filesystem for it. The block could be used at once and no need to reboot. Do some dd operation on the disk, works well.

(5) virtio-balloon-pci

(qemu) device_add virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

Memory balloon device works well (could enlarge and shrink guest memory) after hotplug the device, no need to reboot.

(6) usb-echi

(qemu)  __com.redhat_drive_add file=usb.qcow2,format=qcow2,id=disk3
(qemu) device_add usb-ehci,id=ehci,bus=pci.0,addr=0x3
(qemu) device_add usb-storage,drive=disk3,id=usb3,bus=ehci.0,port=1

After hotplug the device, and make filesystem for it. The block could be used at once and no need to reboot. Do some dd operation on the disk, works well.

Based on above, I will set the bug as VERIFIED.  Did not test all of the supported pci devices, if there's some other thing need to be covered, please add the comment here.  Thanks.

Comment 37 errata-xmlrpc 2015-12-04 16:49:00 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html

Note You need to log in before you can comment on or make changes to this bug.