Bug 1330002
| Summary: | Q35 machine can not boot from scsi-cd device under switch or rootport | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | yduan | ||||||
| Component: | qemu-kvm-rhev | Assignee: | Marcel Apfelbaum <marcel> | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 7.3 | CC: | alex.williamson, jinzhao, knoel, laine, lersek, marcel, virt-maint, yanyang | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-08-07 16:47:40 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
yduan
2016-04-25 09:17:23 UTC
Created attachment 1150408 [details]
VM Installation 1
Created attachment 1150409 [details]
VM Installation 2
(In reply to yduan from comment #0) > Description of problem: > Q35 machine can not boot from scsi-cd device under switch or rootport > > Version-Release number of selected component (if applicable): > Host: > kernel: 3.10.0-382.el7.x86_64 > qemu-kvm-rhev: qemu-kvm-rhev-2.5.0-4.el7.x86_64 > > How reproducible: > 100% > > Steps to Reproduce: > 1.Install a VM using following commands: > ... > -machine q35,accel=kvm,usb=off,vmport=off \ > ... > -device ioh3420,bus=pcie.0,id=root.0,slot=1 \ > -device x3130-upstream,bus=root.0,id=upstream1 \ > -device xio3130-downstream,bus=upstream1,id=downstream1,chassis=1 \ > -device virtio-scsi-pci,**bus=downstream1**,id=scsi_pci_bus0 \ > -drive > file=/dev/sdd,format=raw,id=drive_sysdisk,if=none,cache=none,aio=native, > werror=stop,rerror=stop \ > -device > scsi-hd,drive=drive_sysdisk,bus=scsi_pci_bus0.0,id=device_sysdisk, > bootindex=1 \ > -device virtio-scsi-pci,**bus=downstream1**,id=scsi_pci_bus1 \ > -drive > file=/home/backup/rhel7.2released/RHEL-7.2-20151030.0-Server-x86_64-dvd1.iso, > if=none,media=cdrom,id=drive_cd,readonly=on,format=raw \ > -device > scsi-cd,bus=scsi_pci_bus1.0,drive=drive_cd,id=device_cd,**bootindex=0** \ > ... > > or > > ... > -machine q35,accel=kvm,usb=off,vmport=off \ > ... > -device ioh3420,bus=pcie.0,id=root.0,slot=1 \ > -device virtio-scsi-pci,bus=root.0,id=scsi_pci_bus0 \ > -drive > file=/dev/sdd,format=raw,id=drive_sysdisk,if=none,cache=none,aio=native, > werror=stop,rerror=stop \ > -device > scsi-hd,drive=drive_sysdisk,bus=scsi_pci_bus0.0,id=device_sysdisk, > bootindex=1 \ > -device virtio-scsi-pci,bus=root.0,id=scsi_pci_bus1 \ > -drive > file=/home/backup/rhel7.2released/RHEL-7.2-20151030.0-Server-x86_64-dvd1.iso, > if=none,media=cdrom,id=drive_cd,readonly=on,format=raw \ > -device > scsi-cd,bus=scsi_pci_bus1.0,drive=drive_cd,id=device_cd,**bootindex=0** \ > ... > > Actual results: > Install VM correctly. > > Expected results: > Can not install VM correctly and details are as attachment. > > Additional info: > 1.Reproducible with qemu-kvm-rhev-2.3.0-31.el7_2.12.x86_64. > 2.Not reproducible with following commands: > ... > -machine q35,accel=kvm,usb=off,vmport=off \ > ... > -device virtio-scsi-pci,id=scsi_pci_bus0 \ > -drive > file=/dev/sdd,format=raw,id=drive_sysdisk,if=none,cache=none,aio=native, > werror=stop,rerror=stop \ > -device > scsi-hd,drive=drive_sysdisk,bus=scsi_pci_bus0.0,id=device_sysdisk, > bootindex=1 \ > -device virtio-scsi-pci,id=scsi_pci_bus1 \ > -drive > file=/home/backup/rhel7.2released/RHEL-7.2-20151030.0-Server-x86_64-dvd1.iso, > if=none,media=cdrom,id=drive_cd,readonly=on,format=raw \ > -device > scsi-cd,bus=scsi_pci_bus1.0,drive=drive_cd,id=device_cd,**bootindex=0** \ It should be: Actual results: Can not install VM correctly and details are as attachment. Expected results: Install VM correctly. According to Laine's, Marcel's, and Alex's comments in bug 1333238, plus from Marcel's internal PCI Express presentation, I think that every downstream port can accept only one PCIe device. Although it is represented as a bridge, it is not actually a bus, but part of a point-to-point interconnect. Each device sits on its own dedicated bus. If that's the case, then this is simply incorrect use of the QEMU command line. Note that in comment 0, there are two failing examples, and one successful example. Failing example 1: two virtio-scsi HBAs are placed on bus=downstream1. This violates the PCIe spec. Failing example 2: two virtio-scsi HBAs are placed on bus=root.0, which is a PCIe root port. According to the PCIe spec, all root ports (= ports on a root complex) are downstream ports, hence they can also accept only one device. That's why upstream ports of switches are plugged in there. Successful example: two virtio-scsi HBAs are placed on the normal PCI bus. As I said, I think this is simply incorrect QEMU configuration. What could be improved here perhaps is QEMU error reporting -- QEMU should perhaps reject incorrect PCIe topologies. Adding Marcel, Alex and Laine. What does qemu do if you specify the same bus and no addr for two devices? Does it automatically assign one to addr=1 on that bus? It shouldn't do that either for ioh3420 or for xio3130-downstream since, as Laszlo says, those "buses" only allow things to be plugged into "slot 0" aka "the only slot". So it's incorrect configuration that you are attempting to start qemu with multiple endpoint devices on a *-port - that part is NOTABUG. However, qemu should probably give an error message when someone tries to do this. Note however that libvirt already prevents such a configuration, so the extra utility of a qemu error message is dubious in an environment where the only supported use of qemu is via libvirt. So I'll leave it up to someone else to decide whether to leave this open to track adding an error message to qemu. (In reply to Laszlo Ersek from comment #5) > According to Laine's, Marcel's, and Alex's comments in bug 1333238, plus > from Marcel's internal PCI Express presentation, I think that every > downstream port can accept only one PCIe device. Although it is represented > as a bridge, it is not actually a bus, but part of a point-to-point > interconnect. Each device sits on its own dedicated bus. It's actually a bit more complicated that that. On a PCI bus, there's 8 bits worth of device address space, comprised of 3 bits for function numbers (0-7) and 5 bits for slot numbers (0-31). Standard PCIe only supports slot number 0, but we still support multifunction devices, so we can have up to 8 functions on the bus. Things get complicated further when we add ARI (Alternative Routing-ID Interpretation) into the mix. This is a capability on a downstream port that when enabled change the slot:fn (or more commonly devfn) address space into a simply 8 bit function number space, ie. 00.1 through 1f.7 are simply functions of device 00.0 (conceptually the function number space is 00.0 - 00.ff, but it maps to the devfn bit definitions). However, your interpretation is sufficient for the command lines here. > If that's the case, then this is simply incorrect use of the QEMU command > line. Note that in comment 0, there are two failing examples, and one > successful example. > > Failing example 1: two virtio-scsi HBAs are placed on bus=downstream1. This > violates the PCIe spec. I agree, I expect these come up at addresses 00.0 and 01.0 on bus downstream1, which is invalid (without ARI and multifunction enabled on 00.0). > Failing example 2: two virtio-scsi HBAs are placed on bus=root.0, which is a > PCIe root port. According to the PCIe spec, all root ports (= ports on a > root complex) are downstream ports, hence they can also accept only one > device. That's why upstream ports of switches are plugged in there. Agree, essentially the same error as the first case. > Successful example: two virtio-scsi HBAs are placed on the normal PCI bus. > > As I said, I think this is simply incorrect QEMU configuration. What could > be improved here perhaps is QEMU error reporting -- QEMU should perhaps > reject incorrect PCIe topologies. QEMU allows building all sorts of topologies that aren't really valid. While non-standard and not unexpected that it doesn't work, this one isn't even all that broken. Linux has a boot option, pci=pcie_scan_all that will ignore the slot 0 rule and scan PCIe buses just like it would conventional buses. Clearly we can't expect SeaBIOS/OVMF to do the same, so we can't boot from a device in such a configuration, but apparently there exists real hardware that does this too and it might be convenient in some cases. If we start calling this a broken configuration that QEMU should error on, what about the fact that we're installing a non-PCIe device into a PCIe slot, that's also a spec violation. Where do you draw the line of which spec violations we allow and which we put up with? In any case, this seems like NOTABUG. I expect QEMU to at least give a warning message. Someone from Fujitsu has some patches on the mailing lists for that, but there are no ready yet. I also think that the benefit from allowing this odd configuration is less than what we can achieve if we forbid the use of multiple slots on a PCIe "bus".(lots of confused users) This is maybe acceptable by Linux, but certainly not used widely. And since the libvirt forbids it anyway, maybe we can fail to start QEMU. I'm inclined to close this as WONTFIX, but I'm re-assigning to Marcel in case there is an actual error/warning message work to track. In not, Marcel, please feel free to close it. *** Bug 1326259 has been marked as a duplicate of this bug. *** After I carefully considered all the comments I decided to close it as not a bug. The only issue that remained open is if QEMU should issue an error message. While I am not against it, since libvirt forbids it anyway, it is not enough to keep this BZ. If QE things QEMU should report an error or warning, please open a low priority BZ for 7.4. |