Bug 1567041
Summary: | qemu-guest-agent does not parse PCI bridge links in "build_guest_fsinfo_for_real_device" (q35) | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Lili Zhu <lizhu> |
Component: | qemu-guest-agent | Assignee: | Marc-Andre Lureau <marcandre.lureau> |
Status: | CLOSED ERRATA | QA Contact: | FuXiangChun <xfu> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.6 | CC: | chayang, dyuan, fjin, juzhang, knoel, lersek, michen, xfu, xuzhang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-guest-agent-2.12.0-2.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-10-30 08:08:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Lili Zhu
2018-04-13 10:28:59 UTC
This is a problem with the guest agent, not OVMF.
In the guest agent, the get_pci_driver() function is used to retrieve the
driver for the disk / filesystem. get_pci_driver() fails for the following
"syspath" parameter, for example:
/sys/devices/pci0000:00/0000:00:1e.0/0000:03:01.0/0000:04:05.0/virtio3/host6/target6:0:0/6:0:0:0/block/sda/sda3
And that's justified because there is no "driver" entry in that directory.
Because get_pci_driver() returns NULL, build_guest_fsinfo_for_real_device()
takes the early exit:
if (!driver) {
goto cleanup;
}
and the fs->disk member will not be populated.
Here's the --verbose log from the guest agent:
> read data, count: 31, data: {"execute":"guest-get-fsinfo"}
> process_event: called
> processing command
> Building guest fsinfo for '/'
> parse sysfs path '/sys/devices/virtual/block/dm-1'
> slave device 'sda3'
> parse sysfs path '/sys/devices/pci0000:00/0000:00:1e.0/0000:03:01.0/0000:04:05.0/virtio3/host6/target6:0:0/6:0:0:0/block/sda/sda3'
> Building guest fsinfo for '/boot'
> parse sysfs path '/sys/devices/pci0000:00/0000:00:1e.0/0000:03:01.0/0000:04:05.0/virtio3/host6/target6:0:0/6:0:0:0/block/sda/sda2'
> Building guest fsinfo for '/boot/efi'
> parse sysfs path '/sys/devices/pci0000:00/0000:00:1e.0/0000:03:01.0/0000:04:05.0/virtio3/host6/target6:0:0/6:0:0:0/block/sda/sda1'
> sending data, count: 220
Actually, the syspath that get_pci_driver() operates on is the following string only: /sys/devices/pci0000:00/0000:00:1e.0 That's under which qga looks for the "driver" entry. And, it cannot work -- there is no "driver" entry there -- because the PCI device identified like above is not a disk controller. It is a PCI bridge. The *actual* syspath slice that get_pci_driver() should receive, for investigation, is: /sys/devices/pci0000:00/0000:00:1e.0/0000:03:01.0/0000:04:05.0 Under this, a "driver" entry does exit, and it links to ..../virtio-pci. In short, the issue is that the following code fragment: p = strstr(syspath, "/devices/pci"); if (!p || sscanf(p + 12, "%*x:%*x/%x:%x:%x.%x%n", pci, pci + 1, pci + 2, pci + 3, &pcilen) < 4) { g_debug("only pci device is supported: sysfs path \"%s\"", syspath); return; } driver = get_pci_driver(syspath, (p + 12 + pcilen) - syspath, errp); from build_guest_fsinfo_for_real_device() cannot deal with PCI bridges. And, in the Q35 setup at hand, we have two bridges (a DMI-to-PCI bridge, and a PCI-PCI bridge) before we arrive at the disk controller. The scanning should be extended to iterate (in a loop) over the PCI bridge links. Lili, a request for the future: before filing an RHBZ for the OVMF component, please check whether the issue reproduces with SeaBIOS (using an otherwise identical domain configuration). I see that in this case, you did check i440fx, and that's great. However, you compared the following two configs: - i440fx + SeaBIOS versus - q35 + OVMF The issue in the guest agent wasn't triggered by the SeaBIOS -> OVMF change, but by the i440fx -> q35 change. Therefore I suggest that in the future please try to narrow down the issue as much as possibe. If you see an issue manifest only with q35+OVMF, then please compare it *first* to q35+SeaBIOS. If the issue disappears, then it is likely related to the OVMF<->SeaBIOS difference. If the issue persists, then you can compare q35+SeaBIOS vs. i440fx+SeaBIOS second. If the issue disappears then, then it is likely related to the q35<->i440fx difference. In other words, please change only one element of the setup each step along the way. That's how we eliminate irrelevant components. Thanks. Laszlo, sorry for the misunderstanding about Q35. Thanks very much for your detailed explanation. sent a patch to qemu ML: "[PATCH] qemu-ga: make get-fsinfo work over pci bridge" posted [RHEL-7.6 qemu-guest-agent PATCH 0/2] Make get-fsinfo work over pci bridges Fix included in qemu-guest-agent-2.12.0-2.el7 Reproduced this bug with qemu-guest-agent-2.12.0-1.el7.x86_64. steps: 1. Boot guest system disk with pci-bridge. ... -device pci-bridge,bus=pci.0,id=bridge0,chassis_nr=1 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=bridge0,addr=0x5 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel76-64-virtio-scsi.qcow2 -device scsi-hd,id=image1,drive=drive_image1 ... {"execute":"guest-get-fsinfo"} {"return": [{"name": "sda1", "mountpoint": "/boot", "disk": [], "type": "xfs"}, {"name": "dm-0", "mountpoint": "/", "disk": [], "type": "xfs"}]} Verified this bug with qemu-guest-agent-2.12.0-2.el7.x86_64 {"execute":"guest-get-fsinfo"} {"return": [{"name": "sda1", "mountpoint": "/boot", "disk": [{"bus-type": "scsi", "bus": 0, "unit": 0, "pci-controller": {"bus": 1, "slot": 5, "domain": 0, "function": 0}, "target": 0}], "type": "xfs"}, {"name": "dm-0", "mountpoint": "/", "disk": [{"bus-type": "scsi", "bus": 0, "unit": 0, "pci-controller": {"bus": 1, "slot": 5, "domain": 0, "function": 0}, "target": 0}], "type": "xfs"}]} Base on this result. This bug is fixed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3072 |