Bug 1724048
Summary: | Fail to migrate a rhel6.10-mt7.6 guest with dimm device | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Yanqiu Zhang <yanqzhan> | ||||||
Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Li Xiaohui <xiaohli> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 7.7 | CC: | chayang, dgilbert, dyuan, fjin, juzhang, kraxel, lizhu, lmen, mrezanin, ngu, virt-maint, yafu, yanqzhan, yuhuang | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | qemu-kvm-rhev-2.12.0-38.el7 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1757482 1757517 (view as bug list) | Environment: | |||||||
Last Closed: | 2020-03-31 14:34:48 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1757482, 1757517 | ||||||||
Attachments: |
|
Description
Yanqiu Zhang
2019-06-26 07:15:30 UTC
Created attachment 1584640 [details]
qemu_libvirtd_logs
This looks like some type of USB screwup; the source shows: usb_generic_handle_packet: ctrl buffer too small (61440 > 4096) and the migration error on the destination shows: qemu-kvm: Failed to load usb-ptr:dev error while loading state for instance 0x0 of device '0000:00:01.2/2/usb-ptr although I'm not seeing what that's got to do with the DIMM It looks like: f30815390adb1ec153327c3832ab378e8bce9808 upstream fixes the load side of the problem - so I suggest we need that. But it doesn't explain why the guest is screwing it up in the first place. Bouncing this to Gerd for USB goodness. (In reply to Dr. David Alan Gilbert from comment #4) > This looks like some type of USB screwup; the source shows: > usb_generic_handle_packet: ctrl buffer too small (61440 > 4096) > > and the migration error on the destination shows: > qemu-kvm: Failed to load usb-ptr:dev > error while loading state for instance 0x0 of device > '0000:00:01.2/2/usb-ptr > > although I'm not seeing what that's got to do with the DIMM Hmm, memory corruption? Or guest ram nor being migrated properly? (the usb control structures where the ctrl buffer size comes from is in guest ram). (In reply to Gerd Hoffmann from comment #6) > (In reply to Dr. David Alan Gilbert from comment #4) > > This looks like some type of USB screwup; the source shows: > > usb_generic_handle_packet: ctrl buffer too small (61440 > 4096) > > > > and the migration error on the destination shows: > > qemu-kvm: Failed to load usb-ptr:dev > > error while loading state for instance 0x0 of device > > '0000:00:01.2/2/usb-ptr > > > > although I'm not seeing what that's got to do with the DIMM > > Hmm, memory corruption? Or guest ram nor being migrated properly? > > (the usb control structures where the ctrl buffer size comes from is in > guest ram). Except the source is showing 'usb_generic_handle_packet: ctrl buffer too small (61440 > 4096)' so it suggests it's already broken before the migration. Another issue should be affected by this bug. Could you pls help have a look? Thank you. Description of problem: For the rhel6.10-mt7.6 guest with dimm device, if add a usb keyboard to it, guest os in remote-viewer will not be interactive by keyboard. And guest os will be black screen after ~11mins, can never be waken up anymore. Version-Release number of selected component (if applicable): libvirt-4.5.0-23.el7.x86_64 qemu-kvm-rhev-2.12.0-33.el7.x86_64 virt-viewer-5.0-15.el7.x86_64 Guest os: kernel-2.6.32-754.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. Start a guest with xml in comment0 and a usb keyboard ... <input type='keyboard' bus='ps2'/> <input type='keyboard' bus='usb'> <address type='usb' bus='1' port='1'/> </input> ... 2. Connect graphics by remote-viewer. # remote-viewer spice://hp-dl***:5900 --debug --spice-debug 3. Try to interactive with guest os by keyboard in remote-viewer Actual results: 1. In step3, no response when typing the keyboard. 2. spice log and guest os check: (remote-viewer:7643): GSpice-DEBUG: 08:34:06.870: spice-widget.c:484 0:0 grab_broken (implicit: 1, keyboard: 0) (remote-viewer:7643): GSpice-DEBUG: 08:34:06.870: spice-widget.c:486 0:0 grab_broken (SpiceDisplay::GdkWindow 0x558eec186640, event->grab_window: 0x558eec186640) [root@localhost ~]# dmesg|grep error -i usb 1-1: device descriptor read/64, error -32 usb 1-1: device descriptor read/64, error -32 usb 1-1: device descriptor read/64, error -32 usb 1-1: device descriptor read/64, error -32 usb 1-1: device not accepting address 4, error -32 usb 1-1: device not accepting address 5, error -32 usb 2-2: device descriptor read/64, error -32 usb 2-2: device descriptor read/64, error -32 usb 2-2: device descriptor read/64, error -32 usb 2-2: device descriptor read/64, error -32 usb 2-2: device not accepting address 4, error -32 usb 2-2: device descriptor read/8, error -32 usb 2-2: device descriptor read/8, error -32 usb 3-1: device descriptor read/64, error -32 usb 3-1: device descriptor read/64, error -32 usb 3-1: device descriptor read/64, error -32 usb 3-1: device descriptor read/64, error -32 usb 3-1: device not accepting address 4, error -32 usb 3-1: device descriptor read/8, error -32 usb 3-1: device descriptor read/8, error -61 [root@localhost ~]# dmesg|grep input -i input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 input: Macintosh mouse button emulation as /devices/virtual/input/input1 input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2 input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3 3.After ~11mins, guest os gets black screen, and can never be waken up anymore. Expected results: Additional info: 1. Delete any of the dimm device or usb keyboard, interaction will be able to work, and guest os in virt-viewer is always accessible. If only use a usb keyboard(delete ps2 kbd, issue also reproduces.) 2. If change to use a rhel7.6 guest image, it also works. 3. Same issue for vnc, so nothing about graphics type. 4. If delete the dimm device, check in guest os: [root@localhost ~]# dmesg|grep error -i [root@localhost ~]# dmesg|grep input -i input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 input: Macintosh mouse button emulation as /devices/virtual/input/input1 input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2 input: QEMU QEMU USB Keyboard as /devices/pci0000:00/0000:00:09.7/usb1/1-1/1-1:1.0/input/input3 generic-usb 0003:0627:0001.0001: input,hidraw0: USB HID v1.11 Keyboard [QEMU QEMU USB Keyboard] on usb-0000:00:09.7-1/input0 input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input4 input: QEMU QEMU USB Tablet as /devices/pci0000:00/0000:00:01.2/usb2/2-2/2-2:1.0/input/input5 generic-usb 0003:0627:0001.0002: input,hidraw1: USB HID v0.01 Mouse [QEMU QEMU USB Tablet] on usb-0000:00:01.2-2/input0 Created attachment 1586232 [details]
logs_for_comment8
By comment 8 that suggests it's broken USB not related to migration directly; can you retest with 6.9 and see if it's a regression in 6.10? Note this is actually a report against qemu-kvm-rhev; so flip the package Reproduced here. Noticed during bootup in guest: usb 1-1: new high speed USB device number 2 using ehci_hcd nommu_map_single: overflow 107c8ea40+8 of device mask ffffffff (repeated) usb 1-1: device descriptor read/all, error -32 so the guest USB isn't happy at boot. Guest kernel 2.6.32-754.el6 from 6.10 I'm also suspicious if the guest has actually seen the DIMM. That nommu_map_single seems to be the same as: https://bugzilla.redhat.com/show_bug.cgi?id=1449012 but note that our qemu *does* have -numa also happens on 6.9 It doesn't look like this is needed for RHEL7 qemu-kvm (non-rhev 1.5.3) since it doesn't seem to support hot-plug memory. I can reproduce this bz on rhel7.8 host(kernel-3.10.0-1101.el7.x86_64&qemu-kvm-rhev-2.12.0-33.el7.x86_64): 1.boot a guest with clis on src host: /usr/libexec/qemu-kvm -M pc-i440fx-rhel7.6.0 \ -cpu SandyBridge \ -enable-kvm \ -m 3G,maxmem=8G,slots=8 \ -object memory-backend-file,mem-path=/dev/hugepages,size=268435456,id=mem0 \ -device pc-dimm,id=dimm0,memdev=mem0,node=0,slot=0 \ -smp 8 \ -nodefaults \ -rtc base=utc,clock=host,driftfix=slew \ -device virtio-scsi-pci,id=scsi0 \ -drive file=rhel6-10-scsi.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,media=disk,cache=none,werror=stop,rerror=stop \ -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ -device virtio-net-pci,mac=6c:0b:84:a4:53:4e,id=netdev1,vectors=4,netdev=net1 -netdev tap,id=net1,vhost=on \ -device ich9-usb-ehci1,id=usb1,bus=pci.0,addr=0x9.0x7 \ -device ich9-usb-uhci1,masterbus=usb1.0,firstport=0,bus=pci.0,multifunction=on,addr=0x9 \ -device ich9-usb-uhci2,masterbus=usb1.0,firstport=2,bus=pci.0,addr=0x9.0x1 \ -device ich9-usb-uhci3,masterbus=usb1.0,firstport=4,bus=pci.0,addr=0x9.0x2 \ -device usb-kbd,id=input3,bus=usb1.0,port=1 \ -vnc :3 \ -qmp tcp:0:1234,server,nowait \ -monitor stdio \ -vga qxl \ -boot menu=on \ Notes: (1)guest is rhel6.10; (2)mem must be less than 4G, and use dimm device; (3)use ich9-ehci-uhci controller and attach one usb device under controller. after step1, hmp will print: (qemu) usb_generic_handle_packet: ctrl buffer too small (61440 > 4096) usb_generic_handle_packet: ctrl buffer too small (61440 > 4096) ... 2.Boot a guest with "-incoming tcp:0:4444" on dst host, and migrate guest from src to dst host. Migration will fail on dst host: (qemu) qemu-kvm: Failed to load usb-kbd:dev qemu-kvm: error while loading state for instance 0x0 of device '0000:00:09.7/1/usb-kbd' qemu-kvm: load of migration failed: Invalid argument Verify this bz on same hosts(but qemu-kvm-rhev-2.12.0-38.el7.x86_64), src hmp will still print prompt, but migration finish successfully. Notes, usb device under controller still doesn't work. From above test results, this bz can be verified only considering migration part. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:1216 |