RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1724048 - Fail to migrate a rhel6.10-mt7.6 guest with dimm device
Summary: Fail to migrate a rhel6.10-mt7.6 guest with dimm device
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.7
Hardware: x86_64
OS: Linux
medium
unspecified
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: Li Xiaohui
URL:
Whiteboard:
Depends On:
Blocks: 1757482 1757517
TreeView+ depends on / blocked
 
Reported: 2019-06-26 07:15 UTC by Yanqiu Zhang
Modified: 2020-03-31 14:37 UTC (History)
14 users (show)

Fixed In Version: qemu-kvm-rhev-2.12.0-38.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1757482 1757517 (view as bug list)
Environment:
Last Closed: 2020-03-31 14:34:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
qemu_libvirtd_logs (90.78 KB, application/gzip)
2019-06-26 07:34 UTC, Yanqiu Zhang
no flags Details
logs_for_comment8 (60.38 KB, application/gzip)
2019-07-01 10:50 UTC, Yanqiu Zhang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2020:1216 0 None None None 2020-03-31 14:37:02 UTC

Description Yanqiu Zhang 2019-06-26 07:15:30 UTC
Description of problem:
On rhel7.7 host, start a guest(with rhel6.10 guest os and
'pc-i440fx-rhel7.6.0' machine type) with a dimm device, try to migrate
to another rhel7.7 host, migration will fail.

Version-Release number of selected component (if applicable):
libvirt-4.5.0-23.virtcov.el7.x86_64
qemu-kvm-rhev-2.12.0-33.el7.x86_64
Guest os: kernel-2.6.32-754.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Start a guest with rhel6.10 image and following xml on a rhel7.7 host:
<domain type='kvm'>
  <name>rhel6.10-mt7.6</name>
  <uuid>df899f5c-db94-48b2-867a-e0c266b59b7b</uuid>
  <maxMemory slots='8' unit='KiB'>4194304</maxMemory>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
...
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type>
...
    <numa>
      <cell id='0' cpus='0-7' memory='524288' unit='KiB' memAccess='shared'>
        <distances>
          <sibling id='0' value='10'/>
          <sibling id='1' value='19'/>
        </distances>
      </cell>
      <cell id='1' cpus='8-15' memory='524288' unit='KiB' discard='yes'>
        <distances>
          <sibling id='0' value='19'/>
          <sibling id='1' value='10'/>
        </distances>
      </cell>
    </numa>
  </cpu>
...
    <memory model='dimm' access='private' discard='no'>
      <source>
        <nodemask>0</nodemask>
        <pagesize unit='KiB'>2048</pagesize>
      </source>
      <target>
        <size unit='KiB'>262144</size>
        <node>0</node>
      </target>
      <address type='dimm' slot='0' base='0x100000000'/>
    </memory>
...

2. Try to migrate to another rhel7.7 host:
#  virsh migrate rhel6.10-mt7.6 qemu+ssh://10.73.*.*/system --verbose --live
root.*.*'s password:
Migration: [100 %]error: internal error: qemu unexpectedly closed the
monitor: 2019-06-26T05:43:50.994750Z qemu-kvm: Failed to load
usb-ptr:dev
2019-06-26T05:43:50.994794Z qemu-kvm: error while loading state for
instance 0x0 of device '0000:00:01.2/2/usb-ptr'
2019-06-26T05:43:50.997735Z qemu-kvm: load of migration failed: Invalid argument

Actual results:
As in step2, migrate a rhel6.10 guest with rhel7.6 i440fx machine type
with dimm device between rhel7.7 hosts failed.

Expected results:
 migration should succeed.

Additional info:
1. Same guest can be started on target host.
2. If change to use a rhel7.6 guest image, migration will succeed.
3. If delete the dimm device, migration will succeed.

Comment 2 Yanqiu Zhang 2019-06-26 07:34:00 UTC
Created attachment 1584640 [details]
qemu_libvirtd_logs

Comment 4 Dr. David Alan Gilbert 2019-06-26 12:45:28 UTC
This looks like some type of USB screwup; the source shows:
   usb_generic_handle_packet: ctrl buffer too small (61440 > 4096)

and the migration error on the destination shows:
   qemu-kvm: Failed to load usb-ptr:dev
   error while loading state for instance 0x0 of device '0000:00:01.2/2/usb-ptr

although I'm not seeing what that's got to do with the DIMM

Comment 5 Dr. David Alan Gilbert 2019-06-26 12:48:59 UTC
It looks like:
   f30815390adb1ec153327c3832ab378e8bce9808  upstream fixes the load side of the problem - so I suggest we need that.
But it doesn't explain why the guest is screwing it up in the first place.

Bouncing this to Gerd for USB goodness.

Comment 6 Gerd Hoffmann 2019-06-26 14:02:01 UTC
(In reply to Dr. David Alan Gilbert from comment #4)
> This looks like some type of USB screwup; the source shows:
>    usb_generic_handle_packet: ctrl buffer too small (61440 > 4096)
> 
> and the migration error on the destination shows:
>    qemu-kvm: Failed to load usb-ptr:dev
>    error while loading state for instance 0x0 of device
> '0000:00:01.2/2/usb-ptr
> 
> although I'm not seeing what that's got to do with the DIMM

Hmm, memory corruption?  Or guest ram nor being migrated properly?

(the usb control structures where the ctrl buffer size comes from is in guest ram).

Comment 7 Dr. David Alan Gilbert 2019-06-26 16:35:52 UTC
(In reply to Gerd Hoffmann from comment #6)
> (In reply to Dr. David Alan Gilbert from comment #4)
> > This looks like some type of USB screwup; the source shows:
> >    usb_generic_handle_packet: ctrl buffer too small (61440 > 4096)
> > 
> > and the migration error on the destination shows:
> >    qemu-kvm: Failed to load usb-ptr:dev
> >    error while loading state for instance 0x0 of device
> > '0000:00:01.2/2/usb-ptr
> > 
> > although I'm not seeing what that's got to do with the DIMM
> 
> Hmm, memory corruption?  Or guest ram nor being migrated properly?
> 
> (the usb control structures where the ctrl buffer size comes from is in
> guest ram).

Except the source is showing 'usb_generic_handle_packet: ctrl buffer too small (61440 > 4096)'
so it suggests it's already broken before the migration.

Comment 8 Yanqiu Zhang 2019-07-01 10:41:03 UTC
Another issue should be affected by this bug. Could you pls help have a look? Thank you.

Description of problem:
For the rhel6.10-mt7.6 guest with dimm device, if add a usb keyboard to it, guest os in remote-viewer will not be interactive by keyboard. And guest os will be black screen after ~11mins, can never be waken up anymore. 


Version-Release number of selected component (if applicable):
libvirt-4.5.0-23.el7.x86_64
qemu-kvm-rhev-2.12.0-33.el7.x86_64
virt-viewer-5.0-15.el7.x86_64
Guest os: kernel-2.6.32-754.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Start a guest with xml in comment0 and a usb keyboard
...
    <input type='keyboard' bus='ps2'/>
    <input type='keyboard' bus='usb'>
      <address type='usb' bus='1' port='1'/>
    </input>
...

2. Connect graphics by remote-viewer.
# remote-viewer spice://hp-dl***:5900  --debug --spice-debug

3. Try to interactive with guest os by keyboard in remote-viewer

Actual results:
1. In step3, no response when typing the keyboard.
2. spice log and guest os check:
(remote-viewer:7643): GSpice-DEBUG: 08:34:06.870: spice-widget.c:484 0:0 grab_broken (implicit: 1, keyboard: 0)
(remote-viewer:7643): GSpice-DEBUG: 08:34:06.870: spice-widget.c:486 0:0 grab_broken (SpiceDisplay::GdkWindow 0x558eec186640, event->grab_window: 0x558eec186640)
[root@localhost ~]# dmesg|grep error -i
usb 1-1: device descriptor read/64, error -32
usb 1-1: device descriptor read/64, error -32
usb 1-1: device descriptor read/64, error -32
usb 1-1: device descriptor read/64, error -32
usb 1-1: device not accepting address 4, error -32
usb 1-1: device not accepting address 5, error -32
usb 2-2: device descriptor read/64, error -32
usb 2-2: device descriptor read/64, error -32
usb 2-2: device descriptor read/64, error -32
usb 2-2: device descriptor read/64, error -32
usb 2-2: device not accepting address 4, error -32
usb 2-2: device descriptor read/8, error -32
usb 2-2: device descriptor read/8, error -32
usb 3-1: device descriptor read/64, error -32
usb 3-1: device descriptor read/64, error -32
usb 3-1: device descriptor read/64, error -32
usb 3-1: device descriptor read/64, error -32
usb 3-1: device not accepting address 4, error -32
usb 3-1: device descriptor read/8, error -32
usb 3-1: device descriptor read/8, error -61
[root@localhost ~]# dmesg|grep input -i
input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
input: Macintosh mouse button emulation as /devices/virtual/input/input1
input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
3.After ~11mins, guest os gets black screen, and can never be waken up anymore.


Expected results:


Additional info:
1. Delete any of the dimm device or usb keyboard, interaction will be able to work, and guest os in virt-viewer is always accessible.
   If only use a usb keyboard(delete ps2 kbd, issue also reproduces.) 
2. If change to use a rhel7.6 guest image, it also works.
3. Same issue for vnc, so nothing about graphics type.
4. If delete the dimm device, check in guest os:
[root@localhost ~]# dmesg|grep error -i
[root@localhost ~]# dmesg|grep input -i
input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
input: Macintosh mouse button emulation as /devices/virtual/input/input1
input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
input: QEMU QEMU USB Keyboard as /devices/pci0000:00/0000:00:09.7/usb1/1-1/1-1:1.0/input/input3
generic-usb 0003:0627:0001.0001: input,hidraw0: USB HID v1.11 Keyboard [QEMU QEMU USB Keyboard] on usb-0000:00:09.7-1/input0
input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input4
input: QEMU QEMU USB Tablet as /devices/pci0000:00/0000:00:01.2/usb2/2-2/2-2:1.0/input/input5
generic-usb 0003:0627:0001.0002: input,hidraw1: USB HID v0.01 Mouse [QEMU QEMU USB Tablet] on usb-0000:00:01.2-2/input0

Comment 9 Yanqiu Zhang 2019-07-01 10:50:32 UTC
Created attachment 1586232 [details]
logs_for_comment8

Comment 10 Dr. David Alan Gilbert 2019-09-26 14:09:15 UTC
By comment 8 that suggests it's broken USB not related to migration directly; can you retest with 6.9 and see if it's a regression in 6.10?

Comment 11 Dr. David Alan Gilbert 2019-10-01 09:50:39 UTC
Note this is actually a report against qemu-kvm-rhev; so flip the package

Comment 12 Dr. David Alan Gilbert 2019-10-01 11:33:56 UTC
Reproduced here.
Noticed during bootup in guest:

usb 1-1: new high speed USB device number 2 using ehci_hcd
nommu_map_single: overflow 107c8ea40+8 of device mask ffffffff
(repeated)
usb 1-1: device descriptor read/all, error -32

so the guest USB isn't happy at boot.
Guest kernel 2.6.32-754.el6 from 6.10
I'm also suspicious if the guest has actually seen the DIMM.

Comment 13 Dr. David Alan Gilbert 2019-10-01 11:44:51 UTC
That nommu_map_single seems to be the same as:
   https://bugzilla.redhat.com/show_bug.cgi?id=1449012
but note that our qemu *does* have -numa

Comment 14 Dr. David Alan Gilbert 2019-10-01 13:27:11 UTC
also happens on 6.9

Comment 16 Dr. David Alan Gilbert 2019-10-02 09:20:27 UTC
It doesn't look like this is needed for RHEL7 qemu-kvm (non-rhev 1.5.3) since it doesn't seem to support hot-plug memory.

Comment 20 Li Xiaohui 2019-10-28 11:39:47 UTC
I can reproduce this bz on rhel7.8 host(kernel-3.10.0-1101.el7.x86_64&qemu-kvm-rhev-2.12.0-33.el7.x86_64):
1.boot a guest with clis on src host:
/usr/libexec/qemu-kvm -M pc-i440fx-rhel7.6.0 \
-cpu SandyBridge \
-enable-kvm \
-m 3G,maxmem=8G,slots=8 \
-object memory-backend-file,mem-path=/dev/hugepages,size=268435456,id=mem0 \
-device pc-dimm,id=dimm0,memdev=mem0,node=0,slot=0 \
-smp 8 \
-nodefaults \
-rtc base=utc,clock=host,driftfix=slew \
-device virtio-scsi-pci,id=scsi0 \
-drive file=rhel6-10-scsi.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,media=disk,cache=none,werror=stop,rerror=stop \
-device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
-device virtio-net-pci,mac=6c:0b:84:a4:53:4e,id=netdev1,vectors=4,netdev=net1 -netdev tap,id=net1,vhost=on \
-device ich9-usb-ehci1,id=usb1,bus=pci.0,addr=0x9.0x7 \
-device ich9-usb-uhci1,masterbus=usb1.0,firstport=0,bus=pci.0,multifunction=on,addr=0x9 \
-device ich9-usb-uhci2,masterbus=usb1.0,firstport=2,bus=pci.0,addr=0x9.0x1 \
-device ich9-usb-uhci3,masterbus=usb1.0,firstport=4,bus=pci.0,addr=0x9.0x2 \
-device usb-kbd,id=input3,bus=usb1.0,port=1 \
-vnc :3 \
-qmp tcp:0:1234,server,nowait \
-monitor stdio \
-vga qxl \
-boot menu=on \
Notes: 
(1)guest is rhel6.10;
(2)mem must be less than 4G, and use dimm device;
(3)use ich9-ehci-uhci controller and attach one usb device under controller.

after step1, hmp will print:
(qemu) usb_generic_handle_packet: ctrl buffer too small (61440 > 4096)
usb_generic_handle_packet: ctrl buffer too small (61440 > 4096)
...
2.Boot a guest with "-incoming tcp:0:4444" on dst host, and migrate guest from src to dst host. Migration will fail on dst host:
(qemu) qemu-kvm: Failed to load usb-kbd:dev
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:09.7/1/usb-kbd'
qemu-kvm: load of migration failed: Invalid argument


Verify this bz on same hosts(but qemu-kvm-rhev-2.12.0-38.el7.x86_64), src hmp will still print prompt, but migration finish successfully. Notes, usb device under controller still doesn't work. 

From above test results, this bz can be verified only considering migration part.

Comment 27 errata-xmlrpc 2020-03-31 14:34:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1216


Note You need to log in before you can comment on or make changes to this bug.