Bug 1459831
Summary: | Migration fails with --rdma-pin-all option | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Dan Zheng <dzheng> | ||||||||||||
Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> | ||||||||||||
Status: | CLOSED NOTABUG | QA Contact: | xianwang <xianwang> | ||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||
Priority: | high | ||||||||||||||
Version: | 7.4 | CC: | chayang, dgilbert, dzheng, fjin, juzhang, knoel, michen, qzhang, virt-maint, xianwang, yafu, yanqzhan, zpeng | ||||||||||||
Target Milestone: | rc | Keywords: | Regression | ||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | x86_64 | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2017-06-09 09:51:45 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Created attachment 1286097 [details]
remote host libvirtd
Created attachment 1286098 [details]
remote qemu log
Created attachment 1286100 [details]
local host qemu
Created attachment 1286101 [details]
guest xxml
This is a regression problem which does not exist in RHEL 7.3 Hi Dan, Can you tell me how much RAM your destination host has and whether it's running any other VMs? Does increasing the 'hard_limit' value in the XML help? OK. I have to appologize this is a user error due to too small of hard_limit With below section, the migration succeeds. <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <memtune> <hard_limit unit='KiB'>3145728</hard_limit> <swap_hard_limit unit='KiB'>4194304</swap_hard_limit> </memtune> # virsh migrate --live --migrateuri rdma://192.168.100.2 setusertest --listen-address 0 qemu+ssh://192.168.100.2/system --verbose --rdma-pin-allroot.100.2's password: Migration: [100 %] Hi Dan, Thanks for trying that; can you tell me whether on 7.3 the original hard_limit value worked? Hi David, Currently I have no RDMA machines with RHEL7.3 for testing. I would like to try once I get them ready. Hi, David, After confirmation, the configuration in original XML like below can work in RHEL7.3. Thanks. <memory unit='KiB'>2097152</memory> <currentMemory unit='KiB'>2097152</currentMemory> <memtune> <hard_limit unit='KiB'>2097152</hard_limit> <swap_hard_limit unit='KiB'>2097152</swap_hard_limit> </memtune> (In reply to Dan Zheng from comment #12) > Hi, David, > > After confirmation, the configuration in original XML like below can work in > RHEL7.3. Thanks. > > <memory unit='KiB'>2097152</memory> > <currentMemory unit='KiB'>2097152</currentMemory> > <memtune> > <hard_limit unit='KiB'>2097152</hard_limit> > <swap_hard_limit unit='KiB'>2097152</swap_hard_limit> > </memtune> Hi Dan, Can you try and figure out what component causes it to change; for example for me with a 7.4 install and a 7.3 qemu it still fails. So can you try with 7.3 kernel and 7.4 qemu etc and see which component it is that causes the change. Thanks. Hi David, This is a known change that the hard_limit setting is different between rhel7.3 and rhel7.4. On rhel7.4 the memory hard_limit need be about 2G larger than the memory as our testing experience. There are bugs related to this issue such as Bz1373783. Btw, it's not easy for us to prepare the specific machines and set up env. For more detail changes info, please confirm with QEMU QEs, that should be a more faster way. Thanks. (In reply to yanqzhan from comment #14) > Hi David, > > This is a known change that the hard_limit setting is different between > rhel7.3 and rhel7.4. On rhel7.4 the memory hard_limit need be about 2G > larger than the memory as our testing experience. There are bugs related to > this issue such as Bz1373783. Btw, it's not easy for us to prepare the > specific machines and set up env. > > For more detail changes info, please confirm with QEMU QEs, that should be a > more faster way. I can't find any more details about it; if it has doubled we need to understand why. bz 1373783 is just a documentation bug, it doesn't help - can you please provide some more information about what is known here. > > > Thanks. Sorry, it's misunderstanding between our group discussion. The larger hard limit requirement exists in previous product, refer to BZ1160997, BZ1046833. Maybe the xml configuration in comment12 (which mem equals to hard limit) needs further confirmation. OK, that's fine - I'm only worried if it's a regression where the amount needed suddenly increases a lot. |
Created attachment 1286096 [details] source host libvirtd Description of problem: Migration fails with --rdma-pin-all option. Version-Release number of selected component (if applicable): qemu-kvm-rhev-2.9.0-8.el7.x86_64 libvirt-3.2.0-7.el7.x86_64 3.10.0-679.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Start guest and do migration without --rdma-pin-all. Migration succeeds. 2. Start guest and do migration with --rdma-pin-all. # virsh migrate --live --migrateuri rdma://192.168.0.2 setusertest --listen-address 0 qemu+ssh://192.168.0.2/system --verbose ***--rdma-pin-all*** error: internal error: qemu unexpectedly closed the monitor: 2017-06-08T09:56:14.612417Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/3 (label charserial0) dest_init RDMA Device opened: kernel name mlx4_0 uverbs device name uverbs0, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs0, infiniband class device path /sys/class/infiniband/mlx4_0, transport: (2) Ethernet Failed to register local dest ram block! : Cannot allocate memory 2017-06-08T09:56:14.737291Z qemu-kvm: rdma migration: error dest registering ram blocks 2017-06-08T09:56:14.737301Z qemu-kvm: error while loading state for instance 0x0 of device 'ram' 2017-06-08T09:56:14.737461Z qemu-kvm: Early error. Sending error. 2017-06-08T09:56:40.666831Z qemu-kvm: load of migration failed: Operation not permitted Actual results: See above Expected results: Migration with --rdma-pin-all is successful. Additional info: Logs are attached.