Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1356959

Summary: qemu-kvm segmentation fault/hangs when migration with rdma on mlx5 card
Product: Red Hat Enterprise Linux 7 Reporter: mazhang <mazhang>
Component: libmlx5Assignee: Jarod Wilson <jarod>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 7.3CC: amit.shah, chayang, ddutile, dgibson, dgilbert, dzheng, gsun, jarod, juzhang, jwilson, knoel, mdeng, michen, qizhu, quintela, qzhang, rdma-dev-team, thuth, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64le   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-06-05 16:07:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1364525    
Bug Blocks: 1288337    

Comment 4 mazhang 2016-07-21 06:16:33 UTC
x86 platform can't hit this problem.

Host:
3.10.0-473.el7.x86_64
qemu-kvm-rhev-2.6.0-11.el7.x86_64
rdma-7.2_4.1_rc6-1.el7.noarch
opensm-3.3.19-1.el7.x86_64

Guest:
3.10.0-456.el7.x86_64

Comment 6 Dr. David Alan Gilbert 2016-08-05 09:25:57 UTC
Hi,
  Can you please give the exact command line you used for qemu; the polarion text only has the x86 command line.

Comment 7 Dr. David Alan Gilbert 2016-08-05 10:45:12 UTC
I can reproduce the seg using the 1.0.2-1 mlx5 driver you have, but the seg disappears with the 1.2.1 package; I suspect because of this fix:
    http://www.spinics.net/lists/linux-rdma/msg31163.html

however, it still hangs for me even with that version.  so investigating.

Comment 8 Dr. David Alan Gilbert 2016-08-05 15:35:52 UTC
The hang is another libmlx5 bug - I've filed that and made this dependent on it; it's returning the error code wrong when the buffers fill up.

Comment 9 mazhang 2016-08-08 08:40:16 UTC
(In reply to Dr. David Alan Gilbert from comment #6)
> Hi,
>   Can you please give the exact command line you used for qemu; the polarion
> text only has the x86 command line.

Hi, David

Unfortunately, The host has been released, and I don't have the exact command line, but I remember the command line copied from avocado.

So it should be:

/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pseries  \
    -nodefaults  \
    -vga std  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_KYPkAI/monitor-qmpmonitor1-20160807-233940-3m86IXrE,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/avocado_KYPkAI/monitor-catch_monitor-20160807-233940-3m86IXrE,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/avocado_KYPkAI/serial-serial0-20160807-233940-3m86IXrE,server,nowait \
    -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 \
    -device pci-ohci,id=usb1,bus=pci.0,addr=03 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=04,disable-legacy=off,disable-modern=on \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/RHEL-Server-7.3-ppc64le-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device virtio-net-pci,mac=9a:7d:7e:7f:80:81,id=id7hZITG,vectors=4,netdev=idCUM72j,bus=pci.0,addr=05,disable-legacy=off,disable-modern=on  \
    -netdev tap,id=idCUM72j,vhost=on \
    -m 8192  \
    -smp 8,maxcpus=8,cores=4,threads=1,sockets=2 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -monitor stdio \

Comment 10 Dr. David Alan Gilbert 2016-08-24 15:46:00 UTC
While bz 1364525 fixes the original seg, there's something else going on which I need the mlx5 maintainer to look at.  We're seeing a :

(qemu) mlx5: rdma-virt-03: got completion with error:
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 9d005304 08000033 0004d0d0
ibv_poll_cq wc.status=4 local protection error!
ibv_poll_cq wrid=WRITE RDMA!

given it's working with mlx4 and a couple of other non-mlx cards, I think this is an mlx5 problem, so leaving it for jwilson.

Comment 12 Jarod Wilson 2016-09-12 19:42:32 UTC
ddutile theorized on irc that it might be an upstream mlx5 kernel patch that would fix things. I can build a test kernel with the following if someone can give it a test run:

commit c9b254955b9f8814966f5dabd34c39d0e0a2b437
Author: Eli Cohen <eli>
Date:   Wed Jun 22 17:27:26 2016 +0300

    IB/mlx5: Fix post send fence logic

    If the caller specified IB_SEND_FENCE in the send flags of the work
    request and no previous work request stated that the successive one
    should be fenced, the work request would be executed without a fence.
    This could result in RDMA read or atomic operations failure due to a MR
    being invalidated. Fix this by adding the mlx5 enumeration for fencing
    RDMA/atomic operations and fix the logic to apply this.

    Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
    Signed-off-by: Eli Cohen <eli>
    Signed-off-by: Leon Romanovsky <leon>
    Signed-off-by: Doug Ledford <dledford>

diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index ce43422..ce0a7ab 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -3332,10 +3332,11 @@ static u8 get_fence(u8 fence, struct ib_send_wr *wr)
                        return MLX5_FENCE_MODE_SMALL_AND_FENCE;
                else
                        return fence;
-
-       } else {
-               return 0;
+       } else if (unlikely(wr->send_flags & IB_SEND_FENCE)) {
+               return MLX5_FENCE_MODE_FENCE;
        }
+
+       return 0;
 }

 static int begin_wqe(struct mlx5_ib_qp *qp, void **seg,
diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h
index e4e2988..630f66a 100644
--- a/include/linux/mlx5/qp.h
+++ b/include/linux/mlx5/qp.h
@@ -172,6 +172,7 @@ enum {
 enum {
        MLX5_FENCE_MODE_NONE                    = 0 << 5,
        MLX5_FENCE_MODE_INITIATOR_SMALL         = 1 << 5,
+       MLX5_FENCE_MODE_FENCE                   = 2 << 5,
        MLX5_FENCE_MODE_STRONG_ORDERING         = 3 << 5,
        MLX5_FENCE_MODE_SMALL_AND_FENCE         = 4 << 5,
 };

Comment 13 Qunfang Zhang 2016-09-13 02:38:05 UTC
Thanks Jarod, KVM QE can help test this scenario in comment 0 if no other mlx5 kernel test needed.

Comment 14 Jarod Wilson 2016-09-15 16:18:51 UTC
(In reply to Qunfang Zhang from comment #13)
> Thanks Jarod, KVM QE can help test this scenario in comment 0 if no other
> mlx5 kernel test needed.

If you can give this test build a spin, it should tell us if that patch does indeed fix the problem:

http://hp-dl360pgen8-07.khw.lab.eng.bos.redhat.com/~jwilson/kernels/testing/el7-mlx/

Comment 15 Qunfang Zhang 2016-09-18 06:24:03 UTC
Hi, Min

Could you give a help to test this bug with the above build? Thanks!

Comment 16 Dr. David Alan Gilbert 2016-09-19 09:53:05 UTC
Jarod:
  Note we have an automated test for that which I put together with mstowell;
https://beaker.engineering.redhat.com/tasks/22701

Dave

Comment 18 Min Deng 2016-09-23 02:18:59 UTC
Hi Developer,
   QE need to wait for a special machine from beaker,if the machine is ready QE will give a summary for it,any issues please notice me.Thanks.
MinDeng

Comment 19 Dr. David Alan Gilbert 2016-09-26 15:12:40 UTC
Hi Jarod,
  Sorry, it's still broken with that kernel:

[root@rdma-virt-03 ~]$ uname -a
Linux rdma-virt-03 3.10.0-506.el7.mlx.x86_64 #1 SMP Thu Sep 15 10:58:58 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

[root@rdma-virt-03 ~]$ ./rdma-test 
Starting src
   PID TTY          TIME CMD
 13422 pts/0    00:00:00 qemu-kvm
Starting dst
   PID TTY          TIME CMD
 13433 pts/0    00:00:00 qemu-kvm
QEMU 2.6.0 monitor - type 'help' for more information
(qemu) info status
VM status: running
Found: VM status: running
QEMU 2.6.0 monitor - type 'help' for more information
(qemu) info status
VM status: paused (inmigrate)
Found: VM status: paused (inmigrate)
Good - both qemu's running
(qemu) migrate_set_speed 100G
(qemu) migrate rdma:172.31.40.93:4444
source_resolve_host RDMA Device opened: kernel name mlx5_2 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/mlx5_2, transport: (2) Ethernet
dest_init RDMA Device opened: kernel name mlx5_2 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/mlx5_2, transport: (2) Ethernet
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off 
Migration status: completed
Found: Migration status: completed
(qemu) info status
VM status: running
Found: VM status: running
passed pin_all=false
qemu-kvm: terminating on signal 15 from pid 13413qemu-kvm: terminating on signal 15 from pid 13413
Starting src
   PID TTY          TIME CMD
 13507 pts/0    00:00:00 qemu-kvm
Starting dst
   PID TTY          TIME CMD
 13518 pts/0    00:00:00 qemu-kvm
QEMU 2.6.0 monitor - type 'help' for more information
(qemu) info status
VM status: running
Found: VM status: running
QEMU 2.6.0 monitor - type 'help' for more information
(qemu) info status
VM status: paused (inmigrate)
Found: VM status: paused (inmigrate)
Good - both qemu's running
(qemu) migrate_set_speed 100G
(qemu) migrate_set_capability rdma-pin-all on
(qemu) migrate rdma:172.31.40.93:4444
source_resolve_host RDMA Device opened: kernel name mlx5_2 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/mlx5_2, transport: (2) Ethernet
dest_init RDMA Device opened: kernel name mlx5_2 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/mlx5_2, transport: (2) Ethernet
mlx5: rdma-virt-03: got completion with error:
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 9d005304 08000375 000497d2
ibv_poll_cq wc.status=4 local protection error!
ibv_poll_cq wrid=WRITE RDMA!
qemu-kvm: rdma migration: polling error! -1
ibv_poll_cq wc.status=5 Work Request Flushed Error!

Comment 21 Min Deng 2016-09-28 07:55:04 UTC
(In reply to Jarod Wilson from comment #14)
> (In reply to Qunfang Zhang from comment #13)
> > Thanks Jarod, KVM QE can help test this scenario in comment 0 if no other
> > mlx5 kernel test needed.
> 
> If you can give this test build a spin, it should tell us if that patch does
> indeed fix the problem:
> 
> http://hp-dl360pgen8-07.khw.lab.eng.bos.redhat.com/~jwilson/kernels/testing/
> el7-mlx/

Hi Jarod,
    It seems that it only includes x86 platform,could you please provide ppc64le version,thanks
Min

Comment 22 Jarod Wilson 2016-10-03 14:46:57 UTC
(In reply to dengmin from comment #21)
> (In reply to Jarod Wilson from comment #14)
> > (In reply to Qunfang Zhang from comment #13)
> > > Thanks Jarod, KVM QE can help test this scenario in comment 0 if no other
> > > mlx5 kernel test needed.
> > 
> > If you can give this test build a spin, it should tell us if that patch does
> > indeed fix the problem:
> > 
> > http://hp-dl360pgen8-07.khw.lab.eng.bos.redhat.com/~jwilson/kernels/testing/
> > el7-mlx/
> 
> Hi Jarod,
>     It seems that it only includes x86 platform,could you please provide
> ppc64le version,thanks

It's already been confirmed that the patch we thought might help didn't, so there's not much point to get a ppc64le version up there.

Comment 23 Dr. David Alan Gilbert 2017-06-05 16:07:11 UTC
mstowell confirmed that this test works on -674 kernel on x86.
So I think we're good on 7.4