Bug 2172919

Summary: passt: Two passt instances of different guests sometimes fail to communicate
Product: Red Hat Enterprise Linux 9 Reporter: Lei Yang <leiyang>
Component: passtAssignee: Stefano Brivio <sbrivio>
Status: CLOSED CURRENTRELEASE QA Contact: Lei Yang <leiyang>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 9.2CC: aadam, chayang, jinzhao, juzhang, lvivier, pezhang, sbrivio, wquan, yalzhang
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-07 17:48:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lei Yang 2023-02-23 14:01:31 UTC
Description of problem:
Two passt instances of different guests sometimes fail to communicate with using the address of the default gateway through TCP

Version-Release number of selected component (if applicable):
passt-0^20230222.g4ddbcb9-1.el9.x86_64
qemu-kvm-7.2.0-9.el9.x86_64
kernel-5.14.0-281.el9.x86_64
libvirt-9.0.0-6.el9.x86_64
seabios-bin-1.16.1-1.el9.noarch
edk2-ovmf-20221207gitfff6d81270b5-6.el9.noarch

How reproducible:
2/5

Steps to Reproduce:
1.Create two passt instance and boot up the corresponding guest:
 Passt1: $ passt  -f -t 10001 -u 10001 -P passt1.pid
         $ PATH:/usr/libexec
         $ qrap 5 qemu-kvm -m 16059 -smp 6 -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/home/test/avocado-vt-vm1_rhel920-64-virtio-scsi_qcow2_filesystem_VARS.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' -machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' -nodefaults -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' -m 62464 -object '{"size": 65498251264, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' -smp 28,maxcpus=28,cores=14,threads=1,dies=1,sockets=2 -cpu 'Icelake-Server',ds=on,ss=on,dtes64=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off,kvm_pv_unhalt=on  -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/test/rhel920-64-virtio-scsi.qcow2", "cache": {"direct": true, "no-flush": false}}' -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' -device '{"driver": "virtio-net-pci", "id": "net0", "netdev": "hostnet0", "x-txburst": 16384, "bus": "pcie-root-port-3", "addr": "0x0"}' -netdev socket,fd=5,id=hostnet0 -boot menu=off,order=cdn,once=c,strict=off -vnc :0 -boot menu=off,order=cdn,once=c,strict=off -monitor stdio


  Passt2: $ passt  -f -t 10002 -u 10002 -P passt2.pid
          $ PATH=$PATH:/usr/libexec
          $ qrap 5 qemu-kvm -m 16059 -smp 6 -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/home/test/2.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' -machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' -nodefaults -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' -m 62464 -object '{"size": 65498251264, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' -smp 28,maxcpus=28,cores=14,threads=1,dies=1,sockets=2 -cpu 'Icelake-Server',ds=on,ss=on,dtes64=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off,kvm_pv_unhalt=on  -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/test/2.qcow2", "cache": {"direct": true, "no-flush": false}}' -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' -device '{"driver": "virtio-net-pci", "id": "net0", "netdev": "hostnet0", "x-txburst": 16384, "bus": "pcie-root-port-3", "addr": "0x0"}' -netdev socket,fd=5,id=hostnet0 -boot menu=off,order=cdn,once=c,strict=off -vnc :1 -boot menu=off,order=cdn,once=c,strict=off -monitor stdio

2.Disable firewall in two guests and host.
# systemctl stop firewalld.service || service iptables stop || iptables -F || nft flush ruleset

3.Two guests communicating through TCP using the default gateway address sometimes fails
On guest 2:
# nc -l 10002

On guest1:
# echo "Hello from guest1" | nc 10.73.213.254 10002
Nact: TIMEOUT

Actual results:
Nact: TIMEOUT

Expected results:
Two guests can communicate with each other via TCP using the address of the default gateway

Additional info:
1. From the QE perspective, it should not a regression bug, since this scenario always test with seabios guest in the past,this issue maybe have always existed. And test passed with seabios guest when used the same host env (Tested 5 times).

Comment 1 Lei Yang 2023-02-23 14:06:55 UTC
Seabios command line:
qrap 5 qemu-kvm -m 16059 -cpu host -smp 6 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/test/seabios.qcow2 -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0 -nographic -serial stdio -nodefaults -device virtio-net-pci,netdev=hostnet0,x-txburst=16384 -netdev socket,fd=5,id=hostnet0

Comment 2 yalzhang@redhat.com 2023-03-02 04:33:57 UTC
I can reproduce it with the new passt package passt-0^20230227.gc538ee8-1.fc39.x86_64. It's a regression.
The same scenario works well on passt-0^20221110.g4129764-1.fc38.x86_64.

Test with: 
libvirt v9.1.0-rc2-4-g541670dd5c
qemu-kvm-7.2.0-7.fc39.x86_64

Comment 3 Stefano Brivio 2023-03-02 09:17:30 UTC
(In reply to yalzhang from comment #2)
> I can reproduce it with the new passt package
> passt-0^20230227.gc538ee8-1.fc39.x86_64. It's a regression.
> The same scenario works well on passt-0^20221110.g4129764-1.fc38.x86_64.

Interesting, thanks for the additional test. I still have to try and reproduce it with RHEL guest on RHEL host to debug this further.

Comment 14 Lei Yang 2023-03-03 15:07:14 UTC
(In reply to yalzhang from comment #2)
> I can reproduce it with the new passt package
> passt-0^20230227.gc538ee8-1.fc39.x86_64. It's a regression.
> The same scenario works well on passt-0^20221110.g4129764-1.fc38.x86_64.
> 
> Test with: 
> libvirt v9.1.0-rc2-4-g541670dd5c
> qemu-kvm-7.2.0-7.fc39.x86_64
Hello Yalan

I guess you are also reproduced it with ovmf guest, right?

Thanks
Lei

Comment 19 yalzhang@redhat.com 2023-03-06 16:43:07 UTC
I test it downstream on rhel9.2 with rhel9.2 latest image(ovmf), the issue can *not* be reproduced.
libvirt-9.0.0-8.el9_rc.a7213e6de2.x86_64(scratch build in bz2169244#c5)
qemu-kvm-7.2.0-10.el9.x86_64
passt-0^20230222.g4ddbcb9-1.el9.x86_64

Also I have tried below version, no issue found. Not sure what happend last time when I test it on upstream.
passt-0^20230227.gc538ee8-1.fc39.x86_64

Comment 20 yalzhang@redhat.com 2023-03-06 16:51:03 UTC
(In reply to yalzhang from comment #19)
> Also I have tried below version, no issue found. Not sure what happend last
> time when I test it on upstream.
> passt-0^20230227.gc538ee8-1.fc39.x86_64

s/fc39.x86_64/el9.x86_64