Bug 2134280
| Summary: | [virtual network][rhel9.2][windows]Enable netkvm driver TxLSO,no packet length >= 1514 after file transfer | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Lei Yang <leiyang> |
| Component: | wireshark | Assignee: | Michal Ruprich <mruprich> |
| Status: | CLOSED MIGRATED | QA Contact: | František Hrdina <fhrdina> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 9.2 | CC: | chayang, coli, fhrdina, jinzhao, juzhang, lvivier, mkedzier, virt-maint, wquan, ybendito, yvugenfi |
| Target Milestone: | rc | Keywords: | MigratedToJIRA, Reopened, Triaged |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-09-21 22:51:28 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Lei Yang
2022-10-13 04:41:22 UTC
Test Matrix Test on the rhel.9.2.0 compose:http://download.eng.pek2.redhat.com/rhel-9/composes/RHEL-9/RHEL-9.2.0-20221006.d.0/ 1. kernel-5.14.0-162.6.1.el9_1.x86_64 + qemu-kvm-7.1.0-2.el9.x86_64 -> Test Failed 2. qemu-kvm-7.0.0-13.el9.x86_64.rpm + kernel-5.14.0-174.el9.x86_64.rpm -> Test Failed 3. qemu-kvm-7.0.0-13.el9.x86_64.rpm + kernel-5.14.0-162.6.1.el9_1.x86_64 -> Test Failed Test on the rhel.9.1.0 compose: http://download.eng.pek2.redhat.com/rhel-9/composes/RHEL-9/RHEL-9.1.0-20221012.1 1. qemu-kvm-7.0.0-13.el9.x86_64.rpm + kernel-5.14.0-162.6.1.el9_1.x86_64 -> Test Pass 2. qemu-kvm-7.1.0-2.el9.x86_64 + kernel-5.14.0-162.6.1.el9_1.x86_64 -> Test Pass 3. qemu-kvm-7.1.0-2.el9.x86_64 + kernel-5.14.0-174.el9.x86_64.rpm -> Test Pass Based on the above test result, seems this bug not releate to qemu and kernel, but it must be a product bug.Temporarily use this component to track the problem until the real cause is found (In reply to Lei Yang from comment #1) ... > Based on the above test result, seems this bug not releate to qemu and > kernel, but it must be a product bug.Temporarily use this component to track > the problem until the real cause is found I think this is more related to Windows so I change the SST pool id. (In reply to Laurent Vivier from comment #2) > (In reply to Lei Yang from comment #1) > ... > > Based on the above test result, seems this bug not releate to qemu and > > kernel, but it must be a product bug.Temporarily use this component to track > > the problem until the real cause is found > > I think this is more related to Windows so I change the SST pool id. Marek, could you confirm? Hi Yan, can you take a look? Thanks! What happens when the default values is not changed (should be on as well)? (In reply to Yvugenfi from comment #5) Hello Yan > What happens when the default values is not changed (should be on as well)? What do you mean by default values? During the entire testing process, nothing seems to have changed. Thanks Lei netsh netkvm setparam 0 param=Offload.TxLSO value=1 The default value in INF is 2. In any case, LSO should be enabled. Best regards, Yan. Transfer guest file to host, then large packets make sense much more. After you try, please respond. (In reply to ybendito from comment #8) > Transfer guest file to host, then large packets make sense much more. > After you try, please respond. Hello Yuri This step is already included in the current test log: vocado.virttest.virt_vm DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 360s) aexpect.client DEBUG| Sending command: start "" "C:\Program Files\Wireshark\tshark.exe" -n -w c:\temp.pcapng tcp and dst 192.168.122.1 and src 192.168.122.95 aexpect.client DEBUG| Sending command: echo %errorlevel% aexpect.client DEBUG| Sending command: tasklist /fi "IMAGENAME eq tshark.exe" aexpect.client DEBUG| Sending command: echo %errorlevel% avocado.test INFO | Context: Start file transfer avocado.virttest.virt_vm DEBUG| Attempting to log into 'avocado-vt-vm1' (timeout 360s) avocado.virttest.virt_vm DEBUG| Found/Verified IP 192.168.122.95 for VM avocado-vt-vm1 NIC 0 avocado.virttest.utils_test INFO | Context: Start file transfer --> Login to guest avocado.virttest.utils_test INFO | Context: Start file transfer --> Creating 50MB file on host avocado.utils.process INFO | Running 'dd if=/dev/zero of=/var/tmp/avocado_lut54745/tmp-epqz_1pi bs=10M count=5' avocado.utils.process DEBUG| [stderr] 5+0 records in avocado.utils.process INFO | Command 'dd if=/dev/zero of=/var/tmp/avocado_lut54745/tmp-epqz_1pi bs=10M count=5' finished with 0 after 0.023190531s avocado.utils.process DEBUG| [stderr] 5+0 records out avocado.utils.process DEBUG| [stderr] 52428800 bytes (52 MB, 50 MiB) copied, 0.0224352 s, 2.3 GB/s avocado.virttest.utils_test INFO | Context: Start file transfer --> Transferring file host -> guest, timeout: 1000s aexpect.remote DEBUG| Sending file /var/tmp/avocado_lut54745/tmp-epqz_1pi aexpect.remote INFO | Copy file from /var/tmp/avocado_lut54745/tmp-epqz_1pi to 192.168.122.95:c:\JciYIChB, elapsed time: 0.09739828109741211 avocado.virttest.utils_test INFO | Context: Start file transfer --> Transferring file guest -> host, timeout: 1000s aexpect.remote DEBUG| Receiving file /var/tmp/avocado_lut54745/tmp-epqz_1pi aexpect.remote INFO | Copy file from 192.168.122.95:c:\JciYIChB to /var/tmp/avocado_lut54745/tmp-epqz_1pi, elapsed time: 0.9929146766662598 avocado.virttest.utils_test INFO | Context: Start file transfer --> Compare md5sum between original file and transferred file avocado.virttest.utils_test INFO | Cleaning temp file on guest aexpect.client DEBUG| Sending command: del c:\JciYIChB aexpect.client DEBUG| Sending command: echo %errorlevel% avocado.test INFO | Context: Stop wireshark aexpect.client DEBUG| Sending command: taskkill /im tshark.exe /f aexpect.client DEBUG| Sending command: echo %errorlevel% avocado.test INFO | Context: Parse wireshark log file aexpect.client DEBUG| Sending command: "C:\Program Files\Wireshark\tshark.exe" -2 -r c:\temp.pcapng -R "frame.len>1514" aexpect.client DEBUG| Sending command: echo %errorlevel% avocado.test INFO | Check length > 1514 packets avocado.test ERROR| avocado.test ERROR| Reproduced traceback from: /usr/local/lib/python3.9/site-packages/avocado_framework_plugin_vt-98.0-py3.9.egg/avocado_vt/test.py:274 avocado.test ERROR| Traceback (most recent call last): avocado.test ERROR| File "/usr/local/lib/python3.9/site-packages/avocado_framework_plugin_vt-98.0-py3.9.egg/virttest/error_context.py", line 135, in new_fn avocado.test ERROR| return fn(*args, **kwargs) avocado.test ERROR| File "/root/avocado/data/avocado-vt/virttest/test-providers.d/downloads/io-github-autotest-qemu/qemu/tests/enable_scatter_windows.py", line 204, in run avocado.test ERROR| test.fail("No packet length >= 1514, output=%s" % output) avocado.test ERROR| File "/usr/local/lib/python3.9/site-packages/avocado_framework-98.0-py3.9.egg/avocado/core/test.py", line 779, in wrapper avocado.test ERROR| return func(actual_message) avocado.test ERROR| File "/usr/local/lib/python3.9/site-packages/avocado_framework-98.0-py3.9.egg/avocado/core/test.py", line 795, in fail avocado.test ERROR| raise exceptions.TestFail(msg) avocado.test ERROR| avocado.core.exceptions.TestFail: No packet length >= 1514, output= Thanks Lei (In reply to Yvugenfi from comment #7) > netsh netkvm setparam 0 param=Offload.TxLSO value=1 > > The default value in INF is 2. In any case, LSO should be enabled. > > Best regards, > Yan. Hello Yan I tried to test this scenario with the latest compose 30 times,there is no issues any more.Since the compose mentioned in comment 1 is no longer available. Test Result: http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/?jobid=7173190 Based on the above test reuslt, seems this problem has been fixed, but I can not confirmed it. Therefore from the QE perspevtive,QE can continue to track this issue in subsequent tests, and if it has not been able to reproduce again, it can be closed as "CURRENTRELEASE". What do you think about it? Thanks in advance. Thanks Lei (In reply to Lei Yang from comment #10) > (In reply to Yvugenfi from comment #7) > > netsh netkvm setparam 0 param=Offload.TxLSO value=1 > > > > The default value in INF is 2. In any case, LSO should be enabled. > > > > Best regards, > > Yan. > > Hello Yan > > I tried to test this scenario with the latest compose 30 times,there is no > issues any more.Since the compose mentioned in comment 1 is no longer > available. > > Test Result: > http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/ > ?jobid=7173190 > > Based on the above test reuslt, seems this problem has been fixed, but I can > not confirmed it. Therefore from the QE perspevtive,QE can continue to track > this issue in subsequent tests, and if it has not been able to reproduce > again, it can be closed as "CURRENTRELEASE". What do you think about it? > Thanks in advance. > > Thanks > Lei Hi Lei, Good idea. Let's check with subsequent tests as well. In any case, there were no changes in the area of configuration or LSO support in the driver recently. Best regards, Yan. (In reply to Lei Yang from comment #10) > (In reply to Yvugenfi from comment #7) > > netsh netkvm setparam 0 param=Offload.TxLSO value=1 > > > > The default value in INF is 2. In any case, LSO should be enabled. > > > > Best regards, > > Yan. > > Hello Yan > > I tried to test this scenario with the latest compose 30 times,there is no > issues any more.Since the compose mentioned in comment 1 is no longer > available. > > Test Result: > http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/ > ?jobid=7173190 > > Based on the above test reuslt, seems this problem has been fixed, but I can > not confirmed it. Therefore from the QE perspevtive,QE can continue to track > this issue in subsequent tests, and if it has not been able to reproduce > again, it can be closed as "CURRENTRELEASE". What do you think about it? > Thanks in advance. Do you reproduce the problem anymore? Can we close the BZ? (In reply to Laurent Vivier from comment #12) > (In reply to Lei Yang from comment #10) > > (In reply to Yvugenfi from comment #7) > > > netsh netkvm setparam 0 param=Offload.TxLSO value=1 > > > > > > The default value in INF is 2. In any case, LSO should be enabled. > > > > > > Best regards, > > > Yan. > > > > Hello Yan > > > > I tried to test this scenario with the latest compose 30 times,there is no > > issues any more.Since the compose mentioned in comment 1 is no longer > > available. > > > > Test Result: > > http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/ > > ?jobid=7173190 > > > > Based on the above test reuslt, seems this problem has been fixed, but I can > > not confirmed it. Therefore from the QE perspevtive,QE can continue to track > > this issue in subsequent tests, and if it has not been able to reproduce > > again, it can be closed as "CURRENTRELEASE". What do you think about it? > > Thanks in advance. > Hello Laurent > Do you reproduce the problem anymore? > Can we close the BZ? I tried to test this scenarios with latest compose 30 times, there is no issues any more. So this bug should be closed as "CURRENTRELEASE", please correct me if I'm wrong. Test Version: http://download.eng.pek2.redhat.com/rhel-9/composes/RHEL-9/RHEL-9.2.0-20221107.5/ Test Log: http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/?jobid=7224375 en, I am a little confused, this problem is reproduced again on the latest rhel 9.3. But not sure what caused this problem, temporarily use this bug to record this problem. Test Version: kernel-5.14.0-300.el9.x86_64 qemu-kvm-8.0.0-1.el9.x86_64 edk2-ovmf-20230301gitf80f052277c8-2.el9.noarch virtio-win-prewhql-0.1-235.iso (In reply to Lei Yang from comment #14) > en, I am a little confused, this problem is reproduced again on the latest > rhel 9.3. But not sure what caused this problem, temporarily use this bug to > record this problem. > > Test Version: > kernel-5.14.0-300.el9.x86_64 > qemu-kvm-8.0.0-1.el9.x86_64 > edk2-ovmf-20230301gitf80f052277c8-2.el9.noarch > virtio-win-prewhql-0.1-235.iso Hi Lei, If there is a problem, I suggest reopening a bug or filing a new bug. Otherwise, we might lose the discussion. Again, there were no guest driver changes related to LSO. So if opening new bug the component should be qemu-kvm Best regards, Yan. (In reply to Yvugenfi from comment #15) > (In reply to Lei Yang from comment #14) > > en, I am a little confused, this problem is reproduced again on the latest > > rhel 9.3. But not sure what caused this problem, temporarily use this bug to > > record this problem. > > > > Test Version: > > kernel-5.14.0-300.el9.x86_64 > > qemu-kvm-8.0.0-1.el9.x86_64 > > edk2-ovmf-20230301gitf80f052277c8-2.el9.noarch > > virtio-win-prewhql-0.1-235.iso > > Hi Lei, > > If there is a problem, I suggest reopening a bug or filing a new bug. > Otherwise, we might lose the discussion. > Again, there were no guest driver changes related to LSO. > So if opening new bug the component should be qemu-kvm > > Best regards, > Yan. Hello Yan Thanks for your update, let me reopen this bug. Thanks Lei Hi Yan I review this bug again, from the QE's perspective this bug is not related qemu-kvm and virtio-win-driver. Because the checkpoint is: the scp file successfully and length of some package SSH protocol >=1514. It looks like there are some issues about openssh, what do you think about it? Thanks Lei (In reply to Lei Yang from comment #18) > Hi Yan > > I review this bug again, from the QE's perspective this bug is not related > qemu-kvm and virtio-win-driver. Because the checkpoint is: the scp file > successfully and length of some package SSH protocol >=1514. It looks like > there are some issues about openssh, what do you think about it? > > Thanks > Lei Hi Lei, I am not sure. I might be related to the network configuration on the host (in virtio-net driver on the guest we don't have any related changes between the versions). So it might point to some potential problem that can affect the performance. Best regards, Yan. Hi Wenli Could you please help review this bug? Maybe this is a potential problem that can affect the performance. Thanks in advance. Test Steps: 1. Boot a guest with virtio-net-pci 2. Change scatter-gather to enable for windows: enable Offload Tx LSO: device manager--> click Network adapters--> right-click Properties of Redhat VirtIO Ethernet Adapter---> Properties---> Advanced-->change"Offload Tx LSO" to enable 4. copy file from guest to host For windows: use winscp to scp file from guest to host ,and the same time ,open wireshark to listening the package. 5. After steps 4 For windows: host can receive the scp file successfully and length of some package wih SSH protocal >=1514 Thanks Lei QE tested bug Comment 0 steps again,this time QE noticed that c:\temp.pcapnp was empty when the problem reproduced. But at this time the file is transferred, so this may be a problem with the tool itself. Please correct me if I'm wrong. (In reply to Lei Yang from comment #20) > Hi Wenli > > Could you please help review this bug? Maybe this is a potential problem > that can affect the performance. Thanks in advance. > > Test Steps: > 1. Boot a guest with virtio-net-pci > 2. Change scatter-gather to enable > for windows: > enable Offload Tx LSO: > device manager--> click Network adapters--> right-click Properties of Redhat > VirtIO Ethernet Adapter---> Properties---> Advanced-->change"Offload Tx LSO" > to enable > 4. copy file from guest to host > For windows: use winscp to scp file from guest to host ,and the same time > ,open wireshark to listening the package. > 5. After steps 4 > For windows: host can receive the scp file successfully and length of some > package wih SSH protocal >=1514 > > Thanks > Lei Manually test it, the LSO on windows works correctly. with enabled lso, the len of packet is 16474. 3200 38.142872 10.73.225.119 → 10.73.225.50 SSHv2 16474 Server: Encrypted packet (len=16420) 3201 38.143344 10.73.225.119 → 10.73.225.50 SSHv2 16474 Server: Encrypted packet (len=16420) With disabled lso, the len of packet is 1000. 28485 9.420404 10.73.225.119 → 10.73.225.50 SSH 1000 Server: Encrypted packet (len=946) 28486 9.420424 10.73.225.119 → 10.73.225.50 SSH 1000 Server: Encrypted packet (len=946) Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |