Bug 2212683 - [Azure][ARM64][RHEL-9] TCP file transfer corrupted with 0.5% disturbance
Summary: [Azure][ARM64][RHEL-9] TCP file transfer corrupted with 0.5% disturbance
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: kernel
Version: 9.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: Li Tian
URL:
Whiteboard:
Depends On: 2211258
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-06 07:02 UTC by Li Tian
Modified: 2023-08-15 14:40 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2211258
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-159036 0 None None None 2023-06-06 07:05:00 UTC

Description Li Tian 2023-06-06 07:02:25 UTC
Issue also presents on ARM64 RHEL9.3 - kernel 5.14.0-316.el9.aarch64.

+++ This bug was initially created as a clone of Bug #2211258 +++

Description of problem:
TCP transport should be able to tolerate mild channel corruption (0.5%). Yet the md5sum value doesn't match when sending a ~500MB file.

Version-Release number of selected component (if applicable):
4.18.0-492.el8.aarch64

How reproducible:
100%

Steps to Reproduce:
Detailed steps are filed in below Polarion cases:
https://polarion.engineering.redhat.com/polarion/#/project/RHELVIRT/workitem?id=VIRT-80160
https://polarion.engineering.redhat.com/polarion/#/project/RHELVIRT/workitem?id=VIRT-80159

Actual results:
File size matches but not md5sum
# ls -al server_data 
-rw-r--r--. 1 root root 524288000 May 30 04:22 server_data
# md5sum server_data 
17da58d5050b0a58e3a44685e2b9d522  server_data
# ls -al client_data 
-rw-r--r--. 1 root root 524288000 May 30 04:22 client_data
# md5sum client_data 
7bb64858c2801de10cac8abe4a8968f0  client_data
Expected results:
md5sum value matches.

Additional info:
1. x86_64 does not have this issue.
2. Both IPv4 and IPv6 have this issue.
3. No such issue without channel corruption.

--- Additional comment from Li Tian on 2023-05-31 11:19:47 CST ---

In case people can't view Polarion links, here are the steps:

1. Create 2 VMs on Azure. And make sure checksum offload is turned on - 
# ethtool -k eth0|grep checksum

2. Install below packages -
nmap-ncat iproute-tc kernel-modules-extra

3. On VM#1 open a listen port with 'nc' - 
# nc -l 2233 > server_data

4. On VM#2 create a 0.5% channel corruption - 
# tc qdisc add dev eth0 root netem corrupt 0.5%
Then generate a 500MB file and send to VM#1 - 
# dd if=/dev/urandom of=client_data bs=1024k count=500; nc <VM#1_IP> 2233 < client_data

5. Compare md5sum value on 'server_data' and 'client_data'.


Note You need to log in before you can comment on or make changes to this bug.