Bug 1671542

Summary: [RHEL-8.0] rcopy over qedr iw fails
Product: Red Hat Enterprise Linux 8 Reporter: Afom T. Michael <tmichael>
Component: rdma-coreAssignee: Honggang LI <honli>
Status: CLOSED WONTFIX QA Contact: zguo <zguo>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.1CC: bchae, hwkernel-mgr, infiniband-qe, irusskikh, irusskik, jarod, linville, mchopra, rdma-dev-team, zguo
Target Milestone: rc   
Target Release: 8.4   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-01 07:32:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1850705    

Description Afom T. Michael 2019-01-31 21:05:24 UTC
Description of problem:
Using RHEL-8 (4.18.0-64.el8.x86_64) on system with "QLogic Corp. FastLinQ QL41000" - qedr iWARP, and running "$ rcopy /tmp/123 172.31.50.24" on client fails.

Version-Release number of selected component (if applicable):
[root@rdma-qe-25 ~]$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.0 Beta (Ootpa)
[root@rdma-qe-25 ~]$ uname -r
4.18.0-64.el8.x86_64
[root@rdma-qe-25 ~]$ rpm -qa | grep rdma
rdma-core-22-2.el8.x86_64
librdmacm-22-2.el8.x86_64
librdmacm-utils-22-2.el8.x86_64
rdma-core-devel-22-2.el8.x86_64
[root@rdma-qe-25 ~]$
[root@rdma-qe-25 ~]$ rpm -qf $(which rcopy)
librdmacm-utils-22-2.el8.x86_64
[root@rdma-qe-25 ~]$
[root@rdma-qe-25 ~]$ ibv_devinfo  | grep -A 16 qedr1
hca_id:	qedr1
	transport:			iWARP (1)
	fw_ver:				8.37.2.0
	node_guid:			f6e9:d4ff:fe61:e6af
	sys_image_guid:			f6e9:d4ff:fe61:e6af
	vendor_id:			0x1077
	vendor_part_id:			32880
	hw_ver:				0x0
	phys_port_cnt:			1
		port:	1
			state:			PORT_ACTIVE (4)
			max_mtu:		4096 (5)
			active_mtu:		4096 (5)
			sm_lid:			0
			port_lid:		0
			port_lmc:		0x00
			link_layer:		Ethernet

[root@rdma-qe-25 ~]$ ibstatus qedr1
Infiniband device 'qedr1' port 1 status:
	default gid:	 f4e9:d461:e6af:0000:0000:0000:0000:0000
	base lid:	 0x0
	sm lid:		 0x0
	state:		 4: ACTIVE
	phys state:	 5: LinkUp
	rate:		 25 Gb/sec (1X EDR)
	link_layer:	 Ethernet

[root@rdma-qe-25 ~]$ ip a s qede_iw
9: qede_iw: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether f4:e9:d4:61:e6:af brd ff:ff:ff:ff:ff:ff
    inet 172.31.50.25/24 brd 172.31.50.255 scope global noprefixroute qede_iw
       valid_lft forever preferred_lft forever
    inet6 fe80::f6e9:d4ff:fe61:e6af/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[root@rdma-qe-25 ~]$


How reproducible:
Always

Steps to Reproduce:
1. On server run "$ rcopy"
2. On client run "$ rcopy /tmp/123 172.31.50.24" where 172.31.50.24 qedr's ip and /tmp/123 is a file
3. 

Actual results:

[root@rdma-qe-25 ~]$ timeout 3m rcopy /tmp/123 172.31.50.24
rconnect failed
: Operation not supported
[root@rdma-qe-25 ~]$ echo $?
255
[root@rdma-qe-25 ~]$


Expected results:
For rcopy to complete successfully.

Additional info:
I'm not sure if this is a bug or not specially given the message ": Operation not supported" but I want to at least record it.

Comment 1 Jarod Wilson 2019-11-21 19:38:31 UTC
I suspect we should re-test this with rdma-core v26 now available in RHEL-8.2, and if things are still failing, we ought to get the hardware partner involved here. Can we do a quick re-test with v26?

Comment 2 zguo 2019-11-22 09:45:14 UTC
(In reply to Jarod Wilson from comment #1)
> I suspect we should re-test this with rdma-core v26 now available in
> RHEL-8.2, and if things are still failing, we ought to get the hardware
> partner involved here. Can we do a quick re-test with v26?
 
Will retest when the test machine is available to us.

Comment 3 Afom T. Michael 2019-11-26 12:06:17 UTC
(In reply to zguo from comment #2)
> (In reply to Jarod Wilson from comment #1)
> > I suspect we should re-test this with rdma-core v26 now available in
> > RHEL-8.2, and if things are still failing, we ought to get the hardware
> > partner involved here. Can we do a quick re-test with v26?
>  
> Will retest when the test machine is available to us.

It fails with recent RHEL-8.2 build & updating to rdma-core v26 didn't make a difference.

[root@rdma-dev-02 ~]$ uname -r
4.18.0-151.el8.x86_64
[root@rdma-dev-02 ~]$ rpm -qa | grep -E "rdma|infiniband-diags|verbs" | grep -v "kernel-kernel"
libibverbs-26.0-3.el8.x86_64
libibverbs-utils-26.0-3.el8.x86_64
rdma-core-devel-26.0-3.el8.x86_64
librdmacm-26.0-3.el8.x86_64
librdmacm-utils-26.0-3.el8.x86_64
rdma-core-26.0-3.el8.x86_64
infiniband-diags-26.0-3.el8.x86_64
[root@rdma-dev-02 ~]$ ip a s qede_iw
5: qede_iw: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether 00:0e:1e:d4:22:7f brd ff:ff:ff:ff:ff:ff
    inet 172.31.50.102/24 brd 172.31.50.255 scope global dynamic noprefixroute qede_iw
       valid_lft 2598sec preferred_lft 2598sec
    inet6 fe80::20e:1eff:fed4:227f/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[root@rdma-dev-02 ~]$ ping -c3 172.31.50.103
PING 172.31.50.103 (172.31.50.103) 56(84) bytes of data.
64 bytes from 172.31.50.103: icmp_seq=1 ttl=64 time=0.090 ms
64 bytes from 172.31.50.103: icmp_seq=2 ttl=64 time=0.104 ms
64 bytes from 172.31.50.103: icmp_seq=3 ttl=64 time=0.104 ms

--- 172.31.50.103 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 75ms
rtt min/avg/max/mdev = 0.090/0.099/0.104/0.010 ms
[root@rdma-dev-02 ~]$ rcopy /tmp/123 172.31.50.103
rconnect failed
: Operation not supported
[root@rdma-dev-02 ~]$ echo $?
255
[root@rdma-dev-02 ~]$

Comment 4 Honggang LI 2020-01-13 07:18:24 UTC
Hi, Manish

 As you are drivers/infiniband/hw/qedr module maintainer, could you please fix this
qedr hardware specific issue?

Thanks

Comment 5 John W. Linville 2020-06-19 17:46:12 UTC
Clearly this one missed RHEL 8.0 (and 8.1, 8.2, 8.3)...moving the target release forward to 8.4 for now, but please re-evaluate this bug for continued relevance, and close it if appropriate (WONTFIX, CANTFIX, NOTABUG, CURRENTRELEASE, etc). Given the age of this bug, I don't see it as much of a priority at this time -- please let me know if I'm missing something important.

Comment 6 Manish Chopra (Marvell) 2020-06-24 19:37:19 UTC
Hi,

Some recommendation by the Marvell tester

Please try with root or sudo permission otherwise it won’t work and also file should not be empty.

Failed Case : Need permission for user 
Chelsio/Qlogic:

Server: As a User
[venkata@RH77-730C ~]$ rcopy
waiting for connection...client:
Segmentation fault (core dumped)
[venkata@RH77-730C ~]$

Client:
[venkata@RH77-730S builds]$ rcopy  rdma-core-topic-send_with_invalidate.zip 6.6.6.7
rconnect failed
: Connection refused
[venkata@RH77-730S builds]$

Working Scenarios: Tried with 8.2 Inbox driver and also OOB driver.

QLogic:/Chelsio as a root or sudo privileges:

Client:
[venkata@Dell830S ~]$ sudo rcopy OFED-4.17-1.tgz 40.40.42.2
opening...transferring...closing...done
33406975 bytes in 0.04 seconds = 6.60 Gb/sec
[venkata@Dell830S ~]$

Server:
[venkata@Dell830C tmp]$ sudo rcopy
waiting for connection...client: 40.40.42.1
opening: OFED-4.17-1.tgz, transferring...33406975 bytes...closing...done
waiting for connection...^C
[venkata@Dell830C tmp]$ ls OFED-4.17-1.tgz
OFED-4.17-1.tgz
[venkata@Dell830C tmp]$

As a ROOT user :

Server:
[root@Dell830C tmp]# rcopy
waiting for connection...client: 40.40.43.1
opening: iperf-2.0.9-source.tar.gz, transferring...277702 bytes...closing...done
waiting for connection...^C
[root@Dell830C tmp]# ls iperf-2.0.9-source.tar.gz
iperf-2.0.9-source.tar.gz
[root@Dell830C tmp]#

Client:
[root@Dell830S builds]# rcopy iperf-2.0.9-source.tar.gz 40.40.43.2
opening...transferring...closing...done
277702 bytes in 0.02 seconds = 0.14 Gb/sec
[root@Dell830S builds]#

Other Method: You have to give complete Path.

Server# rcopy
waiting for connection...client: 40.40.43.1
opening: /builds/iperf-2.0.9-source.tar.gz, transferring...277702 bytes...closing...done
waiting for connection...

Client # rcopy iperf-2.0.9-source.tar.gz 40.40.43.2:/builds/iperf-2.0.9-source.tar.gz
opening...transferring...closing...done
277702 bytes in 0.00 seconds = 6.71 Gb/sec


Thanks,
Manish

Comment 9 RHEL Program Management 2021-02-01 07:32:23 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 10 Red Hat Bugzilla 2023-09-15 00:15:32 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days