RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1367374 - [RHEL-7.3] Add rpcrdma to support NFSROOT over NFSoRDMA
Summary: [RHEL-7.3] Add rpcrdma to support NFSROOT over NFSoRDMA
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: dracut
Version: 7.3
Hardware: All
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Lukáš Nykrýn
QA Contact: Release Test Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-16 09:49 UTC by Honggang LI
Modified: 2016-11-04 08:06 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-04 08:06:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch to add rpcrdma module (826 bytes, application/mbox)
2016-08-16 09:49 UTC, Honggang LI
no flags Details
nfsroot over NFSoRDMA console log (85.39 KB, text/plain)
2016-08-16 10:50 UTC, Honggang LI
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2530 0 normal SHIPPED_LIVE dracut bug fix and enhancement update 2016-11-03 14:17:01 UTC

Description Honggang LI 2016-08-16 09:49:22 UTC
Created attachment 1191180 [details]
patch to add rpcrdma module

Description of problem:

NFSROOT can over IPoIB, but failed over NFSoRDMA. The NFSoRDMA module, rpcrdma, is missing. Please review and apply attached patch.

# cat /tmp/0453-Add-rpcrdma-module-to-support-NFSROOT-over-NFSoRDMA-.patch 
From d783ff5a50f739d3e40a148c80f99f5ad2d491b7 Mon Sep 17 00:00:00 2001
From: Honggang Li <honli>
Date: Tue, 16 Aug 2016 04:12:02 -0400
Subject: [PATCH] Add rpcrdma module to support NFSROOT over NFSoRDMA for RHEL7

Signed-off-by: Honggang Li <honli>
---
 modules.d/95nfs/module-setup.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/modules.d/95nfs/module-setup.sh b/modules.d/95nfs/module-setup.sh
index de5a754..e4ba678 100755
--- a/modules.d/95nfs/module-setup.sh
+++ b/modules.d/95nfs/module-setup.sh
@@ -25,7 +25,7 @@ depends() {
 }
 
 installkernel() {
-    hostonly='' instmods nfs sunrpc ipv6 nfsv2 nfsv3 nfsv4 nfs_acl nfs_layout_nfsv41_files
+    hostonly='' instmods nfs sunrpc ipv6 nfsv2 nfsv3 nfsv4 nfs_acl nfs_layout_nfsv41_files rpcrdma
 }
 
 install() {
-- 
1.8.3.1


Version-Release number of selected component (if applicable):
dracut-033-453.el7.src.rpm

How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:
NFSROOT over NFSoRDMA hang.

Expected results:


Additional info:

Comment 1 Honggang LI 2016-08-16 10:09:08 UTC
Reproducer:

[root@ib2-qa-03 tftpboot]# grep -i distro /etc/motd
                           DISTRO=RHEL-7.3-20160729.1

[root@ib2-qa-03 ~]# yum install -y syslinux httpd tftp-server dhcp

[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/pxelinux.0   /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/chain.c32    /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/menu.c32     /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/memdisk      /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/mboot.c32    /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  ls /var/lib/tftpboot/
chain.c32  mboot.c32  memdisk  menu.c32  pxelinux.0


# You have to setup the GUIDS for opensm, as the HCAs on rdma03/04 are connected
# back to back. Without this, the IB ports won't get reset/initialize again when
# remote PXE client reboot (the HCA ports lost power supply).
[root@ib2-qa-03 ~]# grep -v '^#' /etc/sysconfig/opensm 
GUIDS="0x0002c90300b3cff1 0x0002c90300b3cff2"

[root@ib2-qa-03 ~]# cat /etc/xinetd.d/tftp
# default: off
# description: The tftp server serves files using the trivial file transfer \
#	protocol.  The tftp protocol is often used to boot diskless \
#	workstations, download configuration files to network-aware printers, \
#	and to start the installation process for some operating systems.
service tftp
{
	socket_type		= dgram
	protocol		= udp
	wait			= yes
	user			= root
	server			= /usr/sbin/in.tftpd
	server_args		= -s /var/lib/tftpboot
	disable			= no
	per_source		= 11
	cps			= 100 2
	flags			= IPv4
}
[root@ib2-qa-03 ~]# 


[root@ib2-qa-03 ~]# ip addr show mlx4_ib1
8: mlx4_ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast state UP qlen 256
    link/infiniband 80:00:02:08:fe:80:00:00:00:00:00:00:00:02:c9:03:00:b3:cf:f1 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.2.3/24 brd 172.31.2.255 scope global mlx4_ib1
       valid_lft forever preferred_lft forever
    inet6 fe80::202:c903:b3:cff1/64 scope link 
       valid_lft forever preferred_lft forever
[root@ib2-qa-03 ~]#

[root@ib2-qa-03 ~]# cat /etc/dhcp/dhcpd.conf 
#
# DHCP Server Configuration file.
#   see /usr/share/doc/dhcp*/dhcpd.conf.example
#   see dhcpd.conf(5) man page
#

DHCPDARGS="mlx4_ib1";

option space pxelinux;
option pxelinux.magic code 208 = string;
option pxelinux.configfile code 209 = text;
option pxelinux.pathprefix code 210 = text;
option pxelinux.reboottime code 211 = unsigned integer 32;

subnet 172.31.2.0 netmask 255.255.255.0 {
        option routers 10.0.0.254;
        range 172.31.2.100 172.31.2.101;
	always-broadcast on;


        class "pxeclients" {
                match if substring (option vendor-class-identifier, 0, 9) = "PXEClient";
                next-server 172.31.2.3;
                filename "pxelinux.0";
        }

        host rdma04-ib1 {
		option dhcp-client-identifier = 80:00:02:08:fe:80:00:00:00:00:00:00:00:02:c9:03:00:b3:c7:c1;
                fixed-address 172.31.2.100;
        }
}

[root@ib2-qa-03 ~]# modprobe rpcrdma


[root@ib2-qa-03 ~]# systemctl enable httpd.service
[root@ib2-qa-03 ~]# systemctl enable dhcpd.service
[root@ib2-qa-03 ~]# systemctl enable tftp.service
[root@ib2-qa-03 ~]# systemctl enable nfs.service

[root@ib2-qa-03 ~]# systemctl start tftp.service
[root@ib2-qa-03 ~]# systemctl start httpd.service
[root@ib2-qa-03 ~]# systemctl start dhcpd.service
[root@ib2-qa-03 ~]# systemctl start nfs.service

[root@ib2-qa-03 ~]# echo 'rdma 2050' >> /proc/fs/nfsd/portlist 

[root@ib2-qa-03 ~]#  cat /proc/fs/nfsd/portlist 
rdma 2050
udp 2049
tcp 2049
udp 2049
tcp 2049


[root@ib2-qa-03 ~]# systemctl status dhcpd.service
[root@ib2-qa-03 ~]# systemctl status tftp.service
[root@ib2-qa-03 ~]# systemctl status httpd.service
[root@ib2-qa-03 ~]# systemctl status nfs.service

[root@ib2-qa-03 ~]# systemctl stop  firewalld.service

[root@ib2-qa-03 ~]# cat /var/lib/tftpboot/pxelinux.cfg/default 
DEFAULT menu.c32
PROMPT 0
TIMEOUT 100
#ONTIMEOUT local
#ONTIMEOUT RHEL-7.3-20160719.1
#ONTIMEOUT RHEL-7.3-20160729.1
ONTIMEOUT RHEL-7.3-20160729.1-NFSROOT

MENU TITLE PXE Menu

MENU seperator
LABEL local
MENU LABEL Boot local hard drive
LOCALBOOT 0

LABEL  RHEL-7.3-20160729.1
KERNEL RHEL-7.3-20160729.1/images/pxeboot/vmlinuz
APPEND initrd=RHEL-7.3-20160729.1/images/pxeboot/initrd.img  inst.repo=http://172.31.2.3/RHEL-7.3-20160729.1/ ks.device=bootif ip=ib0:dhcp biosdevname=0  sshd rd.shell rd.debug=0 rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,ib_iser console=ttyS1,115200n81 inst.text

LABEL  RHEL-7.3-20160719.1
KERNEL RHEL-7.3-20160719.1/images/pxeboot/vmlinuz
APPEND initrd=RHEL-7.3-20160719.1/images/pxeboot/initrd.img  inst.repo=http://172.31.2.3/RHEL-7.3-20160719.1/ ks.device=bootif ip=ib0:dhcp biosdevname=0  sshd rd.shell rd.debug=0 rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,ib_iser console=ttyS1,115200n81 inst.text

LABEL  RHEL-7.3-20160729.1-NFSROOT
KERNEL nfsordma/images/pxeboot/vmlinuz
# NFSROOT over IPoIB, it works.
#APPEND initrd=nfsordma/images/pxeboot/initrd.img root=nfs:172.31.2.3:/var/lib/tftpboot/rhel7/root rw selinux=0 console=ttyS1,115200n81 ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,ib_iser
# NFSROOT over NFSoRDMA
APPEND initrd=nfsordma/images/pxeboot/initrd.img root=nfs:172.31.2.3:/var/lib/tftpboot/rhel7/root,proto=rdma,port=2050,rw,nfsvers=3,vers=3,rsize=4096,wsize=4096 rw selinux=0 console=ttyS1,115200n81 ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,rpcrdma nfsrootdebug

*******************************************************************
rdma03/04 had been connected back to back with mlx4 HCAs. In other
words, there is no IB-switch between rdma03/04. If we use the default
rsize and wsize, the NFSoRDMA client can't talk to server after client
connect to server. The PXE client hang on with this message:

nfs: server 172.31.2.3 not responding, still trying
nfs: server 172.31.2.3 not responding, still trying

So, we have to decrease the wsize and rsize.
*******************************************************************

Comment 2 Honggang LI 2016-08-16 10:25:19 UTC
Reproducer (cont):

[root@ib2-qa-03 log]# cat /etc/exports
/var/lib/tftpboot/rhel7/root *(rw,no_root_squash) 
/nfsordma *(rw,no_root_squash) 
[root@ib2-qa-03 log]# showmount  -e
Export list for ib2-qa-03:
/nfsordma                    *
/var/lib/tftpboot/rhel7/root *


# yum groups -y install "Server with GUI" --releasever=7 --installroot=/var/lib/tftpboot/rhel7/root/

# yum erase firewalld  --releasever=7 --installroot=/var/lib/tftpboot/rhel7/root/
# yum erase plymouth  --releasever=7 --installroot=/var/lib/tftpboot/rhel7/root/

************************************************************************
Have to erase plymouth, otherwise, system hang on with message:

systemd: Starting Wait for Plymouth Boot Screen to Quit.. (xxx/ no limit).
************************************************************************

Update password for root user.

[root@ib2-qa-03 log]# grep root /etc/shadow | awk -F: '{print $2}'
$6$kjvTz6bX$k6L6zckQjXX5VovWiaA.e9X1/zs7DZ6lhQG8zywdd1ngmsWoL/LEdMYa/y5XdfkFIRHsGEDjuGRgy98za9E9r/

[root@ib2-qa-03 log]# grep root /var/lib/tftpboot/rhel7/root/etc/shadow
root:$6$kjvTz6bX$k6L6zckQjXX5VovWiaA.e9X1/zs7DZ6lhQG8zywdd1ngmsWoL/LEdMYa/y5XdfkFIRHsGEDjuGRgy98za9E9r/:16925:0:99999:7:::

[root@ib2-qa-03 log]# cat /var/lib/tftpboot/rhel7/root/etc/fstab 
none    /tmp        tmpfs   defaults   0 0
tmpfs   /dev/shm    tmpfs   defaults   0 0
sysfs   /sys        sysfs   defaults   0 0
proc    /proc       proc    defaults   0 0

Download dracut-033-453.el7.src.rpm, and apply the patch, 0453-Add-rpcrdma-module-to-support-NFSROOT-over-NFSoRDMA-.patch, rebuild it. And create a local repo for those updated RPMS.

[root@ib2-qa-03 local]# pwd
/var/www/html/local
[root@ib2-qa-03 local]# ls
dracut-033-454.el7.x86_64.rpm                 dracut-fips-033-454.el7.x86_64.rpm
dracut-caps-033-454.el7.x86_64.rpm            dracut-fips-aesni-033-454.el7.x86_64.rpm
dracut-config-generic-033-454.el7.x86_64.rpm  dracut-network-033-454.el7.x86_64.rpm
dracut-config-rescue-033-454.el7.x86_64.rpm   dracut-tools-033-454.el7.x86_64.rpm
dracut-debuginfo-033-454.el7.x86_64.rpm       
[root@ib2-qa-03 local]# createrepo -d .



Create pxeboot images with updated dracut packages.

# lorax -p RHEL -v 3 -r 7 -i anaconda-dracut -i rdma  -s http://download.eng.pek2.redhat.com/pub/rhel/rel-eng/RHEL-7.3-20160729.1/compose/Server/x86_64/os -s http://download.eng.pek2.redhat.com/pub/rhel/rel-eng/RHEL-7.3-20160729.1/compose/Server-optional/x86_64/os   -s http://localhost/local lorax 2>&1 | tee output.txt

[root@ib2-qa-03 nfsroot]# cp lorax/images/pxeboot/* /var/lib/tftpboot/nfsordma/images/pxeboot/
[root@ib2-qa-03 nfsroot]# ls /var/lib/tftpboot/nfsordma/images/pxeboot/
initrd.img  upgrade.img  vmlinuz


Now, we have PXE server and NFSROOT filesystem. Power on rdma04 to use remote NFSROOT as root filesystem.

Comment 3 Honggang LI 2016-08-16 10:50:41 UTC
Created attachment 1191199 [details]
nfsroot over NFSoRDMA console log

nfsroot over NFSoRDMA console log

Comment 4 Marek Hruscak 2016-08-18 13:07:11 UTC
Hi Honggang,
willing you retest it once fix will be present in compose? I will notify you about it.

Comment 5 Honggang LI 2016-08-18 13:14:40 UTC
(In reply to Marek Hruscak from comment #4)
> Hi Honggang,
> willing you retest it once fix will be present in compose? I will notify you
> about it.

Yes, I will. Thanks

Comment 8 Honggang LI 2016-09-01 06:06:01 UTC
Test RHEL-7.3-20160830.n.0, NFSROOT over NFSoRDMA works.

===========================================================
[  OK  ] Started Realm and Domain Configuration.
         Starting WPA Supplicant daemon...
[  OK  ] Started WPA Supplicant daemon.
[  OK  ] Started Location Lookup Service.

Red Hat Enterprise Linux Server 7.3 Beta (Maipo)
Kernel 3.10.0-498.el7.x86_64 on an x86_64

localhost login: root
Password: 
[root@localhost ~]# mount | grep nfs
172.31.2.3:/var/lib/tftpboot/rhel7/root on / type nfs (rw,relatime,vers=3,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=rdma,port=2050,timeo=600,retrans=2,sec=sys,mountaddr=172.31.2.3,mountvers=3,mountproto=tcp,local_lock=all,addr=172.31.2.3)
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
[root@localhost ~]# df -h
Filesystem                               Size  Used Avail Use% Mounted on
172.31.2.3:/var/lib/tftpboot/rhel7/root   50G  6.2G   44G  13% /
devtmpfs                                  16G     0   16G   0% /dev
tmpfs                                     16G  4.0K   16G   1% /dev/shm
tmpfs                                     16G   46M   16G   1% /run
tmpfs                                     16G     0   16G   0% /sys/fs/cgroup
none                                      16G  4.0K   16G   1% /tmp
tmpfs                                    3.2G   12K  3.2G   1% /run/user/990
tmpfs                                    3.2G     0  3.2G   0% /run/user/0
[root@localhost ~]# cat /proc/cmdline 
initrd=nfsordma/images/pxeboot/initrd.img root=nfs:172.31.2.3:/var/lib/tftpboot/rhel7/root,proto=rdma,port=2050,rw,nfsvers=3,vers=3,rsize=4096,wsize=4096 rw selinux=0 console=ttyS1,115200n81 ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,rpcrdma nfsrootdebug BOOT_IMAGE=nfsordma/images/pxeboot/vmlinuz 
[root@localhost ~]#
=============================================================================

Comment 9 Marek Hruscak 2016-09-05 13:03:13 UTC
Thank you Honggang for verifying.
Setting state to verified based on comment #8

Comment 11 errata-xmlrpc 2016-11-04 08:06:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2530.html


Note You need to log in before you can comment on or make changes to this bug.