Bug 1367374 - [RHEL-7.3] Add rpcrdma to support NFSROOT over NFSoRDMA
Summary: [RHEL-7.3] Add rpcrdma to support NFSROOT over NFSoRDMA
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: dracut
Version: 7.3
Hardware: All
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Lukáš Nykrýn
QA Contact: Release Test Team
URL:
Whiteboard:
Keywords: OtherQA
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-16 09:49 UTC by Honggang LI
Modified: 2016-11-04 08:06 UTC (History)
7 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2016-11-04 08:06:36 UTC


Attachments (Terms of Use)
patch to add rpcrdma module (826 bytes, application/mbox)
2016-08-16 09:49 UTC, Honggang LI
no flags Details
nfsroot over NFSoRDMA console log (85.39 KB, text/plain)
2016-08-16 10:50 UTC, Honggang LI
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2530 normal SHIPPED_LIVE dracut bug fix and enhancement update 2016-11-03 14:17:01 UTC

Description Honggang LI 2016-08-16 09:49:22 UTC
Created attachment 1191180 [details]
patch to add rpcrdma module

Description of problem:

NFSROOT can over IPoIB, but failed over NFSoRDMA. The NFSoRDMA module, rpcrdma, is missing. Please review and apply attached patch.

# cat /tmp/0453-Add-rpcrdma-module-to-support-NFSROOT-over-NFSoRDMA-.patch 
From d783ff5a50f739d3e40a148c80f99f5ad2d491b7 Mon Sep 17 00:00:00 2001
From: Honggang Li <honli@redhat.com>
Date: Tue, 16 Aug 2016 04:12:02 -0400
Subject: [PATCH] Add rpcrdma module to support NFSROOT over NFSoRDMA for RHEL7

Signed-off-by: Honggang Li <honli@redhat.com>
---
 modules.d/95nfs/module-setup.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/modules.d/95nfs/module-setup.sh b/modules.d/95nfs/module-setup.sh
index de5a754..e4ba678 100755
--- a/modules.d/95nfs/module-setup.sh
+++ b/modules.d/95nfs/module-setup.sh
@@ -25,7 +25,7 @@ depends() {
 }
 
 installkernel() {
-    hostonly='' instmods nfs sunrpc ipv6 nfsv2 nfsv3 nfsv4 nfs_acl nfs_layout_nfsv41_files
+    hostonly='' instmods nfs sunrpc ipv6 nfsv2 nfsv3 nfsv4 nfs_acl nfs_layout_nfsv41_files rpcrdma
 }
 
 install() {
-- 
1.8.3.1


Version-Release number of selected component (if applicable):
dracut-033-453.el7.src.rpm

How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:
NFSROOT over NFSoRDMA hang.

Expected results:


Additional info:

Comment 1 Honggang LI 2016-08-16 10:09:08 UTC
Reproducer:

[root@ib2-qa-03 tftpboot]# grep -i distro /etc/motd
                           DISTRO=RHEL-7.3-20160729.1

[root@ib2-qa-03 ~]# yum install -y syslinux httpd tftp-server dhcp

[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/pxelinux.0   /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/chain.c32    /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/menu.c32     /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/memdisk      /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  cp /usr/share/syslinux/mboot.c32    /var/lib/tftpboot/
[root@ib2-qa-03 ~]#  ls /var/lib/tftpboot/
chain.c32  mboot.c32  memdisk  menu.c32  pxelinux.0


# You have to setup the GUIDS for opensm, as the HCAs on rdma03/04 are connected
# back to back. Without this, the IB ports won't get reset/initialize again when
# remote PXE client reboot (the HCA ports lost power supply).
[root@ib2-qa-03 ~]# grep -v '^#' /etc/sysconfig/opensm 
GUIDS="0x0002c90300b3cff1 0x0002c90300b3cff2"

[root@ib2-qa-03 ~]# cat /etc/xinetd.d/tftp
# default: off
# description: The tftp server serves files using the trivial file transfer \
#	protocol.  The tftp protocol is often used to boot diskless \
#	workstations, download configuration files to network-aware printers, \
#	and to start the installation process for some operating systems.
service tftp
{
	socket_type		= dgram
	protocol		= udp
	wait			= yes
	user			= root
	server			= /usr/sbin/in.tftpd
	server_args		= -s /var/lib/tftpboot
	disable			= no
	per_source		= 11
	cps			= 100 2
	flags			= IPv4
}
[root@ib2-qa-03 ~]# 


[root@ib2-qa-03 ~]# ip addr show mlx4_ib1
8: mlx4_ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast state UP qlen 256
    link/infiniband 80:00:02:08:fe:80:00:00:00:00:00:00:00:02:c9:03:00:b3:cf:f1 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 172.31.2.3/24 brd 172.31.2.255 scope global mlx4_ib1
       valid_lft forever preferred_lft forever
    inet6 fe80::202:c903:b3:cff1/64 scope link 
       valid_lft forever preferred_lft forever
[root@ib2-qa-03 ~]#

[root@ib2-qa-03 ~]# cat /etc/dhcp/dhcpd.conf 
#
# DHCP Server Configuration file.
#   see /usr/share/doc/dhcp*/dhcpd.conf.example
#   see dhcpd.conf(5) man page
#

DHCPDARGS="mlx4_ib1";

option space pxelinux;
option pxelinux.magic code 208 = string;
option pxelinux.configfile code 209 = text;
option pxelinux.pathprefix code 210 = text;
option pxelinux.reboottime code 211 = unsigned integer 32;

subnet 172.31.2.0 netmask 255.255.255.0 {
        option routers 10.0.0.254;
        range 172.31.2.100 172.31.2.101;
	always-broadcast on;


        class "pxeclients" {
                match if substring (option vendor-class-identifier, 0, 9) = "PXEClient";
                next-server 172.31.2.3;
                filename "pxelinux.0";
        }

        host rdma04-ib1 {
		option dhcp-client-identifier = 80:00:02:08:fe:80:00:00:00:00:00:00:00:02:c9:03:00:b3:c7:c1;
                fixed-address 172.31.2.100;
        }
}

[root@ib2-qa-03 ~]# modprobe rpcrdma


[root@ib2-qa-03 ~]# systemctl enable httpd.service
[root@ib2-qa-03 ~]# systemctl enable dhcpd.service
[root@ib2-qa-03 ~]# systemctl enable tftp.service
[root@ib2-qa-03 ~]# systemctl enable nfs.service

[root@ib2-qa-03 ~]# systemctl start tftp.service
[root@ib2-qa-03 ~]# systemctl start httpd.service
[root@ib2-qa-03 ~]# systemctl start dhcpd.service
[root@ib2-qa-03 ~]# systemctl start nfs.service

[root@ib2-qa-03 ~]# echo 'rdma 2050' >> /proc/fs/nfsd/portlist 

[root@ib2-qa-03 ~]#  cat /proc/fs/nfsd/portlist 
rdma 2050
udp 2049
tcp 2049
udp 2049
tcp 2049


[root@ib2-qa-03 ~]# systemctl status dhcpd.service
[root@ib2-qa-03 ~]# systemctl status tftp.service
[root@ib2-qa-03 ~]# systemctl status httpd.service
[root@ib2-qa-03 ~]# systemctl status nfs.service

[root@ib2-qa-03 ~]# systemctl stop  firewalld.service

[root@ib2-qa-03 ~]# cat /var/lib/tftpboot/pxelinux.cfg/default 
DEFAULT menu.c32
PROMPT 0
TIMEOUT 100
#ONTIMEOUT local
#ONTIMEOUT RHEL-7.3-20160719.1
#ONTIMEOUT RHEL-7.3-20160729.1
ONTIMEOUT RHEL-7.3-20160729.1-NFSROOT

MENU TITLE PXE Menu

MENU seperator
LABEL local
MENU LABEL Boot local hard drive
LOCALBOOT 0

LABEL  RHEL-7.3-20160729.1
KERNEL RHEL-7.3-20160729.1/images/pxeboot/vmlinuz
APPEND initrd=RHEL-7.3-20160729.1/images/pxeboot/initrd.img  inst.repo=http://172.31.2.3/RHEL-7.3-20160729.1/ ks.device=bootif ip=ib0:dhcp biosdevname=0  sshd rd.shell rd.debug=0 rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,ib_iser console=ttyS1,115200n81 inst.text

LABEL  RHEL-7.3-20160719.1
KERNEL RHEL-7.3-20160719.1/images/pxeboot/vmlinuz
APPEND initrd=RHEL-7.3-20160719.1/images/pxeboot/initrd.img  inst.repo=http://172.31.2.3/RHEL-7.3-20160719.1/ ks.device=bootif ip=ib0:dhcp biosdevname=0  sshd rd.shell rd.debug=0 rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,ib_iser console=ttyS1,115200n81 inst.text

LABEL  RHEL-7.3-20160729.1-NFSROOT
KERNEL nfsordma/images/pxeboot/vmlinuz
# NFSROOT over IPoIB, it works.
#APPEND initrd=nfsordma/images/pxeboot/initrd.img root=nfs:172.31.2.3:/var/lib/tftpboot/rhel7/root rw selinux=0 console=ttyS1,115200n81 ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,ib_iser
# NFSROOT over NFSoRDMA
APPEND initrd=nfsordma/images/pxeboot/initrd.img root=nfs:172.31.2.3:/var/lib/tftpboot/rhel7/root,proto=rdma,port=2050,rw,nfsvers=3,vers=3,rsize=4096,wsize=4096 rw selinux=0 console=ttyS1,115200n81 ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,rpcrdma nfsrootdebug

*******************************************************************
rdma03/04 had been connected back to back with mlx4 HCAs. In other
words, there is no IB-switch between rdma03/04. If we use the default
rsize and wsize, the NFSoRDMA client can't talk to server after client
connect to server. The PXE client hang on with this message:

nfs: server 172.31.2.3 not responding, still trying
nfs: server 172.31.2.3 not responding, still trying

So, we have to decrease the wsize and rsize.
*******************************************************************

Comment 2 Honggang LI 2016-08-16 10:25:19 UTC
Reproducer (cont):

[root@ib2-qa-03 log]# cat /etc/exports
/var/lib/tftpboot/rhel7/root *(rw,no_root_squash) 
/nfsordma *(rw,no_root_squash) 
[root@ib2-qa-03 log]# showmount  -e
Export list for ib2-qa-03:
/nfsordma                    *
/var/lib/tftpboot/rhel7/root *


# yum groups -y install "Server with GUI" --releasever=7 --installroot=/var/lib/tftpboot/rhel7/root/

# yum erase firewalld  --releasever=7 --installroot=/var/lib/tftpboot/rhel7/root/
# yum erase plymouth  --releasever=7 --installroot=/var/lib/tftpboot/rhel7/root/

************************************************************************
Have to erase plymouth, otherwise, system hang on with message:

systemd: Starting Wait for Plymouth Boot Screen to Quit.. (xxx/ no limit).
************************************************************************

Update password for root user.

[root@ib2-qa-03 log]# grep root /etc/shadow | awk -F: '{print $2}'
$6$kjvTz6bX$k6L6zckQjXX5VovWiaA.e9X1/zs7DZ6lhQG8zywdd1ngmsWoL/LEdMYa/y5XdfkFIRHsGEDjuGRgy98za9E9r/

[root@ib2-qa-03 log]# grep root /var/lib/tftpboot/rhel7/root/etc/shadow
root:$6$kjvTz6bX$k6L6zckQjXX5VovWiaA.e9X1/zs7DZ6lhQG8zywdd1ngmsWoL/LEdMYa/y5XdfkFIRHsGEDjuGRgy98za9E9r/:16925:0:99999:7:::

[root@ib2-qa-03 log]# cat /var/lib/tftpboot/rhel7/root/etc/fstab 
none    /tmp        tmpfs   defaults   0 0
tmpfs   /dev/shm    tmpfs   defaults   0 0
sysfs   /sys        sysfs   defaults   0 0
proc    /proc       proc    defaults   0 0

Download dracut-033-453.el7.src.rpm, and apply the patch, 0453-Add-rpcrdma-module-to-support-NFSROOT-over-NFSoRDMA-.patch, rebuild it. And create a local repo for those updated RPMS.

[root@ib2-qa-03 local]# pwd
/var/www/html/local
[root@ib2-qa-03 local]# ls
dracut-033-454.el7.x86_64.rpm                 dracut-fips-033-454.el7.x86_64.rpm
dracut-caps-033-454.el7.x86_64.rpm            dracut-fips-aesni-033-454.el7.x86_64.rpm
dracut-config-generic-033-454.el7.x86_64.rpm  dracut-network-033-454.el7.x86_64.rpm
dracut-config-rescue-033-454.el7.x86_64.rpm   dracut-tools-033-454.el7.x86_64.rpm
dracut-debuginfo-033-454.el7.x86_64.rpm       
[root@ib2-qa-03 local]# createrepo -d .



Create pxeboot images with updated dracut packages.

# lorax -p RHEL -v 3 -r 7 -i anaconda-dracut -i rdma  -s http://download.eng.pek2.redhat.com/pub/rhel/rel-eng/RHEL-7.3-20160729.1/compose/Server/x86_64/os -s http://download.eng.pek2.redhat.com/pub/rhel/rel-eng/RHEL-7.3-20160729.1/compose/Server-optional/x86_64/os   -s http://localhost/local lorax 2>&1 | tee output.txt

[root@ib2-qa-03 nfsroot]# cp lorax/images/pxeboot/* /var/lib/tftpboot/nfsordma/images/pxeboot/
[root@ib2-qa-03 nfsroot]# ls /var/lib/tftpboot/nfsordma/images/pxeboot/
initrd.img  upgrade.img  vmlinuz


Now, we have PXE server and NFSROOT filesystem. Power on rdma04 to use remote NFSROOT as root filesystem.

Comment 3 Honggang LI 2016-08-16 10:50 UTC
Created attachment 1191199 [details]
nfsroot over NFSoRDMA console log

nfsroot over NFSoRDMA console log

Comment 4 Marek Hruscak 2016-08-18 13:07:11 UTC
Hi Honggang,
willing you retest it once fix will be present in compose? I will notify you about it.

Comment 5 Honggang LI 2016-08-18 13:14:40 UTC
(In reply to Marek Hruscak from comment #4)
> Hi Honggang,
> willing you retest it once fix will be present in compose? I will notify you
> about it.

Yes, I will. Thanks

Comment 8 Honggang LI 2016-09-01 06:06:01 UTC
Test RHEL-7.3-20160830.n.0, NFSROOT over NFSoRDMA works.

===========================================================
[  OK  ] Started Realm and Domain Configuration.
         Starting WPA Supplicant daemon...
[  OK  ] Started WPA Supplicant daemon.
[  OK  ] Started Location Lookup Service.

Red Hat Enterprise Linux Server 7.3 Beta (Maipo)
Kernel 3.10.0-498.el7.x86_64 on an x86_64

localhost login: root
Password: 
[root@localhost ~]# mount | grep nfs
172.31.2.3:/var/lib/tftpboot/rhel7/root on / type nfs (rw,relatime,vers=3,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=rdma,port=2050,timeo=600,retrans=2,sec=sys,mountaddr=172.31.2.3,mountvers=3,mountproto=tcp,local_lock=all,addr=172.31.2.3)
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
[root@localhost ~]# df -h
Filesystem                               Size  Used Avail Use% Mounted on
172.31.2.3:/var/lib/tftpboot/rhel7/root   50G  6.2G   44G  13% /
devtmpfs                                  16G     0   16G   0% /dev
tmpfs                                     16G  4.0K   16G   1% /dev/shm
tmpfs                                     16G   46M   16G   1% /run
tmpfs                                     16G     0   16G   0% /sys/fs/cgroup
none                                      16G  4.0K   16G   1% /tmp
tmpfs                                    3.2G   12K  3.2G   1% /run/user/990
tmpfs                                    3.2G     0  3.2G   0% /run/user/0
[root@localhost ~]# cat /proc/cmdline 
initrd=nfsordma/images/pxeboot/initrd.img root=nfs:172.31.2.3:/var/lib/tftpboot/rhel7/root,proto=rdma,port=2050,rw,nfsvers=3,vers=3,rsize=4096,wsize=4096 rw selinux=0 console=ttyS1,115200n81 ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,rpcrdma nfsrootdebug BOOT_IMAGE=nfsordma/images/pxeboot/vmlinuz 
[root@localhost ~]#
=============================================================================

Comment 9 Marek Hruscak 2016-09-05 13:03:13 UTC
Thank you Honggang for verifying.
Setting state to verified based on comment #8

Comment 11 errata-xmlrpc 2016-11-04 08:06:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2530.html


Note You need to log in before you can comment on or make changes to this bug.