Red Hat Bugzilla – Bug 1367374
[RHEL-7.3] Add rpcrdma to support NFSROOT over NFSoRDMA
Last modified: 2016-11-04 04:06:36 EDT
Created attachment 1191180 [details] patch to add rpcrdma module Description of problem: NFSROOT can over IPoIB, but failed over NFSoRDMA. The NFSoRDMA module, rpcrdma, is missing. Please review and apply attached patch. # cat /tmp/0453-Add-rpcrdma-module-to-support-NFSROOT-over-NFSoRDMA-.patch From d783ff5a50f739d3e40a148c80f99f5ad2d491b7 Mon Sep 17 00:00:00 2001 From: Honggang Li <honli@redhat.com> Date: Tue, 16 Aug 2016 04:12:02 -0400 Subject: [PATCH] Add rpcrdma module to support NFSROOT over NFSoRDMA for RHEL7 Signed-off-by: Honggang Li <honli@redhat.com> --- modules.d/95nfs/module-setup.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/modules.d/95nfs/module-setup.sh b/modules.d/95nfs/module-setup.sh index de5a754..e4ba678 100755 --- a/modules.d/95nfs/module-setup.sh +++ b/modules.d/95nfs/module-setup.sh @@ -25,7 +25,7 @@ depends() { } installkernel() { - hostonly='' instmods nfs sunrpc ipv6 nfsv2 nfsv3 nfsv4 nfs_acl nfs_layout_nfsv41_files + hostonly='' instmods nfs sunrpc ipv6 nfsv2 nfsv3 nfsv4 nfs_acl nfs_layout_nfsv41_files rpcrdma } install() { -- 1.8.3.1 Version-Release number of selected component (if applicable): dracut-033-453.el7.src.rpm How reproducible: always Steps to Reproduce: 1. 2. 3. Actual results: NFSROOT over NFSoRDMA hang. Expected results: Additional info:
Reproducer: [root@ib2-qa-03 tftpboot]# grep -i distro /etc/motd DISTRO=RHEL-7.3-20160729.1 [root@ib2-qa-03 ~]# yum install -y syslinux httpd tftp-server dhcp [root@ib2-qa-03 ~]# cp /usr/share/syslinux/pxelinux.0 /var/lib/tftpboot/ [root@ib2-qa-03 ~]# cp /usr/share/syslinux/chain.c32 /var/lib/tftpboot/ [root@ib2-qa-03 ~]# cp /usr/share/syslinux/menu.c32 /var/lib/tftpboot/ [root@ib2-qa-03 ~]# cp /usr/share/syslinux/memdisk /var/lib/tftpboot/ [root@ib2-qa-03 ~]# cp /usr/share/syslinux/mboot.c32 /var/lib/tftpboot/ [root@ib2-qa-03 ~]# ls /var/lib/tftpboot/ chain.c32 mboot.c32 memdisk menu.c32 pxelinux.0 # You have to setup the GUIDS for opensm, as the HCAs on rdma03/04 are connected # back to back. Without this, the IB ports won't get reset/initialize again when # remote PXE client reboot (the HCA ports lost power supply). [root@ib2-qa-03 ~]# grep -v '^#' /etc/sysconfig/opensm GUIDS="0x0002c90300b3cff1 0x0002c90300b3cff2" [root@ib2-qa-03 ~]# cat /etc/xinetd.d/tftp # default: off # description: The tftp server serves files using the trivial file transfer \ # protocol. The tftp protocol is often used to boot diskless \ # workstations, download configuration files to network-aware printers, \ # and to start the installation process for some operating systems. service tftp { socket_type = dgram protocol = udp wait = yes user = root server = /usr/sbin/in.tftpd server_args = -s /var/lib/tftpboot disable = no per_source = 11 cps = 100 2 flags = IPv4 } [root@ib2-qa-03 ~]# [root@ib2-qa-03 ~]# ip addr show mlx4_ib1 8: mlx4_ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast state UP qlen 256 link/infiniband 80:00:02:08:fe:80:00:00:00:00:00:00:00:02:c9:03:00:b3:cf:f1 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff inet 172.31.2.3/24 brd 172.31.2.255 scope global mlx4_ib1 valid_lft forever preferred_lft forever inet6 fe80::202:c903:b3:cff1/64 scope link valid_lft forever preferred_lft forever [root@ib2-qa-03 ~]# [root@ib2-qa-03 ~]# cat /etc/dhcp/dhcpd.conf # # DHCP Server Configuration file. # see /usr/share/doc/dhcp*/dhcpd.conf.example # see dhcpd.conf(5) man page # DHCPDARGS="mlx4_ib1"; option space pxelinux; option pxelinux.magic code 208 = string; option pxelinux.configfile code 209 = text; option pxelinux.pathprefix code 210 = text; option pxelinux.reboottime code 211 = unsigned integer 32; subnet 172.31.2.0 netmask 255.255.255.0 { option routers 10.0.0.254; range 172.31.2.100 172.31.2.101; always-broadcast on; class "pxeclients" { match if substring (option vendor-class-identifier, 0, 9) = "PXEClient"; next-server 172.31.2.3; filename "pxelinux.0"; } host rdma04-ib1 { option dhcp-client-identifier = 80:00:02:08:fe:80:00:00:00:00:00:00:00:02:c9:03:00:b3:c7:c1; fixed-address 172.31.2.100; } } [root@ib2-qa-03 ~]# modprobe rpcrdma [root@ib2-qa-03 ~]# systemctl enable httpd.service [root@ib2-qa-03 ~]# systemctl enable dhcpd.service [root@ib2-qa-03 ~]# systemctl enable tftp.service [root@ib2-qa-03 ~]# systemctl enable nfs.service [root@ib2-qa-03 ~]# systemctl start tftp.service [root@ib2-qa-03 ~]# systemctl start httpd.service [root@ib2-qa-03 ~]# systemctl start dhcpd.service [root@ib2-qa-03 ~]# systemctl start nfs.service [root@ib2-qa-03 ~]# echo 'rdma 2050' >> /proc/fs/nfsd/portlist [root@ib2-qa-03 ~]# cat /proc/fs/nfsd/portlist rdma 2050 udp 2049 tcp 2049 udp 2049 tcp 2049 [root@ib2-qa-03 ~]# systemctl status dhcpd.service [root@ib2-qa-03 ~]# systemctl status tftp.service [root@ib2-qa-03 ~]# systemctl status httpd.service [root@ib2-qa-03 ~]# systemctl status nfs.service [root@ib2-qa-03 ~]# systemctl stop firewalld.service [root@ib2-qa-03 ~]# cat /var/lib/tftpboot/pxelinux.cfg/default DEFAULT menu.c32 PROMPT 0 TIMEOUT 100 #ONTIMEOUT local #ONTIMEOUT RHEL-7.3-20160719.1 #ONTIMEOUT RHEL-7.3-20160729.1 ONTIMEOUT RHEL-7.3-20160729.1-NFSROOT MENU TITLE PXE Menu MENU seperator LABEL local MENU LABEL Boot local hard drive LOCALBOOT 0 LABEL RHEL-7.3-20160729.1 KERNEL RHEL-7.3-20160729.1/images/pxeboot/vmlinuz APPEND initrd=RHEL-7.3-20160729.1/images/pxeboot/initrd.img inst.repo=http://172.31.2.3/RHEL-7.3-20160729.1/ ks.device=bootif ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.debug=0 rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,ib_iser console=ttyS1,115200n81 inst.text LABEL RHEL-7.3-20160719.1 KERNEL RHEL-7.3-20160719.1/images/pxeboot/vmlinuz APPEND initrd=RHEL-7.3-20160719.1/images/pxeboot/initrd.img inst.repo=http://172.31.2.3/RHEL-7.3-20160719.1/ ks.device=bootif ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.debug=0 rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,ib_iser console=ttyS1,115200n81 inst.text LABEL RHEL-7.3-20160729.1-NFSROOT KERNEL nfsordma/images/pxeboot/vmlinuz # NFSROOT over IPoIB, it works. #APPEND initrd=nfsordma/images/pxeboot/initrd.img root=nfs:172.31.2.3:/var/lib/tftpboot/rhel7/root rw selinux=0 console=ttyS1,115200n81 ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,ib_iser # NFSROOT over NFSoRDMA APPEND initrd=nfsordma/images/pxeboot/initrd.img root=nfs:172.31.2.3:/var/lib/tftpboot/rhel7/root,proto=rdma,port=2050,rw,nfsvers=3,vers=3,rsize=4096,wsize=4096 rw selinux=0 console=ttyS1,115200n81 ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,rpcrdma nfsrootdebug ******************************************************************* rdma03/04 had been connected back to back with mlx4 HCAs. In other words, there is no IB-switch between rdma03/04. If we use the default rsize and wsize, the NFSoRDMA client can't talk to server after client connect to server. The PXE client hang on with this message: nfs: server 172.31.2.3 not responding, still trying nfs: server 172.31.2.3 not responding, still trying So, we have to decrease the wsize and rsize. *******************************************************************
Reproducer (cont): [root@ib2-qa-03 log]# cat /etc/exports /var/lib/tftpboot/rhel7/root *(rw,no_root_squash) /nfsordma *(rw,no_root_squash) [root@ib2-qa-03 log]# showmount -e Export list for ib2-qa-03: /nfsordma * /var/lib/tftpboot/rhel7/root * # yum groups -y install "Server with GUI" --releasever=7 --installroot=/var/lib/tftpboot/rhel7/root/ # yum erase firewalld --releasever=7 --installroot=/var/lib/tftpboot/rhel7/root/ # yum erase plymouth --releasever=7 --installroot=/var/lib/tftpboot/rhel7/root/ ************************************************************************ Have to erase plymouth, otherwise, system hang on with message: systemd: Starting Wait for Plymouth Boot Screen to Quit.. (xxx/ no limit). ************************************************************************ Update password for root user. [root@ib2-qa-03 log]# grep root /etc/shadow | awk -F: '{print $2}' $6$kjvTz6bX$k6L6zckQjXX5VovWiaA.e9X1/zs7DZ6lhQG8zywdd1ngmsWoL/LEdMYa/y5XdfkFIRHsGEDjuGRgy98za9E9r/ [root@ib2-qa-03 log]# grep root /var/lib/tftpboot/rhel7/root/etc/shadow root:$6$kjvTz6bX$k6L6zckQjXX5VovWiaA.e9X1/zs7DZ6lhQG8zywdd1ngmsWoL/LEdMYa/y5XdfkFIRHsGEDjuGRgy98za9E9r/:16925:0:99999:7::: [root@ib2-qa-03 log]# cat /var/lib/tftpboot/rhel7/root/etc/fstab none /tmp tmpfs defaults 0 0 tmpfs /dev/shm tmpfs defaults 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 Download dracut-033-453.el7.src.rpm, and apply the patch, 0453-Add-rpcrdma-module-to-support-NFSROOT-over-NFSoRDMA-.patch, rebuild it. And create a local repo for those updated RPMS. [root@ib2-qa-03 local]# pwd /var/www/html/local [root@ib2-qa-03 local]# ls dracut-033-454.el7.x86_64.rpm dracut-fips-033-454.el7.x86_64.rpm dracut-caps-033-454.el7.x86_64.rpm dracut-fips-aesni-033-454.el7.x86_64.rpm dracut-config-generic-033-454.el7.x86_64.rpm dracut-network-033-454.el7.x86_64.rpm dracut-config-rescue-033-454.el7.x86_64.rpm dracut-tools-033-454.el7.x86_64.rpm dracut-debuginfo-033-454.el7.x86_64.rpm [root@ib2-qa-03 local]# createrepo -d . Create pxeboot images with updated dracut packages. # lorax -p RHEL -v 3 -r 7 -i anaconda-dracut -i rdma -s http://download.eng.pek2.redhat.com/pub/rhel/rel-eng/RHEL-7.3-20160729.1/compose/Server/x86_64/os -s http://download.eng.pek2.redhat.com/pub/rhel/rel-eng/RHEL-7.3-20160729.1/compose/Server-optional/x86_64/os -s http://localhost/local lorax 2>&1 | tee output.txt [root@ib2-qa-03 nfsroot]# cp lorax/images/pxeboot/* /var/lib/tftpboot/nfsordma/images/pxeboot/ [root@ib2-qa-03 nfsroot]# ls /var/lib/tftpboot/nfsordma/images/pxeboot/ initrd.img upgrade.img vmlinuz Now, we have PXE server and NFSROOT filesystem. Power on rdma04 to use remote NFSROOT as root filesystem.
Created attachment 1191199 [details] nfsroot over NFSoRDMA console log nfsroot over NFSoRDMA console log
Hi Honggang, willing you retest it once fix will be present in compose? I will notify you about it.
(In reply to Marek Hruscak from comment #4) > Hi Honggang, > willing you retest it once fix will be present in compose? I will notify you > about it. Yes, I will. Thanks
Test RHEL-7.3-20160830.n.0, NFSROOT over NFSoRDMA works. =========================================================== [ OK ] Started Realm and Domain Configuration. Starting WPA Supplicant daemon... [ OK ] Started WPA Supplicant daemon. [ OK ] Started Location Lookup Service. Red Hat Enterprise Linux Server 7.3 Beta (Maipo) Kernel 3.10.0-498.el7.x86_64 on an x86_64 localhost login: root Password: [root@localhost ~]# mount | grep nfs 172.31.2.3:/var/lib/tftpboot/rhel7/root on / type nfs (rw,relatime,vers=3,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=rdma,port=2050,timeo=600,retrans=2,sec=sys,mountaddr=172.31.2.3,mountvers=3,mountproto=tcp,local_lock=all,addr=172.31.2.3) rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime) [root@localhost ~]# df -h Filesystem Size Used Avail Use% Mounted on 172.31.2.3:/var/lib/tftpboot/rhel7/root 50G 6.2G 44G 13% / devtmpfs 16G 0 16G 0% /dev tmpfs 16G 4.0K 16G 1% /dev/shm tmpfs 16G 46M 16G 1% /run tmpfs 16G 0 16G 0% /sys/fs/cgroup none 16G 4.0K 16G 1% /tmp tmpfs 3.2G 12K 3.2G 1% /run/user/990 tmpfs 3.2G 0 3.2G 0% /run/user/0 [root@localhost ~]# cat /proc/cmdline initrd=nfsordma/images/pxeboot/initrd.img root=nfs:172.31.2.3:/var/lib/tftpboot/rhel7/root,proto=rdma,port=2050,rw,nfsvers=3,vers=3,rsize=4096,wsize=4096 rw selinux=0 console=ttyS1,115200n81 ip=ib0:dhcp biosdevname=0 sshd rd.shell rd.neednet=1 rdloaddriver=mlx4_ib,ib_ipoib,rpcrdma nfsrootdebug BOOT_IMAGE=nfsordma/images/pxeboot/vmlinuz [root@localhost ~]# =============================================================================
Thank you Honggang for verifying. Setting state to verified based on comment #8
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2530.html