Caused NFS to hang to the point we had to reboot. nfs-utils x86_64 1:2.1.1-5.rc4.fc25 kernel at the time was 4.11.5-200.fc25.x86_64 Logs: Jun 29 02:23:27 dsm rpc.idmapd: rpc.idmapd: conf_reinit: open ("(null)", O_RDONLY) failed Jun 29 02:23:27 dsm rpc.idmapd: rpc.idmapd: conf_reinit: open ("(null)", O_RDONLY) failed Jun 29 02:23:27 dsm kernel: nfsd: last server has exited, flushing export cache Jun 29 02:23:27 dsm systemd: Starting NFS server and services... Jun 29 02:23:27 dsm kernel: divide error: 0000 [#1] SMP Jun 29 02:23:27 dsm kernel: Modules linked in: fuse arc4 md4 nls_utf8 cifs ccm rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_multiport ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_REJECT nf_reject_ipv6 nf_con ntrack_ipv6 iptable_nat nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_conntrack nf_nat nf_conntrack libcrc32c ip6table_filter ip6_tables iptable_mangle joydev coretemp kvm iTCO_wdt gpio_ich iTCO_vendor_support irqbypass ses ipmi_ssif enclosure dcdbas scsi_transport_sas i5000_edac ipmi_si edac_core lpc_ich ipmi_devintf i5k_amb ipmi_msghandler shpchp tpm_tis tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc uas usb_storage amdkfd amd_iommu_v2 radeon i2c_algo_bit drm_kms_helper ttm drm ata_gene ric pata_acpi serio_raw Jun 29 02:23:27 dsm kernel: megaraid_sas bnx2 Jun 29 02:23:27 dsm kernel: CPU: 4 PID: 7781 Comm: rpc.nfsd Not tainted 4.11.5-200.fc25.x86_64 #1 Jun 29 02:23:27 dsm kernel: Hardware name: Dell Inc. PowerEdge 1950/0TT740, BIOS 2.2.6 02/05/2008 Jun 29 02:23:27 dsm kernel: task: ffff9290e5034880 task.stack: ffffa770ce0c4000 Jun 29 02:23:27 dsm kernel: RIP: 0010:svc_pool_for_cpu+0x2b/0x80 [sunrpc] Jun 29 02:23:27 dsm kernel: RSP: 0018:ffffa770ce0c7c18 EFLAGS: 00010246 Jun 29 02:23:27 dsm kernel: RAX: 0000000000000000 RBX: ffff9290286a6000 RCX: 0000000000000002 Jun 29 02:23:27 dsm kernel: RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff928ec3d45500 Jun 29 02:23:27 dsm kernel: RBP: ffffa770ce0c7c18 R08: ffff928ec3d45528 R09: 0000000000018783 Jun 29 02:23:27 dsm kernel: R10: ffffffffc06fb100 R11: 0000000000000000 R12: ffff9290286a6010 Jun 29 02:23:27 dsm kernel: R13: ffff9290286a6018 R14: ffff928ec3d45528 R15: ffff928ec3d45500 Jun 29 02:23:27 dsm kernel: FS: 00007f0f69cacc40(0000) GS:ffff9290efd00000(0000) knlGS:0000000000000000 Jun 29 02:23:27 dsm kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jun 29 02:23:27 dsm kernel: CR2: 00007ffede0fe0a8 CR3: 00000001fb7c4000 CR4: 00000000000006e0 Jun 29 02:23:27 dsm kernel: Call Trace: Jun 29 02:23:27 dsm kernel: svc_xprt_do_enqueue+0xef/0x260 [sunrpc] Jun 29 02:23:27 dsm kernel: svc_xprt_received+0x47/0x90 [sunrpc] Jun 29 02:23:27 dsm kernel: svc_add_new_perm_xprt+0x76/0x90 [sunrpc] Jun 29 02:23:27 dsm kernel: svc_addsock+0x14b/0x200 [sunrpc] Jun 29 02:23:27 dsm kernel: ? recalc_sigpending+0x1b/0x50 Jun 29 02:23:27 dsm kernel: ? __getnstimeofday64+0x41/0xd0 Jun 29 02:23:27 dsm kernel: ? do_gettimeofday+0x29/0x90 Jun 29 02:23:27 dsm kernel: write_ports+0x255/0x2c0 [nfsd] Jun 29 02:23:27 dsm kernel: ? _copy_from_user+0x4e/0x80 Jun 29 02:23:27 dsm kernel: ? write_recoverydir+0x100/0x100 [nfsd] Jun 29 02:23:27 dsm kernel: nfsctl_transaction_write+0x48/0x80 [nfsd] Jun 29 02:23:27 dsm kernel: __vfs_write+0x37/0x160 Jun 29 02:23:27 dsm kernel: ? __inet_hash+0xd2/0x260 Jun 29 02:23:27 dsm kernel: vfs_write+0xb5/0x1a0 Jun 29 02:23:27 dsm kernel: SyS_write+0x55/0xc0 Jun 29 02:23:27 dsm kernel: entry_SYSCALL_64_fastpath+0x1a/0xa9 Jun 29 02:23:27 dsm kernel: RIP: 0033:0x7f0f695c8ae0 Jun 29 02:23:27 dsm kernel: RSP: 002b:00007ffede056ba8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 Jun 29 02:23:27 dsm kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f0f695c8ae0 Jun 29 02:23:27 dsm kernel: RDX: 0000000000000002 RSI: 00005637ab1dd600 RDI: 0000000000000003 Jun 29 02:23:27 dsm kernel: RBP: 00007ffede056ba0 R08: 0000000000000001 R09: 0000000000000002 Jun 29 02:23:27 dsm kernel: R10: 0000000000000064 R11: 0000000000000246 R12: 0000000000000004 Jun 29 02:23:27 dsm kernel: R13: 00005637abfa37a0 R14: 00005637abfa3720 R15: 00007ffede056658 Jun 29 02:23:27 dsm kernel: Code: 66 66 66 66 90 48 8b 87 98 00 00 00 55 48 89 e5 48 83 78 08 00 74 10 8b 05 97 51 02 00 83 f8 01 74 40 83 f8 02 74 19 31 c0 31 d2 <f7> b7 88 00 00 00 5d 89 d0 48 c1 e0 07 48 03 87 90 00 00 00 c3 Jun 29 02:23:27 dsm kernel: RIP: svc_pool_for_cpu+0x2b/0x80 [sunrpc] RSP: ffffa770ce0c7c18 Jun 29 02:23:27 dsm kernel: ---[ end trace 7a2bfc4aacf7a21e ]---
NFS and NFS-server just won't start now. Also seeing: Jun 30 21:49:52 ourserver systemd[1]: Starting NFSv4 ID-name mapping service... Jun 30 21:49:52 ourserver rpc.idmapd[16991]: rpc.idmapd: conf_reinit: open ("(null)", O_RDONLY) failed Jun 30 21:49:52 ourserver rpc.idmapd[16991]: rpc.idmapd: conf_reinit: open ("(null)", O_RDONLY) failed Jun 30 21:49:52 ourserver systemd[1]: Started NFSv4 ID-name mapping service. I also added this to https://bugzilla.linux-nfs.org/show_bug.cgi?id=308
I added the Linux kernel Bugzilla can this be pushed through? We start summer classes Monday and no users can log in and Grub does not have any 4.10 kernels.
> and Grub does not have any 4.10 kernels and why don#t you just donwload whatever kernel builds you need from koji? https://koji.fedoraproject.org/koji/packageinfo?packageID=8 https://koji.fedoraproject.org/koji/packageinfo?buildStart=50&packageID=8&buildOrder=-completion_time&tagOrder=name&tagStart=100#buildlist
I'm a little skittish as to how to do that and which packages are needed, e.g., debug, core, etc. Is there a doc that shows how to downgrade? I've Googled with mixed results.
frankly "rpm -qa | grep kernel" shows you what sub packages are installed and some yum/dnf/rpm/rpm --force dance are basics with trial and error in doubt
OK that helped a little: rpm -qa | grep kernel | grep 7 kernel-devel-4.11.7-200.fc25.x86_64 kernel-core-4.11.7-200.fc25.x86_64 kernel-tools-libs-4.11.7-200.fc25.x86_64 kernel-4.11.7-200.fc25.x86_64 kernel-tools-libs-devel-4.11.7-200.fc25.x86_64 kernel-modules-4.11.7-200.fc25.x86_64 kernel-headers-4.11.7-200.fc25.x86_64 kernel-modules-extra-4.11.7-200.fc25.x86_64 kernel-tools-4.11.7-200.fc25.x86_64 So I downloaded these: kernel-4.10.16-200.fc25.x86_64.rpm kernel-core-4.10.16-200.fc25.x86_64.rpm kernel-devel-4.10.16-200.fc25.x86_64.rpm kernel-headers-4.10.16-200.fc25.x86_64.rpm kernel-modules-4.10.16-200.fc25.x86_64.rpm kernel-modules-extra-4.10.16-200.fc25.x86_64.rpm kernel-tools-4.10.16-200.fc25.x86_64.rpm kernel-tools-libs-4.10.16-200.fc25.x86_64.rpm kernel-tools-libs-devel-4.10.16-200.fc25.x86_64.rpm But running: rpm -ivh --oldpackage kernel* error: Failed dependencies: kernel-headers < 4.11.7-200.fc25 is obsoleted by (installed) kernel-headers-4.11.7-200.fc25.x86_64 Any suggestions? Thanks for your help so far.
you don't need "kernel-headers" and "kernel-devel" at all for normal operations except akmod for build out-of-tree modules in case of crappy hardware and you missed the hint about --force as well look at "man rpm" - there is also --nodeps in case you know what you are doing and case of a kernel downgrade you usually do
I was able to downgrade to 4.10.16-200.fc25.x86_64. All of our servers except the master NIS server seem to be responding OK. I've rebooted them all with this kernel but the master NIS server still hangs when I try to 'su' as a NIS user. No problems with a local or root user. Here are some systemctl's. One thing I do recall changing is enabling NetworkManager while network was running. Any other ideas? You've been so helpful. systemctl status nfs * nfs-server.service - NFS server and services Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled; vendor preset: disabled) Drop-In: /run/systemd/generator/nfs-server.service.d `-order-with-mounts.conf Active: active (exited) since Sun 2017-07-02 15:49:03 EDT; 6h ago Process: 6252 ExecStopPost=/usr/sbin/exportfs -f (code=exited, status=0/SUCCESS) Process: 6249 ExecStopPost=/usr/sbin/exportfs -au (code=exited, status=0/SUCCESS) Process: 6246 ExecStop=/usr/sbin/rpc.nfsd 0 (code=exited, status=0/SUCCESS) Process: 6268 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=0/SUCCESS) Process: 6264 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS) Main PID: 6268 (code=exited, status=0/SUCCESS) Tasks: 0 (limit: 4915) CGroup: /system.slice/nfs-server.service Jul 02 15:49:03 oursever systemd[1]: Starting NFS server and services... Jul 02 15:49:03 oursever systemd[1]: Started NFS server and services. [root@dsm localguy]# systemctl status nfs-server * nfs-server.service - NFS server and services Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled; vendor preset: disabled) Drop-In: /run/systemd/generator/nfs-server.service.d `-order-with-mounts.conf Active: active (exited) since Sun 2017-07-02 15:49:03 EDT; 6h ago Process: 6252 ExecStopPost=/usr/sbin/exportfs -f (code=exited, status=0/SUCCESS) Process: 6249 ExecStopPost=/usr/sbin/exportfs -au (code=exited, status=0/SUCCESS) Process: 6246 ExecStop=/usr/sbin/rpc.nfsd 0 (code=exited, status=0/SUCCESS) Process: 6268 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=0/SUCCESS) Process: 6264 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS) Main PID: 6268 (code=exited, status=0/SUCCESS) Tasks: 0 (limit: 4915) CGroup: /system.slice/nfs-server.service Jul 02 15:49:03 ourserver systemd[1]: Starting NFS server and services... Jul 02 15:49:03 ourserver systemd[1]: Started NFS server and services. systemctl status autofs * autofs.service - Automounts filesystems on demand Loaded: loaded (/usr/lib/systemd/system/autofs.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2017-07-02 22:03:11 EDT; 5min ago Process: 20094 ExecStart=/usr/sbin/automount $OPTIONS --pid-file /run/autofs.pid (code=exited, status=0/SUCCESS) Main PID: 20096 (automount) Tasks: 12 (limit: 4915) CGroup: /system.slice/autofs.service |-20096 /usr/sbin/automount --pid-file /run/autofs.pid |-20395 /usr/bin/mount -t nfs -s -o rw,nolock erdos:/home/users /u/erdos `-20396 /sbin/mount.nfs erdos:/home/users /u/erdos -s -o rw,nolock Jul 02 22:03:11 ourserver systemd[1]: Starting Automounts filesystems on demand... Jul 02 22:03:11 ourserver systemd[1]: Started Automounts filesystems on demand. systemctl status ypserv * ypserv.service - NIS/YP (Network Information Service) Server Loaded: loaded (/usr/lib/systemd/system/ypserv.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2017-07-02 22:03:04 EDT; 6min ago Main PID: 19642 (ypserv) Status: "Processing requests..." Tasks: 1 (limit: 4915) CGroup: /system.slice/ypserv.service `-19642 /usr/sbin/ypserv -f Jul 02 22:03:04 ourserver systemd[1]: Starting NIS/YP (Network Information Service) Server... Jul 02 22:03:04 ourserver systemd[1]: Started NIS/YP (Network Information Service) Server. systemctl status ypbind * ypbind.service - NIS/YP (Network Information Service) Clients to NIS Domain Binder Loaded: loaded (/usr/lib/systemd/system/ypbind.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2017-07-02 22:03:11 EDT; 6min ago Process: 20079 ExecStartPost=/usr/libexec/ypbind-post-waitbind (code=exited, status=0/SUCCESS) Process: 20043 ExecStartPre=/usr/sbin/setsebool allow_ypbind=1 (code=exited, status=1/FAILURE) Process: 20030 ExecStartPre=/usr/libexec/ypbind-pre-setdomain (code=exited, status=0/SUCCESS) Main PID: 20053 (ypbind) Status: "Processing requests..." Tasks: 4 (limit: 4915) CGroup: /system.slice/ypbind.service `-20053 /usr/sbin/ypbind -n Jul 02 22:03:11 ourserver systemd[1]: Starting NIS/YP (Network Information Service) Clients to NIS Domain Binder... Jul 02 22:03:11 ourserver setsebool[20043]: setsebool: SELinux is disabled. Jul 02 22:03:11 ourserver systemd[1]: Started NIS/YP (Network Information Service) Clients to NIS Domain Binder. systemctl status network * network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; generated; vendor preset: disabled) Active: active (exited) since Sun 2017-07-02 22:03:12 EDT; 7min ago Docs: man:systemd-sysv-generator(8) Process: 19588 ExecStop=/etc/rc.d/init.d/network stop (code=exited, status=0/SUCCESS) Process: 20031 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=0/SUCCESS) Jul 02 22:03:11 ourserver systemd[1]: Starting LSB: Bring up/down networking... Jul 02 22:03:11 ourserver network[20031]: Bringing up loopback interface: [ OK ] Jul 02 22:03:12 ourserver network[20031]: Bringing up interface eth0: [ OK ] Jul 02 22:03:12 ourserver systemd[1]: Started LSB: Bring up/down networking. systemctl status NetworkManager * NetworkManager.service - Network Manager Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled) Active: active (running) since Sun 2017-07-02 22:03:05 EDT; 7min ago Docs: man:NetworkManager(8) Main PID: 19893 (NetworkManager) Tasks: 3 (limit: 4915) CGroup: /system.slice/NetworkManager.service `-19893 /usr/sbin/NetworkManager --no-daemon Jul 02 22:03:06 ourserver NetworkManager[19893]: <info> [1499047386.0875] device (eth0): state change: config -> ip-config (reas Jul 02 22:03:06 ourserver NetworkManager[19893]: <info> [1499047386.0920] device (eth0): state change: ip-config -> ip-check (re Jul 02 22:03:06 ourserver NetworkManager[19893]: <info> [1499047386.1033] device (eth0): state change: ip-check -> secondaries ( Jul 02 22:03:06 ourserver NetworkManager[19893]: <info> [1499047386.1038] device (eth0): state change: secondaries -> activated Jul 02 22:03:06 ourserver NetworkManager[19893]: <info> [1499047386.1040] manager: NetworkManager state is now CONNECTED_LOCAL Jul 02 22:03:06 ourserver NetworkManager[19893]: <info> [1499047386.1452] manager: NetworkManager state is now CONNECTED_SITE Jul 02 22:03:06 ourserver NetworkManager[19893]: <info> [1499047386.1453] policy: set 'System eth0' (eth0) as default for IPv4 r Jul 02 22:03:06 ourserver NetworkManager[19893]: <info> [1499047386.1474] device (eth0): Activation: successful, device activate Jul 02 22:03:06 ourserver NetworkManager[19893]: <info> [1499047386.3320] manager: NetworkManager state is now CONNECTED_GLOBAL Jul 02 22:03:11 ourserver NetworkManager[19893]: <info> [1499047391.2042] manager: startup complete
Also started ypserv with debug and got this: Jul 2 22:38:08 dsm ypserv: #011ypdb_open("ourdomain", "hosts.byname") Jul 2 22:38:08 dsm ypserv: Found: ourdomain/hosts.byname (0) Jul 2 22:38:08 dsm ypserv: ypdb_close() called Jul 2 22:38:08 dsm ypserv: #011-> Error #-3 Jul 2 22:38:11 dsm ypserv: ypproc_domain("ourdomain") [From: ourip:855] Jul 2 22:38:11 dsm ypserv: connect from ourip Jul 2 22:38:11 dsm ypserv: #011-> Ok. Jul 2 22:38:16 dsm ypserv: ypproc_domain("ourdomain") [From: ourip:744] Jul 2 22:38:16 dsm ypserv: connect from ourip Jul 2 22:38:16 dsm ypserv: #011-> Ok. Jul 2 22:38:30 dsm systemd: user: State 'stop-final-sigterm' timed out. Killing. Jul 2 22:38:30 dsm systemd: user: Killing process 21434 (systemd) with signal SIGKILL. Jul 2 22:38:30 dsm systemd: Failed to start User Manager for UID 6105. Jul 2 22:38:30 dsm systemd: user: Unit entered failed state. Jul 2 22:38:30 dsm systemd: user: Failed with result 'timeout'. Any idea what Error #-3 is?
I'm not seeing this as an NFS problem... What am I missing?
The divide by zero error was fixed over the summer but the Error #-3 remains a mystery. I believe it's related to firewalld and this BZ: 1488616 which I've linked here. We had to disable firewalld and just use iptables to get NFS to work with NIS.
(In reply to RobbieTheK from comment #11) > The divide by zero error was fixed over the summer but the Error #-3 remains > a mystery. I believe it's related to firewalld and this BZ: 1488616 which > I've linked here. We had to disable firewalld and just use iptables to get > NFS to work with NIS. Fair enough... I'm going to change the component to firewalld so they can have a look at this.
This message is a reminder that Fedora 26 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '26'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 26 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.