1428941 – NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1428941 - NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO

Summary: NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO

Keywords:
Status:	CLOSED DUPLICATE of bug 1428939
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	nfs-utils
Sub Component:
Version:	7.3
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Steve Dickson
QA Contact:	Yongcheng Yang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-03-03 16:25 UTC by jinjian.1
Modified:	2018-04-17 17:45 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-04-08 13:02:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1427379	1	None	None	None	2021-01-20 06:05:38 UTC

Internal Links: 1427379

Description jinjian.1 2017-03-03 16:25:34 UTC

Description of problem:

 configured 3 nfs mount entries in /etc/fstab like:

10.1.50.6:/    /mnt/efs nfs4    nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 1 0
10.1.31.83:/    /mnt/efs-1 nfs4    nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 1 0
10.1.38.216:/    /mnt/efs-2 nfs4    nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 1 0

After reboot, i found only the last one could mount successfully and the first two failed. Check the dmesg got two errors 

[  100.703285] NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO
[  100.710115] NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO.


The confusing part is: One mount successfully. I setup 3 totally the same nfsd servers.

Notes: mount -a could mount all 3 nfs targes successfully.


Version-Release number of selected component (if applicable):

Kernel: 3.10.0-514.el7.x86_64
nfs-utils-1.3.0-0.33.el7.x86_64


How reproducible:


Steps to Reproduce:
1. Put 3 nfs mount entries in /etc/fstab
2. reboot system

Actual results:

Mount failed!

Expected results:

All mount run successfully.

Additional info:

Error messages in /var/log/messages

Feb 27 13:32:13 ip-10-1-34-23 systemd: Started LSB: Initializes the correct repo depending on the region..
Feb 27 13:32:13 ip-10-1-34-23 systemd: Mounted /mnt/efs-2.

.....

Feb 27 13:33:43 ip-10-1-34-23 systemd: mnt-efs\x2d1.mount mounting timed out. Stopping.
Feb 27 13:33:43 ip-10-1-34-23 systemd: mnt-efs.mount mounting timed out. Stopping.
Feb 27 13:33:43 ip-10-1-34-23 kernel: NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO
Feb 27 13:33:43 ip-10-1-34-23 kernel: NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO
Feb 27 13:33:43 ip-10-1-34-23 systemd: Mounted /mnt/efs-1.
Feb 27 13:33:43 ip-10-1-34-23 systemd: Mounted /mnt/efs.

Comment 2 Steve Dickson 2017-04-08 13:02:38 UTC


*** This bug has been marked as a duplicate of bug 1428939 ***

Comment 3 M1ke 2018-01-26 11:15:05 UTC

This bug has been marked as a duplicate but the bug it relates to has been restricted access; so for those without access what was the solution to this?

Comment 4 Steve Dickson 2018-01-29 20:26:37 UTC

(In reply to M1ke from comment #3)
> This bug has been marked as a duplicate but the bug it relates to has been
> restricted access; so for those without access what was the solution to this?

I just added you to the cc list so you should have access to everything

Comment 5 cloud 2018-02-08 13:06:33 UTC

I have the same issue but the related bug is still restricted.

Comment 6 Gyde 2018-02-08 14:26:06 UTC

I have same issue on:
Kernal: 3.10.0-693.17.1.el7.x86_64
nfs-utils-1.3.0-0.48.el7_4.1.x86_64
Can I also gain cc?

Comment 7 f.dreesen 2018-02-12 11:22:25 UTC

We are running into the same issue.
Is there already a solution to this and could I maybe also be added to the cc list ?

kernel: 3.10.0-693.17.1.el7.x86_64
nfs-utils: 1.3.0-0.48.el7_4.1.x86_64

Comment 8 Yongcheng Yang 2018-02-12 13:20:01 UTC

Here just to share the latest progress of this issue.

This is actually a problem in the kernel side and has been resolved via Bug 1506382 in kernel-3.10.0-825.el7 now.

What's more, backporting to RHEL-7.4.z of kernel-3.10.0-693.19.1.el7 is available now and 7.3.z is still on the way.

Hope this can help.

Comment 9 Tom K 2018-04-09 04:20:24 UTC

Wanted to report, despite the solution here, https://access.redhat.com/solutions/3146191 I still get this on one host.  ( Reporting here in case having me run more tests could be beneficial. )

Issue happens only on ovirt01 when running systemctl restart autofs.  It's been reproducible each time autofs is restarted so far:

[ ovirt01 (client) - ERROR APPEARS HERE ]
[root@ovirt01 log]# uname -a
Linux ovirt01.nix.mds.xyz 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@ovirt01 log]#
[root@ovirt01 log]# rpm -aq|grep -Ei nfs
libnfsidmap-0.25-17.el7.x86_64
glusterfs-gnfs-3.12.6-1.el7.x86_64
nfs-utils-1.3.0-0.48.el7_4.2.x86_64
[root@ovirt01 log]#



[ ipaclient01 (client) - NO -512 ERROR HERE ]
[root@ipaclient01 yum.repos.d]# uname -a
Linux ipaclient01.nix.mds.xyz 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@ipaclient01 yum.repos.d]#
[root@ipaclient01 ~]# rpm -aq|grep -Ei nfs
libnfsidmap-0.25-17.el7.x86_64
nfs-utils-1.3.0-0.48.el7_4.2.x86_64
[root@ipaclient01 ~]#



[ nfs01 (server cluster w/ nfs02, NFS Ganesha v2.6) ]
[root@nfs01 ganesha]# uname -a
Linux nfs01.nix.mds.xyz 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@nfs01 ganesha]#

[ nfs02 (server cluster w/ nfs01, NFS Ganesha v2.6) ]
[root@nfs02 ~]# uname -a
Linux nfs02.nix.mds.xyz 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@nfs02 ~]#



Host ovirt01, where this reoccurs, has the least amount of memory:

[root@ovirt01 log]# cat /proc/meminfo
MemTotal:        3881708 kB
MemFree:          909704 kB
MemAvailable:    1256024 kB
Buffers:            2116 kB
Cached:           603680 kB
SwapCached:            0 kB
Active:          2302224 kB
Inactive:         407032 kB
Active(anon):    2138108 kB
Inactive(anon):    43776 kB
Active(file):     164116 kB
Inactive(file):   363256 kB
Unevictable:       48608 kB
Mlocked:           48640 kB
SwapTotal:       4063228 kB
SwapFree:        4063228 kB
Dirty:                72 kB
Writeback:             0 kB
AnonPages:       2152068 kB
Mapped:           165560 kB
Shmem:             52264 kB
Slab:              93056 kB
SReclaimable:      47420 kB
SUnreclaim:        45636 kB
KernelStack:       10080 kB
PageTables:        27800 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6004080 kB
Committed_AS:    4986628 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      157608 kB
VmallocChunk:   34359341052 kB
HardwareCorrupted:     0 kB
AnonHugePages:   1695744 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       90048 kB
DirectMap2M:     4104192 kB
[root@ovirt01 log]#

Additional logs (/var/log/messages):

Apr  8 23:46:07 ovirt01 automount[1590]: st_expire: state 1 path /nix.my.dom
Apr  8 23:46:07 ovirt01 automount[1590]: expire_proc: exp_proc = 139973160154880 path /nix.my.dom
Apr  8 23:46:07 ovirt01 automount[1590]: expire_cleanup: got thid 139973160154880 path /nix.my.dom stat 0
Apr  8 23:46:07 ovirt01 automount[1590]: expire_cleanup: sigchld: exp 139973160154880 finished, switching from 2 to 1
Apr  8 23:46:07 ovirt01 automount[1590]: st_ready: st_ready(): state = 2 path /nix.my.dom
Apr  8 23:46:10 ovirt01 ovsdb-server: ovs|10232|stream_ssl|ERR|Private key must be configured to use SSL
Apr  8 23:46:10 ovirt01 ovsdb-server: ovs|10233|stream_ssl|ERR|Certificate must be configured to use SSL
Apr  8 23:46:11 ovirt01 automount[1590]: st_expire: state 1 path /n
Apr  8 23:46:11 ovirt01 automount[1590]: expire_proc: exp_proc = 139973160154880 path /n
Apr  8 23:46:11 ovirt01 automount[1590]: expire_cleanup: got thid 139973160154880 path /n stat 1
Apr  8 23:46:11 ovirt01 automount[1590]: expire_cleanup: sigchld: exp 139973160154880 finished, switching from 2 to 1
Apr  8 23:46:11 ovirt01 automount[1590]: st_ready: st_ready(): state = 2 path /n
Apr  8 23:46:12 ovirt01 systemd: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Apr  8 23:46:12 ovirt01 systemd: Stopping Automounts filesystems on demand...
Apr  8 23:46:12 ovirt01 kernel: NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO
Apr  8 23:46:12 ovirt01 automount[1590]: do_notify_state: signal 15
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /misc from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /net from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /n from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: st_prepare_shutdown: state 1 path /n
Apr  8 23:46:12 ovirt01 automount[1590]: expire_proc: exp_proc = 139973157791488 path /n
Apr  8 23:46:12 ovirt01 automount[1590]: st_prepare_shutdown: state 1 path /net
Apr  8 23:46:12 ovirt01 automount[1590]: expire_proc: exp_proc = 139973155428096 path /net
Apr  8 23:46:12 ovirt01 automount[1590]: st_prepare_shutdown: state 1 path /misc
Apr  8 23:46:12 ovirt01 automount[1590]: mount(nfs): nfs: mount failure nfs-c01.nix.my.dom:/n/my.dom on /n/my.dom
Apr  8 23:46:12 ovirt01 automount[1590]: expire_proc: exp_proc = 139973153064704 path /misc
Apr  8 23:46:12 ovirt01 automount[1590]: dev_ioctl_send_fail: token = 96
Apr  8 23:46:12 ovirt01 automount[1590]: failed to mount /n/my.dom
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /nix.my.dom from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /- from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /my.dom from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: handle_packet: type = 3
Apr  8 23:46:12 ovirt01 automount[1590]: handle_packet_missing_indirect: token 97, name my.dom, request pid 2917
Apr  8 23:46:12 ovirt01 automount[1590]: dev_ioctl_send_fail: token = 97


Cheers,
TK

Comment 10 Tom K 2018-04-09 04:24:01 UTC

Wanted to report, despite the solution here, https://access.redhat.com/solutions/3146191 I still get this on one host.  ( Reporting here in case having me run more tests could be beneficial. )

Issue happens only on ovirt01 when running systemctl restart autofs.  It's been reproducible each time autofs is restarted so far:

[ ovirt01 (client) - ERROR APPEARS HERE ]
[root@ovirt01 log]# uname -a
Linux ovirt01.nix.my.dom 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@ovirt01 log]#
[root@ovirt01 log]# rpm -aq|grep -Ei nfs
libnfsidmap-0.25-17.el7.x86_64
glusterfs-gnfs-3.12.6-1.el7.x86_64
nfs-utils-1.3.0-0.48.el7_4.2.x86_64
[root@ovirt01 log]#



[ ipaclient01 (client) - NO -512 ERROR HERE ]
[root@ipaclient01 yum.repos.d]# uname -a
Linux ipaclient01.nix.my.dom 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@ipaclient01 yum.repos.d]#
[root@ipaclient01 ~]# rpm -aq|grep -Ei nfs
libnfsidmap-0.25-17.el7.x86_64
nfs-utils-1.3.0-0.48.el7_4.2.x86_64
[root@ipaclient01 ~]#



[ nfs01 (server cluster w/ nfs02, NFS Ganesha v2.6) ]
[root@nfs01 ganesha]# uname -a
Linux nfs01.nix.my.dom 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@nfs01 ganesha]#

[ nfs02 (server cluster w/ nfs01, NFS Ganesha v2.6) ]
[root@nfs02 ~]# uname -a
Linux nfs02.nix.my.dom 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@nfs02 ~]#



Host ovirt01, where this reoccurs, has the least amount of memory:

[root@ovirt01 log]# cat /proc/meminfo
MemTotal:        3881708 kB
MemFree:          909704 kB
MemAvailable:    1256024 kB
Buffers:            2116 kB
Cached:           603680 kB
SwapCached:            0 kB
Active:          2302224 kB
Inactive:         407032 kB
Active(anon):    2138108 kB
Inactive(anon):    43776 kB
Active(file):     164116 kB
Inactive(file):   363256 kB
Unevictable:       48608 kB
Mlocked:           48640 kB
SwapTotal:       4063228 kB
SwapFree:        4063228 kB
Dirty:                72 kB
Writeback:             0 kB
AnonPages:       2152068 kB
Mapped:           165560 kB
Shmem:             52264 kB
Slab:              93056 kB
SReclaimable:      47420 kB
SUnreclaim:        45636 kB
KernelStack:       10080 kB
PageTables:        27800 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6004080 kB
Committed_AS:    4986628 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      157608 kB
VmallocChunk:   34359341052 kB
HardwareCorrupted:     0 kB
AnonHugePages:   1695744 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       90048 kB
DirectMap2M:     4104192 kB
[root@ovirt01 log]#

Additional logs (/var/log/messages):

Apr  8 23:46:07 ovirt01 automount[1590]: st_expire: state 1 path /nix.my.dom
Apr  8 23:46:07 ovirt01 automount[1590]: expire_proc: exp_proc = 139973160154880 path /nix.my.dom
Apr  8 23:46:07 ovirt01 automount[1590]: expire_cleanup: got thid 139973160154880 path /nix.my.dom stat 0
Apr  8 23:46:07 ovirt01 automount[1590]: expire_cleanup: sigchld: exp 139973160154880 finished, switching from 2 to 1
Apr  8 23:46:07 ovirt01 automount[1590]: st_ready: st_ready(): state = 2 path /nix.my.dom
Apr  8 23:46:10 ovirt01 ovsdb-server: ovs|10232|stream_ssl|ERR|Private key must be configured to use SSL
Apr  8 23:46:10 ovirt01 ovsdb-server: ovs|10233|stream_ssl|ERR|Certificate must be configured to use SSL
Apr  8 23:46:11 ovirt01 automount[1590]: st_expire: state 1 path /n
Apr  8 23:46:11 ovirt01 automount[1590]: expire_proc: exp_proc = 139973160154880 path /n
Apr  8 23:46:11 ovirt01 automount[1590]: expire_cleanup: got thid 139973160154880 path /n stat 1
Apr  8 23:46:11 ovirt01 automount[1590]: expire_cleanup: sigchld: exp 139973160154880 finished, switching from 2 to 1
Apr  8 23:46:11 ovirt01 automount[1590]: st_ready: st_ready(): state = 2 path /n
Apr  8 23:46:12 ovirt01 systemd: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Apr  8 23:46:12 ovirt01 systemd: Stopping Automounts filesystems on demand...
Apr  8 23:46:12 ovirt01 kernel: NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO
Apr  8 23:46:12 ovirt01 automount[1590]: do_notify_state: signal 15
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /misc from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /net from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /n from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: st_prepare_shutdown: state 1 path /n
Apr  8 23:46:12 ovirt01 automount[1590]: expire_proc: exp_proc = 139973157791488 path /n
Apr  8 23:46:12 ovirt01 automount[1590]: st_prepare_shutdown: state 1 path /net
Apr  8 23:46:12 ovirt01 automount[1590]: expire_proc: exp_proc = 139973155428096 path /net
Apr  8 23:46:12 ovirt01 automount[1590]: st_prepare_shutdown: state 1 path /misc
Apr  8 23:46:12 ovirt01 automount[1590]: mount(nfs): nfs: mount failure nfs-c01.nix.my.dom:/n/my.dom on /n/my.dom
Apr  8 23:46:12 ovirt01 automount[1590]: expire_proc: exp_proc = 139973153064704 path /misc
Apr  8 23:46:12 ovirt01 automount[1590]: dev_ioctl_send_fail: token = 96
Apr  8 23:46:12 ovirt01 automount[1590]: failed to mount /n/my.dom
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /nix.my.dom from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /- from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: master_notify_state_change: sig 15 switching /my.dom from 1 to 5
Apr  8 23:46:12 ovirt01 automount[1590]: handle_packet: type = 3
Apr  8 23:46:12 ovirt01 automount[1590]: handle_packet_missing_indirect: token 97, name my.dom, request pid 2917
Apr  8 23:46:12 ovirt01 automount[1590]: dev_ioctl_send_fail: token = 97


Cheers,
TK

Comment 11 jas 2018-04-17 17:45:50 UTC

Hi.
I recently upgraded 2 of our NFS servers from RHEL6 to RHEL7.4.  Our clients were already RHEL7.4.  Everything is running 3.10.0-693.21.1.el7 kernel and the server has the latest nfs-utils.  I'm still receiving the "NFS: nfs4_discover_server_trunking unhandled error -512. Exiting with error EIO" on random machines when I'm booting a whole lab of machines.  When I reboot the host, everything is fine.  This was definately not happen with the RHEL6 server.   

Yongcheng Yang above said "What's more, backporting to RHEL-7.4.z of kernel-3.10.0-693.19.1.el7 is available now and 7.3.z is still on the way." as a fix for this issue.  However, I'm still receiving the error.

Please help as I do not have access to #1428939.

Note You need to log in before you can comment on or make changes to this bug.