Bug 1285097
Summary: | updated nfs-utils package broke nfsdcltrack | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Frank Sorenson <fsorenso> | |
Component: | nfs-utils | Assignee: | Steve Dickson <steved> | |
Status: | CLOSED ERRATA | QA Contact: | Yongcheng Yang <yoyang> | |
Severity: | high | Docs Contact: | Marie Hornickova <mdolezel> | |
Priority: | urgent | |||
Version: | 7.2 | CC: | adrian.fischli, bcodding, bfields, bugzilla.redhat.com, chorn, dgilbert, dossow, dwysocha, eguan, evelu, gdubreui, green, igeorgex, ioan, jbnance, jiyin, j, knweiss, lslysz, luc.lalonde, mark2015, martin, mdolezel, me, miturria, mkolaja, pasteur, redhat.bugs, redhatbugs, redhat, rob.verduijn, sellis, steved, swhiteho, tlavigne, troels, vanhoof, wdh, yoguma | |
Target Milestone: | rc | Keywords: | Patch, Regression, TestCaseProvided, ZStream | |
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | nfs-utils-1.3.0-0.23.el7 | Doc Type: | Bug Fix | |
Doc Text: |
The update of the nfs-utils packages in Red Hat Enterprise Linux 7.2 added support for the NFSv4.1 features which was incomplete. Consequently, the NFSv4 client-tracking callout program (nfsdcltrack) created an incorrect schema for the clients table, and the file locks appeared then but these locks did not persist after restart. With this update, the underlying source code has been fixed, and nfsdcltrack can now enter the NFS client data into the database. As a result, NFS clients no longer experience the incorrect locks after NFS server restart.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1309625 (view as bug list) | Environment: | ||
Last Closed: | 2016-11-04 05:01:27 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1203710, 1295577, 1309625 |
Description
Frank Sorenson
2015-11-24 21:34:49 UTC
For those who can't see that Red Hat KB article, the solution there is simply to downgrade nfs-utils. Not sure why that gem is hidden behind the paywall. Also, does anyone know if this issue actually cause any problems with remote hosts mounting exported filesystems? I'm trying to track down several issues I'm having since the 7.2 update but aside from the log spamming I'm not sure what this actually breaks. (In reply to Jason Tibbitts from comment #2) > Also, does anyone know if this issue actually cause any problems with remote > hosts mounting exported filesystems? I'm trying to track down several > issues I'm having since the 7.2 update but aside from the log spamming I'm > not sure what this actually breaks. There shouldn't be problems during either the 'mount' or during most normal usage of the mounts. This program is used to update an on-disk (persistent) database which tracks file locks which the nfs clients have been granted. This becomes important in the event of an nfs server restart, as it enables the nfs clients to reclaim (for a period of time) these locks (while denying the lock to other nfs clients trying to take a conflicting lock). This bug prevents nfsdcltrack from entering the relevant nfs client data into the database. In the event of an nfs server restart, nfs clients will be unable to reclaim these locks. There are definitely issues with this version. In my case it's when using a IPv6 stack between the NFS server and clients. Mounting and listing works but accessing a file content doesn't work. The command (such as cat) just freeze. Also security is not an issue (tested with both SELinux and FW off) A workaround is to move back to previous package version (Tested with nfs-utils-1.3.0-0.8.el7.x86_64.rpm) and to make sure both sides, nfs server and clients are downgraded. I've also had this problem. I have a share defined in /etc/exports as follows /path host(ro,insecure) After upgrade to 7.2 that host (running Kodibuntu OS) could no longer mount the share, it got 'mount.nfs: access denied by server while mounting x' Downgrading nfs-utils resolved the issue. I also have other share defined in /etc/exports and they weren't effected, although they weren't defined with the insecure flag. I don't see how either of these relates to the bugzilla. This bugzilla has nothing to do with either mounting or IPv6. It's about errors updating the database schema used by the nfsdcltrack utility for tracking file locks. I can confirm that rolling back the nfs-utils package resolved the error and also allowed Ubuntu bases nfs4 clients to connect again. yum downgrade nfs-utils Running transaction Installing : 1:nfs-utils-1.3.0-0.8.el7.x86_64 1/2 warning: /etc/sysconfig/nfs created as /etc/sysconfig/nfs.rpmnew Cleanup : 1:nfs-utils-1.3.0-0.21.el7.x86_64 2/2 Verifying : 1:nfs-utils-1.3.0-0.8.el7.x86_64 1/2 Verifying : 1:nfs-utils-1.3.0-0.21.el7.x86_64 2/2 Removed: nfs-utils.x86_64 1:1.3.0-0.21.el7 Installed: nfs-utils.x86_64 1:1.3.0-0.8.el7 Previously to downgrading with a RHEL 7.2 NFS server and Ubuntu 14.04 LTS client root@mythtv:/mnt# mount -t nfs -o vers=4 fileserver:/mnt/share /mnt/local -v mount.nfs: timeout set for Tue Jan 19 21:25:11 2016 mount.nfs: trying text-based options 'vers=4,addr=192.168.0.10,clientaddr=192.168.0.17' mount.nfs: mount(2): Permission denied current kernel version of test nfs server - 3.10.0-229.20.1.el7.x86_64 - All packages are current excluding nfs-utils - selinux enforcing - firewalld enabled If I upgrade nfs-utils to 1.3.0-0.21.el7 then Ubuntu 14.04 LTS client can't connect using NFSv4 *** Bug 1298320 has been marked as a duplicate of this bug. *** I also got this problem since I upgraded to 7.2. Restarting nfs produced this in the log: systemd: Starting NFS server and services... Jan 30 12:40:30 cantor nfsdcltrack[11205]: sqlite_query_reclaiming: unable to prepare select statement: no such column: has_session Jan 30 12:40:30 cantor kernel: NFSD: starting 90-second grace period So then I tried # sqlite3 /var/lib/nfs/nfsdcltrack/main.sqlite sqlite> .tables clients parameters sqlite> .schema clients CREATE TABLE clients (id BLOB PRIMARY KEY, time INTEGER); sqlite> .schema parameters CREATE TABLE parameters (key TEXT PRIMARY KEY, value TEXT); Just guessing where and what has_session ought to be, sqlite> alter table clients add column has_session TINYINT; sqlite> .schema clients CREATE TABLE clients (id BLOB PRIMARY KEY, time INTEGER, has_session TINYINT); sqlite> .exit Now I no longer get the error: systemd: Starting NFS server and services... kernel: NFSD: starting 90-second grace period (net ffffffff81a25e00) systemd: Started NFS server and services. And 'flock /myshare/bar sleep 30' followed by server restart produces this: systemd: Starting NFS server and services... kernel: NFSD: starting 90-second grace period (net ffffffff81a25e00) systemd: Started NFS server and services. systemd: Starting Notify NFS peers of a restart... sm-notify[11325]: Version 1.3.0 starting sm-notify[11325]: Already notifying clients; Exiting! systemd: Started Notify NFS peers of a restart. That's where I think it should have thrown the "insert statement prepare failed" error, but it seems happy. This could be the quick 'n dirty workaround if someone else would try and can confirm it works. (In reply to Zenon Panoussis from comment #13) > That's where I think it should have thrown the "insert statement prepare > failed" error, but it seems happy. This could be the quick 'n dirty > workaround if someone else would try and can confirm it works. It works for me! Thanks for working that out. Just FYI, this is the schema from a Fedora 23 machine (where the problem is not present): sqlite> .tables clients parameters sqlite> .schema clients CREATE TABLE clients (id BLOB PRIMARY KEY, time INTEGER, has_session INTEGER); sqlite> .schema parameters CREATE TABLE parameters (key TEXT PRIMARY KEY, value TEXT); (In reply to Jason Tibbitts from comment #16) > Just FYI, this is the schema from a Fedora 23 machine (where the problem is > not present): > sqlite> .schema clients > CREATE TABLE clients (id BLOB PRIMARY KEY, time INTEGER, has_session > INTEGER); Uhm, has_client is boolean and for a moment I thought that using INTEGER is a bug on its own, but it turns out that the TINYINT in my previous comment simply shows my ignorance of sqlite. Its storage is always INTEGER, no matter what kind of integer you specify: https://www.sqlite.org/datatype3.html . But, precisely therefore, you can safely create the column as TINYINT or BIGINT or any other INT you want and the result will be exactly the same. Just hit that issue too. I have the same issue: May 12 17:49:21 nfs-server nfsdcltrack[10765]: sqlite_insert_client: insert statement prepare failed: table clients has 2 columns but 3 values were supplied I have to force 'vers=4.0' on the clients for autofs mounts... Otherwise I clients cannot mount their home directories: * -fstype=nfs4,rw,sec=krb5,vers=4.0 nfs-server:/& However, I don't know if this is related to this issue... Could this be related: May 10 10:10:53 moe-180 kernel: ------------[ cut here ]------------ May 10 10:10:53 moe-180 kernel: WARNING: at fs/nfsd/nfs4state.c:3853 nfsd4_process_open2+0xb72/0xf70 [nfsd]() May 10 10:10:53 moe-180 kernel: Modules linked in: fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat ipt_REJECT tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables cts rpcsec_gss_krb5 nf_conntrack_ipv4 nf_defrag_ipv4 vmw_vsock_vmci_transport vsock xt_conntrack nf_conntrack iptable_filter coretemp kvm_intel kvm ppdev vmw_balloon sg pcspkr parport_pc vmw_vmci i2c_piix4 parport shpchp nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic sr_mod cdrom crct10dif_common ata_generic pata_acpi vmwgfx crc32c_intel mptspi drm_kms_helper serio_raw ata_piix scsi_transport_spi ttm mptscsih mptbase drm vmxnet3 libata i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod May 10 10:10:53 moe-180 kernel: CPU: 1 PID: 3708 Comm: nfsd Not tainted 3.10.0-327.13.1.el7.x86_64 #1 May 10 10:10:53 moe-180 kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013 May 10 10:10:53 moe-180 kernel: 0000000000000000 00000000288c7f59 ffff88040d56fc28 ffffffff8163571c May 10 10:10:53 moe-180 kernel: ffff88040d56fc60 ffffffff8107b200 ffff8802bf459708 ffff880422e3f3e0 May 10 10:10:53 moe-180 kernel: ffff880191afdd98 ffff88042719b600 0000000000000000 ffff88040d56fc70 May 10 10:10:53 moe-180 kernel: Call Trace: May 10 10:10:53 moe-180 kernel: [<ffffffff8163571c>] dump_stack+0x19/0x1b May 10 10:10:53 moe-180 kernel: [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0 May 10 10:10:53 moe-180 kernel: [<ffffffff8107b34a>] warn_slowpath_null+0x1a/0x20 May 10 10:10:53 moe-180 kernel: [<ffffffffa033be22>] nfsd4_process_open2+0xb72/0xf70 [nfsd] May 10 10:10:53 moe-180 kernel: [<ffffffffa032b14a>] nfsd4_open+0x55a/0x850 [nfsd] May 10 10:10:53 moe-180 kernel: [<ffffffffa032b917>] nfsd4_proc_compound+0x4d7/0x7f0 [nfsd] May 10 10:10:53 moe-180 kernel: [<ffffffffa031712b>] nfsd_dispatch+0xbb/0x200 [nfsd] May 10 10:10:53 moe-180 kernel: [<ffffffffa02b2183>] svc_process_common+0x453/0x6f0 [sunrpc] May 10 10:10:53 moe-180 kernel: [<ffffffffa02b2523>] svc_process+0x103/0x170 [sunrpc] May 10 10:10:53 moe-180 kernel: [<ffffffffa0316ab7>] nfsd+0xe7/0x150 [nfsd] May 10 10:10:53 moe-180 kernel: [<ffffffffa03169d0>] ? nfsd_destroy+0x80/0x80 [nfsd] May 10 10:10:53 moe-180 kernel: [<ffffffff810a5aef>] kthread+0xcf/0xe0 May 10 10:10:53 moe-180 kernel: [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 May 10 10:10:53 moe-180 kernel: [<ffffffff81645e18>] ret_from_fork+0x58/0x90 May 10 10:10:53 moe-180 kernel: [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 May 10 10:10:53 moe-180 kernel: ---[ end trace 37abbe18e83e49c4 ]--- (In reply to Luc Lalonde from comment #24) > Could this be related: That's unlikely to be related. You might be seeing bug 1300023, which should be fixed in kernel-3.10.0-351.el7. Move to VERIFIED as comment 26 and continue to run the corresponding automatic case in the future. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2383.html |