Bug 663068

Summary: genesis w/flocks on NFS can't verify write
Product: Red Hat Enterprise Linux 5 Reporter: Nate Straz <nstraz>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.6CC: adas, bfields, bmarzins, dhowells, jlayton, rpeterso, rwheeler, steved, swhiteho
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 245024 Environment:
Last Closed: 2011-05-26 17:17:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
console logs from west-09 and west-10 with soft lockup messages none

Description Nate Straz 2010-12-14 15:34:39 UTC
+++ This bug was initially created as a clone of Bug #245024 +++

Description of problem:

There is still a race which can cause our test case genesis to fail when running on NFS.

The failure happens during a recreate_file() step which performs:
 1. unlink file
 2. creat file
 3. write buffer to file
 4. lseek to beginning of file
 5. read buffer from file
 6. check read buffer

There are two failure modes we are seeing:

During step 5 above we get a short read:

creat_file(): Short read(3, buf, 1024) (nbytes is 0) on genquick_6/udiehtlcyxdvubptsaaudgtm
genesis (22025) exited with status 1

-- or --

During step 6, the read returned incorrect data:

*** DATA COMPARISON ERROR genreg_70/ecenlfgmsbbesbwiykuuphkg ***
Corrupt regions follow - unprintable chars are represented as '.'
-----------------------------------------------------------------
corrupt bytes starting at file offset 1111451
    1st 32 expected bytes:  4392:west-15:genesis*ecenlfgmsb:
    1st 32 actual bytes:    ................................

genesis (14384) exited with status 1


I have hit this with GFS, ext3 and ext4 as backing for NFS.

Using the following genesis command line I'm able to hit this bug ever
few hours.

genesis -S 25522 -i 0 -n 20 -d 2 -p 3  -L flock -k -w /mnt/nfs

Comment 1 Abhijith Das 2011-01-18 17:11:26 UTC
*** Bug 223263 has been marked as a duplicate of this bug. ***

Comment 2 Robert Peterson 2011-01-21 19:13:54 UTC
Spoke with Jeff Layton who agreed to take it for now, since
we can recreate it outside of GFS and GFS2.  Let me, Abhi, or
Nate know if you want help recreating it.

Comment 6 Jeff Layton 2011-01-27 18:46:02 UTC
I wonder whether this may be a duplicate of bug 672981. The problem sounds like it may be somewhat related. Once we've had some time to over that patch, I'll put in my test kernels and perhaps we can see whether it helps this issue too.

Comment 7 Jeff Layton 2011-01-28 20:55:42 UTC
Nate, I've added Trond's patch to the jtltest.131 kernels on my people.redhat.com page:

    http://people.redhat.com/jlayton/

...if you could test rerun this test with those kernels, that would be great. If it happens to fix this issue, then perhaps we can get this into 5.7 after all.

Comment 9 Jeff Layton 2011-02-01 12:51:44 UTC
Nate ran the test over the weekend. It made it most of the weekend and then failed at ~1am Monday morning. While Trond's patch seems to help, it doesn't seem to be a complete fix.

Comment 10 Jeff Layton 2011-02-01 12:52:18 UTC
The next step is to re-run this with RHEL6 clients to see if this is still an upstream issue. He's doing that this week and should have results by week's end...

Comment 11 Nate Straz 2011-02-04 14:49:05 UTC
This we I have run the genesis load with the same RHEL5.6 server running kernel-2.6.18-238.el5.  I've run RHEL6.0 clients running kernel-2.6.32-71.el6.x86_64 and RHEL6.1 clients running kernel-2.6.32-94.el6.x86_64.

I have not hit the data comparison error at all during the four days of testing.
I have hit the short read several times on RHEL6.0 and RHEL6.1 using both ext3 and GFS under the NFS export.

I hit a new symptom on the RHEL6.1 clients.  After hitting a short read both systems (one w/ ext3 under NFS and one w/ GFS under NFS), one of the genesis processes got stuck on a CPU.


west-09, running NFS on ext3:

creat_file(): Short read(3, buf, 1024) (nbytes is 0) on gendir_0/udiehtlcyxdvubptsaaudgtm
genesis (32625) exited with status 1
Killing all pids.
genesis (32626) terminated with signal 15:
genesis (32627) terminated with signal 15:
genesis (32628) terminated with signal 15:
BUG: soft lockup - CPU#2 stuck for 61s! [genesis:32630]
Modules linked in: nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc cpufreq_ondemand powernow_k8 freq_table ipv6 dm_mirror dm_regio
n_hash dm_log dcdbas serio_raw qla2xxx scsi_transport_fc scsi_tgt amd64_edac_mod edac_core edac_mce_amd k8temp hwmon tg3 sg i2c_piix4 shpchp 
ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw ata_generic pata_acpi pata_serverworks radeon ttm drm_kms_helper drm i2c_algo_bit i
2c_core dm_mod [last unloaded: scsi_wait_scan]
CPU 2:
Modules linked in: nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc cpufreq_ondemand powernow_k8 freq_table ipv6 dm_mirror dm_regio
n_hash dm_log dcdbas serio_raw qla2xxx scsi_transport_fc scsi_tgt amd64_edac_mod edac_core edac_mce_amd k8temp hwmon tg3 sg i2c_piix4 shpchp 
ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw ata_generic pata_acpi pata_serverworks radeon ttm drm_kms_helper drm i2c_algo_bit i
2c_core dm_mod [last unloaded: scsi_wait_scan]
Pid: 32630, comm: genesis Not tainted 2.6.32-94.el6.x86_64 #1 PowerEdge SC1435
RIP: 0010:[<ffffffff81117f00>]  [<ffffffff81117f00>] clear_page_dirty_for_io+0x60/0x100
RSP: 0000:ffff880119ec7b58  EFLAGS: 00000246
RAX: ffff880118672848 RBX: ffff880119ec7b68 RCX: ffff880028035b98
RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffffea000397a9d8
RBP: ffffffff8100bc8e R08: ffff880028035b98 R09: fed8b06043dc8407
R10: 00000000002fc970 R11: 0000000000000400 R12: ffff880119ec7ac0
R13: ffffffff8108a130 R14: ffff8801190b2b30 R15: 0000000000000001
FS:  00007f874964a700(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f1e57302c08 CR3: 0000000118231000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 [<ffffffffa0478fd0>] ? nfs_wb_page+0x60/0xd0 [nfs]
 [<ffffffffa0466c1e>] ? nfs_launder_page+0x4e/0x80 [nfs]
 [<ffffffff8111be52>] ? invalidate_inode_pages2_range+0x322/0x3b0
 [<ffffffff8111bef7>] ? invalidate_inode_pages2+0x17/0x20
 [<ffffffffa046af88>] ? nfs_revalidate_mapping+0x128/0x170 [nfs]
 [<ffffffffa0467c77>] ? nfs_file_read+0x77/0x130 [nfs]
 [<ffffffff81165c8a>] ? do_sync_read+0xfa/0x140
 [<ffffffff8118f4f4>] ? writeback_single_inode+0x154/0x2c0
 [<ffffffff8108a0f0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8120537b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811f87e6>] ? security_file_permission+0x16/0x20
 [<ffffffff811666b5>] ? vfs_read+0xb5/0x1a0
 [<ffffffff810cca12>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff811667f1>] ? sys_read+0x51/0x90
 [<ffffffff8100b172>] ? system_call_fastpath+0x16/0x1b


west-10, running NFS on GFS:

creat_file(): Short read(3, buf, 1024) (nbytes is 0) on gendir_0/udiehtlcyxdvubptsaaudgtm
genesis (30566) exited with status 1
Killing all pids.
genesis (30567) terminated with signal 15:
genesis (30568) terminated with signal 15:
genesis (30569) terminated with signal 15:
BUG: soft lockup - CPU#2 stuck for 61s! [genesis:30570]
Modules linked in: nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc cpufreq_ondemand powernow_k8 freq_table ipv6 dm_mirror dm_regio
n_hash dm_log dcdbas serio_raw amd64_edac_mod edac_core edac_mce_amd k8temp hwmon qla2xxx scsi_transport_fc scsi_tgt tg3 sg i2c_piix4 shpchp 
ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw pata_acpi ata_generic pata_serverworks radeon ttm drm_kms_helper drm i2c_algo_bit i
2c_core dm_mod [last unloaded: scsi_wait_scan]
CPU 2:
Modules linked in: nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc cpufreq_ondemand powernow_k8 freq_table ipv6 dm_mirror dm_regio
n_hash dm_log dcdbas serio_raw amd64_edac_mod edac_core edac_mce_amd k8temp hwmon qla2xxx scsi_transport_fc scsi_tgt tg3 sg i2c_piix4 shpchp 
ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw pata_acpi ata_generic pata_serverworks radeon ttm drm_kms_helper drm i2c_algo_bit i
2c_core dm_mod [last unloaded: scsi_wait_scan]
Pid: 30570, comm: genesis Not tainted 2.6.32-94.el6.x86_64 #1 PowerEdge SC1435
RIP: 0010:[<ffffffff814c51e0>]  [<ffffffff814c51e0>] out_of_line_wait_on_bit_lock+0x0/0x90
RSP: 0018:ffff880112f6db00  EFLAGS: 00000202
RAX: 00000000ffffffff RBX: ffff880112f6db68 RCX: 0000000000000082
RDX: ffffffffa0482aa0 RSI: 0000000000000007 RDI: ffff880114da45a0
RBP: ffffffff8100bc8e R08: ffff880028032b80 R09: fecd25ff96756807
R10: 00000000002e018c R11: 0000000000000400 R12: ffffffffa0482aa0
R13: ffff880114da45a0 R14: 0000000000000007 R15: 0000000000000400
FS:  00007fd1fc79e700(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003fd78672b0 CR3: 0000000112d2b000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 [<ffffffffa0490e5a>] ? nfs_commit_inode+0xaa/0x1c0 [nfs]
 [<ffffffffa0490fe9>] ? nfs_wb_page+0x79/0xd0 [nfs]
 [<ffffffffa047ec1e>] ? nfs_launder_page+0x4e/0x80 [nfs]
 [<ffffffff8111be52>] ? invalidate_inode_pages2_range+0x322/0x3b0
 [<ffffffff8111bef7>] ? invalidate_inode_pages2+0x17/0x20
 [<ffffffffa0482f88>] ? nfs_revalidate_mapping+0x128/0x170 [nfs]
 [<ffffffffa047fc77>] ? nfs_file_read+0x77/0x130 [nfs]
 [<ffffffff81165c8a>] ? do_sync_read+0xfa/0x140
 [<ffffffff8118f4f4>] ? writeback_single_inode+0x154/0x2c0
 [<ffffffff8108a0f0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8120537b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811f87e6>] ? security_file_permission+0x16/0x20
 [<ffffffff811666b5>] ? vfs_read+0xb5/0x1a0
 [<ffffffff810cca12>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff811667f1>] ? sys_read+0x51/0x90
 [<ffffffff8100b172>] ? system_call_fastpath+0x16/0x1b

Comment 12 Nate Straz 2011-02-04 15:17:20 UTC
Created attachment 477043 [details]
console logs from west-09 and west-10 with soft lockup messages

I looked at the logs more closely and there are a bunch of back traces from several processes in the logs.  I've attached the complete console logs from the past two days for inspection.

Comment 13 Jeff Layton 2011-02-04 15:26:40 UTC
I think the RHEL6 problems may be related to bug 672305, but it's hard to be sure. It may be good to open a separate bug for that...

Comment 14 Jeff Layton 2011-03-08 19:02:42 UTC
It's another longshot, but I have a some patches in my latest test kernel that I suspect may help this. The existing code can end up returning too early from fsync, close or setattr -- before writeback is done.

Nate, would you be able to rerun this test on the kernels here?

    http://people.redhat.com/jlayton/

Even if it doesn't help, it'll be good to rule out these problems as the cause.

Comment 15 Nate Straz 2011-03-09 22:05:06 UTC
I ran the test kernel for a day with three processes and didn't hit anything.  I'm going to restart with six processes and let it run through the weekend.

Comment 16 Jeff Layton 2011-03-11 12:50:17 UTC
Thanks Nate. I'll set this to needinfo to remind me that we're waiting on results...

Comment 17 Nate Straz 2011-03-14 14:43:10 UTC
I ran 2.6.18-246.el5.jtltest.135 on five client systems for a week and was not able to reproduce the data comparison error, nor the short read.

Thumbs up!

Comment 18 Jeff Layton 2011-05-17 11:16:09 UTC
Hi Nate, I think all of the relevant patches for this are now in the latest 5.7 builds. Could you re-test this with a -262.el5 kernel or later?

Comment 19 Nate Straz 2011-05-20 15:18:04 UTC
Got the load started on a -262.el5 kernel.  It ran all night but I'm going to let it go through the weekend.

Comment 20 Nate Straz 2011-05-23 14:38:31 UTC
I figured out this morning that I had a problem with my command line and only one process was running on each NFS client instead of three.  I restarted the test this morning.

Comment 21 Nate Straz 2011-05-26 15:53:13 UTC
I've run the load on 8 NFS client to one server all running -262.el5 for four days.  I have not reproduced the data corruption issues I was seeing before.

Since this is in the 5.7 kernel, shouldn't all of the 5.7 flags be set?

Comment 22 Jeff Layton 2011-05-26 17:17:56 UTC
Excellent news. Well, I think the right thing to do here is to go ahead and close this as a duplicate of 441730 since that patchset is what I suspect ultimately fixed this. I'll go ahead and do that, please reopen this one if that's a problem.

*** This bug has been marked as a duplicate of bug 441730 ***