Created attachment 1025317 [details] dmesg log from node1 Description of problem: OOM kill for nfs-ganesha process glusterfsd invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 glusterfsd cpuset=/ mems_allowed=0 Pid: 31429, comm: glusterfsd Not tainted 2.6.32-504.12.2.el6.x86_64 #1 Call Trace: [<ffffffff810d40c1>] ? cpuset_print_task_mems_allowed+0x91/0xb0 [<ffffffff81127300>] ? dump_header+0x90/0x1b0 [<ffffffff8122eb5c>] ? security_real_capable_noaudit+0x3c/0x70 [<ffffffff81127782>] ? oom_kill_process+0x82/0x2a0 [<ffffffff811276c1>] ? select_bad_process+0xe1/0x120 [<ffffffff81127bc0>] ? out_of_memory+0x220/0x3c0 [<ffffffff811344df>] ? __alloc_pages_nodemask+0x89f/0x8d0 [<ffffffff8116c69a>] ? alloc_pages_current+0xaa/0x110 [<ffffffff811246f7>] ? __page_cache_alloc+0x87/0x90 [<ffffffff811240de>] ? find_get_page+0x1e/0xa0 [<ffffffff81125697>] ? filemap_fault+0x1a7/0x500 [<ffffffff8114eae4>] ? __do_fault+0x54/0x530 [<ffffffff8114f0b7>] ? handle_pte_fault+0xf7/0xb00 [<ffffffff8114fcea>] ? handle_mm_fault+0x22a/0x300 [<ffffffff8104d0d8>] ? __do_page_fault+0x138/0x480 [<ffffffff81041e98>] ? pvclock_clocksource_read+0x58/0xd0 [<ffffffff8153003e>] ? do_page_fault+0x3e/0xa0 [<ffffffff8152d3f5>] ? page_fault+0x25/0x30 [root@nfs2 ~]# gluster volume status Status of volume: gluster_shared_storage Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.37.148:/rhs/brick1/d1r1-share 49156 0 Y 3549 Brick 10.70.37.77:/rhs/brick1/d1r2-share 49155 0 Y 3329 Brick 10.70.37.76:/rhs/brick1/d2r1-share 49155 0 Y 3081 Brick 10.70.37.69:/rhs/brick1/d2r2-share 49155 0 Y 3346 Brick 10.70.37.148:/rhs/brick1/d3r1-share 49157 0 Y 3566 Brick 10.70.37.77:/rhs/brick1/d3r2-share 49156 0 Y 3346 Brick 10.70.37.76:/rhs/brick1/d4r1-share 49156 0 Y 3098 Brick 10.70.37.69:/rhs/brick1/d4r2-share 49156 0 Y 3363 Brick 10.70.37.148:/rhs/brick1/d5r1-share 49158 0 Y 3583 Brick 10.70.37.77:/rhs/brick1/d5r2-share 49157 0 Y 3363 Brick 10.70.37.76:/rhs/brick1/d6r1-share 49157 0 Y 3115 Brick 10.70.37.69:/rhs/brick1/d6r2-share 49157 0 Y 3380 Self-heal Daemon on localhost N/A N/A Y 25893 Self-heal Daemon on 10.70.37.69 N/A N/A Y 28389 Self-heal Daemon on 10.70.37.77 N/A N/A Y 4784 Self-heal Daemon on 10.70.37.148 N/A N/A Y 22717 Task Status of Volume gluster_shared_storage ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol2 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.37.148:/rhs/brick1/d1r1 49153 0 Y 22219 Brick 10.70.37.77:/rhs/brick1/d1r2 49152 0 Y 4321 Brick 10.70.37.76:/rhs/brick1/d2r1 49152 0 Y 25654 Brick 10.70.37.69:/rhs/brick1/d2r2 49152 0 Y 27914 Brick 10.70.37.148:/rhs/brick1/d3r1 49154 0 Y 18842 Brick 10.70.37.77:/rhs/brick1/d3r2 49153 0 Y 4343 Brick 10.70.37.76:/rhs/brick1/d4r1 49153 0 Y 25856 Brick 10.70.37.69:/rhs/brick1/d4r2 49153 0 Y 27934 Brick 10.70.37.148:/rhs/brick1/d5r1 49155 0 Y 22237 Brick 10.70.37.77:/rhs/brick1/d5r2 49154 0 Y 4361 Brick 10.70.37.76:/rhs/brick1/d6r1 49154 0 Y 25874 Brick 10.70.37.69:/rhs/brick1/d6r2 49154 0 Y 27952 Self-heal Daemon on localhost N/A N/A Y 25893 Self-heal Daemon on 10.70.37.77 N/A N/A Y 4784 Self-heal Daemon on 10.70.37.69 N/A N/A Y 28389 Self-heal Daemon on 10.70.37.148 N/A N/A Y 22717 Task Status of Volume vol2 ------------------------------------------------------------------------------ There are no active volume tasks Version-Release number of selected component (if applicable): glusterfs-3.7.0beta2-0.0.el6.x86_64 nfs-ganesha-2.2.0-0.el6.x86_64 How reproducible: OOM kill first time Steps to Reproduce: 1. create a volume of type 6x2 and start it 2. start nfs-ganesha after completing the pre-requisites 3. mount the volume with vers=3 4. start executing posix testsuite Actual results: OOM kill seen for nfs-ganesha process Expected results: OOM kill should not be seen, as the test executed is posix testsuite Additional info: BZ 1221489 was filed with posix testsuite
Created attachment 1025318 [details] sosreport of node1
Hi Saurabh, I see the following messages on this node, May 14 13:34:52 nfs1 setroubleshoot: SELinux is preventing /usr/sbin/glusterfsd from create access on the fifo_file fstest_a1b17951bbc68e452b02abd3cf4cf15a. For complete SELinux messages. run sealert -l cc5df174-cc02-4d36-a558-0fad09188e85 May 14 13:34:52 nfs1 setroubleshoot: SELinux is preventing /usr/sbin/glusterfsd from setattr access on the fifo_file fstest_a1b17951bbc68e452b02abd3cf4cf15a. For complete SELinux messages. run sealert -l 47bc3821-9053-4a1f-a3ea-003f1143cc54 May 14 13:34:52 nfs1 setroubleshoot: SELinux is preventing /usr/sbin/glusterfsd from getattr access on the fifo_file /rhs/brick1/d5r1/dir1/fstest_2f02ef6b423108211157842c608fbe71/fstest_a1b17951bbc68e452b02abd3cf4cf15a. For complete SELinux messages. run sealert -l 733b075a-5f77-40c9-9ce7-064b380001b2 May 14 13:34:52 nfs1 setroubleshoot: SELinux is preventing /usr/sbin/glusterfsd from link access on the fifo_file fstest_a1b17951bbc68e452b02abd3cf4cf15a. For complete SELinux messages. run sealert -l b8a0356e-54f1-43b7-833d-89c3033a65bb May 14 13:34:52 nfs1 setroubleshoot: SELinux is preventing /usr/sbin/glusterfsd from getattr access on the fifo_file /rhs/brick1/d5r1/dir1/fstest_2f02ef6b423108211157842c608fbe71/fstest_a1b17951bbc68e452b02abd3cf4cf15a. For complete SELinux messages. run sealert -l 733b075a-5f77-40c9-9ce7-064b380001b2 May 14 13:34:53 nfs1 setroubleshoot: SELinux is preventing glusterfsd from read access on the fifo_file ec1a1d0d-0f95-4c14-a291-fe780016614f. For complete SELinux messages. run sealert -l 26cfead6-e5fb-4381-8bb3-03edae2b21d9 Can you please try this in permissive mode?
selinux is already in permissive mode. [root@nfs1 ~]# sestatus SELinux status: enabled SELinuxfs mount: /selinux Current mode: permissive Mode from config file: permissive Policy version: 24 Policy from config file: targeted
Even in the permissive mode, we see that read acccess is prevented. Please disable SElinux and try once. I am running the test suite on my 4 node set up currently.
Saurabh, If it is glusterfsd that is being prevented from reading the attrs, what makes you file the bug for nfs-ganesha? Won't this be impacting glusterfsd on a general note? If so can you check with other folks who may happen to have a solution for glusterfs attr access with selinux enabled? If not, I agree we can try and work on a policy - my only point being this doesn't seem to be a ganesha issue. Meghana - Please don't mark the comments as private if they do not contain any confidential info. Thanks, Anand
I can think of one RCA for this issue atm though not checked the logs/core. Since upcall is enabled, gfapi will queue all the upcall notifications received from server in a queue. But since the patch in NFS-Ganesha (which polls and cleans up these entries) haven't yet got merged, I am thinking those entries may have consumed lots of memory which resulted in OOM kill. Please disable upcall 'gluster vol set features.cache-invalidation off' and run the test-suite just to confirm if this is the case. Thanks!
Created attachment 1033153 [details] posix-compliance-tests-v3-with-upcall
Created attachment 1033154 [details] posix-compliance-tests-v3-without-upcall
Created attachment 1033155 [details] posix-compliance-tests-v4-with-upcall
Created attachment 1033156 [details] posix-compliance-tests-v4-without-upcall
With the latest upstream gluster and nfs-ganesha (v2.3 dev-5) sources, I do not see any cores/OOM issues reported while running posix-compliance tests. However there are some tests failures and also the results are similar with or without features.cache-invalidation on/off. v3 Mount- Test Summary Report ------------------- /opt/qa/tools/posix-testsuite/tests/chown/00.t (Wstat: 0 Tests: 171 Failed: 1) Failed test: 77 Files=185, Tests=1962, 104 wallclock secs ( 0.64 usr 0.20 sys + 3.04 cusr 0.56 csys = 4.44 CPU) Result: FAIL v4 Mount - Test Summary Report ------------------- /opt/qa/tools/posix-testsuite/tests/chown/00.t (Wstat: 0 Tests: 171 Failed: 1) Failed test: 77 /opt/qa/tools/posix-testsuite/tests/open/07.t (Wstat: 0 Tests: 23 Failed: 3) Failed tests: 5, 7, 9 Files=185, Tests=1962, 106 wallclock secs ( 0.67 usr 0.17 sys + 3.06 cusr 0.57 csys = 4.47 CPU) Result: FAIL Looking into those specific test failures.
With respect to chown tests, I see the same failure in case of Gluster-NFS as well. /opt/qa/tools/posix-testsuite/tests/chown/00.t Failed test: 77 <<<<<<< # when non-super-user calls chown(2) successfully, set-uid and set-gid bits are # removed, except when both uid and gid are equal to -1. # 64 expect 0 create ${n0} 0644 expect 0 chown ${n0} 65534 65533 expect 0 chmod ${n0} 06555 expect 06555 lstat ${n0} mode expect 0 -u 65534 -g 65533,65532 chown ${n0} 65534 65532 expect 0555,65534,65532 lstat ${n0} mode,uid,gid expect 0 chmod ${n0} 06555 expect 06555 lstat ${n0} mode expect 0 -u 65534 -g 65533,65532 -- chown ${n0} -1 65533 expect 0555,65534,65533 lstat ${n0} mode,uid,gid expect 0 chmod ${n0} 06555 expect 06555 lstat ${n0} mode expect 0 -u 65534 -g 65533,65532 -- chown ${n0} -1 -1 case "${os}" in Linux) echo "os = ${os}" expect 0555,65534,65533 lstat ${n0} mode,uid,gid ;; *) expect 06555,65534,65533 lstat ${n0} mode,uid,gid ;; esac >>>>>>>> Last 'expect' in this block is test#77. From the packet trace I have observed that nfs-client never sends 'expect 0 -u 65534 -g 65533,65532 -- chown ${n0} -1 -1' to the server and that has resulted in no change in the mode-bits. This doesn't seem a server issue. Attached the pkt-trace (nfsv3_chown.pcap)
>>>> /opt/qa/tools/posix-testsuite/tests/open/07.t (Wstat: 0 Tests: 23 Failed: 3) Failed tests: 5, 7, 9 These are the expected failures with respect to NFSv4. Got same results using kernel-NFS as well. expect 0 -u 65534 -g 65534 create ${n1} 0644 expect 0 -u 65534 -g 65534 chmod ${n1} 0477 expect EACCES -u 65534 -g 65534 open ${n1} O_RDONLY,O_TRUNC (FAIL) expect 0 -u 65534 -g 65534 chmod ${n1} 0747 expect EACCES -u 65533 -g 65534 open ${n1} O_RDONLY,O_TRUNC (FAIL) expect 0 -u 65534 -g 65534 chmod ${n1} 0774 expect EACCES -u 65533 -g 65533 open ${n1} O_RDONLY,O_TRUNC (FAIL) The issue here is that in the test it tries to open the file in O_READ, O_TRUNC mode which should ideally have returned EACCESS. But NFSv4 client maps it to OPEN with 'OPEN4_SHARE_ACCESS_READ' and since the owner has READ permissions, server rightly grants it. That means both chown(00.t - #77) and open(07.t - #5, #7, #9) tests should be marked as Known_Issues.