Bug 1415981

Summary: NFS segfault seen while creating multiple directory levels with different file size
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Hemanth Kumar <hyelloji>
Component: RGWAssignee: Matt Benjamin (redhat) <mbenjamin>
Status: CLOSED ERRATA QA Contact: Hemanth Kumar <hyelloji>
Severity: high Docs Contact:
Priority: high    
Version: 2.2CC: cbodley, ceph-eng-bugs, hnallurv, kbader, mbenjamin, owasserm, sweil, tserlin
Target Milestone: rc   
Target Release: 2.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-10.2.5-26.el7cp, nfs-ganesha-2.4.2-5.el7cp Ubuntu: ceph_10.2.5-18redhat1, nfs-ganesha_2.4.2-5redhat1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-14 15:48:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hemanth Kumar 2017-01-24 10:04:43 UTC
Description of problem:
-----------------------
Installed crefi on the NFS Client where the NFS Share is mounted.
(https://github.com/vijaykumar-koppad/Crefi)

Ran a command which created files in the directory layout of 10 directories in each level, and 10 directories depth and in each directory, create files of size 100K on mount point.
" crefi --multi -b 10 -d 10 -n 1000  --size=100 /hell/folder1/ "

The Segfault was seen and the nfs-ganesha daemon crashed..


    -5> 2017-01-24 09:44:17.933817 7fd402fe5700  1 -- 10.8.128.20:0/3422557612 --> 10.8.128.16:6800/21443 -- osd_op(client.14300.0:385802 5.4f687131 2017-01-24-09-e825fa53-71c5-4242-8456-6dce7981f130.14300.1-folder1 [append 0~326] snapc 0=[] ack+ondisk+write+known_if_redirected e68) v7 -- ?+0 0x7fd378006d30 con 0x7fd575685710
    -4> 2017-01-24 09:44:17.933837 7fd402fe5700  2 req 0:0.003875:: :put_obj:http status=200
    -3> 2017-01-24 09:44:17.933842 7fd402fe5700  1 ====== process_request req done req=0x7fd402fe2cf0 http_status=200 ======
    -2> 2017-01-24 09:44:17.935233 7fd5271f8700  1 -- 10.8.128.20:0/3422557612 <== osd.3 10.8.128.16:6800/21443 107250 ==== osd_op_reply(385802 2017-01-24-09-e825fa53-71c5-4242-8456-6dce7981f130.14300.1-folder1 [append 0~326] v68'37227 uv37227 ack = 0) v7 ==== 186+0+0 (723282027 0 0) 0x7fd5000bc5a0 con 0x7fd575685710
    -1> 2017-01-24 09:44:17.935273 7fd5271f8700  1 -- 10.8.128.20:0/3422557612 <== osd.3 10.8.128.16:6800/21443 107251 ==== osd_op_reply(385802 2017-01-24-09-e825fa53-71c5-4242-8456-6dce7981f130.14300.1-folder1 [append 0~326] v68'37227 uv37227 ondisk = 0) v7 ==== 186+0+0 (953469729 0 0) 0x7fd5000bc5a0 con 0x7fd575685710
     0> 2017-01-24 09:44:17.935835 7fd402fe5700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fd402fe5700 thread_name:ganesha.nfsd

 ceph version 10.2.5-7.el7cp (59e9fee4a935fdd2bc8197e07596dc4313c410a3)
 1: (()+0x56ecda) [0x7fd56539ecda]
 2: (()+0xf370) [0x7fd571bf4370]
 3: (rgw::RGWFileHandle::reclaim()+0x11f) [0x7fd5653381cf]
 4: (cohort::lru::LRU<std::mutex>::evict_block()+0x102) [0x7fd565345a82]
 5: (rgw::RGWLibFS::lookup_fh(rgw::RGWFileHandle*, char const*, unsigned int)+0x362) [0x7fd56534dd22]
 6: (rgw::RGWLibFS::create(rgw::RGWFileHandle*, char const*, stat*, unsigned int, unsigned int)+0x50d) [0x7fd56533d51d]
 7: (rgw_create()+0x7b) [0x7fd56533d9eb]
 8: (rgw_fsal_open2()+0x1a5) [0x7fd56ea8e695]
 9: (mdcache_open2()+0x34f) [0x7fd57373424f]
 10: (open2_by_name()+0x119) [0x7fd573667029]
 11: (fsal_open2()+0x118) [0x7fd573669df8]
 12: (()+0x2b4c6) [0x7fd5736554c6]
 13: (nfs4_op_open()+0xaa9) [0x7fd57369da89]
 14: (nfs4_Compound()+0x63d) [0x7fd57368ffcd]
 15: (nfs_rpc_execute()+0x5bc) [0x7fd57368117c]
 16: (()+0x587da) [0x7fd5736827da]
 17: (()+0xe2459) [0x7fd57370c459]
 18: (()+0x7dc5) [0x7fd571becdc5]
 19: (clone()+0x6d) [0x7fd5712bb73d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.



Version-Release number of selected component (if applicable):
------------------------------------------------------------
ceph version 10.2.5-7.el7cp

Actual results:
-----------------
Segfault seen while creating multiple directory levels.

Expected results:
------------------
Creation should be successful

Comment 9 Hemanth Kumar 2017-02-08 13:51:38 UTC
Hot the crash again in the latest builds..
ceph-radosgw-10.2.5-22.el7cp.x86_64
nfs-ganesha-2.4.2-4.el7cp.x86_64
nfs-ganesha-rgw-2.4.2-4.el7cp.x86_64


     0> 2017-02-08 13:34:36.439879 7fb2c47b8700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fb2c47b8700 thread_name:ganesha.nfsd

 ceph version 10.2.5-22.el7cp (5cec6848b914e87dd6178e559dedae8a37cc08a3)
 1: (()+0x57322a) [0x7fb442ea822a]
 2: (()+0xf370) [0x7fb44f6fd370]
 3: (rgw::RGWFileHandle::reclaim()+0x1d7) [0x7fb442e42477]
 4: (cohort::lru::LRU<std::mutex>::evict_block()+0x102) [0x7fb442e50092]
 5: (rgw::RGWLibFS::lookup_fh(rgw::RGWFileHandle*, char const*, unsigned int)+0x393) [0x7fb442e571e3]
 6: (rgw::RGWLibFS::create(rgw::RGWFileHandle*, char const*, stat*, unsigned int, unsigned int)+0x327) [0x7fb442e47877]
 7: (rgw_create()+0x7b) [0x7fb442e47ceb]
 8: (rgw_fsal_open2()+0x1a5) [0x7fb44c597885]
 9: (mdcache_open2()+0x34f) [0x7fb45123e14f]
 10: (open2_by_name()+0x119) [0x7fb45116fe49]
 11: (fsal_open2()+0x118) [0x7fb451172c18]
 12: (()+0x2b386) [0x7fb45115e386]
 13: (nfs4_op_open()+0xaa9) [0x7fb4511a68a9]
 14: (nfs4_Compound()+0x63d) [0x7fb451198ded]
 15: (nfs_rpc_execute()+0x5bc) [0x7fb451189f9c]
 16: (()+0x585fa) [0x7fb45118b5fa]
 17: (()+0xe2289) [0x7fb451215289]
 18: (()+0x7dc5) [0x7fb44f6f5dc5]
 19: (clone()+0x6d) [0x7fb44edc473d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.



Re Opening the BZ

Comment 15 Hemanth Kumar 2017-02-23 10:08:09 UTC
No Segfault seen while creating multiple directory levels.
Moving to Verified

Comment 17 errata-xmlrpc 2017-03-14 15:48:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0514.html