Description of problem: ganesha.nfsd crashes on a rgw node when keyring file path is not specified. ganesha.nfsd process tries to get the keyring from an incorrect path and thus crashes. Version-Release number of selected component (if applicable): 10.2.2-38 How reproducible: Always Steps to Reproduce: 1. Configure nfs-ganesha on an RGW node 2. Without having specified keyring path in [client.rgw.host], invoke ganesha process. Process crashes because it is looking for keyring in /var/lib/ceph/radosgw/-rgw.magna059/keyring Actual results: # /usr/bin/ganesha.nfsd -f /etc/ganesha/ganesha.conf -F 2016-08-19 14:42:14.796291 7f45f4f1b0c0 -1 auth: unable to find a keyring on /var/lib/ceph/radosgw/-rgw.magna059/keyring: (2) No such file or directory 2016-08-19 14:42:14.797848 7f45f4f1b0c0 -1 monclient(hunting): authenticate NOTE: no keyring found; disabled cephx authentication 2016-08-19 14:42:14.798319 7f45f4f1b0c0 -1 Couldn't init storage provider (RADOS) *** Caught signal (Segmentation fault) ** in thread 7f45f4f1b0c0 thread_name:ganesha.nfsd ceph version 10.2.2-38.el7cp (119a68752a5671253f9daae3f894a90313a6b8e4) 1: (()+0x554daa) [0x7f45e7a92daa] 2: (()+0xf100) [0x7f45f3504100] 3: (std::string::assign(std::string const&)+0x20) [0x7f45dbc9efc0] 4: (rgw_obj::rgw_obj(rgw_bucket&, std::string const&)+0x75) [0x7f45e7817805] 5: (rgw_get_system_obj(RGWRados*, RGWObjectCtx&, rgw_bucket&, std::string const&, ceph::buffer::list&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > >*, rgw_cache_entry_info*)+0x89) [0x7f45e781cc29] 6: (rgw_get_user_info_from_index(RGWRados*, std::string&, rgw_bucket&, RGWUserInfo&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*)+0x367) [0x7f45e7a1ca77] 7: (rgw::RGWLibFS::authorize(RGWRados*)+0x48) [0x7f45e7a44ff8] 8: (rgw_mount()+0x5d) [0x7f45e7a366ed] 9: (()+0x2260) [0x7f45f117d260] 10: (()+0xf65f6) [0x7f45f502d5f6] 11: (()+0x11a695) [0x7f45f5051695] 12: (()+0x119a20) [0x7f45f5050a20] 13: (load_config_from_parse()+0xd0) [0x7f45f5052100] 14: (ReadExports()+0x5b) [0x7f45f502e35b] 15: (main()+0xafc) [0x7f45f4f6425c] 16: (__libc_start_main()+0xf5) [0x7f45f2af5b15] 17: (()+0x2d331) [0x7f45f4f64331] 2016-08-19 14:42:14.800282 7f45f4f1b0c0 -1 *** Caught signal (Segmentation fault) ** in thread 7f45f4f1b0c0 thread_name:ganesha.nfsd ceph version 10.2.2-38.el7cp (119a68752a5671253f9daae3f894a90313a6b8e4) 1: (()+0x554daa) [0x7f45e7a92daa] 2: (()+0xf100) [0x7f45f3504100] 3: (std::string::assign(std::string const&)+0x20) [0x7f45dbc9efc0] 4: (rgw_obj::rgw_obj(rgw_bucket&, std::string const&)+0x75) [0x7f45e7817805] 5: (rgw_get_system_obj(RGWRados*, RGWObjectCtx&, rgw_bucket&, std::string const&, ceph::buffer::list&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > >*, rgw_cache_entry_info*)+0x89) [0x7f45e781cc29] 6: (rgw_get_user_info_from_index(RGWRados*, std::string&, rgw_bucket&, RGWUserInfo&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*)+0x367) [0x7f45e7a1ca77] 7: (rgw::RGWLibFS::authorize(RGWRados*)+0x48) [0x7f45e7a44ff8] 8: (rgw_mount()+0x5d) [0x7f45e7a366ed] 9: (()+0x2260) [0x7f45f117d260] 10: (()+0xf65f6) [0x7f45f502d5f6] 11: (()+0x11a695) [0x7f45f5051695] 12: (()+0x119a20) [0x7f45f5050a20] 13: (load_config_from_parse()+0xd0) [0x7f45f5052100] 14: (ReadExports()+0x5b) [0x7f45f502e35b] 15: (main()+0xafc) [0x7f45f4f6425c] 16: (__libc_start_main()+0xf5) [0x7f45f2af5b15] 17: (()+0x2d331) [0x7f45f4f64331]
I am not able to reproduce this issue when librgw.so/Ceph is master, I will attempt to reproduce on upstream Jewel.
A possible reproducer for this was found and fixed on upstream nfs-ganesha: upstream nfs-ganesha commit edf4f579 ("RGW: failing to bind to librados should be caught") is the candidate fix QE, can you folks retest with a recent nfs-ganesha?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0514.html