Description of problem:
ganesha.nfsd crashes on a rgw node when keyring file path is not specified. ganesha.nfsd process tries to get the keyring from an incorrect path and thus crashes.
Version-Release number of selected component (if applicable):
10.2.2-38
How reproducible:
Always
Steps to Reproduce:
1. Configure nfs-ganesha on an RGW node
2. Without having specified keyring path in [client.rgw.host], invoke ganesha process. Process crashes because it is looking for keyring in /var/lib/ceph/radosgw/-rgw.magna059/keyring
Actual results:
# /usr/bin/ganesha.nfsd -f /etc/ganesha/ganesha.conf -F
2016-08-19 14:42:14.796291 7f45f4f1b0c0 -1 auth: unable to find a keyring on /var/lib/ceph/radosgw/-rgw.magna059/keyring: (2) No such file or directory
2016-08-19 14:42:14.797848 7f45f4f1b0c0 -1 monclient(hunting): authenticate NOTE: no keyring found; disabled cephx authentication
2016-08-19 14:42:14.798319 7f45f4f1b0c0 -1 Couldn't init storage provider (RADOS)
*** Caught signal (Segmentation fault) **
in thread 7f45f4f1b0c0 thread_name:ganesha.nfsd
ceph version 10.2.2-38.el7cp (119a68752a5671253f9daae3f894a90313a6b8e4)
1: (()+0x554daa) [0x7f45e7a92daa]
2: (()+0xf100) [0x7f45f3504100]
3: (std::string::assign(std::string const&)+0x20) [0x7f45dbc9efc0]
4: (rgw_obj::rgw_obj(rgw_bucket&, std::string const&)+0x75) [0x7f45e7817805]
5: (rgw_get_system_obj(RGWRados*, RGWObjectCtx&, rgw_bucket&, std::string const&, ceph::buffer::list&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > >*, rgw_cache_entry_info*)+0x89) [0x7f45e781cc29]
6: (rgw_get_user_info_from_index(RGWRados*, std::string&, rgw_bucket&, RGWUserInfo&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*)+0x367) [0x7f45e7a1ca77]
7: (rgw::RGWLibFS::authorize(RGWRados*)+0x48) [0x7f45e7a44ff8]
8: (rgw_mount()+0x5d) [0x7f45e7a366ed]
9: (()+0x2260) [0x7f45f117d260]
10: (()+0xf65f6) [0x7f45f502d5f6]
11: (()+0x11a695) [0x7f45f5051695]
12: (()+0x119a20) [0x7f45f5050a20]
13: (load_config_from_parse()+0xd0) [0x7f45f5052100]
14: (ReadExports()+0x5b) [0x7f45f502e35b]
15: (main()+0xafc) [0x7f45f4f6425c]
16: (__libc_start_main()+0xf5) [0x7f45f2af5b15]
17: (()+0x2d331) [0x7f45f4f64331]
2016-08-19 14:42:14.800282 7f45f4f1b0c0 -1 *** Caught signal (Segmentation fault) **
in thread 7f45f4f1b0c0 thread_name:ganesha.nfsd
ceph version 10.2.2-38.el7cp (119a68752a5671253f9daae3f894a90313a6b8e4)
1: (()+0x554daa) [0x7f45e7a92daa]
2: (()+0xf100) [0x7f45f3504100]
3: (std::string::assign(std::string const&)+0x20) [0x7f45dbc9efc0]
4: (rgw_obj::rgw_obj(rgw_bucket&, std::string const&)+0x75) [0x7f45e7817805]
5: (rgw_get_system_obj(RGWRados*, RGWObjectCtx&, rgw_bucket&, std::string const&, ceph::buffer::list&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > >*, rgw_cache_entry_info*)+0x89) [0x7f45e781cc29]
6: (rgw_get_user_info_from_index(RGWRados*, std::string&, rgw_bucket&, RGWUserInfo&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*)+0x367) [0x7f45e7a1ca77]
7: (rgw::RGWLibFS::authorize(RGWRados*)+0x48) [0x7f45e7a44ff8]
8: (rgw_mount()+0x5d) [0x7f45e7a366ed]
9: (()+0x2260) [0x7f45f117d260]
10: (()+0xf65f6) [0x7f45f502d5f6]
11: (()+0x11a695) [0x7f45f5051695]
12: (()+0x119a20) [0x7f45f5050a20]
13: (load_config_from_parse()+0xd0) [0x7f45f5052100]
14: (ReadExports()+0x5b) [0x7f45f502e35b]
15: (main()+0xafc) [0x7f45f4f6425c]
16: (__libc_start_main()+0xf5) [0x7f45f2af5b15]
17: (()+0x2d331) [0x7f45f4f64331]
Comment 4Matt Benjamin (redhat)
2016-09-08 02:00:10 UTC
I am not able to reproduce this issue when librgw.so/Ceph is master, I will attempt to reproduce on upstream Jewel.
Comment 5Matt Benjamin (redhat)
2016-09-20 18:17:33 UTC
A possible reproducer for this was found and fixed on upstream nfs-ganesha:
upstream nfs-ganesha commit edf4f579 ("RGW: failing to bind to librados should be caught") is the candidate fix
QE, can you folks retest with a recent nfs-ganesha?
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHBA-2017-0514.html