Bug 1661583

Summary: ceph-volume crashes while making OSD
Product: [Fedora] Fedora Reporter: Tomasz Torcz <tomek>
Component: cephAssignee: Boris Ranto <branto>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: branto, danmick, david, fedora, i, josef, kkeithle, ramkrsna, steve
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-13 11:22:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tomasz Torcz 2018-12-21 16:39:41 UTC
Description of problem:
# ceph-volume lvm prepare --data /dev/sdd           
Running command: /bin/ceph-authtool --gen-print-key                                                                                                                           
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 7faf689b-b1dd-4f5b-8d9a-dcb063949dda     
Running command: /usr/sbin/vgcreate --force --yes ceph-db3893ef-93db-4b3f-a80e-11cca7911ba1 /dev/sdd                                                                          
 stdout: Physical volume "/dev/sdd" successfully created.                                                                                                                     
 stdout: Volume group "ceph-db3893ef-93db-4b3f-a80e-11cca7911ba1" successfully created
Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n osd-block-7faf689b-b1dd-4f5b-8d9a-dcb063949dda ceph-db3893ef-93db-4b3f-a80e-11cca7911ba1
 stdout: Logical volume "osd-block-7faf689b-b1dd-4f5b-8d9a-dcb063949dda" created.                                                                                             
Running command: /bin/ceph-authtool --gen-print-key                                                                                                                           
Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-4
Running command: /usr/sbin/restorecon /var/lib/ceph/osd/ceph-4
Running command: /bin/chown -h ceph:ceph /dev/ceph-db3893ef-93db-4b3f-a80e-11cca7911ba1/osd-block-7faf689b-b1dd-4f5b-8d9a-dcb063949dda                                        
Running command: /bin/chown -R ceph:ceph /dev/dm-0                                                                                                                            
Running command: /bin/ln -s /dev/ceph-db3893ef-93db-4b3f-a80e-11cca7911ba1/osd-block-7faf689b-b1dd-4f5b-8d9a-dcb063949dda /var/lib/ceph/osd/ceph-4/block                      
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-4/activate.monmap stderr: /bin/ceph:128: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working   
  import rados                                                                                                                                                                
got monmap epoch 8
Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-4/keyring --create-keyring --name osd.4 --add-key AQBi/BxcgL4tNRAA1ncksjAiwRFwsCZXvLbgAw==                         
 stdout: creating /var/lib/ceph/osd/ceph-4/keyring                                                                                                                            
 stdout: added entity osd.4 auth auth(key=AQBi/BxcgL4tNRAA1ncksjAiwRFwsCZXvLbgAw== with 0 caps)                                                                               
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4/keyring
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4/                                         
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 4 --monmap /var/lib/ceph/osd/ceph-4/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-4/ --osd-uuid 7faf689b-b1dd-4f5b-8d9a-dcb063949dda --setuser ceph --setgroup ceph                                                                                      
 stdout: /usr/include/c++/8/bits/basic_string.h:1048: std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::const_reference std::__cxx11::basic_string<_CharT, _Traits, _Alloc>:
:operator[](std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::size_type) const [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>; std::__
cxx11::basic_string<_CharT, _Traits, _Alloc>::const_reference = const char&; std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]: Assertion '__
pos <= size()' failed.                                                                       
 stderr: 2018-12-21 15:46:00.788 7fe6fa91a740 -1 bluestore(/var/lib/ceph/osd/ceph-4/) _read_fsid unparsable uuid                                                              
 stderr: *** Caught signal (Aborted) **
 stderr: in thread 7fe6fa91a740 thread_name:ceph-osd                                                                                                                          
 stderr: ceph version 14.0.1 (5f51cd286b747b1729006a5b98fb08b1b646237a) nautilus (dev)                                                                                        
 stderr: 1: (()+0x13030) [0x7fe6fb05e030]                                                                                                                                     
 stderr: 2: (gsignal()+0x10f) [0x7fe6fab6800f]                                                                                                                                
 stderr: 3: (abort()+0x127) [0x7fe6fab52895]                                                                                                                                  
 stderr: 4: (trim(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1d0) [0x555d1a6f8220]                                             
 stderr: 5: (get_str_map(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<cha
r>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>,
 std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std
::char_traits<char>, std::allocator<char> > > > >*, char const*)+0x200) [0x555d1a6f85d0]                                                                                      
 stderr: 6: (BlueStore::_open_db(bool, bool)+0x12de) [0x555d1a3f033e]
 stderr: 7: (BlueStore::mkfs()+0x102f) [0x555d1a4473ef]                                                                                                                       
 stderr: 8: (OSD::mkfs(CephContext*, ObjectStore*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, uuid_d, int)+0x174) [0x555d19f6aa54]
 stderr: 9: (main()+0x15b9) [0x555d19e73259]                                                                                                                                  
 stderr: 10: (__libc_start_main()+0xf3) [0x7fe6fab53ee3]                                                                                                                      
 stderr: 11: (_start()+0x2e) [0x555d19f4a84e]                                         
 stderr: 2018-12-21 15:46:01.590 7fe6fa91a740 -1 *** Caught signal (Aborted) **                                                                  
 stderr: in thread 7fe6fa91a740 thread_name:ceph-osd                                                                                                                          
 stderr: ceph version 14.0.1 (5f51cd286b747b1729006a5b98fb08b1b646237a) nautilus (dev)                
stderr: 1: (()+0x13030) [0x7fe6fb05e030]                          
 stderr: 2: (gsignal()+0x10f) [0x7fe6fab6800f]                
 stderr: 3: (abort()+0x127) [0x7fe6fab52895]                                                                                                                                  
 stderr: 4: (trim(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1d0) [0x555d1a6f8220]                                             
 stderr: 5: (get_str_map(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<cha
r>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std
::char_traits<char>, std::allocator<char> > > > >*, char const*)+0x200) [0x555d1a6f85d0]                                                                                      
 stderr: 6: (BlueStore::_open_db(bool, bool)+0x12de) [0x555d1a3f033e]
 stderr: 7: (BlueStore::mkfs()+0x102f) [0x555d1a4473ef]                                                                                                                       
 stderr: 8: (OSD::mkfs(CephContext*, ObjectStore*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, uuid_d, int)+0x174) [0x555d19f6aa54]
 stderr: 9: (main()+0x15b9) [0x555d19e73259]                                                                                                                                  
 stderr: 10: (__libc_start_main()+0xf3) [0x7fe6fab53ee3]                 
 stderr: 11: (_start()+0x2e) [0x555d19f4a84e]                                                              
 stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.                                                                           
 stderr: -15> 2018-12-21 15:46:00.788 7fe6fa91a740 -1 bluestore(/var/lib/ceph/osd/ceph-4/) _read_fsid unparsable uuid                                                         
 stderr: 0> 2018-12-21 15:46:01.590 7fe6fa91a740 -1 *** Caught signal (Aborted) **                                                                                             
 stderr: in thread 7fe6fa91a740 thread_name:ceph-osd                                                                                                                           
 stderr: ceph version 14.0.1 (5f51cd286b747b1729006a5b98fb08b1b646237a) nautilus (dev)                                                                                         
 stderr: 1: (()+0x13030) [0x7fe6fb05e030]                                                    
 stderr: 2: (gsignal()+0x10f) [0x7fe6fab6800f]                                                                                                                                
 stderr: 3: (abort()+0x127) [0x7fe6fab52895]
 stderr: 4: (trim(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1d0) [0x555d1a6f8220]                                             
 stderr: 5: (get_str_map(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<cha
r>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>,
 std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std
::char_traits<char>, std::allocator<char> > > > >*, char const*)+0x200) [0x555d1a6f85d0]                                                                                      
 stderr: 6: (BlueStore::_open_db(bool, bool)+0x12de) [0x555d1a3f033e]                                                                                                         
 stderr: 7: (BlueStore::mkfs()+0x102f) [0x555d1a4473ef]                                                                                                                        
 stderr: 8: (OSD::mkfs(CephContext*, ObjectStore*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, uuid_d, int)+0x174) [0x555d19f6aa54]
 stderr: 9: (main()+0x15b9) [0x555d19e73259]                                                                                                                                   
 stderr: 10: (__libc_start_main()+0xf3) [0x7fe6fab53ee3]                                                                                                                      
 stderr: 11: (_start()+0x2e) [0x555d19f4a84e]                        
 stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.                                                                          
--> Was unable to complete a new OSD, will rollback changes                                                                                                                    
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.4 --yes-i-really-mean-it           
 stderr: /bin/ceph:128: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working   
  import rados                                                                        
purged osd.4                                                                                                                                     
-->  RuntimeError: Command failed with exit code 250: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 4 --monmap /var/lib/ceph/osd/ceph-4/activate.monmap --
keyfile - --osd-data /var/lib/ceph/osd/ceph-4/ --osd-uuid 7faf689b-b1dd-4f5b-8d9a-dcb063949dda --setuser ceph --setgroup ceph             


Version-Release number of selected component (if applicable):
ceph-osd-14.0.1-2.fc30.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Try to make OSD using ceph-volume
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Tomasz Torcz 2019-01-09 19:56:29 UTC
BTW, I've downgraded packages to ceph-osd-13.2.4-0.el7.x86_64 from upstream CEPH repository. They work, new OSDs can be created and old ones function.

Comment 2 Tomasz Torcz 2019-03-07 20:37:57 UTC
According to upstream bug you forwarded the bug to (thanks for that, btw), the fix was merged to master on March 3rd. Ceph package was rebases to 14.1 in February. Would it be possible to include the fix - https://github.com/ceph/ceph/pull/26698 - in Fedora's package?

Comment 3 Boris Ranto 2019-03-13 11:22:18 UTC
This should be fixed in the latest rebase to v14.1.1.