Bug 2043513
| Summary: | [Tracker for Ceph BZ 2044836] mon is in CLBO after upgrading to 4.10-113 | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Vijay Avuthu <vavuthu> | |
| Component: | ceph | Assignee: | Patrick Donnelly <pdonnell> | |
| Status: | CLOSED ERRATA | QA Contact: | Vijay Avuthu <vavuthu> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.10 | CC: | bniver, madam, mhackett, mmuench, muagarwa, ocs-bugs, odf-bz-bot, pbalogh, pdonnell, shan, tnielsen | |
| Target Milestone: | --- | Keywords: | Automation, Regression, UpgradeBlocker | |
| Target Release: | ODF 4.10.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | 4.10.0-147 | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2044836 (view as bug list) | Environment: | ||
| Last Closed: | 2022-04-13 18:51:56 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 2044836 | |||
| Bug Blocks: | ||||
|
Description
Vijay Avuthu
2022-01-21 12:22:26 UTC
I think we will need the Ceph team to help debug this issue. The mon is in crash loop backoff with an error from the mon process (copied below). I don't see any crashes registered, and I don't see any must-gather ceph commands were run, so debugging this may be hard.
2022-01-20T16:42:30.296755514Z debug -43> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding auth protocol: cephx
2022-01-20T16:42:30.296755514Z debug -42> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding auth protocol: cephx
2022-01-20T16:42:30.296755514Z debug -41> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding auth protocol: cephx
2022-01-20T16:42:30.296755514Z debug -40> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding auth protocol: none
2022-01-20T16:42:30.296755514Z debug -39> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: secure
2022-01-20T16:42:30.296763563Z debug -38> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: crc
2022-01-20T16:42:30.296763563Z debug -37> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: secure2022-01-20T16:42:30.296771171Z
2022-01-20T16:42:30.296771171Z debug -36> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: crc
2022-01-20T16:42:30.296771171Z debug 2022-01-20T16:42:30.296778803Z -35> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: secure
2022-01-20T16:42:30.296778803Z debug -34> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: crc2022-01-20T16:42:30.296786459Z
2022-01-20T16:42:30.296786459Z debug -33> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: crc2022-01-20T16:42:30.296793797Z
2022-01-20T16:42:30.296793797Z debug -32> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: secure
2022-01-20T16:42:30.296801328Z debug -31> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: crc
2022-01-20T16:42:30.296801328Z debug 2022-01-20T16:42:30.296808629Z -30> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: secure
2022-01-20T16:42:30.296808629Z debug 2022-01-20T16:42:30.296816004Z -29> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: crc
2022-01-20T16:42:30.296816004Z debug -28> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84140) adding con mode: secure
2022-01-20T16:42:30.296823269Z debug -27> 2022-01-20T16:42:30.287+0000 7f5cc51a0700 2 auth: KeyRing::load: loaded key file /etc/ceph/keyring-store/keyring
2022-01-20T16:42:30.296823269Z debug -26> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 0 starting mon.b rank 1 at public addrs [v2:172.30.242.117:3300/0,v1:172.30.242.117:6789/0] at bind addrs [v2:10.128.2.39:3300/0,v1:10.128.2.39:6789/0] mon_data /var/lib/ceph/mon/ceph-b fsid 370885ac-8dec-4d95-8350-0deb0752c15b
2022-01-20T16:42:30.296830648Z debug -25> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding auth protocol: cephx
2022-01-20T16:42:30.296830648Z debug 2022-01-20T16:42:30.296839985Z -24> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding auth protocol: cephx
2022-01-20T16:42:30.296839985Z debug -23> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding auth protocol: cephx
2022-01-20T16:42:30.296839985Z debug 2022-01-20T16:42:30.296847622Z -22> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding auth protocol: none
2022-01-20T16:42:30.296847622Z debug -21> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: secure
2022-01-20T16:42:30.296847622Z debug 2022-01-20T16:42:30.296855188Z -20> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: crc
2022-01-20T16:42:30.296855188Z debug -19> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: secure
2022-01-20T16:42:30.296862684Z debug -18> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: crc
2022-01-20T16:42:30.296862684Z debug -17> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: secure2022-01-20T16:42:30.296869994Z
2022-01-20T16:42:30.296869994Z debug -16> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: crc
2022-01-20T16:42:30.296869994Z debug -15> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: crc2022-01-20T16:42:30.296877556Z
2022-01-20T16:42:30.296877556Z debug -14> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: secure
2022-01-20T16:42:30.296884850Z debug -13> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: crc
2022-01-20T16:42:30.296884850Z debug -12> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: secure
2022-01-20T16:42:30.296892184Z debug -11> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: crc
2022-01-20T16:42:30.296899315Z debug -10> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 AuthRegistry(0x55896cf84a40) adding con mode: secure
2022-01-20T16:42:30.296899315Z debug -9> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 2 auth: KeyRing::load: loaded key file /etc/ceph/keyring-store/keyring
2022-01-20T16:42:30.296906626Z debug -8> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 adding auth protocol: cephx
2022-01-20T16:42:30.296906626Z debug -7> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 5 adding auth protocol: cephx
2022-01-20T16:42:30.296914084Z debug -6> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 10 log_channel(cluster) update_config to_monitors: true to_syslog: false syslog_facility: daemon prio: info to_graylog: false graylog_host: 127.0.0.1 graylog_port: 12201)
2022-01-20T16:42:30.296921160Z debug -5> 2022-01-20T16:42:30.288+0000 7f5cc51a0700 10 log_channel(audit) update_config to_monitors: true to_syslog: false syslog_facility: local0 prio: info to_graylog: false graylog_host: 127.0.0.1 graylog_port: 12201)
2022-01-20T16:42:30.296928465Z debug -4> 2022-01-20T16:42:30.289+0000 7f5cc51a0700 1 mon.b@-1(???) e3 preinit fsid 370885ac-8dec-4d95-8350-0deb0752c15b
2022-01-20T16:42:30.296935725Z debug -3> 2022-01-20T16:42:30.289+0000 7f5cc51a0700 0 mon.b@-1(???).mds e27 new map
2022-01-20T16:42:30.296935725Z debug -2> 2022-01-20T16:42:30.290+0000 7f5cc51a0700 0 mon.b@-1(???).mds e27 print_map
2022-01-20T16:42:30.296935725Z e27
2022-01-20T16:42:30.296935725Z enable_multiple, ever_enabled_multiple: 1,1
2022-01-20T16:42:30.296935725Z default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
2022-01-20T16:42:30.296935725Z legacy client fscid: 1
2022-01-20T16:42:30.296935725Z
2022-01-20T16:42:30.296935725Z Filesystem 'ocs-storagecluster-cephfilesystem' (1)
2022-01-20T16:42:30.296935725Z fs_name ocs-storagecluster-cephfilesystem
2022-01-20T16:42:30.296935725Z epoch 27
2022-01-20T16:42:30.296935725Z flags 32
2022-01-20T16:42:30.296935725Z created 2022-01-20T12:27:41.132859+0000
2022-01-20T16:42:30.296935725Z modified 2022-01-20T15:55:22.452393+0000
2022-01-20T16:42:30.296935725Z tableserver 0
2022-01-20T16:42:30.296935725Z root 0
2022-01-20T16:42:30.296935725Z session_timeout 60
2022-01-20T16:42:30.296935725Z session_autoclose 300
2022-01-20T16:42:30.296935725Z max_file_size 1099511627776
2022-01-20T16:42:30.296935725Z required_client_features {}
2022-01-20T16:42:30.296935725Z last_failure 0
2022-01-20T16:42:30.296935725Z last_failure_osd_epoch 982
2022-01-20T16:42:30.296935725Z compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
2022-01-20T16:42:30.296935725Z max_mds 1
2022-01-20T16:42:30.296935725Z in 0
2022-01-20T16:42:30.296935725Z up {0=74228}
2022-01-20T16:42:30.296935725Z failed
2022-01-20T16:42:30.296935725Z damaged
2022-01-20T16:42:30.296935725Z stopped
2022-01-20T16:42:30.296935725Z data_pools [3]
2022-01-20T16:42:30.296935725Z metadata_pool 2
2022-01-20T16:42:30.296935725Z inline_data disabled
2022-01-20T16:42:30.296935725Z balancer
2022-01-20T16:42:30.296935725Z standby_count_wanted 1
2022-01-20T16:42:30.296935725Z [mds.ocs-storagecluster-cephfilesystem-a{0:74228} state up:active seq 5 join_fscid=1 addr [v2:10.131.0.39:6800/1133637502,v1:10.131.0.39:6801/1133637502] compat {c=[1],r=[1],i=[77f]}]
2022-01-20T16:42:30.296935725Z [mds.ocs-storagecluster-cephfilesystem-b{0:74306} state up:standby-replay seq 1 join_fscid=1 addr [v2:10.129.2.70:6800/115663078,v1:10.129.2.70:6801/115663078] compat {c=[1],r=[1],i=[7ff]}]
2022-01-20T16:42:30.296935725Z
2022-01-20T16:42:30.296935725Z
2022-01-20T16:42:30.296935725Z
2022-01-20T16:42:30.296950797Z debug -1> 2022-01-20T16:42:30.291+0000 7f5cc51a0700 -1 /builddir/build/BUILD/ceph-16.2.7/src/mds/FSMap.cc: In function 'void FSMap::sanity(bool) const' thread 7f5cc51a0700 time 2022-01-20T16:42:30.290637+0000
2022-01-20T16:42:30.296950797Z /builddir/build/BUILD/ceph-16.2.7/src/mds/FSMap.cc: 868: FAILED ceph_assert(info.compat.writeable(fs->mds_map.compat))
2022-01-20T16:42:30.296950797Z
2022-01-20T16:42:30.296950797Z ceph version 16.2.7-31.el8cp (2cfe2e2a505bfa00c184623965dbdb21ed9ff6aa) pacific (stable)
2022-01-20T16:42:30.296950797Z 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f5cbc322c82]
2022-01-20T16:42:30.296950797Z 2: /usr/lib64/ceph/libceph-common.so.2(+0x276e9c) [0x7f5cbc322e9c]
2022-01-20T16:42:30.296950797Z 3: (FSMap::sanity(bool) const+0x2a8) [0x7f5cbc871788]
2022-01-20T16:42:30.296950797Z 4: (MDSMonitor::update_from_paxos(bool*)+0x39a) [0x55896ac1b7aa]
2022-01-20T16:42:30.296950797Z 5: (PaxosService::refresh(bool*)+0x10e) [0x55896ab3c29e]
2022-01-20T16:42:30.296950797Z 6: (Monitor::refresh_from_paxos(bool*)+0x18c) [0x55896a9ed2cc]
2022-01-20T16:42:30.296950797Z 7: (Monitor::init_paxos()+0x10c) [0x55896a9ed5dc]
2022-01-20T16:42:30.296950797Z 8: (Monitor::preinit()+0xd30) [0x55896aa1aaa0]
2022-01-20T16:42:30.296950797Z 9: main()
2022-01-20T16:42:30.296950797Z 10: __libc_start_main()
2022-01-20T16:42:30.296950797Z 11: _start()
2022-01-20T16:42:30.296950797Z
2022-01-20T16:42:30.296950797Z debug 0> 2022-01-20T16:42:30.293+0000 7f5cc51a0700 -1 *** Caught signal (Aborted) **
2022-01-20T16:42:30.296950797Z in thread 7f5cc51a0700 thread_name:ceph-mon
2022-01-20T16:42:30.296950797Z
2022-01-20T16:42:30.296950797Z ceph version 16.2.7-31.el8cp (2cfe2e2a505bfa00c184623965dbdb21ed9ff6aa) pacific (stable)
2022-01-20T16:42:30.296950797Z 1: /lib64/libpthread.so.0(+0x12c20) [0x7f5cba062c20]
2022-01-20T16:42:30.296950797Z 2: gsignal()
2022-01-20T16:42:30.296950797Z 3: abort()
2022-01-20T16:42:30.296950797Z 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f5cbc322cd3]
2022-01-20T16:42:30.296950797Z 5: /usr/lib64/ceph/libceph-common.so.2(+0x276e9c) [0x7f5cbc322e9c]
2022-01-20T16:42:30.296950797Z 6: (FSMap::sanity(bool) const+0x2a8) [0x7f5cbc871788]
2022-01-20T16:42:30.296950797Z 7: (MDSMonitor::update_from_paxos(bool*)+0x39a) [0x55896ac1b7aa]
2022-01-20T16:42:30.296950797Z 8: (PaxosService::refresh(bool*)+0x10e) [0x55896ab3c29e]
2022-01-20T16:42:30.296950797Z 9: (Monitor::refresh_from_paxos(bool*)+0x18c) [0x55896a9ed2cc]
2022-01-20T16:42:30.296950797Z 10: (Monitor::init_paxos()+0x10c) [0x55896a9ed5dc]
2022-01-20T16:42:30.296950797Z 11: (Monitor::preinit()+0xd30) [0x55896aa1aaa0]
2022-01-20T16:42:30.296950797Z 12: main()
2022-01-20T16:42:30.296950797Z 13: __libc_start_main()
2022-01-20T16:42:30.296950797Z 14: _start()
2022-01-20T16:42:30.296950797Z NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2022-01-20T16:42:30.296964906Z
2022-01-20T16:42:30.296992315Z --- logging levels ---
2022-01-20T16:42:30.296992315Z 0/ 5 none
2022-01-20T16:42:30.297000801Z 0/ 1 lockdep
2022-01-20T16:42:30.297000801Z 0/ 1 context
2022-01-20T16:42:30.297000801Z 1/ 1 crush
2022-01-20T16:42:30.297008537Z 1/ 5 mds
2022-01-20T16:42:30.297008537Z 1/ 5 mds_balancer
2022-01-20T16:42:30.297016192Z 1/ 5 mds_locker
2022-01-20T16:42:30.297016192Z 1/ 5 mds_log
2022-01-20T16:42:30.297016192Z 1/ 5 mds_log_expire2022-01-20T16:42:30.297023991Z
2022-01-20T16:42:30.297023991Z 1/ 5 mds_migrator
2022-01-20T16:42:30.297023991Z 0/ 1 buffer
2022-01-20T16:42:30.297031807Z 0/ 1 timer
2022-01-20T16:42:30.297031807Z 0/ 1 filer
2022-01-20T16:42:30.297031807Z 0/ 1 striper2022-01-20T16:42:30.297039569Z
2022-01-20T16:42:30.297039569Z 0/ 1 objecter
2022-01-20T16:42:30.297039569Z 0/ 5 rados
2022-01-20T16:42:30.297047277Z 0/ 5 rbd
2022-01-20T16:42:30.297047277Z 0/ 5 rbd_mirror
2022-01-20T16:42:30.297054704Z 0/ 5 rbd_replay
2022-01-20T16:42:30.297054704Z 0/ 5 rbd_pwl
2022-01-20T16:42:30.297054704Z 0/ 5 journaler2022-01-20T16:42:30.297062378Z
2022-01-20T16:42:30.297062378Z 0/ 5 objectcacher
2022-01-20T16:42:30.297069833Z 0/ 5 immutable_obj_cache
2022-01-20T16:42:30.297069833Z 0/ 5 client
2022-01-20T16:42:30.297069833Z 1/ 5 osd2022-01-20T16:42:30.297079198Z
2022-01-20T16:42:30.297079198Z 0/ 5 optracker
2022-01-20T16:42:30.297079198Z 0/ 5 objclass
2022-01-20T16:42:30.297079198Z 1/ 3 filestore2022-01-20T16:42:30.297087171Z
2022-01-20T16:42:30.297087171Z 1/ 3 journal
2022-01-20T16:42:30.297087171Z 0/ 0 ms
2022-01-20T16:42:30.297094968Z 1/ 5 mon
2022-01-20T16:42:30.297094968Z 0/10 monc
2022-01-20T16:42:30.297102520Z 1/ 5 paxos
2022-01-20T16:42:30.297102520Z 0/ 5 tp
2022-01-20T16:42:30.297102520Z 1/ 5 auth2022-01-20T16:42:30.297110195Z
2022-01-20T16:42:30.297110195Z 1/ 5 crypto
2022-01-20T16:42:30.297110195Z 1/ 1 finisher
2022-01-20T16:42:30.297118068Z 1/ 1 reserver
2022-01-20T16:42:30.297118068Z 1/ 5 heartbeatmap
2022-01-20T16:42:30.297125715Z 1/ 5 perfcounter
2022-01-20T16:42:30.297125715Z 1/ 5 rgw
2022-01-20T16:42:30.297125715Z 1/ 5 rgw_sync2022-01-20T16:42:30.297133388Z
2022-01-20T16:42:30.297133388Z 1/10 civetweb
2022-01-20T16:42:30.297133388Z 1/ 5 javaclient
2022-01-20T16:42:30.297141161Z 1/ 5 asok
2022-01-20T16:42:30.297141161Z 1/ 1 throttle
2022-01-20T16:42:30.297148825Z 0/ 0 refs
2022-01-20T16:42:30.297148825Z 1/ 5 compressor
2022-01-20T16:42:30.297148825Z 1/ 5 bluestore2022-01-20T16:42:30.297156553Z
2022-01-20T16:42:30.297156553Z 1/ 5 bluefs
2022-01-20T16:42:30.297156553Z 1/ 3 bdev
2022-01-20T16:42:30.297164207Z 1/ 5 kstore
2022-01-20T16:42:30.297164207Z 4/ 5 rocksdb
2022-01-20T16:42:30.297164207Z 4/ 5 leveldb2022-01-20T16:42:30.297171880Z
2022-01-20T16:42:30.297171880Z 4/ 5 memdb
2022-01-20T16:42:30.297171880Z 1/ 5 fuse
2022-01-20T16:42:30.297179561Z 2/ 5 mgr
2022-01-20T16:42:30.297179561Z 1/ 5 mgrc
2022-01-20T16:42:30.297187170Z 1/ 5 dpdk
2022-01-20T16:42:30.297187170Z 1/ 5 eventtrace
2022-01-20T16:42:30.297187170Z 1/ 5 prioritycache2022-01-20T16:42:30.297194819Z
2022-01-20T16:42:30.297194819Z 0/ 5 test
2022-01-20T16:42:30.297194819Z 0/ 5 cephfs_mirror
2022-01-20T16:42:30.297202426Z 0/ 5 cephsqlite
2022-01-20T16:42:30.297202426Z -2/-2 (syslog threshold)
2022-01-20T16:42:30.297209918Z 99/99 (stderr threshold)
2022-01-20T16:42:30.297209918Z --- pthread ID / name mapping for recent threads ---
2022-01-20T16:42:30.297234902Z 140036001363712 / rocksdb:dump_st
2022-01-20T16:42:30.297244513Z 140036169725696 / admin_socket
2022-01-20T16:42:30.297251760Z 140036420536064 / ceph-mon
2022-01-20T16:42:30.297251760Z max_recent 10000
2022-01-20T16:42:30.297251760Z max_new 100002022-01-20T16:42:30.297259441Z
2022-01-20T16:42:30.297259441Z log_file /var/lib/ceph/crash/2022-01-20T16:42:30.293729Z_ff2cc9e7-cf68-4c5a-b384-336b8997364e/log
2022-01-20T16:42:30.297259441Z --- end dump of recent events ---
Transferring this to the Ceph component to get help debugging this on the Ceph side. I have created a ceph bug but a quick google search tells me that there is an existing issue around this, see the release notes https://github.com/ceph/ceph/pull/44131 and this conversation in the community https://www.spinics.net/lists/ceph-users/msg70110.html Rook is applying "mon_mds_skip_sanity" already before the upgrade, we can see that from the op logs. 2022-01-20T15:53:10.400253921Z 2022-01-20 15:53:10.400208 I | ceph-cluster-controller: upgrading ceph cluster to "16.2.7-31 pacific" 2022-01-20T15:53:10.400253921Z 2022-01-20 15:53:10.400230 I | ceph-cluster-controller: cluster "openshift-storage": version "16.2.7-31 pacific" detected for image "quay.io/rhceph-dev/rhceph@sha256:77a11bd0eca26a1315c384f1d7f0d7a1f6dd0631e464cd0b1e2cee929f558d9d" 2022-01-20T15:53:10.460664304Z 2022-01-20 15:53:10.460626 I | op-config: setting "mon"="mon_mds_skip_sanity"="1" option to the mon configuration database 2022-01-20T15:53:10.784778923Z 2022-01-20 15:53:10.784735 I | op-config: successfully set "mon"="mon_mds_skip_sanity"="1" option to the mon configuration database Still, it looks like mon-b is failing the upgrade. Patrick do you have any idea what could be wrong? Thanks! *** Bug 2043510 has been marked as a duplicate of this bug. *** Verified with below versions: =========================== upgrade was success from ocs-registry:4.9.3-2 to cs-registry:4.10.0-164 job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/3311/console Moving to verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1372 |