Bug 2149286 - Mon crash is observed while upgrading RHCS 5.3 to RHCS 6.0 in disconnnected environment
Summary: Mon crash is observed while upgrading RHCS 5.3 to RHCS 6.0 in disconnnected e...
Keywords:
Status: CLOSED DUPLICATE of bug 2142141
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 6.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 6.0
Assignee: Neha Ojha
QA Contact: Pawan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-29 12:45 UTC by Manisha Saini
Modified: 2022-11-29 16:27 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-29 16:27:02 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-5705 0 None None None 2022-11-29 12:52:13 UTC

Description Manisha Saini 2022-11-29 12:45:47 UTC
Description of problem:
=====================
Upgrade RHCS 5.3 + RHEL 8 --> RHCS 6.0 + RHEL 9, 2 mon daemon got crashed.

[ceph: root@ceph-msaini-x723pp-node1-installer /]# ceph -s
  cluster:
    id:     d5ae140e-6abe-11ed-b7fe-fa163e6e007c
    health: HEALTH_WARN
            noout,noscrub,nodeep-scrub flag(s) set
            2 daemons have recently crashed
 
  services:
    mon: 3 daemons, quorum ceph-msaini-x723pp-node1-installer,ceph-msaini-x723pp-node2,ceph-msaini-x723pp-node3 (age 19m)
    mgr: ceph-msaini-x723pp-node2.kaolcj(active, since 20m), standbys: ceph-msaini-x723pp-node3.watgoz, ceph-msaini-x723pp-node1-installer.vctwbi
    mds: 1/1 daemons up
    osd: 10 osds: 10 up (since 14m), 10 in (since 2h)
         flags noout,noscrub,nodeep-scrub
 
  data:
    volumes: 1/1 healthy
    pools:   3 pools, 65 pgs
    objects: 24 objects, 455 KiB
    usage:   237 MiB used, 200 GiB / 200 GiB avail
    pgs:     65 active+clean

==================

Crash logs
----------

{
    "assert_condition": "m < ranks.size()",
    "assert_file": "/builddir/build/BUILD/ceph-16.2.10/src/mon/MonMap.h",
    "assert_func": "const entity_addrvec_t& MonMap::get_addrs(unsigned int) const",
    "assert_line": 404,
    "assert_msg": "/builddir/build/BUILD/ceph-16.2.10/src/mon/MonMap.h: In function 'const entity_addrvec_t& MonMap::get_addrs(unsigned int) const' thread 7f522627e700 time 2022-11-24T03:10:53.697663+0000\n/builddir/build/BUILD/ceph-16.2.10/src/mon/MonMap.h: 404: FAILED ceph_assert(m < ranks.size())\n",
    "assert_thread_name": "safe_timer",
    "backtrace": [
        "/lib64/libpthread.so.0(+0x12cf0) [0x7f522f310cf0]",
        "gsignal()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f52315d4129]",
        "/usr/lib64/ceph/libceph-common.so.2(+0x2782f2) [0x7f52315d42f2]",
        "(Elector::send_peer_ping(int, utime_t const*)+0x448) [0x55d889c704d8]",
        "(Elector::ping_check(int)+0x30f) [0x55d889c70dff]",
        "(Context::complete(int)+0xd) [0x55d889befded]",
        "(CommonSafeTimer<std::mutex>::timer_thread()+0x10f) [0x7f52316ca64f]",
        "(CommonSafeTimerThread<std::mutex>::entry()+0x11) [0x7f52316cb9e1]",
        "/lib64/libpthread.so.0(+0x81ca) [0x7f522f3061ca]",
        "clone()"
    ],
    "ceph_version": "16.2.10-76.el8cp",
    "crash_id": "2022-11-24T03:10:53.701535Z_035bf4ff-7559-4051-a094-1f07645daa9b",
    "entity_name": "mon.ceph-msaini-x723pp-node2",
    "os_id": "rhel",
    "os_name": "Red Hat Enterprise Linux",
    "os_version": "8.7 (Ootpa)",
    "os_version_id": "8.7",
    "process_name": "ceph-mon",
    "stack_sig": "e38d028eff1f898de137a7e5338724648f80af760a5035ff9224340dd867cf40",
    "timestamp": "2022-11-24T03:10:53.701535Z",
    "utsname_hostname": "ceph-msaini-x723pp-node2",
    "utsname_machine": "x86_64",
    "utsname_release": "4.18.0-372.35.1.el8_6.x86_64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP Thu Nov 10 14:47:06 EST 2022"
}



Version-Release number of selected component (if applicable):

[cephuser@ceph-msaini-x723pp-node1-installer crash]$ rpm -qa | grep ceph
cephadm-17.2.5-14.el9cp.noarch
cephadm-ansible-2.10.0-1.el9cp.noarch
libcephfs2-17.2.5-14.el9cp.x86_64
python3-ceph-argparse-17.2.5-14.el9cp.x86_64
python3-cephfs-17.2.5-14.el9cp.x86_64
python3-ceph-common-17.2.5-14.el9cp.x86_64
ceph-common-17.2.5-14.el9cp.x86_64

[ceph: root@ceph-msaini-x723pp-node1-installer /]# ceph --version
ceph version 17.2.5-14.el9cp (2b72f4dcf7331ea8d6bbccdd62be4eb8291ddc7a) quincy (stable)


How reproducible:
1/1


Steps to Reproduce:
1. Create RHCS 5.3 + RHEL 8 cluster
2. Perform Host OS upgrade from RHEL 8 to RHEL 9
3. Perform RHCS upgrade from RHCS 5.3 to RHCS 6.0

Actual results:
==============
while performing RHCS upgrade, 2 mon daemons got crashed

Expected results:
============
Upgrade should be successful. No crash should be observed


Additional info:
=================
[ceph: root@ceph-msaini-x723pp-node1-installer /]# ceph crash ls
ID                                                                ENTITY                        NEW  
2022-11-24T03:10:53.628839Z_aab8e787-7bb5-4ee8-b3ae-61c3f98e8cf6  mon.ceph-msaini-x723pp-node3   *   
2022-11-24T03:10:53.701535Z_035bf4ff-7559-4051-a094-1f07645daa9b  mon.ceph-msaini-x723pp-node2   *   



[ceph: root@ceph-msaini-x723pp-node1-installer /]# ceph orch ls
NAME           PORTS        RUNNING  REFRESHED  AGE  PLACEMENT                                                                             
alertmanager   ?:9093,9094      1/1  8m ago     6d   count:1                                                                               
crash                           6/6  8m ago     6d   *                                                                                     
grafana        ?:3000           1/1  8m ago     6d   count:1                                                                               
mds.test                        1/1  8m ago     5d   ceph-msaini-x723pp-node5;count:1                                                      
mgr                             3/3  8m ago     5d   ceph-msaini-x723pp-node1-installer;ceph-msaini-x723pp-node2;ceph-msaini-x723pp-node3  
mon                             3/3  8m ago     5d   label:mon                                                                             
node-exporter  ?:9100           6/6  8m ago     6d   *                                                                                     
osd                              10  8m ago     -    <unmanaged>                                                                           
prometheus     ?:9095           1/1  8m ago     6d   count:1                        


[ceph: root@ceph-msaini-x723pp-node1-installer /]# ceph orch ps
NAME                                              HOST                                PORTS        STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION          IMAGE ID      CONTAINER ID  
alertmanager.ceph-msaini-x723pp-node1-installer   ceph-msaini-x723pp-node1-installer  *:9093,9094  running (14m)     3m ago   4d    20.8M        -  0.21.0           ad9feb60a543  0f5654ed7768  
crash.ceph-msaini-x723pp-node1-installer          ceph-msaini-x723pp-node1-installer               running (19m)     3m ago   4d    6589k        -  17.2.5-14.el9cp  03f2e202c063  411be06477de  
crash.ceph-msaini-x723pp-node2                    ceph-msaini-x723pp-node2                         running (19m)     4m ago   2d    6643k        -  17.2.5-14.el9cp  03f2e202c063  cf89a858dd8e  
crash.ceph-msaini-x723pp-node3                    ceph-msaini-x723pp-node3                         running (19m)     4m ago   2d    6639k        -  17.2.5-14.el9cp  03f2e202c063  c39770bf5949  
crash.ceph-msaini-x723pp-node4                    ceph-msaini-x723pp-node4                         running (18m)     4m ago   2d    6597k        -  17.2.5-14.el9cp  03f2e202c063  79fd2d7fa8ef  
crash.ceph-msaini-x723pp-node5                    ceph-msaini-x723pp-node5                         running (18m)     3m ago   2d    6597k        -  17.2.5-14.el9cp  03f2e202c063  06b8acd32241  
crash.ceph-msaini-x723pp-node6                    ceph-msaini-x723pp-node6                         running (17m)     3m ago   2d    6605k        -  17.2.5-14.el9cp  03f2e202c063  c286ff0dff0c  
grafana.ceph-msaini-x723pp-node1-installer        ceph-msaini-x723pp-node1-installer  *:3000       running (13m)     3m ago   4d    50.3M        -  8.3.5            a283f9df3197  f5f8864e9aa9  
mds.test.ceph-msaini-x723pp-node5.xsqhie          ceph-msaini-x723pp-node5                         running (15m)     3m ago   2d    17.7M        -  17.2.5-14.el9cp  03f2e202c063  51fc05151741  
mgr.ceph-msaini-x723pp-node1-installer.vctwbi     ceph-msaini-x723pp-node1-installer  *:8443,9283  running (21m)     3m ago   4d     393M        -  17.2.5-14.el9cp  03f2e202c063  701b2d896e4a  
mgr.ceph-msaini-x723pp-node2.kaolcj               ceph-msaini-x723pp-node2            *:8443,9283  running (22m)     4m ago   2d     456M        -  17.2.5-14.el9cp  03f2e202c063  caaaeb254f86  
mgr.ceph-msaini-x723pp-node3.watgoz               ceph-msaini-x723pp-node3            *:8443,9283  running (21m)     4m ago   2d     393M        -  17.2.5-14.el9cp  03f2e202c063  ce547b20ff1a  
mon.ceph-msaini-x723pp-node1-installer            ceph-msaini-x723pp-node1-installer               running (21m)     3m ago   4d     113M    2048M  17.2.5-14.el9cp  03f2e202c063  28eae7a73fb6  
mon.ceph-msaini-x723pp-node2                      ceph-msaini-x723pp-node2                         running (20m)     4m ago   2d    58.8M    2048M  17.2.5-14.el9cp  03f2e202c063  8e7a298fea47  
mon.ceph-msaini-x723pp-node3                      ceph-msaini-x723pp-node3                         running (20m)     4m ago   2d    58.3M    2048M  17.2.5-14.el9cp  03f2e202c063  7cdf5ef6c698  
node-exporter.ceph-msaini-x723pp-node1-installer  ceph-msaini-x723pp-node1-installer  *:9100       running (14m)     3m ago   4d    17.4M        -  1.0.1            c8af8d642c9a  513797c98ec1  
node-exporter.ceph-msaini-x723pp-node2            ceph-msaini-x723pp-node2            *:9100       running (14m)     4m ago   2d    18.3M        -  1.0.1            c8af8d642c9a  6b1a5bb8b7ec  
node-exporter.ceph-msaini-x723pp-node3            ceph-msaini-x723pp-node3            *:9100       running (14m)     4m ago   2d    17.8M        -  1.0.1            c8af8d642c9a  7962c582d2c1  
node-exporter.ceph-msaini-x723pp-node4            ceph-msaini-x723pp-node4            *:9100       running (14m)     4m ago   2d    19.4M        -  1.0.1            c8af8d642c9a  4ab4d7990ef5  
node-exporter.ceph-msaini-x723pp-node5            ceph-msaini-x723pp-node5            *:9100       running (14m)     3m ago   2d    17.8M        -  1.0.1            c8af8d642c9a  35bcc4c384af  
node-exporter.ceph-msaini-x723pp-node6            ceph-msaini-x723pp-node6            *:9100       running (14m)     3m ago   2d    18.6M        -  1.0.1            c8af8d642c9a  fd14a6809bc7  
osd.0                                             ceph-msaini-x723pp-node4                         running (17m)     4m ago   2d    64.4M    1460M  17.2.5-14.el9cp  03f2e202c063  50cf6f7234be  
osd.1                                             ceph-msaini-x723pp-node4                         running (17m)     4m ago   2d    63.4M    1460M  17.2.5-14.el9cp  03f2e202c063  05ffb0814eec  
osd.2                                             ceph-msaini-x723pp-node4                         running (17m)     4m ago   2d    62.1M    1460M  17.2.5-14.el9cp  03f2e202c063  fb9fb6327257  
osd.3                                             ceph-msaini-x723pp-node5                         running (16m)     3m ago   2d    58.1M    4096M  17.2.5-14.el9cp  03f2e202c063  321cf8a6e8fe  
osd.4                                             ceph-msaini-x723pp-node5                         running (16m)     3m ago   2d    57.4M    4096M  17.2.5-14.el9cp  03f2e202c063  11520a66c2af  
osd.5                                             ceph-msaini-x723pp-node5                         running (16m)     3m ago   2d    59.7M    4096M  17.2.5-14.el9cp  03f2e202c063  637ddcf43925  
osd.6                                             ceph-msaini-x723pp-node5                         running (16m)     3m ago   2d    60.1M    4096M  17.2.5-14.el9cp  03f2e202c063  ebdf95d80ea8  
osd.7                                             ceph-msaini-x723pp-node6                         running (15m)     3m ago   2d    57.4M    1460M  17.2.5-14.el9cp  03f2e202c063  26dfdf22acbd  
osd.8                                             ceph-msaini-x723pp-node6                         running (15m)     3m ago   2d    58.0M    1460M  17.2.5-14.el9cp  03f2e202c063  3f4cd55a393f  
osd.9                                             ceph-msaini-x723pp-node6                         running (15m)     3m ago   2d    56.3M    1460M  17.2.5-14.el9cp  03f2e202c063  5600443447e4  
prometheus.ceph-msaini-x723pp-node1-installer     ceph-msaini-x723pp-node1-installer  *:9095       running (14m)     3m ago   4d     107M        -  2.22.2           ec2d358ca73c  f17a75761522


Note You need to log in before you can comment on or make changes to this bug.