Bug 1244322

Summary: [Cent OS 6.6 UPGRADE]: Monitor crash after upgrade to RHEL 7.1 CEPH-1.3.0
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: shylesh <shmohan>
Component: DocumentationAssignee: ceph-docs <ceph-docs>
Status: CLOSED CURRENTRELEASE QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 1.3.0CC: ceph-eng-bugs, dzafman, flucifre, hnallurv, kchai, kdreyer, ngoswami, shmohan, sjust
Target Milestone: rc   
Target Release: 1.3.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-18 09:59:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description shylesh 2015-07-17 18:58:02 UTC
Description of problem:
I am following the upgrade path Centos6.6 ceph 1.2.2 ---->Ceph 1.2.3----> RHEL 6.6 ceph 1.2.3---->RHEL 7.1 1.3.0. but after upgrading to RHEL 7.1 ceph 1.3.0 trying to start the monitor but its crashing repeatedly.

Version-Release number of selected component (if applicable):
ceph-mon-0.94.1-13.el7cp.x86_64
ceph-0.94.1-13.el7cp.x86_64


How reproducible:
always

Steps to Reproduce:

1.Created a cluster with 3 mons, 3 osd nodes and 1 admin/calamari node with CentOS 6.6 ceph 1.2.2

2.Upgraded the cluster from Centos 6.6 ceph 1.2.2 -----> Centos 6.6 ceph 1.2.3 and I/o was in progress. Upgrade was successful

3.Upgraded the same cluster from Centos 6.6 ceph 1.2.3 -------> RHEL 6.6 ceph 1.2.3 , everything was fine

4. Upgraded the same cluster from RHEL 6.6 ceph 1.2.3 -------> RHEL 7.1 ceph 1.3.0. since there are 3 mons I upgraded one by one

5. First calamari node was upgraded, then picked the first mon due to the bug 
https://bugzilla.redhat.com/show_bug.cgi?id=1230679 couldn't do iso upgrade on the mon1 so I did CDN upgrade due to which it was moved to RHEL7.1 ceph 1.3.0 Async .

6. Then picked mon2 for upgrade and was able to do ISO based upgrade by following https://bugzilla.redhat.com/show_bug.cgi?id=1230679#c12 workaround so now mon2 is in RHEL7.1 ceph 1.3.0, but after upgrade monitor is not able to start but crashing.


7. Now the cluster is in mixed mode
MON1-----> RHEL 7.1 ceph 1.3.0 Async
MON2------> RHEL7.1 ceph 1.3.0
MON3 ------> RHEL 6.6 ceph 1.2.3
All OSDs----> RHEL 6.6 ceph 1.2.3




Actual results:
Monitor is crashing continuosly 

Expected results:


Additional info:
--- begin dump of recent events ---
   -24> 2015-07-17 12:27:00.340588 7f63b22c97c0  5 asok(0x3750000) register_command perfcounters_dump hook 0x36c0050
   -23> 2015-07-17 12:27:00.340618 7f63b22c97c0  5 asok(0x3750000) register_command 1 hook 0x36c0050
   -22> 2015-07-17 12:27:00.340625 7f63b22c97c0  5 asok(0x3750000) register_command perf dump hook 0x36c0050
   -21> 2015-07-17 12:27:00.340632 7f63b22c97c0  5 asok(0x3750000) register_command perfcounters_schema hook 0x36c0050
   -20> 2015-07-17 12:27:00.340638 7f63b22c97c0  5 asok(0x3750000) register_command 2 hook 0x36c0050
   -19> 2015-07-17 12:27:00.340642 7f63b22c97c0  5 asok(0x3750000) register_command perf schema hook 0x36c0050
   -18> 2015-07-17 12:27:00.340647 7f63b22c97c0  5 asok(0x3750000) register_command perf reset hook 0x36c0050
   -17> 2015-07-17 12:27:00.340651 7f63b22c97c0  5 asok(0x3750000) register_command config show hook 0x36c0050
   -16> 2015-07-17 12:27:00.340657 7f63b22c97c0  5 asok(0x3750000) register_command config set hook 0x36c0050
   -15> 2015-07-17 12:27:00.340662 7f63b22c97c0  5 asok(0x3750000) register_command config get hook 0x36c0050
   -14> 2015-07-17 12:27:00.340667 7f63b22c97c0  5 asok(0x3750000) register_command config diff hook 0x36c0050
   -13> 2015-07-17 12:27:00.340671 7f63b22c97c0  5 asok(0x3750000) register_command log flush hook 0x36c0050
   -12> 2015-07-17 12:27:00.340676 7f63b22c97c0  5 asok(0x3750000) register_command log dump hook 0x36c0050
   -11> 2015-07-17 12:27:00.340681 7f63b22c97c0  5 asok(0x3750000) register_command log reopen hook 0x36c0050
   -10> 2015-07-17 12:27:00.344113 7f63b22c97c0  0 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 4973
    -9> 2015-07-17 12:27:00.370331 7f63b22c97c0  5 asok(0x3750000) init /var/run/ceph/ceph-mon.magna105.asok
    -8> 2015-07-17 12:27:00.370346 7f63b22c97c0  5 asok(0x3750000) bind_and_listen /var/run/ceph/ceph-mon.magna105.asok
    -7> 2015-07-17 12:27:00.370438 7f63b22c97c0  5 asok(0x3750000) register_command 0 hook 0x36b80b8
    -6> 2015-07-17 12:27:00.370447 7f63b22c97c0  5 asok(0x3750000) register_command version hook 0x36b80b8
    -5> 2015-07-17 12:27:00.370453 7f63b22c97c0  5 asok(0x3750000) register_command git_version hook 0x36b80b8
    -4> 2015-07-17 12:27:00.370459 7f63b22c97c0  5 asok(0x3750000) register_command help hook 0x36c00b0
    -3> 2015-07-17 12:27:00.370464 7f63b22c97c0  5 asok(0x3750000) register_command get_command_descriptions hook 0x36c0150
    -2> 2015-07-17 12:27:00.370497 7f63ae387700  5 asok(0x3750000) entry start
    -1> 2015-07-17 12:27:00.435792 7f63ae387700  5 asok(0x3750000) AdminSocket: request 'get_command_descriptions' '' to 0x36c0150 returned 1496 bytes
     0> 2015-07-17 12:27:00.441372 7f63b22c97c0 -1 *** Caught signal (Segmentation fault) **
 in thread 7f63b22c97c0

 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
 1: /usr/bin/ceph-mon() [0x901862]
 2: (()+0xf130) [0x7f63b1951130]




other 2 mons are in quorum

[root@magna105 ceph]# ceph quorum_status
{"election_epoch":48,"quorum":[1,2],"quorum_names":["magna107","magna108"],"quorum_leader_name":"magna107","monmap":{"epoch":1,"fsid":"46baf039-51de-4ff3-9b65-322193452957","modified":"0.000000","created":"0.000000","mons":[{"rank":0,"name":"magna105","addr":"10.8.128.105:6789\/0"},{"rank":1,"name":"magna107","addr":"10.8.128.107:6789\/0"},{"rank":2,"name":"magna108","addr":"10.8.128.108:6789\/0"}]}}


cluster is in active+clean state.

Comment 2 shylesh 2015-07-20 13:31:05 UTC
I purged the packages on the crashing node and did a fresh install but still mon fails to start with same crash as mentioned above.

Comment 3 Ken Dreyer (Red Hat) 2015-07-22 13:27:10 UTC
Is /usr/bin/ceph-mon segfaulting here? Kefu, can you please look into this?

Comment 5 Kefu Chai 2015-07-22 14:36:44 UTC
ken, yeah, seems like it segfaults at startup. will look at it.

shylesh, could you please post more details, ideally the log file of the crashed monitor? it should have the backtrace and more log messages? thanks.

Comment 7 Nilamdyuti 2015-07-22 17:33:12 UTC
Hi Shylesh,

Did you upgrade Ceph 1.2.3 on RHEL 6.6 to Ceph 1.3 on RHEL 7.1 in a single step? I mean did you enable Ceph 1.3 repos along with RHEL 7 repos and then ran preupgrade assistant?

Because if that has happened then it is not the right way to do it. Ideally the upgrade should be from RHEL 6.6 to RHEL 7.1 first, i.e first the OS upgrade only and not the Ceph upgrade. Once the OS has upgraded to RHEL 7.1, then upgrade Ceph from 1.2.3 to 1.3.

That your cluster is in a mixed state currently i.e, all OSDs on Ceph 1.2.3, RHEL 6.6 and two MONs on Ceph 1.3, RHEL 7.1 suggests that you might have done two upgrades (OS + Ceph) in one step.

Comment 8 shylesh 2015-07-22 17:50:38 UTC
(In reply to Nilamdyuti from comment #7)
> Hi Shylesh,
> 
> Did you upgrade Ceph 1.2.3 on RHEL 6.6 to Ceph 1.3 on RHEL 7.1 in a single
> step? I mean did you enable Ceph 1.3 repos along with RHEL 7 repos and then
> ran preupgrade assistant?
> 
> Because if that has happened then it is not the right way to do it. Ideally
> the upgrade should be from RHEL 6.6 to RHEL 7.1 first, i.e first the OS
> upgrade only and not the Ceph upgrade. Once the OS has upgraded to RHEL 7.1,
> then upgrade Ceph from 1.2.3 to 1.3.
> 
> That your cluster is in a mixed state currently i.e, all OSDs on Ceph 1.2.3,
> RHEL 6.6 and two MONs on Ceph 1.3, RHEL 7.1 suggests that you might have
> done two upgrades (OS + Ceph) in one step.

Nilam,

No, I upgraded RHEL first and then upgraded ceph. One mon is in 1.3.0 Async because we couldn't do ISO upgrade due to a bug in calamari which won't allow you to get packages from ISO mount, so in this case we did cdn upgrade so accidentally it went to 1.3.0 Async. Mon2 is the one on which we did proper upgrade from ISO (this is also after upgrading RHEL to 7.1) but here monitor process is crashing. Mon3 I haven't touched yet so its still in 1.2.3

Comment 9 Nilamdyuti 2015-07-22 18:11:07 UTC
(In reply to shylesh from comment #8)
> (In reply to Nilamdyuti from comment #7)
> > Hi Shylesh,
> > 
> > Did you upgrade Ceph 1.2.3 on RHEL 6.6 to Ceph 1.3 on RHEL 7.1 in a single
> > step? I mean did you enable Ceph 1.3 repos along with RHEL 7 repos and then
> > ran preupgrade assistant?
> > 
> > Because if that has happened then it is not the right way to do it. Ideally
> > the upgrade should be from RHEL 6.6 to RHEL 7.1 first, i.e first the OS
> > upgrade only and not the Ceph upgrade. Once the OS has upgraded to RHEL 7.1,
> > then upgrade Ceph from 1.2.3 to 1.3.
> > 
> > That your cluster is in a mixed state currently i.e, all OSDs on Ceph 1.2.3,
> > RHEL 6.6 and two MONs on Ceph 1.3, RHEL 7.1 suggests that you might have
> > done two upgrades (OS + Ceph) in one step.
> 
> Nilam,
> 
> No, I upgraded RHEL first and then upgraded ceph. One mon is in 1.3.0 Async
> because we couldn't do ISO upgrade due to a bug in calamari which won't
> allow you to get packages from ISO mount, so in this case we did cdn upgrade
> so accidentally it went to 1.3.0 Async. Mon2 is the one on which we did
> proper upgrade from ISO (this is also after upgrading RHEL to 7.1) but here
> monitor process is crashing. Mon3 I haven't touched yet so its still in 1.2.3

Okay. I get it. Thanks for the clarification! :)

Comment 10 Ken Dreyer (Red Hat) 2015-07-22 18:56:17 UTC
Since there's nothing upstream for this today, let's re-target to 1.3.2. (If the fix is trivial and we can land it before 1.3.1 dev freeze, we'll do so.)

Comment 12 Kefu Chai 2015-07-23 13:35:10 UTC
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000000880db9 in LevelDBStore::LevelDBWholeSpaceIteratorImpl::lower_bound (this=0x32e8110, prefix=..., to=...)
    at os/LevelDBStore.h:253
Python Exception <type 'exceptions.IndexError'> list index out of range:
#2  0x000000000087f8aa in LevelDBStore::get (this=0x3339080, prefix=..., keys=std::set with 1 elements, out=0x7fffffffcdd0)
    at os/LevelDBStore.cc:194
#3  0x0000000000564b96 in MonitorDBStore::get (this=this@entry=0x33391e0, prefix="monitor", key="magic", bl=...)
    at mon/MonitorDBStore.h:497
#4  0x000000000054bf2b in main (argc=<optimized out>, argv=0x7fffffffe3d8) at ceph_mon.cc:521
(gdb) f 3
#3  0x0000000000564b96 in MonitorDBStore::get (this=this@entry=0x33391e0, prefix="monitor", key="magic", bl=...)
    at mon/MonitorDBStore.h:497
497	    db->get(prefix, k, &out);

Comment 13 Federico Lucifredi 2015-07-23 17:30:00 UTC
Let's see what we can do to get the bug into 1.3.1 — we have a longer development phase than expected when Ken pushed — and it will be painful not to have this worked out until 1.3.2... we would most likely forced to fix it in an errata, so may as well be in the release to begin with.

Comment 14 Kefu Chai 2015-07-24 08:08:53 UTC
the crash was caused by leveldb-1.7.0-2.el6.x86_64 . it is not necessary a bug in leveldb, chances are that this very package failed to work in rhel7. for example, the ABI could have changed in glibc.

after upgrading leveldb to leveldb.x86_64 0:1.12.0-5.el7cp, i have following error:

# ceph-mon -i magna105 --public-addr 10.8.128.105
2015-07-24 03:34:06.418136 7f143bcb87c0 -1 unable to read magic from mon data

so i am wondering what we have in mon store:

$ strings /var/lib/ceph/mon/ceph-magna105/store.db/MANIFEST-000038
leveldb.BytewiseComparator

the MANIFEST should contain the important meta data of this leveldb. but it turns out it have barely nothing in it.

while a working mon store should have following tables:

$ strings ~/dev/ceph/src/dev/mon.a/store.db/MANIFEST-000004
leveldb.BytewiseComparatorM
mkfs
keyring
monitor
magic

in which, monitor/magic was being retrieved when the monitor crashed.

and the worse is that the .sst (which is now .ldb in recent leveldb) file is missing:

$ ls /var/lib/ceph/mon/ceph-magna105/store.db/
000039.log  CURRENT  LOCK  LOG  LOG.old  MANIFEST-000038

so a wild guess is that the monitor store was nuked by the crash.

Shylesh, could you try the upgrade again, but this time please be sure that the system is updated with rhel7 repo, or at least the leveldb is upgraded. but i'd suggest do a upgrade after RHEL 6.6 ceph 1.2.3---->RHEL 7.1 1.3.0, just to avoid some other failures due to possible ABI incompatibility.

Comment 15 shylesh 2015-07-24 10:25:43 UTC
(In reply to Kefu Chai from comment #14)
> the crash was caused by leveldb-1.7.0-2.el6.x86_64 . it is not necessary a
> bug in leveldb, chances are that this very package failed to work in rhel7.
> for example, the ABI could have changed in glibc.
> 
> after upgrading leveldb to leveldb.x86_64 0:1.12.0-5.el7cp, i have following
> error:
> 
> # ceph-mon -i magna105 --public-addr 10.8.128.105
> 2015-07-24 03:34:06.418136 7f143bcb87c0 -1 unable to read magic from mon data
> 
> so i am wondering what we have in mon store:
> 
> $ strings /var/lib/ceph/mon/ceph-magna105/store.db/MANIFEST-000038
> leveldb.BytewiseComparator
> 
> the MANIFEST should contain the important meta data of this leveldb. but it
> turns out it have barely nothing in it.
> 
> while a working mon store should have following tables:
> 
> $ strings ~/dev/ceph/src/dev/mon.a/store.db/MANIFEST-000004
> leveldb.BytewiseComparatorM
> mkfs
> keyring
> monitor
> magic
> 
> in which, monitor/magic was being retrieved when the monitor crashed.
> 
> and the worse is that the .sst (which is now .ldb in recent leveldb) file is
> missing:
> 
> $ ls /var/lib/ceph/mon/ceph-magna105/store.db/
> 000039.log  CURRENT  LOCK  LOG  LOG.old  MANIFEST-000038
> 
> so a wild guess is that the monitor store was nuked by the crash.
> 
> Shylesh, could you try the upgrade again, but this time please be sure that
> the system is updated with rhel7 repo, or at least the leveldb is upgraded.
> but i'd suggest do a upgrade after RHEL 6.6 ceph 1.2.3---->RHEL 7.1 1.3.0,
> just to avoid some other failures due to possible ABI incompatibility.

This mon is already in RHEL 7.1 1.3.0, what I understood from your comment is "I have to reinstall RHEL 6.6 with ceph 1.2.3 again on this node and then upgrade to RHEL7.1 first then upgrade the ceph from 1.2.3 to ceph 1.3.0" --correct me If I am wrong.

Comment 16 Kefu Chai 2015-07-24 14:24:21 UTC
Shylesh, 

the issue was caused by leveldb-1.7.0-2.el6. please note this is a el6 package.

my guess is that: to make ceph-mon work, we need to upgrade all its dependencies to their rhel7 versions.

in this case, leveldb-1.7.0-2.el6.x86_64 fails. so i wondering if other packages you installed as the dependencies of ceph-mon *before* upgrading the system to rhel7 will work for us.

so a safe bet is to upgrade all ceph-mon dependencies to the el7 version. please ping me if you are confused.

Comment 17 Kefu Chai 2015-07-24 14:51:57 UTC
Shylesh, probably we need step #7 in addition to the recipe you put in https://bugzilla.redhat.com/show_bug.cgi?id=1244322#c0

1.Created a cluster with 3 mons, 3 osd nodes and 1 admin/calamari node with CentOS 6.6 ceph 1.2.2

2.Upgraded the cluster from Centos 6.6 ceph 1.2.2 -----> Centos 6.6 ceph 1.2.3 and I/o was in progress. Upgrade was successful

3.Upgraded the same cluster from Centos 6.6 ceph 1.2.3 -------> RHEL 6.6 ceph 1.2.3 , everything was fine

4. Upgraded the same cluster from RHEL 6.6 ceph 1.2.3 -------> RHEL 7.1 ceph 1.3.0. since there are 3 mons I upgraded one by one

5. First calamari node was upgraded, then picked the first mon due to the bug 
https://bugzilla.redhat.com/show_bug.cgi?id=1230679 couldn't do iso upgrade on the mon1 so I did CDN upgrade due to which it was moved to RHEL7.1 ceph 1.3.0 Async .

6. Then picked mon2 for upgrade and was able to do ISO based upgrade by following https://bugzilla.redhat.com/show_bug.cgi?id=1230679#c12 workaround so now mon2 is in RHEL7.1 ceph 1.3.0, but after upgrade monitor is not able to start but crashing.

7. upgrade all dependencies of ceph-mon to latest version:

something like:

yum update `yum deplist ceph-mon 2>/dev/null | grep provider | awk '{print $2}' | uniq`

but some of them are not installable. for example, gperftools-libs.

Comment 18 Ken Dreyer (Red Hat) 2015-07-24 17:27:01 UTC
Thanks a ton Kefu for tracking down that leveldb issue.

There's nothing in the ceph RPM that will cause it to upgrade leveldb from 1.7 to 1.12 when yum upgrades ceph.

To be on the safe side, we should update the docs [1] to say "Run yum update on each node" after the "ceph-deploy install" operations (and before starting ceph back up). The "CDN" instructions already include a "yum update" command, but the "ISO" instructions do not. This bug is showing me that it's not safe to simply update the ceph package alone with ceph-deploy, because there could still be el6 packages on the nodes.

I'm curious about this leveldb change in particular. Will it be possible to read the old data with leveldb 1.12? Your comment 15 makes me think it is not possible to read leveldb 1.7's old data?


[1] https://access.redhat.com/beta/documentation/en/red-hat-ceph-storage-13-installation-guide-for-rhel-x86-64/chapter-26-upgrading-v123-to-v13-for-iso-based-installations

Comment 19 shylesh 2015-07-24 17:54:30 UTC
Kefu,

It worked since you had already upgraded leveldb to leveldb.x86_64 1.12.0-5.el7cp
 and I upgraded gperftool (not sure if this has really got something to do ) .

Now monitor started successfully.
but ps shows output something like 
#root     15460  1.1  0.1 301252 64544 ?        Sl   13:45   0:02 ceph-mon -i magna105 --public-addr 10.8.128.105


on magna105 compared to 

#root     25813  0.1  0.2 311092 76200 ?        Sl   Jul17  14:24 /usr/bin/ceph-mon -i magna107 --pid-file /var/run/ceph/mon.magna107.pid -c /etc/ceph/ceph.conf --cluster ceph -f

on magna107.. Is this ok ??

while upgrading next node I will make sure that all dependency packages are upgraded to el7, otherwise I have to do it manually. Not sure why ceph-deploy install didn't do it though it was pointing to right repo.

Let me know if I have to check something else.

Comment 20 Kefu Chai 2015-07-27 07:55:55 UTC
> Will it be possible to read the old data with leveldb 1.12? Your comment 15 makes me think it is not possible to read leveldb 1.7's old data?

## leveldb-1.7
$ ls mon.b/store.db/
000005.sst  000006.log  CURRENT  LOCK  LOG  LOG.old  MANIFEST-000004

## leveldb-1.18 (1.18 in my case)
$ ls mon.b/store.db/
000005.ldb  000006.log  CURRENT  LOCK  LOG  LOG.old  MANIFEST-000004

good question, please see https://github.com/google/leveldb/releases:

> New sstables will have the file extension .ldb. .sst files will continue to be recognized.

Comment 21 Kefu Chai 2015-07-27 13:31:13 UTC
Shylesh,

> but ps shows output something like 
> #root     15460  1.1  0.1 301252 64544 ?        Sl   13:45   0:02 ceph-mon -i magna105 --public-addr 10.8.128.105
> 

this was probably started by me. i killed it and started using the sysv init script.


and the ps output looks like that from from magna107 now =)

root      5970  1.1  0.0 256604 21260 ?        Sl   09:29   0:00 /usr/bin/ceph-mon -i magna105 --pid-file /var/run/ceph/mon.magna105.pid -c /etc/ceph/ceph.conf --cluster ceph -f

Comment 23 Ken Dreyer (Red Hat) 2015-07-29 13:55:40 UTC
This is a doc change -> resetting component for that.

Comment 25 shylesh 2015-07-30 13:10:58 UTC
*** Bug 1247711 has been marked as a duplicate of this bug. ***

Comment 27 shylesh 2015-08-21 10:20:57 UTC
After upgrading the leveldb monitor works fine . as per comment22 an yum update would also bring the mon and osd dependencies to the latest version. Hence marking this as verified.

Comment 28 Anjana Suparna Sriram 2015-12-18 09:59:24 UTC
Fixed for 1.3.1 Release.