Bug 1253639 - ceph does not upgrade from ceph 1.2.3 to 1.3.0 in Ubuntu with the latest Debian packages
Summary: ceph does not upgrade from ceph 1.2.3 to 1.3.0 in Ubuntu with the latest Debi...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Build
Version: 1.3.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: 1.3.2
Assignee: Alfredo Deza
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-14 11:05 UTC by rakesh-gm
Modified: 2016-01-05 00:16 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-05 00:16:22 UTC
Embargoed:


Attachments (Terms of Use)

Description rakesh-gm 2015-08-14 11:05:36 UTC
Description of problem:
Ceph does not upgrade from 1.2.3 to 1.3.0 in ubuntu 14.04 using debian packages of Ceph-1.3-Ubuntu-14.04-20150813.t.0 


document following from https://gitlab.cee.redhat.com/ngoswami/red-hat-ceph-storage-installation-guide-ubuntu/blob/v1.3/red-hat-ceph-storage-upgrade.adoc 

As usual following has been replaced in the commands 

`enterprise` to `enterprise-testing`
`1.3` to `Ceph-1.3-Ubuntu-14.04-20150813.t.0`
and with username and password
-----------



Actual results:
Upgrade tested on admin node and 1 mon node. 
when the version is checked using dpkg -l | grep ceph. The version is the output

ii  ceph                                0.80.8.1-1trusty                 amd64        distributed storage and file system
ii  ceph-common                         0.80.8.1-1trusty                 amd64        common utilities to mount and interact with a ceph storage cluster
ii  ceph-deploy                         1.5.27.1trusty                   all          Ceph-deploy is an easy to use configuration tool
rc  libapache2-mod-fastcgi              2.4.7~0910052141-ceph1           amd64        Apache 2 FastCGI module for long-running CGI scripts
rc  libcephfs1                          0.80.9-0ubuntu0.14.04.2          amd64        Ceph distributed file system client library
ii  python-ceph                         0.80.8.1-1trusty                 amd64        Python libraries for the Ceph distributed filesystem
 

the last lines after `sudo apt-get update` are : 

[magna033][DEBUG ] Reading package lists...
[magna033][INFO  ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph
[magna033][DEBUG ] Reading package lists...
[magna033][DEBUG ] Building dependency tree...
[magna033][DEBUG ] Reading state information...
[magna033][DEBUG ] ceph is already the newest version.
[magna033][DEBUG ] 0 upgraded, 0 newly installed, 0 to remove and 6 not upgraded.
[magna033][INFO  ] Running command: sudo ceph --version
[magna033][DEBUG ] ceph version 0.80.8.1 (00b6fb027bb6fccda6716ef577464d30d96959cc)
 
Actual Result : 

the version should be 0.94.1.7 after the upgrade, but it is still in 0.80.8.1 as seen above. 

Additional info:

Comment 2 Tanay Ganguly 2015-08-14 11:59:39 UTC
Even for me also its also not getting upgraded to 0.94.1.7 from 0.94.1.4 version.

ceph -v
ceph version 0.94.1.4 (944951aae2783417100ff0f1078a20bcdcdb605d)

ii  ceph                                 0.94.1.4-1trusty                      amd64        distributed storage and file system
ii  ceph-common                          0.94.1.4-1trusty                      amd64        common utilities to mount and interact with a ceph storage cluster
ii  ceph-deploy                          1.5.27.1trusty                        all          Ceph-deploy is an easy to use configuration tool
ii  ceph-fs-common                       0.94.2-1trusty                        amd64        common utilities to mount and interact with a ceph file system
ii  ceph-mds                             0.94.2-1trusty                        amd64        metadata server for the ceph distributed file system
rc  libapache2-mod-fastcgi               2.4.7~0910052141-ceph1                amd64        Apache 2 FastCGI module for long-running CGI scripts
ii  libcephfs1                           0.94.1.4-1trusty                      amd64        Ceph distributed file system client library
ii  python-cephfs                        0.94.1.4-1trusty                      amd64        Python libraries for the Ceph libcephfs library

Comment 3 rakesh-gm 2015-08-14 12:19:08 UTC
the logs / output while upgrading one of the mon is here: http://pastebin.test.redhat.com/305443

magna006 - admin 
magna089 - mon

Comment 4 rakesh-gm 2015-08-14 12:22:57 UTC
Note that in the last lines of http://pastebin.test.redhat.com/305443

Reading package lists... Done                                                  
N: Ignoring file 'ceph.log' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension

I am guessing this will not cause any issue, but i am pointing out just in case.

Comment 5 Alfredo Deza 2015-08-14 14:38:36 UTC
After reading the documentation that was followed to reproduce this issue and trying it on a new/separate node with ceph-deploy I found that there seems to be a step that was omitted.

From the doc link provided (https://gitlab.cee.redhat.com/ngoswami/red-hat-ceph-storage-installation-guide-ubuntu/blob/v1.3/red-hat-ceph-storage-upgrade.adoc) on the "Monitor Node" section it says:

  Execute the following steps:

      Remove existing Ceph repositories in monitor node:

      cd /etc/apt/sources.list.d/
      sudo rm -rf calamari-minion.list ceph.list


However, after logging into magna089 I still see ceph.list with the wrong files:

    ubuntu@magna089:/etc/apt/sources.list.d$ ls -l
    total 16
    -rw-r--r-- 1 root root  69 Aug 10 07:49 apt_mirror_sepia_ceph_com_blkin.list
    -rw-r--r-- 1 root root  55 Aug 14 08:04 ceph.list
    -rw-r--r-- 1 root root 101 Aug 10 07:49 gitbuilder_ceph_com_libapache_mod_fastcgi_deb_trusty_x86_64_basic_ref_master.list
    -rw------- 1 root root 145 Aug 14 08:02 Monitor.list

According to the docs, 'ceph.list' should *not* be there. The contents of that file show that it is pointing to the previous release:

    $ cat ceph.list
    deb http://10.8.128.6/static/ceph/0.80.8.1 trusty main

Comment 6 rakesh-gm 2015-08-14 16:27:55 UTC
Thanks Alfredo for pointing it out. I was cleaning the ceph.list in the admin node and not on monitor node. However after your comment I carefully executed the steps and observed the behavior. 

The following is the output after cleaning ceph.list 

ubuntu@magna089:/etc/apt/sources.list.d$ ls -l
total 12
-rw-r--r-- 1 root root  69 Aug 10 07:49 apt_mirror_sepia_ceph_com_blkin.list
-rw-r--r-- 1 root root 101 Aug 10 07:49 gitbuilder_ceph_com_libapache_mod_fastcgi_deb_trusty_x86_64_basic_ref_master.list
-rw------- 1 root root 145 Aug 14 11:58 Monitor.

then I went ahead and executed the commands as per the doc.

Till I run Ceph-deploy with --no-adjust-repos, there was no ceph.list in the monitor node, as soon as I run this command from the admin node, ceph.list is getting created in monitor node.

-rw-r--r-- 1 root root  69 Aug 10 07:49 apt_mirror_sepia_ceph_com_blkin.list
-rw-r--r-- 1 root root  55 Aug 14 12:06 ceph.list
-rw-r--r-- 1 root root 101 Aug 10 07:49 gitbuilder_ceph_com_libapache_mod_fastcgi_deb_trusty_x86_64_basic_ref_master.list
-rw------- 1 root root 145 Aug 14 12:05 Monitor.list
ubuntu@magna089:/etc/apt/sources.list.d$ 

And this ceph.list should not be created with --no-adjust-repos as per the doc.
and as you pointed above ceph.list has the old version again. 

ubuntu@magna089:/etc/apt/sources.list.d$ sudo cat ceph.list 
deb http://10.8.128.6/static/ceph/0.80.8.1 trusty main

ubuntu@magna089:/etc/apt/sources.list.d$ dpkg -l | grep ceph
ii  ceph                                0.80.8.1-1trusty                 amd64        distributed storage and file system
ii  ceph-common                         0.80.8.1-1trusty                 amd64        common utilities to mount and interact with a ceph storage cluster
ii  libapache2-mod-fastcgi              2.4.7~0910052141-ceph1           amd64        Apache 2 FastCGI module for long-running CGI scripts
rc  libcephfs1                          0.80.8.1-1trusty                 amd64        Ceph distributed file system client library
ii  python-ceph                         0.80.8.1-1trusty  

Am I missing some thing here again or is this really turning out to be a bug?

Comment 7 Alfredo Deza 2015-08-14 16:38:48 UTC
I think you now have a machine in an inconsistent state. You mention you are running a command but I am not sure what command is it. A full paste would be useful. 

I would like to see how you are removing the ceph.list file, what command you are running afterwards, and then the files in /etc/apt.sources.list.d/ and the contents of them.

On a clean system I am completely unable to replicate this behavior.

Comment 8 rakesh-gm 2015-08-14 17:07:34 UTC
Since you are not able to reproduce on a clean system. Let me try it again on a clean set of machines.

Comment 9 Alfredo Deza 2015-08-14 18:53:08 UTC
tganguly mentioned that magna094 has still the wrong version. Found that somehow the Ubuntu version of Ceph was installed, so helped getting it to the right version (manually):

    apt-get remove --purge ceph
    apt-get remove --purge ceph-*
    apt-get clean
    apt-get autoremove
    
    ls /etc/apt/sources.list.d

shows ceph.list
points to the wrong place:

    sudo rm -f ceph.list

I find OSD.list there, has the right contents.

    apt-get update
    sudo apt-cache policy ceph
    ceph:
     Installed: (none)
     Candidate: 0.94.1.7-1trusty
    sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph
    ceph --version
    ceph version 0.94.1.7 (ede3a62fccbeb02d395769e7c75e637492be0ced)

Comment 11 rakesh-gm 2015-08-16 04:42:52 UTC
Note that this is with deb packages and not ISO. I have not tested ISO.

Comment 12 Nilamdyuti 2015-08-17 11:47:44 UTC
Hi Rakesh,

I think the issue is because of the old cephdeploy.conf file in your working directory i.e, the directory which has the ceph configuration file and from where you run the ceph-deploy commands. You are trying to upgrade RHCS v1.2.3 to RHCS v1.3 using online repos (this is new for v1.3, previously it was only ISO based). Now even if you are correctly setting the online repos on admin and monitor node, ceph-deploy is still picking up the 'baseurl' in the local 'cephdeploy.conf' (it gets created when you run ice_setup in ISO based installation). Also, because of this ceph-deploy is creating ceph.list in monitor node with same baseurl as in admin node i.e, having the contents for ceph version 0.80.8.1 (Firefly).

Please remove the old 'cephdeploy.conf' file from your working directory in admin node and try from the beginning. It should install the 0.94 (Hammer) version. I have also added this info in the doc. See: https://gitlab.cee.redhat.com/ngoswami/red-hat-ceph-storage-installation-guide-ubuntu/blob/v1.3/red-hat-ceph-storage-upgrade.adoc

Note that this step to remove cephdeploy.conf file is only required when you are upgrading from ISO based 1.2.3 install to online repo based 1.3 install. For ISO to ISO upgrade, ice_setup program will automatically update the old cephdeploy.conf file with new information.

Let me know if works for you.

Comment 13 Alfredo Deza 2015-08-17 11:51:30 UTC
I logged into magna033 and saw the correct Ceph version.

ubuntu@magna033:/etc/apt/sources.list.d$ ceph --version
ceph version 0.94.1.7 (ede3a62fccbeb02d395769e7c75e637492be0ced)

Comment 14 Harish NV Rao 2015-08-17 13:48:51 UTC
Alfredo,

From comment 12: 

>>Please remove the old 'cephdeploy.conf' file from your working directory in admin node

Can this issue be fixed via code?

Regards,
Harish

Comment 15 rakesh-gm 2015-08-17 14:24:15 UTC
After a lot of research we found out that .cephdeploy.conf has to be removed and Nilam too confirmed and updated the docs. After we found resolution, the upgrade went through. 

the pastebin is here : http://pastebin.test.redhat.com/305753. 

And Alfredo by the time you checked the machines, they were upgraded. !!.

Comment 16 Alfredo Deza 2015-08-17 14:29:49 UTC
(In reply to Harish NV Rao from comment #14)
> Alfredo,
> 
> From comment 12: 
> 
> >>Please remove the old 'cephdeploy.conf' file from your working directory in admin node
> 
> Can this issue be fixed via code?
> 
> Regards,
> Harish

No, because having one in the current working directory implies that you have configuration you want. If it doesn't find one, it will use one in $HOME/.cephdeploy.conf, and if that doesn't exist, it will create an empty one for the user.


Note You need to log in before you can comment on or make changes to this bug.