Description of problem: Ceph does not upgrade from 1.2.3 to 1.3.0 in ubuntu 14.04 using debian packages of Ceph-1.3-Ubuntu-14.04-20150813.t.0 document following from https://gitlab.cee.redhat.com/ngoswami/red-hat-ceph-storage-installation-guide-ubuntu/blob/v1.3/red-hat-ceph-storage-upgrade.adoc As usual following has been replaced in the commands `enterprise` to `enterprise-testing` `1.3` to `Ceph-1.3-Ubuntu-14.04-20150813.t.0` and with username and password ----------- Actual results: Upgrade tested on admin node and 1 mon node. when the version is checked using dpkg -l | grep ceph. The version is the output ii ceph 0.80.8.1-1trusty amd64 distributed storage and file system ii ceph-common 0.80.8.1-1trusty amd64 common utilities to mount and interact with a ceph storage cluster ii ceph-deploy 1.5.27.1trusty all Ceph-deploy is an easy to use configuration tool rc libapache2-mod-fastcgi 2.4.7~0910052141-ceph1 amd64 Apache 2 FastCGI module for long-running CGI scripts rc libcephfs1 0.80.9-0ubuntu0.14.04.2 amd64 Ceph distributed file system client library ii python-ceph 0.80.8.1-1trusty amd64 Python libraries for the Ceph distributed filesystem the last lines after `sudo apt-get update` are : [magna033][DEBUG ] Reading package lists... [magna033][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph [magna033][DEBUG ] Reading package lists... [magna033][DEBUG ] Building dependency tree... [magna033][DEBUG ] Reading state information... [magna033][DEBUG ] ceph is already the newest version. [magna033][DEBUG ] 0 upgraded, 0 newly installed, 0 to remove and 6 not upgraded. [magna033][INFO ] Running command: sudo ceph --version [magna033][DEBUG ] ceph version 0.80.8.1 (00b6fb027bb6fccda6716ef577464d30d96959cc) Actual Result : the version should be 0.94.1.7 after the upgrade, but it is still in 0.80.8.1 as seen above. Additional info:
Even for me also its also not getting upgraded to 0.94.1.7 from 0.94.1.4 version. ceph -v ceph version 0.94.1.4 (944951aae2783417100ff0f1078a20bcdcdb605d) ii ceph 0.94.1.4-1trusty amd64 distributed storage and file system ii ceph-common 0.94.1.4-1trusty amd64 common utilities to mount and interact with a ceph storage cluster ii ceph-deploy 1.5.27.1trusty all Ceph-deploy is an easy to use configuration tool ii ceph-fs-common 0.94.2-1trusty amd64 common utilities to mount and interact with a ceph file system ii ceph-mds 0.94.2-1trusty amd64 metadata server for the ceph distributed file system rc libapache2-mod-fastcgi 2.4.7~0910052141-ceph1 amd64 Apache 2 FastCGI module for long-running CGI scripts ii libcephfs1 0.94.1.4-1trusty amd64 Ceph distributed file system client library ii python-cephfs 0.94.1.4-1trusty amd64 Python libraries for the Ceph libcephfs library
the logs / output while upgrading one of the mon is here: http://pastebin.test.redhat.com/305443 magna006 - admin magna089 - mon
Note that in the last lines of http://pastebin.test.redhat.com/305443 Reading package lists... Done N: Ignoring file 'ceph.log' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension I am guessing this will not cause any issue, but i am pointing out just in case.
After reading the documentation that was followed to reproduce this issue and trying it on a new/separate node with ceph-deploy I found that there seems to be a step that was omitted. From the doc link provided (https://gitlab.cee.redhat.com/ngoswami/red-hat-ceph-storage-installation-guide-ubuntu/blob/v1.3/red-hat-ceph-storage-upgrade.adoc) on the "Monitor Node" section it says: Execute the following steps: Remove existing Ceph repositories in monitor node: cd /etc/apt/sources.list.d/ sudo rm -rf calamari-minion.list ceph.list However, after logging into magna089 I still see ceph.list with the wrong files: ubuntu@magna089:/etc/apt/sources.list.d$ ls -l total 16 -rw-r--r-- 1 root root 69 Aug 10 07:49 apt_mirror_sepia_ceph_com_blkin.list -rw-r--r-- 1 root root 55 Aug 14 08:04 ceph.list -rw-r--r-- 1 root root 101 Aug 10 07:49 gitbuilder_ceph_com_libapache_mod_fastcgi_deb_trusty_x86_64_basic_ref_master.list -rw------- 1 root root 145 Aug 14 08:02 Monitor.list According to the docs, 'ceph.list' should *not* be there. The contents of that file show that it is pointing to the previous release: $ cat ceph.list deb http://10.8.128.6/static/ceph/0.80.8.1 trusty main
Thanks Alfredo for pointing it out. I was cleaning the ceph.list in the admin node and not on monitor node. However after your comment I carefully executed the steps and observed the behavior. The following is the output after cleaning ceph.list ubuntu@magna089:/etc/apt/sources.list.d$ ls -l total 12 -rw-r--r-- 1 root root 69 Aug 10 07:49 apt_mirror_sepia_ceph_com_blkin.list -rw-r--r-- 1 root root 101 Aug 10 07:49 gitbuilder_ceph_com_libapache_mod_fastcgi_deb_trusty_x86_64_basic_ref_master.list -rw------- 1 root root 145 Aug 14 11:58 Monitor. then I went ahead and executed the commands as per the doc. Till I run Ceph-deploy with --no-adjust-repos, there was no ceph.list in the monitor node, as soon as I run this command from the admin node, ceph.list is getting created in monitor node. -rw-r--r-- 1 root root 69 Aug 10 07:49 apt_mirror_sepia_ceph_com_blkin.list -rw-r--r-- 1 root root 55 Aug 14 12:06 ceph.list -rw-r--r-- 1 root root 101 Aug 10 07:49 gitbuilder_ceph_com_libapache_mod_fastcgi_deb_trusty_x86_64_basic_ref_master.list -rw------- 1 root root 145 Aug 14 12:05 Monitor.list ubuntu@magna089:/etc/apt/sources.list.d$ And this ceph.list should not be created with --no-adjust-repos as per the doc. and as you pointed above ceph.list has the old version again. ubuntu@magna089:/etc/apt/sources.list.d$ sudo cat ceph.list deb http://10.8.128.6/static/ceph/0.80.8.1 trusty main ubuntu@magna089:/etc/apt/sources.list.d$ dpkg -l | grep ceph ii ceph 0.80.8.1-1trusty amd64 distributed storage and file system ii ceph-common 0.80.8.1-1trusty amd64 common utilities to mount and interact with a ceph storage cluster ii libapache2-mod-fastcgi 2.4.7~0910052141-ceph1 amd64 Apache 2 FastCGI module for long-running CGI scripts rc libcephfs1 0.80.8.1-1trusty amd64 Ceph distributed file system client library ii python-ceph 0.80.8.1-1trusty Am I missing some thing here again or is this really turning out to be a bug?
I think you now have a machine in an inconsistent state. You mention you are running a command but I am not sure what command is it. A full paste would be useful. I would like to see how you are removing the ceph.list file, what command you are running afterwards, and then the files in /etc/apt.sources.list.d/ and the contents of them. On a clean system I am completely unable to replicate this behavior.
Since you are not able to reproduce on a clean system. Let me try it again on a clean set of machines.
tganguly mentioned that magna094 has still the wrong version. Found that somehow the Ubuntu version of Ceph was installed, so helped getting it to the right version (manually): apt-get remove --purge ceph apt-get remove --purge ceph-* apt-get clean apt-get autoremove ls /etc/apt/sources.list.d shows ceph.list points to the wrong place: sudo rm -f ceph.list I find OSD.list there, has the right contents. apt-get update sudo apt-cache policy ceph ceph: Installed: (none) Candidate: 0.94.1.7-1trusty sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q --no-install-recommends install ceph ceph --version ceph version 0.94.1.7 (ede3a62fccbeb02d395769e7c75e637492be0ced)
Note that this is with deb packages and not ISO. I have not tested ISO.
Hi Rakesh, I think the issue is because of the old cephdeploy.conf file in your working directory i.e, the directory which has the ceph configuration file and from where you run the ceph-deploy commands. You are trying to upgrade RHCS v1.2.3 to RHCS v1.3 using online repos (this is new for v1.3, previously it was only ISO based). Now even if you are correctly setting the online repos on admin and monitor node, ceph-deploy is still picking up the 'baseurl' in the local 'cephdeploy.conf' (it gets created when you run ice_setup in ISO based installation). Also, because of this ceph-deploy is creating ceph.list in monitor node with same baseurl as in admin node i.e, having the contents for ceph version 0.80.8.1 (Firefly). Please remove the old 'cephdeploy.conf' file from your working directory in admin node and try from the beginning. It should install the 0.94 (Hammer) version. I have also added this info in the doc. See: https://gitlab.cee.redhat.com/ngoswami/red-hat-ceph-storage-installation-guide-ubuntu/blob/v1.3/red-hat-ceph-storage-upgrade.adoc Note that this step to remove cephdeploy.conf file is only required when you are upgrading from ISO based 1.2.3 install to online repo based 1.3 install. For ISO to ISO upgrade, ice_setup program will automatically update the old cephdeploy.conf file with new information. Let me know if works for you.
I logged into magna033 and saw the correct Ceph version. ubuntu@magna033:/etc/apt/sources.list.d$ ceph --version ceph version 0.94.1.7 (ede3a62fccbeb02d395769e7c75e637492be0ced)
Alfredo, From comment 12: >>Please remove the old 'cephdeploy.conf' file from your working directory in admin node Can this issue be fixed via code? Regards, Harish
After a lot of research we found out that .cephdeploy.conf has to be removed and Nilam too confirmed and updated the docs. After we found resolution, the upgrade went through. the pastebin is here : http://pastebin.test.redhat.com/305753. And Alfredo by the time you checked the machines, they were upgraded. !!.
(In reply to Harish NV Rao from comment #14) > Alfredo, > > From comment 12: > > >>Please remove the old 'cephdeploy.conf' file from your working directory in admin node > > Can this issue be fixed via code? > > Regards, > Harish No, because having one in the current working directory implies that you have configuration you want. If it doesn't find one, it will use one in $HOME/.cephdeploy.conf, and if that doesn't exist, it will create an empty one for the user.