I opened an upstream PR to address this, https://github.com/ceph/ceph-ansible/pull/1070
The PR has been merged and the fixed pushed downstream.
devel_ack set in error
Created attachment 1217481 [details] ansible playbook log
[gmeno@localhost bcwc_pcie]$ for x in 061 063 067 070 077 080 086; do echo "magna$x"; ssh -n "shadow_man@magna$x.ceph.redhat.com" "ceph --version"; done magna061 ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986) magna063 ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986) magna067 ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986) magna070 ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986) magna077 ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986) magna080 ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986) magna086 ceph version 10.2.2-29redhat1xenial (8562bd783c1cd38a2bec7a423846482cbf5523e9) [gmeno@localhost bcwc_pcie]$ ssh shadow_man.redhat.comWelcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-45-generic x86_64) * Documentation: https://help.shadow_man.com * Management: https://landscape.canonical.com * Support: https://shadow_man.com/advantage Last login: Fri Nov 4 21:22:08 2016 from 10.3.112.43 shadow_man@magna061:~$ sudo ceph --cluster slave -s cluster c6f3e44b-94a1-475d-9b21-cf75800884a2 health HEALTH_WARN clock skew detected on mon.magna063, mon.magna067 pool us-west.rgw.buckets.data has many more objects per pg than average (too few pgs?) noout,noscrub,nodeep-scrub,sortbitwise flag(s) set Monitor clock skew detected monmap e1: 3 mons at {magna061=10.8.128.61:6789/0,magna063=10.8.128.63:6789/0,magna067=10.8.128.67:6789/0} election epoch 18, quorum 0,1,2 magna061,magna063,magna067 fsmap e9: 1/1/1 up {0=magna086=up:active} osdmap e74: 9 osds: 9 up, 9 in flags noout,noscrub,nodeep-scrub,sortbitwise pgmap v94136: 544 pgs, 16 pools, 251 GB data, 65190 objects 756 GB used, 7535 GB / 8291 GB avail 544 active+clean client io 40594 B/s rd, 80519 kB/s wr, 39 op/s rd, 213 op/s wr shadow_man@magna061:~$
Since I cant proceed on this using build: ceph-ansible-1.0.5-41.el7scon Moving back to Assigned
There is nothing more to do here. This should be testable once we've resolved the regression introduced in https://bugzilla.redhat.com/show_bug.cgi?id=1392238
Moving back to assigned because of https://bugzilla.redhat.com/show_bug.cgi?id=1393582
Created attachment 1219711 [details] ansible playbook log
Hi Andrew, I am seeing it again on the next OSD node i.e magna077 TASK: [restart ceph osds (systemd)] ******************************************* <magna077> ESTABLISH CONNECTION FOR USER: root <magna077> REMOTE_MODULE service name= state=restarted <magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010 && echo $HOME/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010' <magna077> PUT /tmp/tmp6Ao9Hu TO /root/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010/service <magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=indjejevdwpvjcmkuysrwlzlquoliswh] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-indjejevdwpvjcmkuysrwlzlquoliswh; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010/service; rm -rf /root/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010/ >/dev/null 2>&1'"'"'' changed: [magna077] => (item=1) => {"changed": true, "enabled": true, "item": "1", "name": "ceph-osd@1", "state": "started"} <magna077> ESTABLISH CONNECTION FOR USER: root <magna077> REMOTE_MODULE service name= state=restarted <magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926 && echo $HOME/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926' <magna077> PUT /tmp/tmphoXZAt TO /root/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926/service <magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=jynmtvrvfeqiwwcilwbiagszvfcocqrp] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-jynmtvrvfeqiwwcilwbiagszvfcocqrp; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926/service; rm -rf /root/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926/ >/dev/null 2>&1'"'"'' changed: [magna077] => (item=3) => {"changed": true, "enabled": true, "item": "3", "name": "ceph-osd@3", "state": "started"} <magna077> ESTABLISH CONNECTION FOR USER: root <magna077> REMOTE_MODULE service name= state=restarted <magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924 && echo $HOME/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924' <magna077> PUT /tmp/tmp8DD3FE TO /root/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924/service <magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=wpctwcohsaqarinahtbpdkxccnlcqbia] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-wpctwcohsaqarinahtbpdkxccnlcqbia; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924/service; rm -rf /root/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924/ >/dev/null 2>&1'"'"'' changed: [magna077] => (item=8) => {"changed": true, "enabled": true, "item": "8", "name": "ceph-osd@8", "state": "started"} Result from run 9 is: {u'cmd': u'test "$(ceph pg stat --cluster slave | sed \'s/^.*pgs://;s/active+clean.*//;s/ //\')" -eq "$(ceph pg stat --cluster slave | sed \'s/pgs.*//;s/^.*://;s/ //\')" && ceph health --cluster slave | egrep -sq "HEALTH_OK|HEALTH_WARN"', u'end': u'2016-11-11 13:45:39.346054', u'stdout': u'', u'changed': True, 'attempts': 9, u'start': u'2016-11-11 13:45:38.920396', u'delta': u'0:00:00.425658', u'stderr': u'', u'rc': 1, u'warnings': []} <magna061> REMOTE_MODULE command test "$(ceph pg stat --cluster slave | sed 's/^.*pgs://;s/active+clean.*//;s/ //')" -eq "$(ceph pg stat --cluster slave | sed 's/pgs.*//;s/^.*://;s/ //')" && ceph health --cluster slave | egrep -sq "HEALTH_OK|HEALTH_WARN" #USE_SHELL <magna061> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna061 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665 && echo $HOME/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665' <magna061> PUT /tmp/tmpWUKyM4 TO /root/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665/command <magna061> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna061 /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=fjbvyitpqgbhvwooeundgdiolzyyfwce] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-fjbvyitpqgbhvwooeundgdiolzyyfwce; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665/command; rm -rf /root/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665/ >/dev/null 2>&1'"'"'' Result from run 10 is: {u'cmd': u'test "$(ceph pg stat --cluster slave | sed \'s/^.*pgs://;s/active+clean.*//;s/ //\')" -eq "$(ceph pg stat --cluster slave | sed \'s/pgs.*//;s/^.*://;s/ //\')" && ceph health --cluster slave | egrep -sq "HEALTH_OK|HEALTH_WARN"', u'end': u'2016-11-11 13:45:49.961370', u'stdout': u'', u'changed': True, 'attempts': 10, u'start': u'2016-11-11 13:45:49.462353', u'delta': u'0:00:00.499017', u'stderr': u'', u'rc': 1, u'warnings': []} failed: [magna077 -> magna061] => {"attempts": 10, "changed": true, "cmd": "test \"$(ceph pg stat --cluster slave | sed 's/^.*pgs://;s/active+clean.*//;s/ //')\" -eq \"$(ceph pg stat --cluster slave | sed 's/pgs.*//;s/^.*://;s/ //')\" && ceph health --cluster slave | egrep -sq \"HEALTH_OK|HEALTH_WARN\"", "delta": "0:00:00.499017", "end": "2016-11-11 13:45:49.961370", "rc": 1, "start": "2016-11-11 13:45:49.462353", "warnings": []} FATAL: all hosts have already failed -- aborting I have entire ansible log with -vvvv if you need it.
Hi Andrew, This is my setup exactly: Ubuntu 16.04 Master cluster: 3 MON 3 OSD 1 RGW + MDS + RBD mirror Slave cluster: 3 MON 3 OSD 1 RGW + MDS + RBD mirror RBD mirror configured. RGW multisite configured, and multipart IO in progress when the rolling_update is run. on magna006 I am using the default /usr/share/ceph-ansible directory, all the standard group_vars options. Also we tried the rolling_update on a normal 3 mon 3 osd cluster on RHEL, and we did not see this issue. It may be specific to ubuntu or a multisite setup. Thanks, Tejas
I had purged the older cluster fully, before creating the slave cluster. I dont think there is any trace of the old cluster. But apparently I forgot to clean the log directory. Let me run the same rolling_update on the master cluster just to confirm.
for me these 2 issues are different. - description and comment#21 are different issues Initial description says problem with rgw upgrade which comes after all osd upgrade and problem mentioned in comment #21 is osd upgrade problem so as per me its 2 different issue so better to track it differently
(In reply to Rachana Patel from comment #31) > for me these 2 issues are different. > - description and comment#21 are different issues > > Initial description says problem with rgw upgrade which comes after all osd > upgrade and problem mentioned in comment #21 is osd upgrade problem so as > per me its 2 different issue so better to track it differently please refer https://bugzilla.redhat.com/show_bug.cgi?id=1380195 Bug 1380195 - [ceph-ansible] : rolling update is failing if cluster takes time to achieve OK state after OSD upgrade looks similar issue
(In reply to Rachana Patel from comment #33) > (In reply to Rachana Patel from comment #31) > > for me these 2 issues are different. > > - description and comment#21 are different issues > > > > Initial description says problem with rgw upgrade which comes after all osd > > upgrade and problem mentioned in comment #21 is osd upgrade problem so as > > per me its 2 different issue so better to track it differently > > please refer https://bugzilla.redhat.com/show_bug.cgi?id=1380195 > > Bug 1380195 - [ceph-ansible] : rolling update is failing if cluster takes > time to achieve OK state after OSD upgrade > > looks similar issue That does look similar, but the code we're testing does have the fix for 1380195 in it. Also, I don't think these OSDs ever achieve an OK state. They stay out until the service is restarted manually.
Would you please provide the logs requested in c38?
Hi Sam, I have repro'd the issue, with all the debug options that you asked, right from the creation of the cluster. Currently the setup is in the failed state with 1 OSD down. Here are the setup details: Master: Mon : magna008 magna016 magna019 OSD: magna031 magna046 magna052 RGW + RBD mirror + MDS : magna058 Slave: Mon : magna061 magna063 magna067 OSD: magna070 magna077 magna080 RGW + RBD mirror + MDS : magna086 The rolling update was run on master, and there is a OSD down on magna052. root@magna058:~# ceph -s --cluster master cluster 30fed536-557d-40b7-95e4-e95ca3e6196f health HEALTH_WARN clock skew detected on mon.magna016, mon.magna019 197 pgs degraded 197 pgs stuck unclean 197 pgs undersized recovery 8680/49353 objects degraded (17.588%) pool us-east.rgw.buckets.data has many more objects per pg than average (too few pgs?) 1/9 in osds are down noout,noscrub,nodeep-scrub,sortbitwise flag(s) set Monitor clock skew detected monmap e1: 3 mons at {magna008=10.8.128.8:6789/0,magna016=10.8.128.16:6789/0,magna019=10.8.128.19:6789/0} election epoch 16, quorum 0,1,2 magna008,magna016,magna019 fsmap e9: 1/1/1 up {0=magna058=up:active} osdmap e76: 9 osds: 8 up, 9 in; 28 remapped pgs flags noout,noscrub,nodeep-scrub,sortbitwise pgmap v4516: 536 pgs, 15 pools, 62300 MB data, 16451 objects 174 GB used, 8117 GB / 8291 GB avail 8680/49353 objects degraded (17.588%) 339 active+clean 197 active+undersized+degraded client io 932 B/s rd, 1 op/s rd, 0 op/s wr root@magna058:~# root@magna058:~# root@magna058:~# ceph osd tree --cluster master ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 8.09720 root default -2 2.69907 host magna046 2 0.89969 osd.2 up 1.00000 1.00000 5 0.89969 osd.5 up 1.00000 1.00000 8 0.89969 osd.8 up 1.00000 1.00000 -3 2.69907 host magna052 1 0.89969 osd.1 up 1.00000 1.00000 4 0.89969 osd.4 down 1.00000 1.00000 6 0.89969 osd.6 up 1.00000 1.00000 -4 2.69907 host magna031 0 0.89969 osd.0 up 1.00000 1.00000 3 0.89969 osd.3 up 1.00000 1.00000 7 0.89969 osd.7 up 1.00000 1.00000 Thanks, Tejas
Summary of what seems to be happening right now: Tejas was testing the original patch and noticed that some OSDs get started as part of rolling_update but never get marked up -- this leads to a situation where a cluster MAY fail to arrive at an active and clean state. This issue appears to be limited to ubuntu and only affects a few OSDs see https://bugzilla.redhat.com/show_bug.cgi?id=1391675#c40 We're trying to identify the root cause today and will communicate either a fix or workaround ASAP.
ceph-ansible was restarting daemons twice in quick succession in some cases. This exposed a bug in the OSD/Monitor interactions, https://bugzilla.redhat.com/show_bug.cgi?id=1394928 We are going to patch ceph-ansible in order to make it restart the daemons only once during a rolling upgrade. https://github.com/ceph/ceph-ansible/pull/1093
In bug 1394929 we've updated rolling_update to stop the services before upgrading, and then ensure they are started back up at the end. This should unblock QE verification of this bug.
Tejas, Could you please try again with the latest build ceph-ansible-1.0.5-44.el7scon? Thanks, Andrew
Verified in the latest build: ceph-ansible-1.0.5-44.el7scon using ISO and repos. Moving to verfied.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:2817