1391675 – [ceph-ansible]: Rolling update on ubuntu fails in "TASK: [restart rados gateway server(s)"

Bug 1391675 - [ceph-ansible]: Rolling update on ubuntu fails in "TASK: [restart rados gateway server(s)"

Summary: [ceph-ansible]: Rolling update on ubuntu fails in "TASK: [restart rados gatew...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Storage Console
Classification:	Red Hat Storage
Component:	ceph-ansible
Sub Component:
Version:	2
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	2
Assignee:	Andrew Schoen
QA Contact:	ceph-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:	1393582
Blocks:
TreeView+	depends on / blocked

Reported:	2016-11-03 18:08 UTC by Tejas
Modified:	2016-11-22 23:42 UTC (History)
CC List:	11 users (show)
Fixed In Version:	ceph-ansible-1.0.5-44.el7scon
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-11-22 23:42:40 UTC
Embargoed:

Attachments	(Terms of Use)
ansible playbook log (152.69 KB, text/plain) 2016-11-04 18:23 UTC, Tejas	no flags	Details
ansible playbook log (356.58 KB, text/plain) 2016-11-11 11:31 UTC, Tejas	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1394929	0	unspecified	CLOSED	During rolling upgrade all systemd services should be stopped before upgrade	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHBA-2016:2817	0	normal	SHIPPED_LIVE	ceph-iscsi-ansible and ceph-ansible bug fix update	2017-04-18 19:50:43 UTC

Internal Links: 1394929

Comment 2 Andrew Schoen 2016-11-03 19:35:08 UTC

I opened an upstream PR to address this, https://github.com/ceph/ceph-ansible/pull/1070

Comment 3 Andrew Schoen 2016-11-03 20:19:37 UTC

The PR has been merged and the fixed pushed downstream.

Comment 4 John Poelstra 2016-11-03 22:24:07 UTC

devel_ack set in error

Comment 11 Tejas 2016-11-04 18:23:38 UTC

Created attachment 1217481 [details]
ansible playbook  log

Comment 12 Christina Meno 2016-11-04 21:41:03 UTC

[gmeno@localhost bcwc_pcie]$ for x in 061 063 067 070 077 080 086; do echo "magna$x"; ssh -n "shadow_man@magna$x.ceph.redhat.com" "ceph --version"; done
magna061
ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986)
magna063
ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986)
magna067
ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986)
magna070
ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986)
magna077
ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986)
magna080
ceph version 10.2.3-5redhat1xenial (18d1172bab519c6230c96389a3e52f375741b986)
magna086
ceph version 10.2.2-29redhat1xenial (8562bd783c1cd38a2bec7a423846482cbf5523e9)
[gmeno@localhost bcwc_pcie]$ ssh shadow_man.redhat.comWelcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-45-generic x86_64)

 * Documentation:  https://help.shadow_man.com
 * Management:     https://landscape.canonical.com
 * Support:        https://shadow_man.com/advantage
Last login: Fri Nov  4 21:22:08 2016 from 10.3.112.43
shadow_man@magna061:~$ sudo ceph --cluster slave -s
    cluster c6f3e44b-94a1-475d-9b21-cf75800884a2
     health HEALTH_WARN
            clock skew detected on mon.magna063, mon.magna067
            pool us-west.rgw.buckets.data has many more objects per pg than average (too few pgs?)
            noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
            Monitor clock skew detected 
     monmap e1: 3 mons at {magna061=10.8.128.61:6789/0,magna063=10.8.128.63:6789/0,magna067=10.8.128.67:6789/0}
            election epoch 18, quorum 0,1,2 magna061,magna063,magna067
      fsmap e9: 1/1/1 up {0=magna086=up:active}
     osdmap e74: 9 osds: 9 up, 9 in
            flags noout,noscrub,nodeep-scrub,sortbitwise
      pgmap v94136: 544 pgs, 16 pools, 251 GB data, 65190 objects
            756 GB used, 7535 GB / 8291 GB avail
                 544 active+clean
  client io 40594 B/s rd, 80519 kB/s wr, 39 op/s rd, 213 op/s wr
shadow_man@magna061:~$

Comment 14 Tejas 2016-11-09 09:51:34 UTC

Since I cant proceed on this  using build:
ceph-ansible-1.0.5-41.el7scon

Moving back to Assigned

Comment 15 Andrew Schoen 2016-11-09 15:25:31 UTC

There is nothing more to do here. This should be testable once we've resolved the regression introduced in https://bugzilla.redhat.com/show_bug.cgi?id=1392238

Comment 18 Tejas 2016-11-10 07:20:31 UTC

Moving back to assigned because of 
https://bugzilla.redhat.com/show_bug.cgi?id=1393582

Comment 22 Tejas 2016-11-11 11:31:19 UTC

Created attachment 1219711 [details]
ansible playbook  log

Comment 24 Tejas 2016-11-11 13:48:21 UTC

Hi Andrew,

    I am seeing it again on the next OSD node i.e magna077

TASK: [restart ceph osds (systemd)] ******************************************* 
<magna077> ESTABLISH CONNECTION FOR USER: root
<magna077> REMOTE_MODULE service name= state=restarted
<magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010 && echo $HOME/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010'
<magna077> PUT /tmp/tmp6Ao9Hu TO /root/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010/service
<magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=indjejevdwpvjcmkuysrwlzlquoliswh] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-indjejevdwpvjcmkuysrwlzlquoliswh; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010/service; rm -rf /root/.ansible/tmp/ansible-tmp-1478871836.53-69755247935010/ >/dev/null 2>&1'"'"''
changed: [magna077] => (item=1) => {"changed": true, "enabled": true, "item": "1", "name": "ceph-osd@1", "state": "started"}
<magna077> ESTABLISH CONNECTION FOR USER: root
<magna077> REMOTE_MODULE service name= state=restarted
<magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926 && echo $HOME/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926'
<magna077> PUT /tmp/tmphoXZAt TO /root/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926/service
<magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=jynmtvrvfeqiwwcilwbiagszvfcocqrp] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-jynmtvrvfeqiwwcilwbiagszvfcocqrp; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926/service; rm -rf /root/.ansible/tmp/ansible-tmp-1478871837.21-250822556852926/ >/dev/null 2>&1'"'"''
changed: [magna077] => (item=3) => {"changed": true, "enabled": true, "item": "3", "name": "ceph-osd@3", "state": "started"}
<magna077> ESTABLISH CONNECTION FOR USER: root
<magna077> REMOTE_MODULE service name= state=restarted
<magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924 && echo $HOME/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924'
<magna077> PUT /tmp/tmp8DD3FE TO /root/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924/service
<magna077> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna077 /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=wpctwcohsaqarinahtbpdkxccnlcqbia] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-wpctwcohsaqarinahtbpdkxccnlcqbia; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924/service; rm -rf /root/.ansible/tmp/ansible-tmp-1478871839.89-147190929468924/ >/dev/null 2>&1'"'"''
changed: [magna077] => (item=8) => {"changed": true, "enabled": true, "item": "8", "name": "ceph-osd@8", "state": "started"}






Result from run 9 is: {u'cmd': u'test "$(ceph pg stat --cluster slave | sed \'s/^.*pgs://;s/active+clean.*//;s/ //\')" -eq "$(ceph pg stat --cluster slave | sed \'s/pgs.*//;s/^.*://;s/ //\')" && ceph health --cluster slave | egrep -sq "HEALTH_OK|HEALTH_WARN"', u'end': u'2016-11-11 13:45:39.346054', u'stdout': u'', u'changed': True, 'attempts': 9, u'start': u'2016-11-11 13:45:38.920396', u'delta': u'0:00:00.425658', u'stderr': u'', u'rc': 1, u'warnings': []}
<magna061> REMOTE_MODULE command test "$(ceph pg stat --cluster slave | sed 's/^.*pgs://;s/active+clean.*//;s/ //')" -eq "$(ceph pg stat --cluster slave | sed 's/pgs.*//;s/^.*://;s/ //')" && ceph health --cluster slave | egrep -sq "HEALTH_OK|HEALTH_WARN" #USE_SHELL
<magna061> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna061 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665 && echo $HOME/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665'
<magna061> PUT /tmp/tmpWUKyM4 TO /root/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665/command
<magna061> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 magna061 /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=fjbvyitpqgbhvwooeundgdiolzyyfwce] password: " -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-fjbvyitpqgbhvwooeundgdiolzyyfwce; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665/command; rm -rf /root/.ansible/tmp/ansible-tmp-1478871949.37-84447399544665/ >/dev/null 2>&1'"'"''
Result from run 10 is: {u'cmd': u'test "$(ceph pg stat --cluster slave | sed \'s/^.*pgs://;s/active+clean.*//;s/ //\')" -eq "$(ceph pg stat --cluster slave | sed \'s/pgs.*//;s/^.*://;s/ //\')" && ceph health --cluster slave | egrep -sq "HEALTH_OK|HEALTH_WARN"', u'end': u'2016-11-11 13:45:49.961370', u'stdout': u'', u'changed': True, 'attempts': 10, u'start': u'2016-11-11 13:45:49.462353', u'delta': u'0:00:00.499017', u'stderr': u'', u'rc': 1, u'warnings': []}
failed: [magna077 -> magna061] => {"attempts": 10, "changed": true, "cmd": "test \"$(ceph pg stat --cluster slave | sed 's/^.*pgs://;s/active+clean.*//;s/ //')\" -eq \"$(ceph pg stat --cluster slave  | sed 's/pgs.*//;s/^.*://;s/ //')\" && ceph health --cluster slave | egrep -sq \"HEALTH_OK|HEALTH_WARN\"", "delta": "0:00:00.499017", "end": "2016-11-11 13:45:49.961370", "rc": 1, "start": "2016-11-11 13:45:49.462353", "warnings": []}

FATAL: all hosts have already failed -- aborting



I have entire ansible log with -vvvv if you need it.

Comment 26 Tejas 2016-11-11 17:07:33 UTC

Hi Andrew,

   This is my setup exactly:

Ubuntu 16.04

Master cluster:
3 MON
3 OSD
1 RGW + MDS + RBD mirror

Slave cluster:
3 MON
3 OSD
1 RGW + MDS + RBD mirror

RBD mirror configured.
RGW multisite configured, and multipart IO in progress when the  rolling_update is run.

on magna006 I am using the default /usr/share/ceph-ansible directory, all the standard group_vars options. 

Also we tried the rolling_update on a normal 3 mon 3 osd cluster on RHEL, and we did not see this issue. It may be specific to ubuntu or a multisite setup.

Thanks,
Tejas

Comment 28 Tejas 2016-11-11 17:16:22 UTC

I had purged the older cluster fully, before creating the slave cluster. I dont think there is any trace of the old cluster. But apparently I forgot to clean the log directory.

Let me run the same rolling_update on the master cluster just to confirm.

Comment 31 Rachana Patel 2016-11-11 21:51:18 UTC

for me these 2 issues are different.
- description and comment#21 are different issues

Initial description says problem with rgw upgrade which comes after all osd upgrade and problem mentioned in comment #21 is osd upgrade problem so as per me its 2 different issue so better to track it differently

Comment 33 Rachana Patel 2016-11-11 21:55:30 UTC

(In reply to Rachana Patel from comment #31)
> for me these 2 issues are different.
> - description and comment#21 are different issues
> 
> Initial description says problem with rgw upgrade which comes after all osd
> upgrade and problem mentioned in comment #21 is osd upgrade problem so as
> per me its 2 different issue so better to track it differently

please refer https://bugzilla.redhat.com/show_bug.cgi?id=1380195

Bug 1380195 - [ceph-ansible] : rolling update is failing if cluster takes time to achieve OK state after OSD upgrade 

looks similar issue

Comment 34 Andrew Schoen 2016-11-11 21:59:34 UTC

(In reply to Rachana Patel from comment #33)
> (In reply to Rachana Patel from comment #31)
> > for me these 2 issues are different.
> > - description and comment#21 are different issues
> > 
> > Initial description says problem with rgw upgrade which comes after all osd
> > upgrade and problem mentioned in comment #21 is osd upgrade problem so as
> > per me its 2 different issue so better to track it differently
> 
> please refer https://bugzilla.redhat.com/show_bug.cgi?id=1380195
> 
> Bug 1380195 - [ceph-ansible] : rolling update is failing if cluster takes
> time to achieve OK state after OSD upgrade 
> 
> looks similar issue

That does look similar, but the code we're testing does have the fix for 1380195 in it. Also, I don't think these OSDs ever achieve an OK state. They stay out until the service is restarted manually.

Comment 39 Christina Meno 2016-11-14 04:05:27 UTC

Would you please provide the logs requested in c38?

Comment 40 Tejas 2016-11-14 11:30:00 UTC

Hi Sam,

   I have repro'd the issue, with all the debug options that you asked, right from the creation of the cluster. Currently the setup is in the failed state with 1 OSD down. Here are the setup details:

Master:
Mon : magna008 magna016 magna019
OSD: magna031 magna046 magna052
RGW + RBD mirror + MDS : magna058

Slave: 
Mon : magna061 magna063 magna067
OSD: magna070 magna077 magna080
RGW + RBD mirror + MDS : magna086

The rolling update was run on master, and there is a OSD down on magna052.

root@magna058:~# ceph -s --cluster master
    cluster 30fed536-557d-40b7-95e4-e95ca3e6196f
     health HEALTH_WARN
            clock skew detected on mon.magna016, mon.magna019
            197 pgs degraded
            197 pgs stuck unclean
            197 pgs undersized
            recovery 8680/49353 objects degraded (17.588%)
            pool us-east.rgw.buckets.data has many more objects per pg than average (too few pgs?)
            1/9 in osds are down
            noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
            Monitor clock skew detected 
     monmap e1: 3 mons at {magna008=10.8.128.8:6789/0,magna016=10.8.128.16:6789/0,magna019=10.8.128.19:6789/0}
            election epoch 16, quorum 0,1,2 magna008,magna016,magna019
      fsmap e9: 1/1/1 up {0=magna058=up:active}
     osdmap e76: 9 osds: 8 up, 9 in; 28 remapped pgs
            flags noout,noscrub,nodeep-scrub,sortbitwise
      pgmap v4516: 536 pgs, 15 pools, 62300 MB data, 16451 objects
            174 GB used, 8117 GB / 8291 GB avail
            8680/49353 objects degraded (17.588%)
                 339 active+clean
                 197 active+undersized+degraded
  client io 932 B/s rd, 1 op/s rd, 0 op/s wr
root@magna058:~# 
root@magna058:~# 
root@magna058:~# ceph osd tree --cluster master
ID WEIGHT  TYPE NAME         UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 8.09720 root default                                        
-2 2.69907     host magna046                                   
 2 0.89969         osd.2          up  1.00000          1.00000 
 5 0.89969         osd.5          up  1.00000          1.00000 
 8 0.89969         osd.8          up  1.00000          1.00000 
-3 2.69907     host magna052                                   
 1 0.89969         osd.1          up  1.00000          1.00000 
 4 0.89969         osd.4        down  1.00000          1.00000 
 6 0.89969         osd.6          up  1.00000          1.00000 
-4 2.69907     host magna031                                   
 0 0.89969         osd.0          up  1.00000          1.00000 
 3 0.89969         osd.3          up  1.00000          1.00000 
 7 0.89969         osd.7          up  1.00000          1.00000 


Thanks,
Tejas

Comment 42 Christina Meno 2016-11-14 16:57:04 UTC

Summary of what seems to be happening right now:

Tejas was testing the original patch and noticed that some OSDs get started as part of rolling_update but never get marked up -- this leads to a situation where a cluster MAY fail to arrive at an active and clean state.

This issue appears to be limited to ubuntu and only affects a few OSDs see https://bugzilla.redhat.com/show_bug.cgi?id=1391675#c40

We're trying to identify the root cause today and will communicate either a fix or workaround ASAP.

Comment 43 Ken Dreyer (Red Hat) 2016-11-14 21:05:24 UTC

ceph-ansible was restarting daemons twice in quick succession in some cases.

This exposed a bug in the OSD/Monitor interactions, https://bugzilla.redhat.com/show_bug.cgi?id=1394928

We are going to patch ceph-ansible in order to make it restart the daemons only once during a rolling upgrade. https://github.com/ceph/ceph-ansible/pull/1093

Comment 44 Ken Dreyer (Red Hat) 2016-11-14 21:26:51 UTC

In bug 1394929 we've updated rolling_update to stop the services before upgrading, and then ensure they are started back up at the end. This should unblock QE verification of this bug.

Comment 45 Andrew Schoen 2016-11-14 22:23:27 UTC

Tejas,

Could you please try again with the latest build ceph-ansible-1.0.5-44.el7scon?

Thanks,
Andrew

Comment 49 Tejas 2016-11-15 05:36:55 UTC

Verified in the latest build:
ceph-ansible-1.0.5-44.el7scon

using ISO and repos.
Moving to verfied.

Comment 51 errata-xmlrpc 2016-11-22 23:42:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:2817

Note You need to log in before you can comment on or make changes to this bug.