Bug 1632160 - OSD restart should attempt the same amount of time for each OSD restart
Summary: OSD restart should attempt the same amount of time for each OSD restart
Keywords:
Status: CLOSED DUPLICATE of bug 1632157
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: z5
: 3.*
Assignee: Guillaume Abrioux
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-24 09:09 UTC by Guillaume Abrioux
Modified: 2018-09-24 09:46 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: the 'RETRIES' counter in restart_osd_daemon.sh was set at the start of the script and never reset between each call of check_pgs() function. Consequence: The counter which is set to 40 by default was never reset between each OSD restart, it means it was trying 40 times for all OSD on a node. Fix: The counter is now reset between each call of `check_pgs()` function. Result: The script tries the same amount of time for every OSD restart.
Clone Of:
Environment:
Last Closed: 2018-09-24 09:46:41 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 3160 0 None None None 2018-09-24 09:12:42 UTC

Description Guillaume Abrioux 2018-09-24 09:09:32 UTC
Description of problem:

The 'RETRIES' counter is not reset after each call of check_pgs in the restart_osd_daemon.sh script.
It's set with a default of 40 attempts, it means that it would wait for up to 40 lots of 30s across *all* the OSDs on a host.


How reproducible:
100%


Steps to Reproduce:
1. Deploy a cluster.
2. Make a change so the 'restart osds daemon' handler is triggered.
3. Relaunch the playbook.

Actual results:
The playbook will retry up to 40 times across all the OSDs on a node.

Expected results:
We should retry for the same amount of time after each OSD restart.

Comment 5 Guillaume Abrioux 2018-09-24 09:46:41 UTC

*** This bug has been marked as a duplicate of bug 1632157 ***


Note You need to log in before you can comment on or make changes to this bug.