Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be unavailable on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1650184 - [RFE] Do not execute client role tasks serially but in parallel
Summary: [RFE] Do not execute client role tasks serially but in parallel
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Ansible
Version: 3.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z1
: 3.2
Assignee: Guillaume Abrioux
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks: 1578730
TreeView+ depends on / blocked
 
Reported: 2018-11-15 14:35 UTC by Giulio Fidente
Modified: 2019-10-02 08:50 UTC (History)
18 users (show)

Fixed In Version: RHEL: ceph-ansible-3.2.4-1.el7cp Ubuntu: ceph-ansible_3.2.4-2redhat1
Doc Type: Enhancement
Doc Text:
Previously, the `rolling-update.yml` playbook executed the client roles one by one. With this update, users can specify the number of nodes to be processed in one batch using the new variable `client_update_batch`. This makes the upgrade process for client nodes much faster. If no value is passed, it defaults to the value of the variable for `ansible_forks`, which is `5` by default.
Clone Of:
Environment:
Last Closed: 2019-01-31 10:36:36 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 3470 0 None None None 2019-01-02 16:46:05 UTC
Github ceph ceph-ansible pull 3471 0 None None None 2019-01-21 13:18:34 UTC
Github ceph ceph-ansible pull 3512 0 None None None 2019-01-18 08:07:28 UTC
Github ceph ceph-ansible pull 3519 0 None None None 2019-01-21 13:16:18 UTC
Red Hat Product Errata RHBA-2019:0223 0 None None None 2019-01-31 10:36:39 UTC

Description Giulio Fidente 2018-11-15 14:35:21 UTC
In ceph-ansible 3.1 the client role tasks are executed serially on each client node.

This causes tasks like scale up or upgrade to take a lot of time depending on how many client nodes are using the cluster; it should be possible instead to execute the client role tasks in parallel on all nodes at the same time.

Comment 3 Lukas Bezdicka 2018-11-15 15:15:59 UTC
We should provide set of new playbooks where we will try to do something like:
  serial: "{{ ((groups['<group>'] | length)  * 0.2) | round(0,'ceil') | int }}"
And even full parallel on clients as it makes no sense to containerize and upgrade one by one node if you have hundreds of nodes.

Comment 4 John Fulton 2018-12-20 15:40:01 UTC
- This is about rolling update; we see "serial: 1" here: https://github.com/ceph/ceph-ansible/blob/master/infrastructure-playbooks/rolling_update.yml#L798-L803
- Can we override this value for the client role, e.g. by customizing the inventory?

Comment 5 John Fulton 2019-01-02 14:18:05 UTC
As per a conversation with Seb: 

- the ceph-ansible team will remove "serial: 1" from rolling_update.yml playbook for the clients
- it will be put into 3.2

Comment 12 Giulio Fidente 2019-01-16 16:38:56 UTC
Looks like the fix introduced a new issue [1]

2019-01-16 17:32:34,485 p=28500 u=mistral |  ERROR! The field 'serial' has an invalid value, which includes an undefined variable. The error was: 'ansible_forks' is undefined                                    
                                                                                                                                                                                                                  
The error appears to have been in '/usr/share/ceph-ansible/infrastructure-playbooks/rolling_update.yml': line 737, column 3, but may                                                                              
be elsewhere in the file depending on the exact syntax problem. 
                                                                
The offending line appears to be:                               
                                                                                                                                                                                                                  
                                                                                                                                                                                                                  
- name: upgrade ceph client node                                
  ^ here                                                        
                                                                
exception type: <class 'ansible.errors.AnsibleUndefinedVariable'>                                                                                                                                                 
exception: 'ansible_forks' is undefined    

1. https://paste.fedoraproject.org/paste/QV7A0FhAl4t4uQMjXUK3Tw

Comment 14 Giulio Fidente 2019-01-17 17:36:14 UTC
(In reply to Giulio Fidente from comment #12)
> Looks like the fix introduced a new issue [1]
> 
> 2019-01-16 17:32:34,485 p=28500 u=mistral |  ERROR! The field 'serial' has
> an invalid value, which includes an undefined variable. The error was:
> 'ansible_forks' is undefined                                    
>                                                                             
> 
> The error appears to have been in
> '/usr/share/ceph-ansible/infrastructure-playbooks/rolling_update.yml': line
> 737, column 3, but may                                                      
> 
> be elsewhere in the file depending on the exact syntax problem. 
>                                                                 
> The offending line appears to be:                               
>                                                                             
> 
>                                                                             
> 
> - name: upgrade ceph client node                                
>   ^ here                                                        
>                                                                 
> exception type: <class 'ansible.errors.AnsibleUndefinedVariable'>           
> 
> exception: 'ansible_forks' is undefined    
> 
> 1. https://paste.fedoraproject.org/paste/QV7A0FhAl4t4uQMjXUK3Tw

I think the problem is that ansible_forks was added in ansible 2.5

Lukas, do you know what version of ansible was installed on the undercloud when the run failed?

Comment 27 Guillaume Abrioux 2019-01-22 15:02:38 UTC
Hi Tejas,

yes, the default behaviour is to update clients in parallel.

Comment 30 errata-xmlrpc 2019-01-31 10:36:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0223


Note You need to log in before you can comment on or make changes to this bug.