Bug 1540881 - [CEE/SD] monitor_interface with "-" in the name fails with "msg": "'dict object' has no attribute u'ansible_bond-monitor-interface'"
Summary: [CEE/SD] monitor_interface with "-" in the name fails with "msg": "'dict obj...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Ansible
Version: 3.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z2
: 3.2
Assignee: Rishabh Dave
QA Contact: Vasishta
John Brier
URL:
Whiteboard:
Depends On:
Blocks: 1629656
TreeView+ depends on / blocked
 
Reported: 2018-02-01 09:19 UTC by Tomas Petr
Modified: 2019-05-02 14:58 UTC (History)
20 users (show)

Fixed In Version: RHEL: ceph-ansible-3.2.9-1.el7cp Ubuntu: ceph-ansible_3.2.9-2redhat1
Doc Type: Bug Fix
Doc Text:
.Ceph Ansible no longer fails if network interface names include dashes When `ceph-ansible` makes an inventory of network interfaces if they have a dash (`-`) in the name the inventory must convert the dashes to undescores (`_`) in order to use them. In some cases conversion did not occur and Ceph installation failed. With this update to {product}, all dashes in the names of network interfaces are converted in the facts and installation completes successfully.
Clone Of:
Environment:
Last Closed: 2019-04-30 15:56:43 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:0911 None None None 2019-04-30 15:57:00 UTC
Github ceph ceph-ansible pull 2078 None None None 2018-02-01 09:32:30 UTC
Github ceph ceph-ansible pull 3640 None None None 2019-02-27 12:12:19 UTC
Github ceph ceph-ansible pull 3657 None None None 2019-03-01 07:46:13 UTC

Description Tomas Petr 2018-02-01 09:19:54 UTC
Description of problem:
we have set network interface:
bond-monitor-interface as monitor_interface / public network

setting in all.yml
monitor_interface: bond-monitor-interface
public_network: 192.168.1.0/24
cluster_network: 192.168.2.0/28

Ansible deploy will fail with:
fatal: [mons-0]: FAILED! => {"msg": "'dict object' has no attribute u'ansible_bond-monitor-interface'"}

looking at the output of
ansible all -i mons-0 -m setup -c local > file.txt
        "ansible_bond_monitor_interface": {   <------------
            "active": true, 
            "device": "bond-monitor-interface",    <------------
            "features": {
                     .....
            }, 
            "hw_timestamp_filters": [], 
            "ipv4": {
                "address": "192.168.1.2", 
                "broadcast": "192.168.1.255", 
                "netmask": "255.255.255.0", 
                "network": "192.168.1.0"
            }, 
            "ipv6": [
                ...
            ], 
            "lacp_rate": "fast", 
            "macaddress": "aa:bb:cc:dd:ee:ff", 
            "miimon": "0", 
            "mode": "802.3ad", 
            "mtu": 9000, 
            "promisc": false, 
            "slaves": [
                "eth0", 
                "eth1"
            ], 
            "speed": 20000, 
            "timestamping": [
                "rx_software", 
                "software"
            ], 
            "type": "bonding"
        }, 
        ....
        "ansible_interfaces": [
            "lo", 
            "bond-monitor-interface ", 
            "eth0", 
            "eth1"
        ],


Version-Release number of selected component (if applicable):
ceph-ansible-3.0.14-1.el7cp.noarch
ansible-2.4.2.0-2.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. create network interface with "-" on the name
2. set is as monitor_interfave in all.yml
3. deploy ceph cluster with ceph-ansible 
4. watch it fail

Actual results:
if there is a "-" in interface name it will get changed for ansible object with "_" in name and the interface cannot be found
fatal: [mons-0]: FAILED! => {"msg": "'dict object' has no attribute u'ansible_bond-monitor-interface'"}

Expected results:
interface is properly recognized

Additional info:
unsure if it is ceph-ansible or ansible problem

Comment 3 leseb 2018-02-01 09:32:30 UTC
Fix is upstream already. This will be in 3.1

Comment 5 Servesha 2019-01-17 04:49:47 UTC
Hello Sebastian,

I tried to test the issue by reproducing it in my lab environment with ceph-ansible version 3.2. 
I experienced below error during deployment : 

TASK [ceph-validate : fail if br-ex is not active on servesha-ceph-test2] *********************************************
task path: /usr/share/ceph-ansible/roles/ceph-validate/tasks/check_eth_mon.yml:8
Tuesday 15 January 2019  04:47:00 -0500 (0:00:00.077)       0:00:21.305 ******* 
META: noop
META: noop
fatal: [servesha-ceph-test2]: FAILED! => {
    "msg": "The conditional check 'not hostvars[inventory_hostname]['ansible_' + monitor_interface]['active']' failed. The error was: error while evaluating conditional (not hostvars[inventory_hostname]['ansible_' + monitor_interface]['active']): 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_br-ex'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-validate/tasks/check_eth_mon.yml': line 8, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: \"fail if {{ monitor_interface }} is not active on {{ inventory_hostname }}\"\n  ^ here\nWe could be wrong, but this one looks like it might be an issue with\nmissing quotes.  Always quote template expression brackets when they\nstart a value. For instance:\n\n    with_items:\n      - {{ foo }}\n\nShould be written as:\n\n    with_items:\n      - \"{{ foo }}\"\n"
}

# rpm -qa | grep ansible
ansible-2.6.11-1.el7ae.noarch
ceph-ansible-3.2.0-1.el7cp.noarch


The bug was expected to be fixed in version 3.1 but it's still there in version 3.2.


Regards,
Servesha

Comment 6 leseb 2019-01-21 09:13:10 UTC
Indeed, it appears that's still a problem. Rishabh please look into this when you have a moment. Thanks.

Comment 7 Tomas Petr 2019-02-18 17:31:16 UTC
(In reply to Servesha from comment #5)
> Hello Sebastian,
> 
> I tried to test the issue by reproducing it in my lab environment with
> ceph-ansible version 3.2. 
> I experienced below error during deployment : 
> 
> TASK [ceph-validate : fail if br-ex is not active on servesha-ceph-test2]
> *********************************************
> task path:
> /usr/share/ceph-ansible/roles/ceph-validate/tasks/check_eth_mon.yml:8
> Tuesday 15 January 2019  04:47:00 -0500 (0:00:00.077)       0:00:21.305
> ******* 
> META: noop
> META: noop
> fatal: [servesha-ceph-test2]: FAILED! => {
>     "msg": "The conditional check 'not
> hostvars[inventory_hostname]['ansible_' + monitor_interface]['active']'
> failed. The error was: error while evaluating conditional (not
> hostvars[inventory_hostname]['ansible_' + monitor_interface]['active']):
> 'ansible.vars.hostvars.HostVarsVars object' has no attribute
> u'ansible_br-ex'\n\nThe error appears to have been in
> '/usr/share/ceph-ansible/roles/ceph-validate/tasks/check_eth_mon.yml': line
> 8, column 3, but may\nbe elsewhere in the file depending on the exact syntax
> problem.\n\nThe offending line appears to be:\n\n\n- name: \"fail if {{
> monitor_interface }} is not active on {{ inventory_hostname }}\"\n  ^
> here\nWe could be wrong, but this one looks like it might be an issue
> with\nmissing quotes.  Always quote template expression brackets when
> they\nstart a value. For instance:\n\n    with_items:\n      - {{ foo
> }}\n\nShould be written as:\n\n    with_items:\n      - \"{{ foo }}\"\n"
> }
> 
> # rpm -qa | grep ansible
> ansible-2.6.11-1.el7ae.noarch
> ceph-ansible-3.2.0-1.el7cp.noarch
> 
> 
> The bug was expected to be fixed in version 3.1 but it's still there in
> version 3.2.
> 
> 
> Regards,
> Servesha


So the task that causes this fail is Task in ceph-validate
TASK [ceph-validate : fail if br-ex is not active on servesha-ceph-test2]

 in 
./roles/ceph-validate/tasks/check_eth_mon.yml
and
./roles/ceph-validate/tasks/check_eth_rgw.yml


this was added in ceph-ansible 3.2 beta
https://github.com/ceph/ceph-ansible/commit/235d1b3f557dcd9164d392050382398e1cda7084#diff-3f1cf80769de29dc34cad67d08a71ee9


I think this can be fixed by adding the same parts of code as in original issue
https://github.com/ceph/ceph-ansible/pull/2078/files


like (and same for check_eth_rgw.yml)
-----------
# cat ./roles/ceph-validate/tasks/check_eth_mon.yml
---
- name: "fail if {{ monitor_interface }} does not exist on {{ inventory_hostname }}"
  fail:
    msg: "{{ monitor_interface }} does not exist on {{ inventory_hostname }}"
  when:
    - monitor_interface not in ansible_interfaces

- name: "fail if {{ monitor_interface }} is not active on {{ inventory_hostname }}"
  fail:
    msg: "{{ monitor_interface }} is not active on {{ inventory_hostname }}"
  when:
    - not hostvars[inventory_hostname]['ansible_' + (monitor_interface | replace('-', '_'))]['active']

- name: "fail if {{ monitor_interface }} does not have any ip v4 address on {{ inventory_hostname }}"
  fail:
    msg: "{{ monitor_interface }} does not have any IPv4 address on {{ inventory_hostname }}"
  when:
    - ip_version == "ipv4"
    - hostvars[inventory_hostname]['ansible_' + (monitor_interface | replace('-', '_'))]['ipv4'] is not defined

- name: "fail if {{ monitor_interface }} does not have any ip v6 address on {{ inventory_hostname }}"
  fail:
    msg: "{{ monitor_interface }} does not have any IPv6 address on {{ inventory_hostname }}"
  when:
    - ip_version == "ipv6"
    - hostvars[inventory_hostname]['ansible_' + (monitor_interface | replace('-', '_'))]['ipv6'] is not defined

-----------

Seb, can you confirm my thoughts?

Comment 8 Tomas Petr 2019-02-18 17:37:56 UTC
btw all I did was replace 
['ansible_' + monitor_interface]
for
['ansible_' + (monitor_interface | replace('-', '_'))]

Comment 11 Servesha 2019-03-06 15:52:39 UTC
Hello,

I made changes as mentioned in the upstream (made changes in file ./roles/ceph-validate/tasks/check_eth_mon.yml). After making changes playbook is giving errors at same task (TASK [ceph-validate : fail if br-ex is not active on servesha-ceph-test2]) as it was giving previously. 

Best regards,
Servesha

Comment 13 Servesha 2019-03-12 10:08:05 UTC
Hello gabrioux,

I have bridge created on node ssd1. ssd2 is my admin node and also mon,mgr and osd. ssd3 is purely osd node. 
I have made changes in file /usr/share/ceph-ansible/roles/ceph-validate/tasks/check_eth_mon.yml as mentioned in upstream. 

Expected results are : error occurrence while deploying monitor on node which contains br-ex network interface.

Here are my credentials of containerized cluster.

10.74.253.96 ssd2 - admin
10.74.254.21 ssd3
10.74.250.60 ssd1


ssd1 won't be accessible through ssh since it has br-ex.

Best regards,
Servesha

Comment 14 Dimitri Savineau 2019-03-13 18:24:42 UTC
@Servesha

The patch is working fine however the task is failling because your br-ex interface is down (active=false).
Could you try to set the interface up and rerun ceph-ansible ?

$ ip link set br-ex up

Comment 15 Servesha 2019-03-18 14:18:30 UTC
Hello Dimitri,

Yeah sure. I will try and rerun the playbook.

Regards,
Servesha

Comment 16 Dimitri Savineau 2019-03-26 19:33:36 UTC
Any update on this ?

Comment 17 Servesha 2019-03-28 10:46:01 UTC
Hello Dimitri , 

I was using that environment for different task testing, it is not available now. But since the patch is working, I have notified the customer to test workaround and let us know the results. The case is now "WOC". 

Thank you 

Best Regards,
Servesha

Comment 23 errata-xmlrpc 2019-04-30 15:56:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:0911

Comment 25 Rishabh Dave 2019-05-02 14:58:40 UTC
Hi, sorry for the late reply. I've made a slight change in the last sentence. I've replaced "converted in the inventory" by "converted in the facts" since that is more accurate.


Note You need to log in before you can comment on or make changes to this bug.