Bug 1645049 - Ansible Networking - nuking Neutron server.log in case of connection failure + logged message
Summary: Ansible Networking - nuking Neutron server.log in case of connection failure ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ansible
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: Upstream M2
: ---
Assignee: Dan Radez
QA Contact: Arkady Shtempler
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-01 10:18 UTC by Arkady Shtempler
Modified: 2019-08-29 16:43 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-29 16:43:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Arkady Shtempler 2018-11-01 10:18:26 UTC
In case when incorrect IP is set for switch:


neutron-ml2-ansible.yaml:
resource_registry:
  OS::TripleO::Services::NeutronCorePlugin: OS::TripleO::Services::NeutronCorePluginML2Ansible
parameter_defaults:
  NeutronMechanismDrivers: openvswitch,ansible
  NeutronTypeDrivers: local,vxlan,vlan,flat
  NeutronNetworkType: vlan
  ML2HostConfigs:
    switch1:
      ansible_network_os: junos
      ansible_host: 10.9.95.26 #Not existing IP
      ansible_user: ansible
      ansible_ssh_pass: N3tAutomation!
      #manage_vlans: false
                  
There are two issues:

1 - Loop
Error message is logged ~every 2-3 seconds into Neutron server.log:

For example:
2018-11-01 07:20:40.866 34 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
[root@overcloud-controller-0 heat-admin]# cat /var/log/containers/neutron/server.log | grep 'Name or service not known' | grep switch1 | grep 2018
2018-11-01 06:55:29.903 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:32.359 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:34.744 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:37.545 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:39.930 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:42.275 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:44.665 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:47.126 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:49.641 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:52.005 33 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:53.426 29 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:56.054 29 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}
2018-11-01 06:55:58.478 29 ERROR neutron.plugins.ml2.managers  fatal: [switch1]: FAILED! => {"msg": "[Errno -2] Name or service not known"}

As far as I know "retry mechanism" in such cases is usually started from some low value and then getting increased per failure until Final/Fatal decision is made.

2- more meaningful message
Actual message is: "Name or service not known"

I think we should improve experience for operators. One
thing is to look at the loop why we retry so many times and other thing
is to provide more meaningful message.

Comment 1 Arkady Shtempler 2019-08-29 16:43:21 UTC
Not reproducible


Note You need to log in before you can comment on or make changes to this bug.