1)- after a fresh deployment of overcloud, openvswitch-agent is showed as dead, making the any tenant instance deployment fails by not able to bind the port. After reboot of compute node, the patchport is present again, and all operations on the VM level (deployment, reboot, shutdown, etc) are working again. Investigating the issue looks still in the same family than https://bugzilla.redhat.com/show_bug.cgi?id=1451082 for -close inherited filedescriptors after forking -Allow rootwrap-daemon to timeout and exit If the client side abnormally exits, its rootwrap daemon cannot receive a shutdown message and will be left forever. Let it timeout and exit to save such cases. openstack-neutron-9.3.1-2.1.el7ost.noarch openstack-neutron-openvswitch-9.3.1-2.1.el7ost.noarch openstack-neutron-sriov-nic-agent-9.3.1-2.1.el7ost.noarch openvswitch-2.6.1-10.git20161206.el7fdp.x86_64 python-neutron-9.3.1-2.1.el7ost.noarch python-openvswitch-2.6.1-10.git20161206.el7fdp.noarch python-ryu-4.9-2.1.el7ost.noarch python-ryu-common-4.9-2.1.el7ost.noarch python-oslo-rootwrap-5.1.1-1.el7ost.noarc Neutron and python-ryu are marked for last available erratas for RHOSP 10. [openvswitch-agent.log] - Server was deployed and rootwrap daemon process with pid=8775 1 2017-08-22 10:47:56.512 8542 INFO neutron.common.config [-] Logging enabled! 2 2017-08-22 10:47:56.512 8542 INFO neutron.common.config [-] /usr/bin/neutron-openvswitch-agent version 9.3.1 3 2017-08-22 10:47:56.514 8542 WARNING oslo_config.cfg [-] Option "rabbit_hosts" from group "oslo_messaging_rabbit" is deprecated for removal. Its value may be silently ignored in the future. 4 2017-08-22 10:47:56.515 8542 WARNING oslo_config.cfg [-] Option "rabbit_password" from group "oslo_messaging_rabbit" is deprecated for removal. Its value may be silently ignored in the future. 5 2017-08-22 10:47:56.515 8542 WARNING oslo_config.cfg [-] Option "rabbit_userid" from group "oslo_messaging_rabbit" is deprecated for removal. Its value may be silently ignored in the future. 6 2017-08-22 10:47:56.517 8542 INFO ryu.base.app_manager [-] loading app neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp 7 2017-08-22 10:47:56.821 8542 INFO ryu.base.app_manager [-] loading app ryu.app.ofctl.service 8 2017-08-22 10:47:56.822 8542 INFO ryu.base.app_manager [-] loading app ryu.controller.ofp_handler 9 2017-08-22 10:47:56.827 8542 INFO ryu.base.app_manager [-] instantiating app neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp of OVSNeutronAgentRyuApp 10 2017-08-22 10:47:56.827 8542 INFO ryu.base.app_manager [-] instantiating app ryu.controller.ofp_handler of OFPHandler 11 2017-08-22 10:47:56.828 8542 INFO ryu.base.app_manager [-] instantiating app ryu.app.ofctl.service of OfctlService 12 2017-08-22 10:47:57.031 8542 INFO oslo_rootwrap.client [-] Spawned new rootwrap daemon process with pid=8775 -- Then at 14:50 logging was enabled that includes restarting services 329 2017-08-22 14:50:57.114 24253 INFO neutron.common.config [-] Logging enabled! --first signals ryu.lib.hub was no able to connect as pid=8775 was still using the socket. 508 2017-08-22 14:50:57.318 24253 INFO ryu.base.app_manager [-] loading app ryu.app.ofctl.service 509 2017-08-22 14:50:57.319 24253 INFO ryu.base.app_manager [-] loading app ryu.controller.ofp_handler 510 2017-08-22 14:50:57.319 24253 INFO ryu.base.app_manager [-] instantiating app neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp of OVSNeutronAgentRyuApp 511 2017-08-22 14:50:57.320 24253 INFO ryu.base.app_manager [-] instantiating app ryu.controller.ofp_handler of OFPHandler 512 2017-08-22 14:50:57.320 24253 INFO ryu.base.app_manager [-] instantiating app ryu.app.ofctl.service of OfctlService 513 2017-08-22 14:50:57.323 24253 DEBUG neutron.callbacks.manager [-] Subscribe: <function init_handler at 0x5a4a488> Open vSwitch agent after_init subscribe /usr/lib/python2.7/site-packages/neutron/callbacks/manager.py:42 514 2017-08-22 14:50:57.323 24253 DEBUG neutron.agent.linux.utils [-] Running command: ['ip', 'addr', 'show', 'to', '172.17.2.13'] create_process /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:89 515 2017-08-22 14:50:57.350 24253 ERROR ryu.lib.hub [-] hub: uncaught exception: Traceback (most recent call last): 516 File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 54, in _launch 517 return func(*args, **kwargs) 518 File "/usr/lib/python2.7/site-packages/ryu/controller/controller.py", line 97, in __call__ 519 self.ofp_ssl_listen_port) 520 File "/usr/lib/python2.7/site-packages/ryu/controller/controller.py", line 120, in server_loop 521 datapath_connection_factory) 522 File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 117, in __init__ 523 self.server = eventlet.listen(listen_info) 524 File "/usr/lib/python2.7/site-packages/eventlet/convenience.py", line 43, in listen 525 sock.bind(addr) 526 File "/usr/lib64/python2.7/socket.py", line 224, in meth 527 return getattr(self._sock,name)(*args) 528 error: [Errno 98] Address already in use 529 This is mentioned there is a previos pid using the socket. this was associated with pid being in sleep state 8775 0.0 0.0 193388 2832 ?S 10:47 0:00 sudo neutron-rootwrap-daemon /etc/neutron/rootwrap.conf (this is a sudo wrapper allowing neutron run ovs client and similars operations at root) Noticed the State: S (sleeping), Pocess in S or D state is usually in a blocking system call, such as reading or writing to a file or the network, waiting for a called program to finish, or while waiting on semaphores or other synchronization primitives. It will go into the sleep state while waiting. on our scenario the pid was waiting on the child 8776 strace -p 8775 Process 8775 attached select(13, [6 12], [], NULL, NULL ps -ef | grep -i 8775 root 8775 1 0 10:47 ?00:00:00 sudo neutron-rootwrap-daemon /etc/neutron/rootwrap.conf root 8776 8775 0 10:47 ?00:00:00 /usr/bin/python2 /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Checking the socket was in used for 8775 [root@overcloud-compute-0neutron]# ss -lnp|grep 6633 tcp LISTEN 51 50 127.0.0.1:6633 *:* users:(("sudo",pid=8775,fd=5)) [root@overcloud-compute-0neutron]# netstat -tulpn |grep 6633 tcp 51 0 127.0.0.1:6633 0.0.0.0:* LISTEN 8775/sudo [root@overcloud-compute-0neutron]# ps -aux | grep 8775 root 8775 0.0 0.0 193388 2832 ?S 10:47 0:00 sudo neutron-rootwrap-daemon /etc/neutron/rootwrap.conf After killing the child one and freeing the socket service was able to start again, neutron agent-list showing openvswitch.agent being alive again as bein able to run the OMMAND=/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ovsdb-client monitor Interface name,ofport,external_ids --format=json Aug 22 15:55:47 overcloud-compute-0 sudo[28614]: neutron : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Aug 22 15:55:47 overcloud-compute-0 sudo[28622]: neutron : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ovsdb-client monitor Interface name,ofport,external_ids --format=json --OVS proccesses being restarted correctly. [root@overcloud-compute-0neutron]# ps -ef | grep ovs root 28334 1 0 15:50 ?00:00:00 ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach root 28368 1 0 15:50 ?00:00:00 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach root 28622 28589 0 15:55 ?00:00:00 sudo neutron-rootwrap /etc/neutron/rootwrap.conf ovsdb-client monitor Interface name,ofport,external_ids --format=json root 28624 28622 0 15:55 ?00:00:00 /usr/bin/python2 /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ovsdb-client monitor Interface name,ofport,external_ids --format=json root 28628 28624 0 15:55 ?00:00:00 /bin/ovsdb-client monitor Interface name,ofport,external_ids --format=json -Rest of the logs entries: 1584 2017-08-22 15:55:42.361 28589 INFO ryu.base.app_manager [-] loading app ryu.app.ofctl.service 1585 2017-08-22 15:55:42.362 28589 INFO ryu.base.app_manager [-] loading app ryu.controller.ofp_handler 1586 2017-08-22 15:55:42.362 28589 INFO ryu.base.app_manager [-] instantiating app neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp of OVSNeutronAgentRyuApp 1587 2017-08-22 15:55:42.362 28589 INFO ryu.base.app_manager [-] instantiating app ryu.controller.ofp_handler of OFPHandler 1588 2017-08-22 15:55:42.363 28589 INFO ryu.base.app_manager [-] instantiating app ryu.app.ofctl.service of OfctlService 1589 2017-08-22 15:55:42.364 28589 DEBUG neutron.callbacks.manager [-] Subscribe: <function init_handler at 0x4d04488> Open vSwitch agent after_init subscribe /usr/lib/python2.7/site-packages/neutron/callbacks/manager.py:42 1590 2017-08-22 15:55:42.365 28589 DEBUG neutron.agent.linux.utils [-] Running command: ['ip', 'addr', 'show', 'to', '172.17.2.13'] create_process /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:89 1591 2017-08-22 15:55:42.375 28589 DEBUG neutron.agent.linux.utils [-] Exit code: 0 execute /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:150 1592 2017-08-22 15:55:42.440 28589 DEBUG neutron.agent.ovsdb.impl_idl [-] Running txn command(idx=0): AddBridgeCommand(datapath_type=system, may_exist=True, name=br-int) do_commit /usr/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_idl.py:98 1593 2017-08-22 15:55:42.440 28589 DEBUG neutron.agent.ovsdb.impl_idl [-] Transaction caused no change do_commit /usr/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_idl.py:125 root@overcloud-compute-0neutron]# systemctl restart neutron-openvswitch-agent.service [root@overcloud-compute-0neutron]# systemctl -l status neutron-openvswitch-agent.service ● neutron-openvswitch-agent.service - OpenStack Neutron Open vSwitch Agent Loaded: loaded (/usr/lib/systemd/system/neutron-openvswitch-agent.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2017-08-22 15:55:41 UTC; 14s ago Process: 28582 ExecStartPre=/usr/bin/neutron-enable-bridge-firewall.sh (code=exited, status=0/SUCCESS) Main PID: 28589 (neutron-openvsw) CGroup: /system.slice/neutron-openvswitch-agent.service ├─28589 /usr/bin/python2 /usr/bin/neutron-openvswitch-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini --config-dir /etc/neutron/conf.d/common --config-dir /etc/neutron/conf.d/neutron-openvswitch-agent --log-file /var/log/neutron/openvswitch-agent.log ├─28614 sudo neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ├─28615 /usr/bin/python2 /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ├─28622 sudo neutron-rootwrap /etc/neutron/rootwrap.conf ovsdb-client monitor Interface name,ofport,external_ids --format=json ├─28624 /usr/bin/python2 /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ovsdb-client monitor Interface name,ofport,external_ids --format=json └─28628 /bin/ovsdb-client monitor Interface name,ofport,external_ids --format=json Aug 22 15:55:41 overcloud-compute-0 neutron-enable-bridge-firewall.sh[28582]: net.bridge.bridge-nf-call-arptables = 1 Aug 22 15:55:41 overcloud-compute-0 neutron-enable-bridge-firewall.sh[28582]: net.bridge.bridge-nf-call-iptables = 1 Aug 22 15:55:41 overcloud-compute-0 neutron-enable-bridge-firewall.sh[28582]: net.bridge.bridge-nf-call-ip6tables = 1 Aug 22 15:55:41 overcloud-compute-0 systemd[1]: Started OpenStack Neutron Open vSwitch Agent. Aug 22 15:55:41 overcloud-compute-0 neutron-openvswitch-agent[28589]: Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports. Aug 22 15:55:41 overcloud-compute-0 neutron-openvswitch-agent[28589]: Option "verbose" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future. Aug 22 15:55:41 overcloud-compute-0 neutron-openvswitch-agent[28589]: Option "rpc_backend" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future. Aug 22 15:55:42 overcloud-compute-0 neutron-openvswitch-agent[28589]: Option "notification_driver" from group "DEFAULT" is deprecated. Use option "driver" from group "oslo_messaging_notifications". Aug 22 15:55:47 overcloud-compute-0 sudo[28614]: neutron : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Aug 22 15:55:47 overcloud-compute-0 sudo[28622]: neutron : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ovsdb-client monitor Interface name,ofport,external_ids --format=json
Archiving this bugzilla as duplicated of https://bugzilla.redhat.com/show_bug.cgi?id=1451082 covered officially as per: https://access.redhat.com/errata/RHBA-2017:2653.