Description of problem: During the first OS boot of RHEL 8 after the leapp OS upgrade step of the 13->16.1.6 FFU, openvswitch failed to start with the following error: error: Starting ovsdb-server ovsdb-server: /var/run/openvswitch/ovsdb-server.pid.tmp: create failed (Permission denied) This issue was trace to the following systemd service files: /etc/systemd/system/ovsdb-server.service & /etc/systemd/system/ovs-vswitchd.service After removing these files, openvswitch could start normally. This environment has been upgraded many times and its unclear when these files were inserted. The theory is that a previous openvswitch rpm or tripleo deployment placed the files(environment has been upgraded since osp 8). The upgrade should handle this situation; perhaps with a validation to protect against this failure. Version-Release number of selected component (if applicable): OSP 16.1.6 FFU How reproducible: 100% with these config files Additional info: # cat /etc/systemd/system/ovsdb-server.service [Unit] Description=Open vSwitch Database Unit After=syslog.target network-pre.target Before=network.target network.service Wants=ovs-delete-transient-ports.service PartOf=openvswitch.service [Service] Type=forking Restart=on-failure EnvironmentFile=/etc/openvswitch/default.conf EnvironmentFile=-/etc/sysconfig/openvswitch ExecStartPre=/usr/bin/chown ${OVS_USER_ID} /var/run/openvswitch ExecStartPre=/bin/sh -c 'rm -f /run/openvswitch/useropts; if [ "$${OVS_USER_ID/:*/}" != "root" ]; then /usr/bin/echo "OVSUSER=--ovs-user=${OVS_USER_ID}" > /run/openvswitch/useropts; fi' EnvironmentFile=-/run/openvswitch/useropts ExecStart=/usr/local/bin/ovs-ctl \ --no-ovs-vswitchd --no-monitor --system-id=random \ ${OVSUSER} \ start $OPTIONS ExecStop=/usr/local/bin/ovs-ctl --no-ovs-vswitchd stop ExecReload=/usr/local/bin/ovs-ctl --no-ovs-vswitchd \ ${OVSUSER} \ --no-monitor restart $OPTIONS RuntimeDirectory=openvswitch RuntimeDirectoryMode=0755 # cat /etc/systemd/system/ovs-vswitchd.service [Unit] Description=Open vSwitch Forwarding Unit After=ovsdb-server.service network-pre.target systemd-udev-settle.service Before=network.target network.service Requires=ovsdb-server.service ReloadPropagatedFrom=ovsdb-server.service AssertPathIsReadWrite=/var/run/openvswitch/db.sock PartOf=openvswitch.service [Service] Type=forking Restart=on-failure Environment=HOME=/var/run/openvswitch EnvironmentFile=/etc/openvswitch/default.conf EnvironmentFile=-/etc/sysconfig/openvswitch EnvironmentFile=-/run/openvswitch/useropts ExecStartPre=-/bin/sh -c '/usr/bin/chown :$${OVS_USER_ID##*:} /dev/hugepages' ExecStartPre=-/usr/bin/chmod 0775 /dev/hugepages ExecStart=/usr/local/bin/ovs-ctl \ --no-ovsdb-server --no-monitor --system-id=random \ ${OVSUSER} \ start $OPTIONS ExecStop=/usr/local/bin/ovs-ctl --no-ovsdb-server stop ExecReload=/usr/local/bin/ovs-ctl --no-ovsdb-server \ --no-monitor --system-id=random \ ${OVSUSER} \ restart $OPTIONS TimeoutSec=300 RuntimeDirectoryMode=0775 UMask=0002
Alex - this needs to be converted into a known issue for OSP 16.1 and 16.2. Generically speaking, in any environment that was upgraded from OSP<13 through to OSP13 there may be some /etc/systemd/system/ovs* files. If they are there, they need to be removed prior to starting the overcloud upgrade process - assuming they weren't put there by the customer on purpose. We cannot do this automatically because those overrides may be there for other purposes (they may be intentionally placed there by the customer). Having systemd service unit overrides is a perfectly valid thing to do if you know what you're doing.
Flipping back to engineering and closing as WONTFIX. This is now documented, but a fix can't be implemented in the code because automating removal of files in /etc/systemd/system/ is undesirable. Docs point to this (engineering) BZ for more info.
*** Bug 2091818 has been marked as a duplicate of this bug. ***