Bug 1738896 - ovsdb-server.service fails with /var/run/openvswitch/ovsdb-server.pid.tmp: create failed (Permission denied)
Summary: ovsdb-server.service fails with /var/run/openvswitch/ovsdb-server.pid.tmp: cr...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: openvswitch
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Flavio Leitner
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1745049 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-08 11:30 UTC by Daniel Berrangé
Modified: 2020-02-14 09:05 UTC (History)
14 users (show)

Fixed In Version: openvswitch-2.12.0-1.fc31
Clone Of:
Environment:
Last Closed: 2019-09-17 02:18:52 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Daniel Berrangé 2019-08-08 11:30:25 UTC
Description of problem:
On Fedora 31 I'm unable to start openvswitch

# systemctl start openvswitch
A dependency job for openvswitch.service failed. See 'journalctl -xe' for details.

Journal reports

Aug 08 12:24:41 localhost.localdomain systemd[1]: ovsdb-server.service: Failed with result 'exit-code'.
Aug 08 12:24:41 localhost.localdomain systemd[1]: Failed to start Open vSwitch Database Unit.


The log file /var/log/openvswitch/ovsdb-server.log shows

2019-08-08T11:27:38.240Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovsdb-server.log
2019-08-08T11:27:38.248Z|00002|daemon_unix|EMER|/var/run/openvswitch/ovsdb-server.pid.tmp: create failed (Permission denied)

I attempted to setenforce 0 and it still fails, so this is not selinux related

The directory referenced does not even exist

It works correctly on Fedora 30, so this is a regression in rawhide.

Version-Release number of selected component (if applicable):
openvswitch-2.11.1-3.fc31.x86_64

How reproducible:
Always

Steps to Reproduce:
1.# systemctl start openvswitch
2.
3.

Actual results:
Fails to start

Expected results:


Additional info:

Comment 1 Ben Cotton 2019-08-13 16:47:37 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to '31'.

Comment 2 Flavio Leitner 2019-08-21 14:39:09 UTC
Looks like systemd has changed the behavior again:

This is the hacked service to show details during the service start up:

# cat /etc/systemd/system/ovsdb-server.service 
[Unit]
Description=Open vSwitch Database Unit
After=syslog.target network-pre.target
Before=network.target network.service
Wants=ovs-delete-transient-ports.service
PartOf=openvswitch.service

[Service]
Type=forking
Restart=on-failure
EnvironmentFile=/etc/openvswitch/default.conf
EnvironmentFile=-/etc/sysconfig/openvswitch
ExecStartPre=/usr/bin/chown ${OVS_USER_ID} /var/run/openvswitch /var/log/openvswitch
ExecStartPre=/bin/sh -c 'rm -f /run/openvswitch/useropts; if [ "$${OVS_USER_ID/:*/}" != "root" ]; then /usr/bin/echo "OVSUSER=--ovs-user=${OVS_USER_ID}" > /run/openvswitch/useropts; fi'
ExecStartPre=/usr/bin/ls -lad /var/run/openvswitch /var/log/openvswitch
ExecStartPre=/usr/bin/echo "OVS_USER_ID: ${OVS_USER_ID}  OVSUSER: ${OVSUSER}"
EnvironmentFile=-/run/openvswitch/useropts
ExecStart=/usr/share/openvswitch/scripts/ovs-ctl \
          --no-ovs-vswitchd --no-monitor --system-id=random \
          ${OVSUSER} \
          start $OPTIONS
ExecStop=/usr/share/openvswitch/scripts/ovs-ctl --no-ovs-vswitchd stop
ExecReload=/usr/share/openvswitch/scripts/ovs-ctl --no-ovs-vswitchd \
           ${OVSUSER} \
           --no-monitor restart $OPTIONS
RuntimeDirectory=openvswitch
RuntimeDirectoryMode=0755


and this is the output:
systemd[1]: Starting Open vSwitch Database Unit...
ls[3246]: drwxr-x---. 2 openvswitch hugetlbfs 4096 Aug 21 10:21 /var/log/openvswitch
ls[3246]: drwxr-xr-x  2 root        root        60 Aug 21 10:25 /var/run/openvswitch
echo[3247]: OVS_USER_ID: openvswitch:hugetlbfs  OVSUSER: --ovs-user=openvswitch:hugetlbfs
ovsdb-server[3294]: ovs|00002|daemon_unix|EMER|/var/run/openvswitch/ovsdb-server.pid.tmp: create failed (Permission denied)
ovs-ctl[3248]: Starting ovsdb-server ovsdb-server: /var/run/openvswitch/ovsdb-server.pid.tmp: create failed (Permission denied)


Therefore, even though OvS service chown the runtimedir, the permissions are restored in between.
This is likely another instance of bz#1508495.

Comment 3 Flavio Leitner 2019-08-23 14:26:22 UTC
*** Bug 1745049 has been marked as a duplicate of this bug. ***

Comment 4 Zbigniew Jędrzejewski-Szmek 2019-08-26 14:40:35 UTC
So the original fix was merged in https://github.com/systemd/systemd/commit/30c81ce2cef97b05e143c7adf4cd1b1c5fb59932,
but then https://github.com/systemd/systemd/commit/206e9864de undid that fix, because people
(other people) were complaining that directory ownership is not updated.

Dunno, the best solution might be to stop using RuntimeDirectory for this service and create the
directory in ExecStartPre.

Comment 5 Flavio Leitner 2019-08-27 16:23:40 UTC
Aaron, what do you think about comment#4?

Comment 6 Aaron Conole 2019-08-27 17:01:08 UTC
I think we probably need to fix our usage of the systemd scripts to set up the user and pass the capabilities.

The runtime directory was fixed with:

7a65e5a9252a ("rhel: let *-ctl handle runtime directory")

Probably we can backport it if needed.

Comment 7 Vladimir Benes 2019-09-02 11:29:48 UTC
(In reply to Aaron Conole from comment #6)
> I think we probably need to fix our usage of the systemd scripts to set up
> the user and pass the capabilities.
> 
> The runtime directory was fixed with:
> 
> 7a65e5a9252a ("rhel: let *-ctl handle runtime directory")
> 
> Probably we can backport it if needed.

any update here?

Comment 8 Zbigniew Jędrzejewski-Szmek 2019-09-03 09:43:39 UTC
After "sleeping" on this for a bit, I think changing the scripts is the only long-term solution.
The change in 206e9864de makes a lot of sense and is not going to be reverted.

Comment 9 Flavio Leitner 2019-09-03 23:29:22 UTC
Hi,


OvS 2.12 is about to be released in upstream and there is a plan to push to Rawhide next.
I tested the current branch-2.12 and it works, see below the outputs:

[root@localhost ~]# cat /etc/fedora-release 
Fedora release 31 (Thirty One)
[root@localhost ~]# rpm -q openvswitch 
openvswitch-2.12.90-1.fc31.x86_64
[root@localhost ~]# systemctl start openvswitch 
[root@localhost ~]# systemctl status  openvswitch 
● openvswitch.service - Open vSwitch
   Loaded: loaded (/usr/lib/systemd/system/openvswitch.service; disabled; vendor preset: disabled)
   Active: active (exited) since Tue 2019-09-03 20:27:48 -03; 5s ago
  Process: 44776 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
 Main PID: 44776 (code=exited, status=0/SUCCESS)
      CPU: 2ms

Sep 03 20:27:48 localhost.localdomain systemd[1]: Starting Open vSwitch...
Sep 03 20:27:48 localhost.localdomain systemd[1]: Started Open vSwitch.
[root@localhost ~]# systemctl status  ovsdb-server
● ovsdb-server.service - Open vSwitch Database Unit
   Loaded: loaded (/usr/lib/systemd/system/ovsdb-server.service; static; vendor preset: disabled)
   Active: active (running) since Tue 2019-09-03 20:27:48 -03; 13s ago
  Process: 44660 ExecStartPre=/usr/bin/chown ${OVS_USER_ID} /var/run/openvswitch /var/log/openvswitch (code=exited, status=0/SUCCESS)
  Process: 44661 ExecStartPre=/bin/sh -c rm -f /run/openvswitch.useropts; /usr/bin/echo "OVS_USER_ID=${OVS_USER_ID}" > /run/openvswitch.user>
  Process: 44664 ExecStartPre=/bin/sh -c if [ "$${OVS_USER_ID/:*/}" != "root" ]; then /usr/bin/echo "OVS_USER_OPT=--ovs-user=${OVS_USER_ID}">
  Process: 44666 ExecStart=/usr/share/openvswitch/scripts/ovs-ctl --no-ovs-vswitchd --no-monitor --system-id=random ${OVS_USER_OPT} start $O>
 Main PID: 44712 (ovsdb-server)
    Tasks: 1
   Memory: 4.1M
      CPU: 364ms
   CGroup: /system.slice/ovsdb-server.service
           └─44712 ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock>

Sep 03 20:27:47 localhost.localdomain systemd[1]: Starting Open vSwitch Database Unit...
Sep 03 20:27:48 localhost.localdomain ovs-ctl[44666]: [35B blob data]
Sep 03 20:27:48 localhost.localdomain ovs-vsctl[44713]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait -- init -- set Open_vSwitch . db->
Sep 03 20:27:48 localhost.localdomain ovs-vsctl[44718]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . ovs-version=2.>
Sep 03 20:27:48 localhost.localdomain ovs-ctl[44666]: [49B blob data]
Sep 03 20:27:48 localhost.localdomain ovs-vsctl[44723]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . external-ids:h>
Sep 03 20:27:48 localhost.localdomain ovs-ctl[44666]: [44B blob data]
Sep 03 20:27:48 localhost.localdomain systemd[1]: Started Open vSwitch Database Unit.

fbl

Comment 10 Vladimir Benes 2019-09-04 07:54:09 UTC
What is the plan for F31? We have the same broken behavior there. We need different fix for 31 as rebase is very likely not a way to go there.

Comment 11 Flavio Leitner 2019-09-04 11:03:55 UTC
(In reply to Vladimir Benes from comment #10)
> What is the plan for F31? We have the same broken behavior there. We need
> different fix for 31 as rebase is very likely not a way to go there.

The comment#9 is on F31/Rawhide (before final freeze).

Comment 12 Fedora Update System 2019-09-14 05:30:51 UTC
FEDORA-2019-22194b7c56 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2019-22194b7c56

Comment 13 Fedora Update System 2019-09-15 01:13:30 UTC
openvswitch-2.12.0-1.fc31 has been pushed to the Fedora 31 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-22194b7c56

Comment 14 Fedora Update System 2019-09-17 02:18:52 UTC
openvswitch-2.12.0-1.fc31 has been pushed to the Fedora 31 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.