Bug 1803920 - ovs-appctl fails "read-only" operations if it doesn't have write access to ovs-vswitchd pid file
Summary: ovs-appctl fails "read-only" operations if it doesn't have write access to ov...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch
Version: FDP 20.A
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Timothy Redaelli
QA Contact: qding
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-17 17:03 UTC by Dan Winship
Modified: 2023-07-13 07:25 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-451 0 None None None 2022-02-22 05:45:12 UTC

Description Dan Winship 2020-02-17 17:03:58 UTC
sh-4.2# ovs-appctl ofproto/trace br0 in_port=32,tcp6,ipv6_src=fd01:0:0:2::1f,ipv6_dst=fd01:0:0:2::1e
2020-02-17T17:01:37Z|00001|daemon_unix|WARN|/var/run/openvswitch/ovs-vswitchd.pid: open: Read-only file system
ovs-appctl: cannot read pidfile "/var/run/openvswitch/ovs-vswitchd.pid" (Read-only file system)


If you manually specify the ctl file with "-t /var/run/openvswitch/ovs-vswitchd.4034.ctl", it works fine.

The problem seems to be that ovs-appctl is using a helper function to read the pid file that is also used by code that needs read/write access to the pid file...

Comment 1 Timothy Redaelli 2020-03-10 17:59:15 UTC
Hi,

I have a patch that I'd like to send upstream, but I'd like to known when it's useful,
so I can put it inside the commit message or I know what to reply if they'll asks me something.

I mean, .ctl file and .pid file are in the same path and with the same user,
so if you have write access to .ctl file you should also have write access to the pid file.

Can you please explain better?

Thank you

Comment 2 Dan Winship 2020-03-10 19:01:47 UTC
I don't have write access to the .ctl file; you don't need write access to a socket to be able to connect to it.

You can reproduce yourself by doing something like:

    sudo podman run -it --privileged --mount type=bind,src=/var/run/openvswitch,dst=/var/run/openvswitch,ro=true fedora:31 /bin/bash

then inside the container:

    # dnf install -y openvswitch
    ...

    # ovs-appctl ofproto/trace br0 in_port=32,tcp6,ipv6_src=fd01:0:0:2::1f,ipv6_dst=fd01:0:0:2::1e
    ovs-appctl: cannot read pidfile "/var/run/openvswitch/ovs-vswitchd.pid" (Read-only file system)

    # ovs-appctl -t /var/run/openvswitch/ovs-vswitchd.$(cat /var/run/openvswitch/ovs-vswitchd.pid).ctl ofproto/trace br0 in_port=32,tcp6,ipv6_src=fd01:0:0:2::1f,ipv6_dst=fd01:0:0:2::1e
    br0: unknown bridge
    ovs-appctl: /var/run/openvswitch/ovs-vswitchd.1159105.ctl: server returned an error

(which is expected since I don't actually have a br0, but it shows that it was actually talking to vswitchd)

Comment 3 Dan Winship 2020-03-10 19:07:50 UTC
Oh, as for "when it's useful"; in OpenShift, the pod running openshift-sdn needs to be able to make OVS-related calls, so it mounts the host's /var/run/openvswitch into its pod. But just as a general "minimum privilege" sort of thing, it mounts it read-only, since it only needs read access.

openshift-sdn never calls ovs-appctl, so it doesn't actually hit this bug itself. I was just trying to debug a problem on the node, and was doing it from the context of the openshift-sdn pod, and was surprised to discover that certain commands that I would have expected to work actually didn't work.

Comment 4 Dan Winship 2020-05-08 13:07:25 UTC
Related problem: in ovn-kubernetes, where we do use ovs-appctl/ovn-appctl, I discovered a bunch of functions like this:

	pid, err := ioutil.ReadFile(runner.ovnRunDir + "ovn-northd.pid")
	if err != nil {
		return "", "", fmt.Errorf("failed to run the command since failed to get ovn-northd's pid: %v", err)
	}

	cmdArgs = []string{
		"-t",
		runner.ovnRunDir + fmt.Sprintf("ovn-northd.%s.ctl", strings.TrimSpace(string(pid))),
	}
	cmdArgs = append(cmdArgs, args...)

I pointed out that this was silly; you should be able to just say "ovn-appctl -t ovn-northd", and ovn-appctl will read the pid file itself and find the correct control socket. But in fact, that only works if you're in the same PID namespace as the process you're trying to connect to. If you're in a different PID namespace, then read_pidfile() screws things up:

    $ ovn-appctl -t ovn-controller list-commands
    2020-05-06T20:42:07Z|00001|daemon_unix|WARN|/var/run/ovn/ovn-controller.pid: stale pidfile for pid 60
 being deleted by pid 0
    ovn-appctl: cannot read pidfile "/var/run/ovn/ovn-controller.pid" (No such process)

If it just read the pidfile and used the pid to find the right socket file, then everything would work. But because it tries to do irrelevant "clever" things, it screws everything up.

Comment 5 Surya Seetharaman 2021-08-04 10:25:10 UTC
I saw the same errors:

sh-4.4# ovn-appctl -t ovnsb_db vlog/set dbg
2021-08-04T10:24:14Z|00001|daemon_unix|WARN|/var/run/ovn/ovnsb_db.pid: stale pidfile for pid 80
 being deleted by pid 0
ovn-appctl: cannot read pidfile "/var/run/ovn/ovnsb_db.pid" (No such process)

while it clearly exists.


Note You need to log in before you can comment on or make changes to this bug.