sh-4.2# ovs-appctl ofproto/trace br0 in_port=32,tcp6,ipv6_src=fd01:0:0:2::1f,ipv6_dst=fd01:0:0:2::1e 2020-02-17T17:01:37Z|00001|daemon_unix|WARN|/var/run/openvswitch/ovs-vswitchd.pid: open: Read-only file system ovs-appctl: cannot read pidfile "/var/run/openvswitch/ovs-vswitchd.pid" (Read-only file system) If you manually specify the ctl file with "-t /var/run/openvswitch/ovs-vswitchd.4034.ctl", it works fine. The problem seems to be that ovs-appctl is using a helper function to read the pid file that is also used by code that needs read/write access to the pid file...
Hi, I have a patch that I'd like to send upstream, but I'd like to known when it's useful, so I can put it inside the commit message or I know what to reply if they'll asks me something. I mean, .ctl file and .pid file are in the same path and with the same user, so if you have write access to .ctl file you should also have write access to the pid file. Can you please explain better? Thank you
I don't have write access to the .ctl file; you don't need write access to a socket to be able to connect to it. You can reproduce yourself by doing something like: sudo podman run -it --privileged --mount type=bind,src=/var/run/openvswitch,dst=/var/run/openvswitch,ro=true fedora:31 /bin/bash then inside the container: # dnf install -y openvswitch ... # ovs-appctl ofproto/trace br0 in_port=32,tcp6,ipv6_src=fd01:0:0:2::1f,ipv6_dst=fd01:0:0:2::1e ovs-appctl: cannot read pidfile "/var/run/openvswitch/ovs-vswitchd.pid" (Read-only file system) # ovs-appctl -t /var/run/openvswitch/ovs-vswitchd.$(cat /var/run/openvswitch/ovs-vswitchd.pid).ctl ofproto/trace br0 in_port=32,tcp6,ipv6_src=fd01:0:0:2::1f,ipv6_dst=fd01:0:0:2::1e br0: unknown bridge ovs-appctl: /var/run/openvswitch/ovs-vswitchd.1159105.ctl: server returned an error (which is expected since I don't actually have a br0, but it shows that it was actually talking to vswitchd)
Oh, as for "when it's useful"; in OpenShift, the pod running openshift-sdn needs to be able to make OVS-related calls, so it mounts the host's /var/run/openvswitch into its pod. But just as a general "minimum privilege" sort of thing, it mounts it read-only, since it only needs read access. openshift-sdn never calls ovs-appctl, so it doesn't actually hit this bug itself. I was just trying to debug a problem on the node, and was doing it from the context of the openshift-sdn pod, and was surprised to discover that certain commands that I would have expected to work actually didn't work.
Related problem: in ovn-kubernetes, where we do use ovs-appctl/ovn-appctl, I discovered a bunch of functions like this: pid, err := ioutil.ReadFile(runner.ovnRunDir + "ovn-northd.pid") if err != nil { return "", "", fmt.Errorf("failed to run the command since failed to get ovn-northd's pid: %v", err) } cmdArgs = []string{ "-t", runner.ovnRunDir + fmt.Sprintf("ovn-northd.%s.ctl", strings.TrimSpace(string(pid))), } cmdArgs = append(cmdArgs, args...) I pointed out that this was silly; you should be able to just say "ovn-appctl -t ovn-northd", and ovn-appctl will read the pid file itself and find the correct control socket. But in fact, that only works if you're in the same PID namespace as the process you're trying to connect to. If you're in a different PID namespace, then read_pidfile() screws things up: $ ovn-appctl -t ovn-controller list-commands 2020-05-06T20:42:07Z|00001|daemon_unix|WARN|/var/run/ovn/ovn-controller.pid: stale pidfile for pid 60 being deleted by pid 0 ovn-appctl: cannot read pidfile "/var/run/ovn/ovn-controller.pid" (No such process) If it just read the pidfile and used the pid to find the right socket file, then everything would work. But because it tries to do irrelevant "clever" things, it screws everything up.
I saw the same errors: sh-4.4# ovn-appctl -t ovnsb_db vlog/set dbg 2021-08-04T10:24:14Z|00001|daemon_unix|WARN|/var/run/ovn/ovnsb_db.pid: stale pidfile for pid 80 being deleted by pid 0 ovn-appctl: cannot read pidfile "/var/run/ovn/ovnsb_db.pid" (No such process) while it clearly exists.