We have observed that ppc64le RHCOS builds fail to modify the hugetlbfs group to add the openvswitch user, which causes OVS startup to fail because it cannot chown directories to the right group/user. This appears to be because RPM's /bin/sh does not "set -e" and therefore errors returned by useradd/groupadd/usermod get ignored and do not terminate the %pre script. %pre getent group openvswitch >/dev/null || groupadd -r openvswitch getent passwd openvswitch >/dev/null || \ useradd -r -g openvswitch -d / -s /sbin/nologin \ -c "Open vSwitch Daemons" openvswitch %ifarch %{dpdkarches} getent group hugetlbfs >/dev/null || groupadd hugetlbfs usermod -a -G hugetlbfs openvswitch %endif exit 0 This causes issues like: [2023-05-02T18:30:05.713Z] openvswitch3.1.prein: usermod.rpmostreesave: /etc/passwd.6: lock file already used [2023-05-02T18:30:05.713Z] openvswitch3.1.prein: usermod.rpmostreesave: cannot lock /etc/passwd; try again later. If we add "|| exit 1" behind those it'll supposedly help surface errors.
Is it safer to add set -euo pipefail at the top of the pre script?
(In reply to Mark Hamzy from comment #1) > Is it safer to add set -euo pipefail at the top of the pre script? I'm not a bash expert so it may well look/work nicer to do that instead. I'll leave it to OVS team.
I wonder if there is anything related to that in the packaging guidelines.
(In reply to Flavio Leitner from comment #3) > I wonder if there is anything related to that in the packaging guidelines. There isn't; OVS appears to use it correctly. Unfortunately it looks like a long-running shadow-utils issue that we may only see in RHCOS/FCOS: https://github.com/coreos/fedora-coreos-tracker/issues/1250
Applicable Fedora packaging guidelines are https://docs.fedoraproject.org/en-US/packaging-guidelines/UsersAndGroups/#_rationale_for_some_of_the_implementation_choices which says: --- The exit 0 at the end will result in the %pre scriptlet passing through even if the user/group creation fails for some reason. This is suboptimal but has less potential for system wide breakage than allowing it to fail. If the user/group aren't available at the time the package's payload is unpacked, rpm will fall back to setting those files owned by root. --- so there is a tradeoff that may/may not be appropriate for OVS packages in a *non*-RHCOS/FCOS context.
On Fedora openvswitch package uses sysusers file in order to create the group and the user "dynamically". I guess I can update RHEL spec file to use that too that should works on your scenario. What do you think? Can you try to build Fedora openvswitch spec file and see if you still have the problem?
*** Bug 2196275 has been marked as a duplicate of this bug. ***
Added openvswitch.sysusers and openvswitch-hugetlbfs.sysusers to follow the new Fedora guidelines (https://docs.fedoraproject.org/en-US/packaging-guidelines/UsersAndGroups/#_dynamic_allocation)
Hi, I have a couple of questions: - Is this issue specific to ppc64le? - Is something that can be reproduced manually and, if so, what are the steps? Thanks, Rick
Moving needinfo to the reportee
The fix was confirmed to work fine in RHEL, but does not work in RHCOS due to some missing systemd-sysusers macros bits described in https://github.com/openshift/os/issues/1274#issuecomment-1597742858 https://github.com/openshift/os/pull/1318 was the openshift workaround until the systemd RPM macros can be fixed. We can call this bug VERIFIED as the problem is not with OVS on RHEL, but with systemd macros on RHCOS.
Marking BZ Verified per comment 13.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (openvswitch3.1 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:3989