Bug 2160557

Summary: backport https://github.com/ostreedev/ostree/pull/2519/commits/cb731294837736e957ee595ce11ab115277dbb36 [rhel-8.6.0.z]
Product: Red Hat Enterprise Linux 8 Reporter: RHEL Program Management Team <pgm-rhel-tools>
Component: ostreeAssignee: Colin Walters <walters>
Status: CLOSED ERRATA QA Contact: HuijingHei <hhei>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: ---CC: aaradhak, dornelas, hhei, walters
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ostree-2022.1-3.el8_6 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2160075 Environment:
Last Closed: 2023-03-07 13:54:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2160075    
Bug Blocks:    

Comment 4 Colin Walters 2023-01-13 13:40:30 UTC
There are a few levels of verifying this bug.  First, we'll need to queue an update.  The simplest is e.g.

rpm-ostree kargs --append=foo=bar


Now, to synthesize the failure mode:

# Use strace 

We can use e.g. strace to cause the `sync` system call to hang, like `strace --inject=sync:delay_exit=1000s ...`; simplest is to `systemctl edit ostree-finalize-staged.service, then 

[Service]
ExecStop=
ExecStop=strace --inject=sync:delay_exit=400s ostree admin finalize-staged

or so.  This will cause systemd to time out and kill the service when we do `systemctl reboot`.  The patched ostree should successfully apply the karg, previous one should not.

# Set up a network filesystem that fails to sync

Here you could set up e.g. a NFS mount, then use a firewall rule or other change to cause the remote NFS server to become unreachable and hence for sync() to fail

# Set up OpenShift with Ceph (ODS)

This is obviously the full real test, but much more involved.

Comment 5 HuijingHei 2023-01-16 08:45:14 UTC
Verify passed with ostree-2022.1-3.el8_6.x86_64 on rhcos-411.86.202301102150-0-qemu.x86_64.qcow2

1) Test with old ostree-2022.1-2.el8.x86_64 using strace, reboot and check the karg is not applied

a) Setup
# rpm -q ostree
ostree-2022.1-2.el8.x86_64
# rpm-ostree kargs --append=foo=bar
# systemctl edit ostree-finalize-staged.service, then add

[Service]
ExecStop=
ExecStop=strace --inject=sync:delay_exit=400s ostree admin finalize-staged

b) Reboot and check the karg is not applied
# systemctl reboot

[root@cosa-devsh ~]# cat /proc/cmdline | grep foo
[root@cosa-devsh ~]# rpm-ostree status
State: idle
Warning: failed to finalize previous deployment
         ostree-finalize-staged.service: Failed with result 'timeout'.
         check `journalctl -b -1 -u ostree-finalize-staged.service`
Deployments:
* 31180811d9f312d82de7f243665176ee2d6a55cd5b4d0d59d869a49ec6623815
                   Version: 411.86.202301102150-0 (2023-01-10T21:53:17Z)


2) Upgrade to ostree-2022.1-3.el8_6.x86_64 with the same steps, reboot and check the karg is applied 
# rpm-ostree override replace ostree-2022.1-3.el8_6.x86_64.rpm ostree-libs-2022.1-3.el8_6.x86_64.rpm ostree-grub2-2022.1-3.el8_6.x86_64.rpm
# reboot

a) Setup
# rpm-ostree kargs --append=foo=bar
# systemctl edit ostree-finalize-staged.service, then add

[Service]
ExecStop=
ExecStop=strace --inject=sync:delay_exit=400s ostree admin finalize-staged

b) Reboot and check the karg is applied
# systemctl reboot

[root@cosa-devsh ~]# cat /proc/cmdline | grep foo
BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-bd733914976ee5d3d4dac5da3a0e10a10c75d2ecd289fc1cfe1f70aeb0e88db6/vmlinuz-4.18.0-372.36.1.el8_6.x86_64 random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu ostree=/ostree/boot.1/rhcos/bd733914976ee5d3d4dac5da3a0e10a10c75d2ecd289fc1cfe1f70aeb0e88db6/0 root=UUID=544a3180-8c4e-471c-86bd-c26724bb11a7 rw rootflags=prjquota boot=UUID=50910abb-dc78-46b2-bad7-f11ff8a51015 foo=bar

Comment 11 errata-xmlrpc 2023-03-07 13:54:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ostree bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:1135