Bug 2055258 - randomizing the rootfs UUID is racing with mounting the rootfs
Summary: randomizing the rootfs UUID is racing with mounting the rootfs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.9
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.11.0
Assignee: Jonathan Lebon
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks: 2055259
TreeView+ depends on / blocked
 
Reported: 2022-02-16 14:29 UTC by Micah Abbott
Modified: 2022-07-13 14:59 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2055259 (view as bug list)
Environment:
Last Closed: 2022-02-16 15:47:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Micah Abbott 2022-02-16 14:29:21 UTC
While testing a hotfix'ed RHEL 8.4 kernel, we observed the following backtrace when running the metal4K iso-install scenario:

```
[   11.581955] systemd[1]: Mounting /sysroot...                                
[   11.729982] SGI XFS with ACLs, security attributes, quota, no debug enabled                                                                                 
[   11.740353] XFS (dm-4): Mounting V5 Filesystem                  
[   11.748288] XFS (dm-4): Internal error !uuid_equal(&mp->m_sb.sb_uuid, &head->h_fs_uuid) at line 279 of file fs/xfs/xfs_log_recover.c.  Caller xlog_header_check_mount+0x35/0xb0 [xfs]
[   11.751400] CPU: 0 PID: 960 Comm: mount Not tainted 4.18.0-305.39.1.el8_4.x86_64 #1
[   11.752868] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[   11.754380] Call Trace:                                       
[   11.754923]  dump_stack+0x5c/0x80                        
[   11.755604]  xfs_corruption_error+0x8b/0x90 [xfs]          
[   11.756546]  ? xlog_header_check_mount+0x35/0xb0 [xfs]                
[   11.757559]  ? xlog_do_io+0x8d/0x130 [xfs]                                                                                                                  
[   11.758376]  xlog_header_check_mount+0x5f/0xb0 [xfs]          
[   11.759362]  ? xlog_header_check_mount+0x35/0xb0 [xfs]                                                                                                      
[   11.760377]  xlog_find_verify_log_record+0x115/0x230 [xfs] 
[   11.761454]  xlog_find_head+0x1c3/0x390 [xfs]        
[   11.762319]  xlog_find_tail+0x44/0x350 [xfs]              
[   11.763171]  ? try_to_wake_up+0x1cc/0x590                                                                                                                   
[   11.763966]  xlog_recover+0x2b/0x160 [xfs]                               
[   11.764785]  ? xfs_trans_ail_init+0xbd/0xd0 [xfs]                                                                                                           
[   11.765717]  xfs_log_mount+0x280/0x2a0 [xfs]                                                                                                                
[   11.766566]  xfs_mountfs+0x44f/0x8c0 [xfs]                                
[   11.767393]  xfs_fc_fill_super+0x318/0x560 [xfs]                                                                                                            
[   11.768308]  ? xfs_mount_free+0x30/0x30 [xfs]                                                                                                               
[   11.769174]  get_tree_bdev+0x186/0x260                                      
[   11.769911]  vfs_get_tree+0x25/0xb0                               
[   11.770609]  do_mount+0x2e2/0x950                                                                                                                           
[   11.771271]  ? memdup_user+0x4b/0x70                                                                                                                        
[   11.771966]  ksys_mount+0xb6/0xd0                         
[   11.772622]  __x64_sys_mount+0x21/0x30                
[   11.773376]  do_syscall_64+0x5b/0x1a0                                                                                                                       
[   11.774110]  entry_SYSCALL_64_after_hwframe+0x65/0xca                       
[   11.775097] RIP: 0033:0x7f9bcc5a292e
[   11.775801] Code: 48 8b 0d 5d 15 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2a 15 2c 00 f7 d8 64 89 01 48
[   11.779370] RSP: 002b:00007ffeaa990d78 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[   11.780822] RAX: ffffffffffffffda RBX: 0000556c27db3a70 RCX: 00007f9bcc5a292e
[   11.782198] RDX: 0000556c27db5c80 RSI: 0000556c27db3c50 RDI: 0000556c27db4960
[   11.783566] RBP: 00007f9bcd34f184 R08: 0000000000000000 R09: 0000000000000002
[   11.784934] R10: 00000000c0ed0000 R11: 0000000000000246 R12: 0000000000000000
[   11.786308] R13: 00000000c0ed0000 R14: 0000556c27db4960 R15: 0000556c27db5c80
[   11.787685] XFS (dm-4): Corruption detected. Unmount and run xfs_repair
[   11.788963] XFS (dm-4): log has mismatched uuid - can't recover
[   11.790110] XFS (dm-4): failed to find log head
[   11.790992] XFS (dm-4): log mount/recovery failed: error -117
[   11.792532] XFS (dm-4): log mount failed
[   11.795059] ignition-ostree-firstboot-uuid[952]: Clearing log and setting UUID
[   11.797491] ignition-ostree-firstboot-uuid[952]: writing all SBs
[   11.798745] ignition-ostree-firstboot-uuid[952]: new UUID = 642c68d7-863e-4f1a-ace9-373f424e5ba6
[FAILED] Failed to mount /sysroot.
```


Investigation into the trace showed previous evidence of it reported upstream at https://github.com/coreos/fedora-coreos-tracker/issues/619#issuecomment-683925234

It's believed that backporting https://github.com/coreos/fedora-coreos-config/pull/1357 to RHCOS 4.9 would fix the race condition.

Comment 1 Jonathan Lebon 2022-02-16 15:47:36 UTC
This patch is already in 4.11.


Note You need to log in before you can comment on or make changes to this bug.