Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2026827

Summary: Failed to switch root: Failed to determine whether root path '/sysroot' contains an OS tree: Input/output error
Product: Red Hat Satellite Reporter: Ram Nainsingh Tiruwa <ramsingh>
Component: ProvisioningAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED NOTABUG QA Contact: Satellite QE Team <sat-qe-bz-list>
Severity: high Docs Contact:
Priority: high    
Version: 6.9.6CC: inecas, lstejska, lzap, saydas, sganar, sshtein, stephane_lapie
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-16 08:38:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ram Nainsingh Tiruwa 2021-11-26 04:32:10 UTC
Description of problem:

Tried provisioning rhel8.5 system from satellite and got an error.

Failed to switch root: Failed to determine whether root path '/sysroot' contains an OS tree: Input/output error

When booting

 kernel: EXT4-fs error (device dm-0): ext4_find_entry:1446: inode #1841: comm systemd: reading directory lblock 0
 kernel: Buffer I/O error on dev dm-0, logical block 0, lost sync page write
 systemctl[1580]: Failed to switch root: Failed to determine whether root path '/sysroot' contains an OS tree: Input/output error
 kernel: EXT4-fs (dm-0): I/O error while writing superblock
 systemd[1]: initrd-switch-root.service: Main process exited, code=exited, status=1/FAILURE
 systemd[1]: initrd-switch-root.service: Failed with result 'exit-code'.
 systemd[1]: Failed to start Switch Root.
 systemd[1]: Startup finished in 7.058s (kernel) + 0 (initrd) + 17.902s (userspace) = 24.961s.
 systemd[1]: initrd-switch-root.service: Triggering OnFailure= dependencies.
 systemd[1]: Starting Setup Virtual Console...
 systemd[1]: systemd-vconsole-setup.service: Succeeded.
 systemd[1]: Started Setup Virtual Console.
 systemd[1]: Started Emergency Shell.
 systemd[1]: Reached target Emergency Mode.
 systemd[1]: Received SIGRTMIN+21 from PID 973 (plymouthd).
 kernel: EXT4-fs error (device dm-0): ext4_find_entry:1446: inode #2605: comm plymouthd: reading directory lblock 0

Sync rhel8.5 kickstart repository and try building host.

Comment 4 Sayan Das 2022-02-15 15:44:11 UTC
FYI, I never have faced this issue with RHEl 8.5 and Satellite 6.9\6.10\7.0 

I simply ensure to use min 4 GB ram and 12+ GB of storage during deployment.

Comment 5 Stéphane Lapie 2022-02-25 00:06:58 UTC
Also confirming the exact same problem in a PXE kickstart scenario on a VM on vSphere infrastructure, when building eight rhel 8.5 VMs at once.
Of note, this happened without using Red Hat Satellite, but a very plain PXE/TFTP/HTTP Cobbler deployment environment.

I will try to provide a more meaningful log file as is suggested,
but in the meantime here is the transcript of what shows on the VM console when it happens (which is the exact output that led me to this bug report).

-------

[   59.305552] EXT4-FS error (device dm-0): ext4_find_entry:1446: inode #1841: comm systemd: reading directory lblock 0
[   59.305900] Buffer I/O error on dev dm-0, logical block 0, lost sync page write
[   59.306252] EXT4-FS (dm-0): I/O error while writing superblock
[FAILED] Failed to start Switch Root.
See 'systemctl status initrd-switch-root.service' for details.
[   59.338477] EXT4-FS error (device dm-0): ext4_find_entry:1446: inode #2605: comm plymouthd: reading directory lblock 0
[   59.339756] Buffer I/O error on dev dm-0, logical block 0, lost sync page write
[   59.341017] EXT4-fs (dm-0): I/O error while writing superblock
[   59.342352] EXT4-FS error (device dm-0): ext4_find_entry:1446: inode #12: comm plymouthd: reading directory lblock 0
[   59.344740] Buffer I/O error on dev dm-0, logical block 0, lost sync page write
[   59.345938] EXT4-fs (dm-0): I/O error while writing superblock
Warning: /dev/root does not exist

Generating "/run/initramfs/rdsosreport.txt"
[   59.374744] Buffer I/O error on dev dm-0, logical block 786416, async page read

Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot
after mounting them and attach it to a bug report.


:/# [   63.968192] JBD2: Error -5 detected when updating journal superblock for dm-0-8.
[   63.969668] Aborting journal on device dm-0-8.
[   63.971224] JBD2: Error -5 detected when updating journal superblock for dm-0-8.

Comment 6 Stéphane Lapie 2022-02-25 02:22:11 UTC
:/run/initramfs# dmsetup deps /dev/dm-0
2 dependencies  : (7, 2) (7, 1)
:/run/initramfs# cat /proc/partitions | grep -w 7
   7        0     690660 loop0
   7        1    3145728 loop1
   7        2   33554432 loop2
:/run/initramfs# losetup -l
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE                        DIO LOG-SEC
/dev/loop1         0      0         0  1 /Live/OS/rootfs.img                0     512
/dev/loop2         0      0         0  0 /overlay                           0     512
/dev/loop0         0      0         0  1 /tmp/curl_fetch_url1/install.img   0     512

dm-0 is therefore an overlay device using /overlay as writable space, and failing to use it as such.
This is further confirmed by the following kernel messages:

[   59.169880] kickstart.machine kernel: loop: Write error at byte offset 29339648, length 4096.
[   59.170352] kickstart.machine kernel: loop: Write error at byte offset 29343744, length 4096.
[   59.170776] kickstart.machine kernel: loop: Write error at byte offset 29347840, length 4096.
[   59.171191] kickstart.machine kernel: loop: Write error at byte offset 29351936, length 4096.
[   59.171598] kickstart.machine kernel: loop: Write error at byte offset 29356032, length 4096.
[   59.171987] kickstart.machine kernel: loop: Write error at byte offset 29360128, length 4096.
[   59.172484] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57304 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   59.172884] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57312 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   59.173283] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57320 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   59.173665] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57328 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   59.174038] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57336 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   59.174417] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57344 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   59.174825] kickstart.machine kernel: loop: Write error at byte offset 29364224, length 4096.
[   59.175198] kickstart.machine kernel: loop: Write error at byte offset 29368320, length 4096.
[   59.175558] kickstart.machine kernel: loop: Write error at byte offset 29372416, length 4096.
[   59.175902] kickstart.machine kernel: loop: Write error at byte offset 29376512, length 4096.
[   59.176403] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57352 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   59.176759] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57360 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   59.177113] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57368 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   59.177467] kickstart.machine kernel: blk_update_request: I/O error, dev loop2, sector 57376 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0

I am now wondering if this could be a free memory problem, given I was trying to build VMs with a small memory footprint (2GB RAM, which seems to go against the Minimum System Requirement for RHEL8).
It does not quite explain why this does not happen consistently, but the failure to write on an overlay sparse file on a loop device would indicate disk space issues on the ramdisk, ergo a lack of available memory, based on how much data is logged... Which could be a little more if the install process is slow because many VMs are being built at the same time.

Comment 7 Stéphane Lapie 2022-02-25 13:53:58 UTC
Comparing my build sequence results when trying VMs with 4GB, I have had so far 0 failures out of 40 VMs (building them 8 at a time), when with 2GB I would have at least one or two VMs showing the above error.

Additionally, when trying to create a new file on the emergency shell (with an "echo test > test" command), it would fail with a "No space left on device" error, giving credence to a shortage of disk space being the root cause.

Now, I wonder if this is in part caused by updates to the access time of files when reading stuff from the file system...
Of course, having the recommended minimal amount of memory is a must,
but it might make sense to have the loop device be mounted with the option "noatime",
thus avoiding modifications that are not needed?

Comment 8 Lukas Zapletal 2022-03-01 10:22:10 UTC
Hello,

Anaconda in RHEL8 requires 3GB memory minimum:

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/performing_a_standard_rhel_installation/system-requirements-reference_installing-rhel

With Satellite there is a lot of going in during installation post scriptlet, we recommend 4GM RAM or more.

Consequences of Anaconda getting out of memory are usually bad and very random.

Comment 9 Leos Stejskal 2023-05-16 08:38:15 UTC
As previous comments, I'm closing as not a bug.