This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Bug 2218408 - Enforcing Selinux in RHEL 8.8 RAW failed VM reboot in Azure [NEEDINFO]
Summary: Enforcing Selinux in RHEL 8.8 RAW failed VM reboot in Azure
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: selinux-policy
Version: 8.8
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Zdenek Pytela
QA Contact: Milos Malik
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-29 03:19 UTC by yzho
Modified: 2023-08-17 12:58 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-17 12:57:59 UTC
Type: Bug
Target Upstream Version:
Embargoed:
mmalik: needinfo? (yzho)
zpytela: needinfo? (yzho)


Attachments (Terms of Use)
log (64.00 KB, application/octet-stream)
2023-06-30 01:13 UTC, yzho
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-1465 0 None None None 2023-08-17 12:57:58 UTC
Red Hat Issue Tracker RHELPLAN-161147 0 None None None 2023-06-29 03:20:19 UTC

Description yzho 2023-06-29 03:19:05 UTC
Description of problem:
In RHEL 8.8 RAW, Selinux config is 'disabled' as default. If changing it to 'enforcing' and reboot VM, there will be no auto-relabeling and reboot will fail.

The strange part is 1. in RHEL 8.7 RAW, this failure would not happen. VM could auto-relabel and reboot succeeded. 2. this failure only in Azure VM but not in AWS.

What could cause such difference? 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Deploy an Azure VM with Marketplace image RHEL 8.8 RAW
2. Change SELINUX to be SELINUX=enforcing in /etc/selinux/config 
3. Reboot the VM, it will fail

Actual results:
Azure VM failed to reboot

Expected results:
From Red Hat doc, seems this is expected, but it's confusing why it succeeded in 8.7 and in AWS VM?

Additional info:

Comment 1 Milos Malik 2023-06-29 07:46:06 UTC
Please be more specific.

What exactly fails?
 * the load of SELinux policy
 * the whole boot sequence
 * important systemd services

Thank you.

Comment 2 yzho 2023-06-30 01:13:06 UTC
Created attachment 1973299 [details]
log

Comment 3 yzho 2023-06-30 01:13:58 UTC
(In reply to Milos Malik from comment #1)
> Please be more specific.
> 
> What exactly fails?
>  * the load of SELinux policy
>  * the whole boot sequence
>  * important systemd services
> 
> Thank you.

Hi Milos,

Seems to be important systemd services but I'm not 100% sure. Just attached log, can you please take a look?

Thanks

Comment 4 yzho 2023-07-10 18:44:40 UTC
Hi any updates?

Comment 5 Milos Malik 2023-07-14 08:38:50 UTC
Please boot the machine with the enforcing=0 kernel parameter.

Once the machine boots, please collect SELinux denials that appeared via the following command:

# ausearch -m avc -m user_avc -m selinux_err -i -ts boot

After looking into the attached log, we still don't know what is the main cause.

Thank you.

Comment 6 Milos Malik 2023-07-14 08:52:40 UTC
If the ausearch command does not produce any output, please search for SELinux denials in the systemd journal:

# journalctl -b | grep -i -e avc -e selinux_err

Thank you.

Comment 7 Zdenek Pytela 2023-07-25 09:48:34 UTC
Hello yzho,

We need additional data to troubleshoot further. Off the log provided, we can see the journald service and subsequently some others failed to start, but we don't know the reason and it is not clear if the system recovered afterwards. You can also run:

systemctl status systemd-journald.service
systemctl list-units --state failed 
semodule -lfull | grep -v ^100

to gather more information.

As every important service is expected to start without an error on a clean system, it is possible e. g. some related configuration change was made, or an unusual hardware is in place, or some other service interferes with default setup.

Comment 9 Yuxin Sun 2023-08-16 05:59:05 UTC
Hi Zdenek,

I can reproduce this issue with image RedHat:rhel-raw:8_8:8.8.2023300515.

1. After set SELINUX=enforcing in /etc/selinux/config, run "fixfiles onboot" and reboot VM, it starts with many failed services and then it auto reboot again.
2. I enter the grub and add "enforcing=0" in the kernel parameters then start the VM, it can start the VM and autorelabel the whole file system. Then it auto reboot again.
3. After that, the VM can boot up successfully and the selinux status is Enforcing.

Comment 11 Zdenek Pytela 2023-08-16 09:27:02 UTC
Hello,

Off the log I can see the system booted successfully, but some units failed.

[FAILED] Failed to start Remount Root and Kernel File Systems.
[FAILED] Failed to mount Kernel Debug File System.
[FAILED] Failed to start Load Kernel Modules.
[FAILED] Failed to mount Huge Pages File System.
[FAILED] Failed to mount POSIX Message Queue File System.
[FAILED] Failed to start Monitoring of LVM2 …sing dmeventd or progress polling.
[FAILED] Failed to start Journal Service.
[FAILED] Failed to start Apply Kernel Variables.
[FAILED] Failed to start Create Static Device Nodes in /dev.
[FAILED] Failed to start Load/Save Random Seed.
[FAILED] Failed to start Create Volatile Files and Directories.
[FAILED] Failed to start Update UTMP about System Boot/Shutdown.
[FAILED] Failed to start Journal Service.
[FAILED] Failed to start Journal Service.
[FAILED] Failed to start Journal Service.
[FAILED] Failed to start Journal Service.
[FAILED] Failed to start Journal Service.
[FAILED] Failed to start Azure temporary res…sk dataloss warning file creation.
[FAILED] Failed to start Azure temporary res…sk dataloss warning file creation.

In the previous report there were reasonably more failed services. There is no reason logged though. Can you get more information from the system? Are there AVC denial, either in audit log or in journal?

Comment 12 Yuxin Sun 2023-08-16 14:08:34 UTC
(In reply to Zdenek Pytela from comment #11)
> Hello,
> 
> Off the log I can see the system booted successfully, but some units failed.
> 
> [FAILED] Failed to start Remount Root and Kernel File Systems.
> [FAILED] Failed to mount Kernel Debug File System.
> [FAILED] Failed to start Load Kernel Modules.
> [FAILED] Failed to mount Huge Pages File System.
> [FAILED] Failed to mount POSIX Message Queue File System.
> [FAILED] Failed to start Monitoring of LVM2 …sing dmeventd or progress
> polling.
> [FAILED] Failed to start Journal Service.
> [FAILED] Failed to start Apply Kernel Variables.
> [FAILED] Failed to start Create Static Device Nodes in /dev.
> [FAILED] Failed to start Load/Save Random Seed.
> [FAILED] Failed to start Create Volatile Files and Directories.
> [FAILED] Failed to start Update UTMP about System Boot/Shutdown.
> [FAILED] Failed to start Journal Service.
> [FAILED] Failed to start Journal Service.
> [FAILED] Failed to start Journal Service.
> [FAILED] Failed to start Journal Service.
> [FAILED] Failed to start Journal Service.
> [FAILED] Failed to start Azure temporary res…sk dataloss warning file
> creation.
> [FAILED] Failed to start Azure temporary res…sk dataloss warning file
> creation.
> 
> In the previous report there were reasonably more failed services. There is
> no reason logged though. Can you get more information from the system? Are
> there AVC denial, either in audit log or in journal?

Hi Zdenek,

When this issue happened the VM cannot boot up. It keeps rebooting. (Please see the step 1 in comment9)
Only when I edit the grub menu while booting and add "enforcing=0", the VM can boot up, and because I added the /.autorelabel in step 1, it relabel the whole file system then then reboot. (see comment9 step 2 and 3)

So, if I don't set "enforcing=0" in kernel parameters, the VM cannot boot up. So maybe I cannot collect logs in that status.

Comment 14 Zdenek Pytela 2023-08-17 12:28:37 UTC
Hello,

I am afraid there are still no data we can work with. Please gather selinux denials (see #c5 and #c6) which should be logged even when the system fails to start all services.

One finding: Reading again #c9 it looks the image is mislabeled. Can you check how it was created? Can you reach out to its maintainer for further explanation?


Note You need to log in before you can comment on or make changes to this bug.