2051285 – reconfiguring fstab causes sysem lockout, even with running systemctl daemon-reload

Bug 2051285 - reconfiguring fstab causes sysem lockout, even with running systemctl daemon-reload

Summary: reconfiguring fstab causes sysem lockout, even with running systemctl daemon-...

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	audit
Sub Component:
Version:	35
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Steve Grubb
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-02-06 22:40 UTC by Leslie Satenstein
Modified:	2022-12-13 16:35 UTC (History)
CC List:	22 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2022-12-13 16:35:23 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Leslie Satenstein 2022-02-06 22:40:48 UTC

Description of problem:
I have an installation using workspace live.  That successfully creates
/boot/efi, /boot and /
On reboot and after assigning the adminstrator/user, I need to add some partitions.

To add the partitions I do the following:
sudo su -
cd /
mkdir Development LinuxStuff share Backup music ISOs 
chmod 1777 Development LinuxStuff share Backup music ISOs 
chown leslie:leslie Development LinuxStuff share Backup music ISOs 

This part is successful. I have added entries

I then append the following to the /etc/fstab but first I back up the existing /etc/fstab

UUID=50ed96c9-b5b8-452f-ad9c-9b5e59168f0b /LinuxStuff  xfs    defaults,noatime                                                    0  0     #/dev/nvme0n1p5 LinuxStuff
UUID=0bbd3f52-6d7a-456d-8475-e35137d6932f /Development xfs    defaults,noatime                                                    0  0     #/dev/nvme0n1p4 Development
UUID=b521e017-013f-40ca-bce9-aa3bc8c35c15 /share       btrfs  defaults,noatime,compress=zstd:1,autodefrag,commit=120              0  0     #/dev/nvme0n1p8 sharebtrfs
UUID=5e18134e-c709-4332-8a50-6db7d690c665 /Backup      ext4   defaults,noatime,noauto,user                                        0  0     #/dev/sdc1      Backupsdc
UUID=ef53596d-8ec6-42c7-ad98-10bfc187bc62 /music       btrfs  defaults,noatime,compress=zstd:1,autodefrag,commit=120,noauto,user  0  0     #/dev/sdc2      Music
UUID=0ece8188-e470-4e8a-875c-c849e358cbf5 /ISOs        ext4   defaults,noatime                                                    0  0     #/dev/sdc3      ISOs


Following the appendage I run  systemctl daemon-reload. 
After which I mount the /Linux /Development /share /Backup /music and /ISOs.

The result with Fedora 35 was the following:
#
# /etc/fstab
# Created by anaconda on Mon Jan 31 19:57:03 2022
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
#<file system>                            <mount>      <type> <options>                 <dmp pass>  <xref>          <label/uuid>
UUID=79336fe7-19c0-45b1-bbde-d3047bcfc0d2 /            btrfs  defaults,noatime                 0 0 #/dev/nvme0n1p14 
UUID=0bbd3f52-6d7a-456d-8475-e35137d6932f /Development xfs    defaults,noatime                 0 0 #/dev/nvme0n1p4  Development
UUID=0ece8188-e470-4e8a-875c-c849e358cbf5 /ISOs        ext4   defaults,noatime,nodiscard       1 2 #/dev/sdc3       ISOs
UUID=50ed96c9-b5b8-452f-ad9c-9b5e59168f0b /LinuxStuff  xfs    defaults,noatime,nodiscard       0 0 #/dev/nvme0n1p5  LinuxStuff
UUID=5e18134e-c709-4332-8a50-6db7d690c665 /Backup      ext4   defaults,noatime,noauto,user     1 2 #/dev/sdc1       Backup
UUID=31fb2e7a-70d0-408b-8c4d-f7b60590c860 /boot        ext4   defaults,noatime                 1 2 #/dev/nvme0n1p11 Fed35Boot
UUID=4AA4-2B0C                            /boot/efi    vfat   umask=0077,shortname=winnt       0 2 #/dev/nvme0n1p10 Fed35EFI
UUID=ef53596d-8ec6-42c7-ad98-10bfc187bc62 /music       btrfs  subvolid=5,noauto,user           0 0 #/dev/sdc2       Music
UUID=b521e017-013f-40ca-bce9-aa3bc8c35c15 /share       btrfs  subvolid=5                       0 0 #/dev/nvme0n1p8  sharebtrfs

On first setup the system would not mount any partitions.   
I tried the above by reinstalling and then after the appendage of the items, I mounted each partition successfully. And tested that I could access all the contents.
On reboot -- partition lockout.

I am also not able to return to the initial fstab  system audit does the blocking.

The above decorated fstab worked with Fedora 33,34,and only with Fedora 35, if I run error recovery.  Error recovery (from the USB iso file) would correct some settings and also add /.autorelabel 
It would take a good 25 minutes, but then the system would come up with the new fstab and added partitions.

That error recovery routine has disappeared.  I manually added /.autorelabel and it will not run because ... / is readonly.  

So, please, 
1) What has the procedure changed to not allow the decoration on the right of the dump and pass fields?
2) How do I add partitions to my system without having the above decorations to the right end of the fstab.
3) I have tried various methods and ownerships of partitions added to / with failure.

There are two ways that I achieved the above ultimate fstab.
I did the addition of the partitions during installation of  Fedora 34, 35 and now Rawhide (36). That works.
When the error recovery procedure existed and was invoked, I could walk through the procedure and successfully create the fstab with the extra partitions.
That error recovery is not on the LIve Workstation ISO file and not on the Everything ISO file

Is the only way to add a partition to a workstation fstab to be done only during the installation process.

Aside from reading the /etc/fstab and remembering the checksum for same, What else does systemctl daemon-reload suppoed to do  which it is not?


 


Version-Release number of selected component (if applicable):


How reproducible:



Steps to Reproduce:
1.Create a basid Fedora 35, or Fedora36 (Rawhide)
2.Add the additional mount-points to /   
3.Add the additonal fstab lines to the /etc/fstab, 
4)Run sudo systemctl daemon-reload
5)Reboot

Actual results:
Locked up system

Expected results:
A workable system  
How to I run that hidden emergency recovery routine.

 


Additional info:

The program to audit/validate the fstab, (all fields in all lines checked, and the decorations appended)  is available freely, gpl3 license. I am the author).
 
I do want to begin testing Rawhide as an end-user. I have been testing every release of Fedora for the past 18 years, beginning in 2004

Comment 1 Leslie Satenstein 2022-02-06 22:43:06 UTC

Same issue with Rawhide.

Comment 2 Zbigniew Jędrzejewski-Szmek 2022-02-07 11:51:40 UTC

I don't see anything wrong with the fstab. From systemd side, it is systemd-fstab-generator that
parses that file. I checked locally, and it doesn't seem to have any issues with it, all units
are generated as expected.

> with Fedora 35, if I run error recovery. Error recovery (from the USB iso file) would correct some settings and also add /.autorelabel 
> It would take a good 25 minutes, but then the system would come up with the new fstab and added partitions.

If I understand correctly, there was an issue with mounting partitions during boot, but
a selinux relabel executed fromt he live cd resolved the issue. This could be a selinux problem
in the sense of the policy forbidding something that it shouldn't, or it could be a selinux
problem in the sense that something is labelled wrongly. Please attach the boot logs from the
failed boot, without that it's really hard to say anything.

Comment 3 Leslie Satenstein 2022-02-08 19:39:28 UTC

ERROR MESSAGE I encountered

When the problem occurs, root (/) is in RO mode. The error message that scrolls by is similar to systemd auditd cannot write to /.

Today I succeeded in mounting my extra partitions along with the necessary three. To succeed, I had to do it via a fresh Fedora installation, and only by using the anaconda option to identify the partitions to mount. 

Need a quick way to do recovery.  Need some clear instructions as to do recovery. 
Please note. I was able to do recovery using Fedora 34 ISO, not so with Fedora35 or Rawhide ISOs.
With F34, a /.autorelabel was installed. Recovery took about 30 minutes.  Need something that runs in less time.

Here is a quick way to test.
a) create a new Fedora 34 or Rawhide.
On reboot, append some additional partitions. Do not forget to add the mounts to /   
run systemctl daemon-reload
reboot

When I do this, I get the error message similar to the above.

Comment 4 Zbigniew Jędrzejewski-Szmek 2022-02-10 07:40:45 UTC

Please at least attach a photo of the error… Right now this is not very actionable for us.

Comment 5 Leslie Satenstein 2022-02-10 15:03:13 UTC

Good day Zbigniew 

I have been doing much research on my own.  I think I have the information for you to do a follow up.

Summary
I do a basic install of Fedora workstation as Identified in the opening post

If I reboot, immediately after saving the combined /etc/fstab, even after doing systemctl daemon-reload,  The system will boot with a systemd auditd  message about not being able to write (/ is RO)

However.
If before I reboot the system I mount every partition listed within the /etc/fstab, and then reboot, the system comes up OK.

In the past, when I did  not mount every partition, I could take the workspace ISO, and perform Error Recovery. 
The Error recovery was removed, in in it's place is the mount with basic settings.

By coincidence, I found the Error Recovery software was part of the "Everything ISO".   That Error Recovery software needs to be appended to the Workspace ISO as a second error recovery option.

Check out  Fedora 35 workstation(gnome) Fedora 35 Everything ISO , 
Rawhide / Fedora 36 Workstation ISO, (Gnome) and Rawhide/Fedora 36 Everything ISO.

You can use my initial entry to duplicate the problems I encountered.

By the way, the documentation does not correspond with the actuality.  It described something called "Linux Rescue", but I was unable to follow through with it.

Currently I am using Rawhide36 with full /etc/fstab, as shown in the opening post.

Comment 6 Zbigniew Jędrzejewski-Szmek 2022-02-10 15:06:31 UTC

There is no "systemd auditd message": systemd is one thing, and audits is another thing. They both
generate messages, but not through one another… Please show the exact message.

Comment 7 Leslie Satenstein 2022-02-10 15:19:13 UTC

The repeated message was about / being RO and  auditd not being able to write)  

(Dmesg had several thousand lines, I tried with grep to locate the message.  I could not). 

But, from memory, as I now know how to use the Everything ISO to do error recovery  I will be using the undocumented Everything ISO for that purpose. 

To duplicate my issue
======================
Make yourself one or two partitions with  names line test1, test2, add these to /etc/fstab without mounting the two and reboot. I do believe you will be able to reproduce the problem I experienced.

Recovery via the recovery software or reinstallation, each requires one hour.

Comment 8 Zbigniew Jędrzejewski-Szmek 2022-02-10 16:17:51 UTC

So this seems to be a problem with auditd… In general daemons need to be ready to run without
writable root, there is too many recovery scenarios where this is needed.

Comment 9 Steve Grubb 2022-02-10 20:48:53 UTC

Not sure *how* the audit daemon can cause lockout. It passively monitors events based on it's rules. It should exit at one of the failures.

I think that if you boot with audit=0 for the kernel command line, auditd isn't started.

Comment 10 Leslie Satenstein 2022-02-11 01:19:53 UTC

I did a test today with Fedora36beta
I could reinstall Fedora 35 and likely would get the same thing.

Here is what I got from dmesg

Failed to start audit.service - security auditing
see systemctl stat service

on reboot message 
Failed to search for file packagekit daemon-dissappeared.

Comment 11 Steve Grubb 2022-02-11 01:34:36 UTC

I read through the audits start up code. If it has problems writing logs to disk, it does whatever disk_error_action is. By default it is suspend logging. Once it does that, it won't try writing until it's unsuspended. But if it runs into any big problems during boot, it exits. Based on comment #10, it sounds like it exited. It should have written a reason to syslog. But if auditd exited, it's not causing the lock out.

Comment 12 Leslie Satenstein 2022-02-11 02:22:20 UTC

Hi Steve
I took my wife's cellphone to photograph the screen with the messages. I see that you have found it. My phones camera is Kaput.
All the transcribed error messages, it seems, were written to stderr, messages to dmesg were to stdout.
When I grep'd dmesg, I could not find the error message.
One patch to kernel would be to write stderr messages to a dmesg.stderr

For recovery, part 2 of my research and report.

Fedora 33 and I think 34 isos on boot, have a error recovery routine that can be invoked. That routine undoes the suspend and also writes a /.autorelabel which, as you can guess, runs through every partition listed within /etc/fstab. Recovery for me was 1 hour. It was actually faster for me to re-install linux than to wait for the completion of the ./autorelabel.

I did mention that I found the true error recovery routine on the Everything.iso

So. good luck.
I am a grandpa (he) end-user, age 81, I program in C, and am starting rust. I started with Fedora around 2004.

If I was to create a patch, I would add the patch to "systemctl daemon-reload", or create a new daemon-unlock or some other clever way.
The error recovery with Everything ISO, can give you a clue as to how to recover from the lockout.

I am in Montreal EST. My code adds the xref to the right of dmp pass (see post1) and also validates every field and also columinizes the 6 fstab parameters.
I wrote the source for that formatter/xrefrencer you see in post#1 and it is freely (gcc license ) available on request.

Comment 13 Zbigniew Jędrzejewski-Szmek 2022-02-12 07:37:03 UTC

Attach the photos of the screen you took. This will help us debug the issue.

Comment 14 Leslie Satenstein 2022-02-13 21:05:30 UTC

This is the best I can do. The image of screen is 7.4megs, Gmail allows only 5meg attachment. 


==================
Starting systemd0hostnameed service - Hostname Service
Starting audit.service
[FAILED] Failed to start audit.service - Security Auditing service
See systemctl status audit.service for details. 

Message appeared 6 times [FAILED] is in RED, the line with it is not in dmesg.
===============================

When you boot the USB with workstation, one option is recovery,
the USB with the workstation ISO does not have the recovery entry. it just boots
without nvideo or other graphics.

THE RECOVERY ROUTINE reslides on the Everything ISO.  I used it to recover
my system.

Comment 15 Steve Grubb 2022-02-14 01:49:31 UTC

I still have no idea what is happening. Saying auditd is locking out someone is like saying syslog is locking out someone. Auditd just records what it's given. My review of the code is auditd should run into too many problems and exit (as the message above indicates). At that point it's gone and not the problem. What component should this bz be reassigned to?

Comment 16 Leslie Satenstein 2022-02-15 15:42:01 UTC

Hi Steve.
As I mentioned on post 13, I was able to find the recovery routine on the Everything ISO.

This recovery routine must also be made available on the workspace iso.  

Boot recovery for workspace iso, and then do it for the Everything ISO and you will understand what I mean.

The system recovery from Everything.iso is doing something extra, aside from creating a /.autorelabel

Want to duplicate the bug?
=========================
Create a system using workspace iso. Post reboot, add two extra partitions from another disk or partition. (I have a 6 disk +1 nvme system).  You do not need to have same number.   Do not run systemctl daemon-reload, after the modifying the fstab, just reboot.


If you have a modified diagnostic software I could use, send it to me. I will use it. I tried to grep the error message from dmesg without success. 

The messages during boot stream very quickly. From what I am "Guessing", the audit.service provides that message when it tries to write to the partitions, which, are in RO mode.   (one error message / partition)

I can also mount the RO partition to /mnt and RW to the /mnt. Would that help you.


I will be installing Rawhide later today, and will see what I can do for you and for zbigniew. 

If you are in North America, I can phone you to discuss the next step to resolve this issue.  Voice contact is more productive, particularly if I am in front of the desktop and the problem occurs real-time.

Comment 17 Leslie Satenstein 2022-02-18 15:46:10 UTC

Steve, the problem is with systemd. 

Since then I am very discouraged with problem resolution.

Here is how I think you can create the same problem.

Create a formatted ext4 partition (or other than ext4).
Create the entries for it into the /etc/fstab, and in /  but do not mount the partition 
run systemctl daemon-reload 

What happens after a reboot?

Comment 18 Steve Grubb 2022-02-18 20:50:39 UTC

Leslie, the problem you are running into is one of the harder ones to troubleshoot. This is because not enough of the system is running to provide easy troubleshooting steps. In the past, we'd say do a null modem setup and capture the output to another system. No one has rs-232 anymore.

I work on auditd upstream. I do not have anything to do with system boot. I've been trying to think about how I might reproduce this. (I don't have a spare computer laying around.) I think I could use a USB flash drive and USB pass-through to simulate adding a partition to a virtual machine. Then I might be able to use the virt serial console to capture the boot messages to another terminal. I'll give it a shot next week.

The only thing I can help with is sorting out what auditd is doing in this scenario. If it's not auditd, someone else needs to help you out. I'm sure that if we can figure out which component is at fault, problem resolution should come faster. Right now, we don't have data that is actionable.

Comment 19 Leslie Satenstein 2022-03-06 22:21:43 UTC

Hi Steve, 

I have discovered a probem that needs a change to  "systemctl daemon-reload" or some other software or some documentation.

Situation. In the past, prior to Fedora 36, refer to initial post. I was and am able to add additional information to the right of the first five columns of the /etc/fstab.
I could add, or modify the /etc/fstab, followed by "sudo systemctl daemon-reload" as per the instructions in the header of the /etc/fstab

In so doing,  Selinux flags an issue, but nothing appears on the terminal.  A Selinux message on the terminal may appear if there were already serveral others.

What is the consequence if the Selinux flag is not reset.

On reboot, the system mounts every partition in RO mode. 

When I take the Recovery ISO, and run it, it stops at   chroot /mnt/sysimage.   There is no instruction as to what to do.
I reboot, and the grub menu invokes the autolabel routine.  BUT THE PARTITIONS ARE MOUNTED RO.
When the autorelabel routine tries to write what it requires, it cannot. (RO only partitions) and the system reboots.

We have a reboot, autorelabel to RO partitions, followed by reboot followed by ......

Here is what I do to avoid the problem.
1) update the /etc/fstab, (with new partitions or my stuff at the right)
2) run sudo systemctl daemon-reload
3) Run the following script.
#!/bin/bash
sudo ausearch -c 'systemd-gpt-aut' --raw | audit2allow -M my-systemdgptaut   
sudo semodule -X 300 -i my-systemdgptaut.pp  
#
sudo ausearch -c 'systemd-fstab-g' --raw | audit2allow -M my-systemdfstabg           
sudo semodule -X 300 -i my-systemdfstabg.pp         
#
echo done
4) Reboot to a functioning system. 

Below is a copy of my current /etc/fstab  with my xrefs to the right
To duplicate my loop reboot problem.  
add some text to the right of the 5th column of the fstab. 
Do the "sudo systemctl daemon-reload"  
and reboot

If a patch is going to be created to allow extra info to the right of the fstab lines, Could it be programmed to ignore the # and everything to the right of the # 

The most recent fstab with xrefs to the right 

#
# /etc/fstab
# Created by anaconda on Sat Mar  5 17:57:27 2022
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
#<file system>                            <mount>      <type> <options>                                                  <dmp pass>  <xref>            <label/uuid>
UUID=f93b235a-4b14-49f1-9846-e319fd8ad808 /            btrfs  subvol=root01,compress=zstd:1                                      0 0 #/dev/nvme0n1p12  
UUID=9628b758-2443-4be3-84ee-bf4bb4265364 /boot        ext4   defaults                                                           1 2 #/dev/nvme0n1p11  
UUID=7C6A-3A4D                            /boot/efi    vfat   umask=0077,shortname=winnt                                         0 2 #/dev/nvme0n1p3   F35EFI
UUID=f93b235a-4b14-49f1-9846-e319fd8ad808 /home        btrfs  subvol=home01,compress=zstd:1                                      0 0 #/dev/nvme0n1p12  
#
UUID=b521e017-013f-40ca-bce9-aa3bc8c35c15 /share       btrfs  subvolid=5                                                         0 0 #/dev/nvme0n1p8   sharebtrfs
UUID=50ed96c9-b5b8-452f-ad9c-9b5e59168f0b /LinuxStuff  xfs    defaults,noatime                                                   0 0 #/dev/nvme0n1p5   LinuxStuff
UUID=0bbd3f52-6d7a-456d-8475-e35137d6932f /Development xfs    defaults,noatime                                                   0 0 #/dev/nvme0n1p4   Development
UUID=5e18134e-c709-4332-8a50-6db7d690c665 /Backup      ext4   defaults,noatime,noauto,user                                       0 0 #/dev/sdc1        Backup
UUID=ef53596d-8ec6-42c7-ad98-10bfc187bc62 /music       btrfs  defaults,noatime,compress=zstd:1,autodefrag,commit=120,noauto,user 0 0 #/dev/sdc2        Music
UUID=0ece8188-e470-4e8a-875c-c849e358cbf5 /ISOs        ext4   defaults,noatime                                                   0 0 #/dev/sdc3        ISOs
#
UUID=f11cfb14-95ea-48ef-b2bc-004a0633eb3f /scratch     ext4   defaults,noatime,noauto,user                                       0 0 #/dev/sde1        scratch
UUID=480b0cc0-b2f8-4738-9955-a778aaf33daa /slash       xfs    defaults,noatime,noauto,user                                       0 0 #/dev/sde2        slash
UUID=ef17c619-0def-485d-baf6-825f48876cde /backup      ext4   defaults,noatime,noauto,owner                                      0 0 #/dev/sde4        backup
UUID=f6c0819e-462a-47bb-9720-fec6935e540c /spare       xfs    defaults,noatime,noauto,user                                       0 0 #/dev/sde3        spare



============ GPL3 SOFTWARE (C SOURCE AND GIT REPOSITORY) AVAILABLE FOR FREE If you make improvements
please provide feedback.

FYI, the utility I wrote columinizes the fstab info, creates a thorough xreference. By thorough, it handles 
lvm, btrfs, and other is supported
Lines beginning /dev/xxx  are cross referenced with the UUID= and the label.
LInes beginning with UUID=, PARTUUID=, LABEL=xxxx and PARTLABEL=xxx are XREF handled.
Not handled-- network mounted partitions
================== end

Comment 20 Leslie Satenstein 2022-09-05 00:18:58 UTC

I have added, on shutdown of my system

sudo systemctl daemon-reload.

Thus always insuring that the next boot will pickup the latest version of /etc/fstab.

Comment 21 Ben Cotton 2022-11-29 17:50:14 UTC

This message is a reminder that Fedora Linux 35 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '35'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 35 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 22 Ben Cotton 2022-12-13 16:35:23 UTC

Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13.

Fedora Linux 35 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.

awilliam
dwalsh
fedoraproject
filbranden
flepied
grepl.miroslav
lnykryn
lvrabec
mmalik
msekleta
omosnace
pkoncity
ryncsn
scorreia
sgrubb
ssahani
s
systemd-maint
vmojzis
yuwatana
zbyszek
zpytela