Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
.The `rsyslog` logging service now starts at boot of the rescue system
Previously, the `rsyslog` service for message logging did not automatically start in the rescue system. The `/dev/log` socket kept receiving messages during the recovery process with no service listening at this socket. Consequently, the `/dev/log` socket was filled with messages and caused the recovery process to be stuck. For example, the `grub2-mkconfig` command to regenerate the GRUB configuration produces a high amount of log messages depending on the number of mounted file systems. If you used ReaR to recover systems with many mounted file systems, numerous log messages would fill the `/dev/log` socket, and the recovery process froze.
With this fix, the `systemd` units in the rescue system now include the sockets target in the boot procedure to start the logging socket at boot. As a result, the `rsyslog` service starts in the rescue environment when required, and the processes that need to log messages during recovery are no longer stuck. The recovery process completes successfully and you can find the log messages in the `/var/log/messages` file in the rescue RAM disk.
Description of problem:
With RHEL9, the /dev/log inode is supposed to be a symlink to /run/systemd/journal/dev-log.
But when booting the ReaR ISO, it's not the case, it's a regular socket with nobody listening on.
This causes no harm unless programs log to /dev/log, which gets filled and once filled up, programs will hang.
Affected program can be anything, but usually it is likely grub2-mkconfig and children (including os-prober) executing in the chroot after recovery that will be affected:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
++ chroot /mnt/local /bin/bash --login -c 'grub2-mkconfig -o /boot/grub2/grub.cfg'
Generating grub configuration file ...
--> HANG
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
In this scenario, hang happens when having many mount points, which lead to having os-prober scan all the mount points and send many debug messages such as "debug: /dev/mapper/vg-lvname is not an HFS+ partition: exiting" through /dev/log.
The exact root cause behind having the /dev/log socket broken is the usage of templates in ReaR for some systemd services, e.g. /usr/share/rear/skel/default/usr/lib/systemd/system/syslog.socket
Such template is not in sync with systemd's units on RHEL9, causing the issue.
The workaround consists in 2 operations, to be performed before recovering:
1. Tell to copy standard systemd's units to the ReaR ISO:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
COPY_AS_IS+=( /usr/lib/systemd/system/systemd-journald-dev-log.socket /usr/lib/systemd/system/systemd-journald.socket /usr/lib/systemd/system/systemd-journald.service /usr/lib/systemd/system/sockets.target.wants/systemd-journald-dev-log.socket )
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
2. Delete /usr/share/rear/skel/default/usr/lib/systemd/system/syslog.socket
The proper solution is likely to remove all templates mapping systemd units and copy the systemd units to the ISO instead.
Version-Release number of selected component (if applicable):
rear-2.6-15
How reproducible:
Always
Steps to Reproduce:
1. Create a VM with many filesystems
/dev/mapper/rhel-root / xfs defaults 0 0
UUID=01d8a9ea-ee10-4ec2-b839-bac3c7e36db6 /boot xfs defaults 0 0
/dev/mapper/rhel-datamntpoint1 /datamntpoint1 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint10 /datamntpoint10 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint11 /datamntpoint11 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint12 /datamntpoint12 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint13 /datamntpoint13 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint14 /datamntpoint14 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint15 /datamntpoint15 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint16 /datamntpoint16 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint17 /datamntpoint17 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint18 /datamntpoint18 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint19 /datamntpoint19 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint2 /datamntpoint2 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint20 /datamntpoint20 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint21 /datamntpoint21 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint22 /datamntpoint22 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint23 /datamntpoint23 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint24 /datamntpoint24 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint25 /datamntpoint25 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint26 /datamntpoint26 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint3 /datamntpoint3 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint4 /datamntpoint4 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint5 /datamntpoint5 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint6 /datamntpoint6 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint7 /datamntpoint7 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint8 /datamntpoint8 xfs defaults 0 0
/dev/mapper/rhel-datamntpoint9 /datamntpoint9 xfs defaults 0 0
/dev/mapper/rhel-swap none swap defaults 0 0
2. Create a ReaR backup
3. Restore the backup
Actual results:
Hang while executing grub2-mkconfig
Expected results:
No hang, /dev/log socket being a symlink
(In reply to Renaud Métrich from comment #0)
> Description of problem:
>
> With RHEL9, the /dev/log inode is supposed to be a symlink to
> /run/systemd/journal/dev-log.
Thank you for the analysis. Is it a new problem in RHEL 9, or has it existed in RHEL 8 as well?
I see a similar situation in RHEL 8:
# ls -l /dev/log
lrwxrwxrwx. 1 root root 28 Feb 22 04:16 /dev/log -> /run/systemd/journal/dev-log
I don't know if this affects RHEL8.
For sure the good inode is:
# ls -l /dev/log
lrwxrwxrwx. 1 root root 28 Feb 22 04:16 /dev/log ->
/run/systemd/journal/dev-log
I am curious though how does having correct systemd unit outside the chroot help the program running in the chroot? Is it because /run is shared so that connecting to /run/systemd/journal/dev-log in the chroot actually connects to the daemon that runs outside?
Hi Renaud, thank you for the analysis again, I have looked into the details of systemd units startup in the rescue system. IMO, your proposed workaround (to copy all the systemd logging-related units) is not very well suitable for inclusion in upstream, as ReaR needs to support many distros and these details will vary among them. At least, it would require lots of difficult testing in all the supported distros. Therefore, I propose a less invasive solution. I found that there are multiple problems with the current systemd units: nothing wants basic.target and therefore the services/sockets that it contains get never started (this affect the /dev/log socket and the rsyslogd service that is listening on it). Moreover, if I fix this, the socket starts very early and for some reason this does not work. If I order it after basic system initialization, everything starts working. The socket gets started, when one attempts to log to it rsyslogd is spawned and sends the messages to /var/log/messages. (/dev/log is not a symlink to /run/systemd/journal/dev-log, but I don't think it is a big problem). By the way, I can reproduce the problem as well using a simple for loop:
for i in `seq 1 1000`; do echo foo$i; done
this hangs when the problem occur, because the socket gets filled.
Wit my fixes to the systemd units, it is fine, the output goies to /var/log/messages. I can also see the output from grub2-mkconfig (actually, from os-prober) there. So the problem you are seeing should be fixed. The changes are on my branch: https://github.com/pcahyna/rear/tree/rsyslog . What do you think?
Regarding RHEL 8, I see that the logs go into the systemd journal by default, so it seems that the problem does not occur there and so I won't touch it.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (rear bug fix and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2023:6571
Description of problem: With RHEL9, the /dev/log inode is supposed to be a symlink to /run/systemd/journal/dev-log. But when booting the ReaR ISO, it's not the case, it's a regular socket with nobody listening on. This causes no harm unless programs log to /dev/log, which gets filled and once filled up, programs will hang. Affected program can be anything, but usually it is likely grub2-mkconfig and children (including os-prober) executing in the chroot after recovery that will be affected: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- ++ chroot /mnt/local /bin/bash --login -c 'grub2-mkconfig -o /boot/grub2/grub.cfg' Generating grub configuration file ... --> HANG -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- In this scenario, hang happens when having many mount points, which lead to having os-prober scan all the mount points and send many debug messages such as "debug: /dev/mapper/vg-lvname is not an HFS+ partition: exiting" through /dev/log. The exact root cause behind having the /dev/log socket broken is the usage of templates in ReaR for some systemd services, e.g. /usr/share/rear/skel/default/usr/lib/systemd/system/syslog.socket Such template is not in sync with systemd's units on RHEL9, causing the issue. The workaround consists in 2 operations, to be performed before recovering: 1. Tell to copy standard systemd's units to the ReaR ISO: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- COPY_AS_IS+=( /usr/lib/systemd/system/systemd-journald-dev-log.socket /usr/lib/systemd/system/systemd-journald.socket /usr/lib/systemd/system/systemd-journald.service /usr/lib/systemd/system/sockets.target.wants/systemd-journald-dev-log.socket ) -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 2. Delete /usr/share/rear/skel/default/usr/lib/systemd/system/syslog.socket The proper solution is likely to remove all templates mapping systemd units and copy the systemd units to the ISO instead. Version-Release number of selected component (if applicable): rear-2.6-15 How reproducible: Always Steps to Reproduce: 1. Create a VM with many filesystems /dev/mapper/rhel-root / xfs defaults 0 0 UUID=01d8a9ea-ee10-4ec2-b839-bac3c7e36db6 /boot xfs defaults 0 0 /dev/mapper/rhel-datamntpoint1 /datamntpoint1 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint10 /datamntpoint10 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint11 /datamntpoint11 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint12 /datamntpoint12 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint13 /datamntpoint13 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint14 /datamntpoint14 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint15 /datamntpoint15 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint16 /datamntpoint16 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint17 /datamntpoint17 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint18 /datamntpoint18 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint19 /datamntpoint19 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint2 /datamntpoint2 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint20 /datamntpoint20 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint21 /datamntpoint21 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint22 /datamntpoint22 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint23 /datamntpoint23 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint24 /datamntpoint24 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint25 /datamntpoint25 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint26 /datamntpoint26 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint3 /datamntpoint3 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint4 /datamntpoint4 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint5 /datamntpoint5 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint6 /datamntpoint6 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint7 /datamntpoint7 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint8 /datamntpoint8 xfs defaults 0 0 /dev/mapper/rhel-datamntpoint9 /datamntpoint9 xfs defaults 0 0 /dev/mapper/rhel-swap none swap defaults 0 0 2. Create a ReaR backup 3. Restore the backup Actual results: Hang while executing grub2-mkconfig Expected results: No hang, /dev/log socket being a symlink