Bug 1709179

Summary: journald: never block when sending messages on NOTIFY_SOCKET socket [rhel-7.4.z]
Product: Red Hat Enterprise Linux 7 Reporter: RAD team bot copy to z-stream <autobot-eus-copy>
Component: systemdAssignee: David Tardon <dtardon>
Status: CLOSED ERRATA QA Contact: Frantisek Sumsal <fsumsal>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.5CC: chorn, dtardon, fsumsal, jamacku, jsynacek, lnykryn, mmatsuya, msekleta, ovasik, sbroz, systemd-maint-list, systemd-maint
Target Milestone: rcKeywords: Reopened, Triaged, ZStream
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: systemd-219-42.el7_4.21 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1511565 Environment:
Last Closed: 2022-12-06 07:23:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1511565    
Bug Blocks:    

Description RAD team bot copy to z-stream 2019-05-13 07:06:25 UTC
This bug has been copied from bug #1511565 and has been proposed to be backported to 7.4 z-stream (EUS).

Comment 2 Jan Synacek 2019-05-13 08:16:00 UTC
This can't be fixed as there is no NOTIFY_SOCKET in 7.4. It was added in 7.5 in systemd-219-48 and the accuracy further fixed in 219-49, but the latest 7.4 version is 219-42.15.

Comment 7 Christian Horn 2022-11-02 00:48:02 UTC
We have no real reproducer for this situation, but our partner
has use systemtap to observe the interval between Watchdog 
notifications, and compared rhel7.4-systemd and 
a test package which Stepan has built:

~~~
I tried "03348542-systemd-test-packages.tar.xz" in our reproduction environment.

I use Systemtap to probe "sendto" system call to check the interval between Watchdog notifications.

In the case of systemd-219-42.el7_4.20, 
it is often seen that the transmission interval is just 3-minutes.

    Tue Oct 18 20:39:42 2022 JST
    sendto is called: 11, "WATCHDOG=1", 10, MSG_DONTWAIT, NULL, 0
    PID:  14963 EXECNAME:  systemd-journal
    PPID: 1 PEXECNAME: systemd
    0x7f4b962bbbfd : send+0x1d/0xb0 [/usr/lib64/libpthread-2.17.so]
    0x557dbbb7072e : dispatch_notify_event.3887+0x18e/0x2c0 [/usr/lib/systemd/systemd-journald]
    0x557dbbb6b2c0 : source_dispatch.12334.3025+0x1c0/0x320 [/usr/lib/systemd/systemd-journald]
    0x557dbbb6c2ea : sd_event_dispatch+0x6a/0x1b0 [/usr/lib/systemd/systemd-journald]
    0x557dbbb6835c : main+0x7ac/0x12a0 [/usr/lib/systemd/systemd-journald]
    0x7f4b95f01555 : __libc_start_main+0xf5/0x1c0 [/usr/lib64/libc-2.17.so]
    0x557dbbb68e8c : _start+0x29/0x3d [/usr/lib/systemd/systemd-journald]

    Tue Oct 18 20:42:42 2022 JST
    sendto is called: 11, "WATCHDOG=1", 10, MSG_DONTWAIT, NULL, 0
    PID:  14966 EXECNAME:  systemd-journal
    PPID: 1 PEXECNAME: systemd
    0x7f9b9312fbfd : send+0x1d/0xb0 [/usr/lib64/libpthread-2.17.so]
    0x55ad1760272e : dispatch_notify_event.3887+0x18e/0x2c0 [/usr/lib/systemd/systemd-journald]
    0x55ad175fd2c0 : source_dispatch.12334.3025+0x1c0/0x320 [/usr/lib/systemd/systemd-journald]
    0x55ad175fe2ea : sd_event_dispatch+0x6a/0x1b0 [/usr/lib/systemd/systemd-journald]
    0x55ad175fa35c : main+0x7ac/0x12a0 [/usr/lib/systemd/systemd-journald]
    0x7f9b92d75555 : __libc_start_main+0xf5/0x1c0 [/usr/lib64/libc-2.17.so]
    0x55ad175fae8c : _start+0x29/0x3d [/usr/lib/systemd/systemd-journald]

    Tue Oct 18 20:45:43 2022 JST
    sendto is called: 11, "WATCHDOG=1", 10, MSG_DONTWAIT, NULL, 0
    PID:  14969 EXECNAME:  systemd-journal
    PPID: 1 PEXECNAME: systemd
    0x7f3785cbdbfd : send+0x1d/0xb0 [/usr/lib64/libpthread-2.17.so]
    0x55625bed672e : dispatch_notify_event.3887+0x18e/0x2c0 [/usr/lib/systemd/systemd-journald]
    0x55625bed12c0 : source_dispatch.12334.3025+0x1c0/0x320 [/usr/lib/systemd/systemd-journald]
    0x55625bed22ea : sd_event_dispatch+0x6a/0x1b0 [/usr/lib/systemd/systemd-journald]
    0x55625bece35c : main+0x7ac/0x12a0 [/usr/lib/systemd/systemd-journald]
    0x7f3785903555 : __libc_start_main+0xf5/0x1c0 [/usr/lib64/libc-2.17.so]
    0x55625becee8c : _start+0x29/0x3d [/usr/lib/systemd/systemd-journald]

In the case of systemd-219-42.el7_4.20.case03348542.test.0.x86_64, 
The transmission interval is stable at 2-minute intervals.
This behavior is similar to systemd in RHEL7.5 and later with the fix.

    Tue Nov  1 09:54:58 2022 JST
    sendto is called: 11, "WATCHDOG=1", 10, MSG_DONTWAIT, NULL, 0
    PID:  339 EXECNAME:  systemd-journal
    PPID: 1 PEXECNAME: systemd
    0x7f0e8e121bfd : send+0x1d/0xb0 [/usr/lib64/libpthread-2.17.so]
    0x55ebc4b8872e : dispatch_notify_event.3887+0x18e/0x2c0 [/usr/lib/systemd/systemd-journald]
    0x55ebc4b832c0 : source_dispatch.12334.3025+0x1c0/0x320 [/usr/lib/systemd/systemd-journald]
    0x55ebc4b842ea : sd_event_dispatch+0x6a/0x1b0 [/usr/lib/systemd/systemd-journald]
    0x55ebc4b8035c : main+0x7ac/0x12a0 [/usr/lib/systemd/systemd-journald]
    0x7f0e8dd67555 : __libc_start_main+0xf5/0x1c0 [/usr/lib64/libc-2.17.so]
    0x55ebc4b80e8c : _start+0x29/0x3d [/usr/lib/systemd/systemd-journald]

    Tue Nov  1 09:56:28 2022 JST
    sendto is called: 11, "WATCHDOG=1", 10, MSG_DONTWAIT, NULL, 0
    PID:  339 EXECNAME:  systemd-journal
    PPID: 1 PEXECNAME: systemd
    0x7f0e8e121bfd : send+0x1d/0xb0 [/usr/lib64/libpthread-2.17.so]
    0x55ebc4b8872e : dispatch_notify_event.3887+0x18e/0x2c0 [/usr/lib/systemd/systemd-journald]
    0x55ebc4b832c0 : source_dispatch.12334.3025+0x1c0/0x320 [/usr/lib/systemd/systemd-journald]
    0x55ebc4b842ea : sd_event_dispatch+0x6a/0x1b0 [/usr/lib/systemd/systemd-journald]
    0x55ebc4b8035c : main+0x7ac/0x12a0 [/usr/lib/systemd/systemd-journald]
    0x7f0e8dd67555 : __libc_start_main+0xf5/0x1c0 [/usr/lib64/libc-2.17.so]
    0x55ebc4b80e8c : _start+0x29/0x3d [/usr/lib/systemd/systemd-journald]

    Tue Nov  1 09:58:28 2022 JST
    sendto is called: 11, "WATCHDOG=1", 10, MSG_DONTWAIT, NULL, 0
    PID:  339 EXECNAME:  systemd-journal
    PPID: 1 PEXECNAME: systemd
    0x7f0e8e121bfd : send+0x1d/0xb0 [/usr/lib64/libpthread-2.17.so]
    0x55ebc4b8872e : dispatch_notify_event.3887+0x18e/0x2c0 [/usr/lib/systemd/systemd-journald]
    0x55ebc4b832c0 : source_dispatch.12334.3025+0x1c0/0x320 [/usr/lib/systemd/systemd-journald]
    0x55ebc4b842ea : sd_event_dispatch+0x6a/0x1b0 [/usr/lib/systemd/systemd-journald]
    0x55ebc4b8035c : main+0x7ac/0x12a0 [/usr/lib/systemd/systemd-journald]
    0x7f0e8dd67555 : __libc_start_main+0xf5/0x1c0 [/usr/lib64/libc-2.17.so]
    0x55ebc4b80e8c : _start+0x29/0x3d [/usr/lib/systemd/systemd-journald]

For now, it looks like it has been fixed correctly.
I will try to run it for a little longer in our reproduction environment to see if there is an issue.
~~~

In absence of a real reproducer, I guess this is the next best thing
we can get?
The comment in #6 looks also like the patch might be material appropriate
for rhel7.4.z stream?

( Bugzilla forces me to set a target release, and just accepts 7.9.z,
  so setting that.  Either this bz will eventually be flagged for 7.4.z,
  or this here will just contain the investigation and the fix will be
  via a clone of the bz with fixes which went into rhel7.5GA.)

Comment 10 Plumber Bot 2022-11-18 15:18:15 UTC
fix merged to github rhel-7.4 branch -> https://github.com/redhat-plumbers/systemd-rhel7/pull/142

Comment 18 errata-xmlrpc 2022-12-06 07:23:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (systemd bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8798