Bug 1778744

Summary: Failed to create new system journal: Argument list too long
Product: Red Hat Enterprise Linux 7 Reporter: Florencia Fotorello <ffotorel>
Component: systemdAssignee: David Tardon <dtardon>
Status: CLOSED ERRATA QA Contact: Frantisek Sumsal <fsumsal>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.7CC: bdeering, dchong, dtardon, duge, jsynacek, redhat, rmetrich, sbroz
Target Milestone: rcKeywords: Patch, Reproducer
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-29 20:32:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Florencia Fotorello 2019-12-02 12:22:55 UTC
Description of problem:
systemd-journald sometimes fails when SystemMaxUse is reached:

~~~
[ 1402.272539] systemd-journald[26673]: Failed to create new system journal: Argument list too long
[ 1402.282373] systemd-journald[26673]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.313215] systemd-journald[26673]: Failed to create new system journal: Argument list too long
[ 1402.325012] systemd-journal[26673]: segfault at 7ffe7d346f38 ip 00007f554495c617 sp 00007ffe7d346e20 error 6 in libc-2.17.so[7f5544914000+1c3000]
[ 1402.361699] systemd-journald[26676]: File /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal corrupted or uncleanly shut down, renaming and replacing.
[ 1402.398772] systemd-journald[26676]: Failed to open system journal: Argument list too long
[ 1402.414520] systemd-journald[26676]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.463389] systemd-journald[26676]: Failed to create new system journal: Argument list too long
[ 1402.464196] systemd-journald[26676]: Didn't flush runtime journal since rotation of system journal wasn't successful.
[ 1402.478435] systemd-journald[26676]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.478578] systemd-journald[26676]: Failed to create new runtime journal: No such file or directory
[ 1402.520236] systemd-journald[26676]: Failed to create new system journal: Argument list too long
[ 1402.520854] systemd-journald[26676]: Didn't flush runtime journal since rotation of system journal wasn't successful.
[ 1402.520871] systemd-journald[26676]: Assertion 'f' failed at src/journal/journal-file.c:132, function journal_file_close(). Aborting.
[ 1402.556377] systemd-journald[26678]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.589207] systemd-journald[26678]: Failed to create new system journal: Argument list too long
[ 1402.601023] systemd-journald[26678]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.633406] systemd-journald[26678]: Failed to create new system journal: Argument list too long
[ 1402.646062] systemd-journald[26678]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.679718] systemd-journald[26678]: Failed to create new system journal: Argument list too long
~~~

Version-Release number of selected component (if applicable):
systemd-219-67.el7_7.2.x86_64

How reproducible:
Always using the below steps, in a lab and in the customer's environment.


Steps to Reproduce:
1. Configuration file:

~~~
[root@myserver ~]# cat /etc/systemd/journald.conf
[Journal]
Storage=persistent
Compress=yes
RateLimitBurst=10000
SystemMaxUse=600M
SystemKeepFree=6G
SystemMaxFileSize=1G
~~~

2. Create the following script:

~~~
[root@myserver ~]# cat testmessage.sh 
#!/bin/bash
FILEDATE=`date +%T`
LOGFILE=/var/log/results.$FILEDATE.log
touch $LOGFILE

n=1000000
i=0

journalctl --vacuum-time=1m
systemctl restart systemd-journald

while [ $i -lt $n ]
do
   echo "Test message #$i" | tee -a $LOGFILE
   sudo -u user1 logger "This is a test message - user 1 #$i" 
   sudo -u user2 logger "This is a test message - user 2 #$i" 
   sudo -u user3 logger "This is a test message - user 3 #$i" 
   date  >> $LOGFILE 
   df -h >> $LOGFILE
   ls -lR /var/log/journal >> $LOGFILE
   ls -lR /var/log/journal | wc -l >> $LOGFILE
   journalctl --disk-usage >> $LOGFILE
   i=`expr $i + 1`
done
~~~

3. Run the script and wait until journald fails. 

Actual results:
The console is unusable, flooded with "Failed to create new system journal: Argument list too long" messages.

~~~
# dmesg | grep journal
[ 1402.272539] systemd-journald[26673]: Failed to create new system journal: Argument list too long
[ 1402.282373] systemd-journald[26673]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.313215] systemd-journald[26673]: Failed to create new system journal: Argument list too long
[ 1402.325012] systemd-journal[26673]: segfault at 7ffe7d346f38 ip 00007f554495c617 sp 00007ffe7d346e20 error 6 in libc-2.17.so[7f5544914000+1c3000]
[ 1402.361699] systemd-journald[26676]: File /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal corrupted or uncleanly shut down, renaming and replacing.
[ 1402.398772] systemd-journald[26676]: Failed to open system journal: Argument list too long
[ 1402.414520] systemd-journald[26676]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.463389] systemd-journald[26676]: Failed to create new system journal: Argument list too long
[ 1402.464196] systemd-journald[26676]: Didn't flush runtime journal since rotation of system journal wasn't successful.
[ 1402.478435] systemd-journald[26676]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.478578] systemd-journald[26676]: Failed to create new runtime journal: No such file or directory
[ 1402.520236] systemd-journald[26676]: Failed to create new system journal: Argument list too long
[ 1402.520854] systemd-journald[26676]: Didn't flush runtime journal since rotation of system journal wasn't successful.
[ 1402.520871] systemd-journald[26676]: Assertion 'f' failed at src/journal/journal-file.c:132, function journal_file_close(). Aborting.
[ 1402.556377] systemd-journald[26678]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.589207] systemd-journald[26678]: Failed to create new system journal: Argument list too long
[ 1402.601023] systemd-journald[26678]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.633406] systemd-journald[26678]: Failed to create new system journal: Argument list too long
[ 1402.646062] systemd-journald[26678]: /var/log/journal/2b6d0a4a6e70465097482efdc5053898/system.journal: Journal file corrupted, rotating.
[ 1402.679718] systemd-journald[26678]: Failed to create new system journal: Argument list too long
~~~

Expected results:
- No errors when there is space in the filesystem.

Additional info:
- Initial status:

~~~
Test message #0
Thu Nov 28 10:13:30 EST 2019
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/rhel-root  8.0G  1.8G  6.3G  23% /

Archived and active journals take up 120.0M on disk.
~~~

- Final status:

~~~
Test message #17586
Thu Nov 28 08:34:05 EST 2019
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/rhel-root  8.0G  2.1G  6.0G  26% /

Archived and active journals take up 360.0M on disk.
~~~

- When systemd-journald is restarted, there is an error:

~~~
[root@myserver ~]# systemctl restart systemd-journald

** (pkttyagent:26715): WARNING **: 08:45:11.009: Unable to register authentication agent: Timeout was reached
Error registering authentication agent: Timeout was reached (g-io-error-quark, 24)
Failed to restart systemd-journald.service: Connection timed out
See system logs and 'systemctl status systemd-journald.service' for details.


[root@myserver ~]# systemctl status systemd-journald.service
● systemd-journald.service - Journal Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static; vendor preset: disabled)
   Active: activating (start) since Thu 2019-11-28 08:44:32 EST; 1min 11s ago
     Docs: man:systemd-journald.service(8)
           man:journald.conf(5)
 Main PID: 26711 (systemd-journal)
   CGroup: /system.slice/systemd-journald.service
           └─26711 /usr/lib/systemd/systemd-journald
[root@myserver ~]# 
~~~

Comment 4 Stepan Broz 2020-03-11 16:40:53 UTC
Reproduction scenario (basically an alternative to the original reproducer):

1. Configure systemd-journald to use permanent or auto (default) storage and create "/var/log/journal" directory.

    # mkdir /var/log/journal

2. Edit "/etc/systemd/journald.conf" and use the following configuration:

--8<--------8<--------8<--------8<------
[Journal]
RateLimitInterval=0
RateLimitBurst=0
SystemMaxUse=100G
SystemMaxFileSize=1G
MaxRetentionSec=1d
MaxFileSec=30
--8<--------8<--------8<--------8<------

NOTE: Use SystemMaxUse really large, larger than available disk space on "/var" (or "/" if you don't have dedicated "/var" filesystem).

3. Restart systemd-journald.

    # systemctl restart systemd-journald

4. Create a really large file to fill-up disk space so that less than 15% free is left in "/var".

    # dd if=/dev/zero of=bigfile bs=1M count=<you need to use your own calculation, I've used 10240 for 12.8G of available space>

5. Wait for journal rotation.

6. Rotation fails, errord on console, systemd-journald in indefinite recursion loop, systemd-journald crashes.

Comment 10 David Tardon 2020-04-28 06:24:23 UTC
*** Bug 1692407 has been marked as a duplicate of this bug. ***

Comment 11 Lukáš Nykrýn 2020-04-29 11:20:45 UTC
fix merged to github master branch -> https://github.com/systemd-rhel/rhel-7/pull/107

Comment 17 errata-xmlrpc 2020-09-29 20:32:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Low: systemd security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4007

Comment 19 Dushyant 2021-03-02 15:46:39 UTC
Still, the customer is facing the issue with systemd-219-78.el7_9.3.x86_64 on RHEL7.9

--8<--------8<--------8<--------8<------
/etc/systemd/journald.conf:
[Journal]
Storage=persistent
Compress=yes
RateLimitBurst=10000
SystemMaxUse=600M
SystemKeepFree=5.6G
SystemMaxFileSize=1G
--8<--------8<--------8<--------8<------

Output of journalctl --verify

File corruption detected at /var/log/journal/e1acb366491c40dbbb362347d855d92d/user-1002~:0015d0 (of 8388608 bytes, 0%).
FAIL: /var/log/journal/e1acb366491c40dbbb362347d855d92d/user-1002~ (Bad message)
PASS: /var/log/journal/e1acb366491c40dbbb362347d855d92d/user-1002.journal
PASS: /var/log/journal/e1acb366491c40dbbb362347d855d92d/system~
0015d0: Unused data (entry_offset==0)
0000d0: Data number mismatch
File corruption detected at /var/log/journal/e1acb366491c40dbbb362347d855d92d/user-1001:0015d0 (of 8388608 bytes, 0%).
FAIL: /var/log/journal/e1acb366491c40dbbb362347d855d92d/user-1001 (Bad message)
0015d0: Unused data (entry_offset==0)


[ 1010.266193] systemd-journald[2306]: Failed to write entry (13 items, 429 bytes) despite vacuuming, ignoring: Bad message
[ 1010.267833] systemd-journald[2306]: /var/log/journal/e1acb366491c40dbbb362347d855d92d/system.journal: Journal file corrupted, rotating.
[ 1010.271599] systemd-journald[2306]: Failed to create new system journal: Argument list too long
[ 1010.273421] systemd-journald[2306]: Failed to write entry (13 items, 396 bytes) despite vacuuming, ignoring: Bad message
[ 1010.277275] systemd-journald[2306]: /var/log/journal/e1acb366491c40dbbb362347d855d92d/user-1002.journal: Journal file corrupted, rotating.
[ 1010.281075] systemd-journald[2306]: Failed to create new system journal: Argument list too long
[ 1010.286775] systemd-journald[2306]: Failed to create new user journal: Argument list too long
[ 1010.289336] systemd-journald[2306]: Failed to write entry (13 items, 366 bytes) despite vacuuming, ignoring: Bad message

Comment 26 Dean K. Gibson 2022-03-06 21:30:54 UTC
I am running version 219 of systemd.

The above problem can also occur if there are too many files that systemd-journald sees, not only in the local /var/log/journal/123... directory, but also in (for example) the /var/log/journal/remote directory.  Apparently this is a per-directory max of somewhere around 160(?) files.

Scenario:  This (lots of error messages to tty1) started happening about a month ago. Clearing all the journal files in the local /var/log/journal/123... directory & restarting did not fix the problem.  Then I manually purged most of the /var/log/journal/remote directory, & then journaling to the local /var/log/journal/123... directory worked.

Note that I was not running systemd-journal-remote was not running on the problem system;  the /var/log/journal/remote directory was being populated by using RSYNC from another host.

Unfortunately (at least on version 219), "journalctl -D /var/log/journal/remote --vacuum-time=5weeks" DOES NOTHING.  Purging has to be done manually.

Also unfortunately, since I do not have access to https://access.redhat.com/errata/RHSA-2020:4007, I have no idea whether that fix is relevant to the above problem.

Comment 27 Dean K. Gibson 2022-03-06 21:35:33 UTC
Forgot:  Here is my journald.conf:

# -- /etc/systemd/journald.conf --

# The default setting for MaxFileSec is *INSANE*, as it allows one
# old entry in a journal to cause the entire journal to be deleted!

[Journal]
Storage                 = persistent
SplitMode               = none
SystemMaxUse            = 8G
SystemKeepFree          = 8G
SystemMaxFileSize       = 1G
MaxRetentionSec         = 5weeks
MaxFileSec              = 1day
ForwardToSyslog         = no