Bug 742624 - Enabling syslog-ng interferes with desktop login to KDE
Summary: Enabling syslog-ng interferes with desktop login to KDE
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: syslog-ng
Version: 15
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Matthias Runge
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 755682 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-09-30 19:41 UTC by Greg
Modified: 2011-12-24 20:52 UTC (History)
7 users (show)

Fixed In Version: syslog-ng-3.2.5-2.fc15
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-24 20:51:43 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Greg 2011-09-30 19:41:13 UTC
Description of problem:

When I enable syslog-ng, even with the default config, it apparently interferes with the logout and login processes for KDE which hangs and does not recover until I ssh into the box and kill syslog-ng


Version-Release number of selected component (if applicable):

3.2.4

How reproducible:

Login to Desktop KDE - disable the default rsyslogd daemon and enable syslog-ng daemon and start it up.

Steps to Reproduce:
1.  Login to Desktop KDE
2. Kill and disable rsyslogd and enable syslog-ng and start it using systemctl
3. Logout of KDE
4. Try to login
  
Actual results:

Can't log back into KDE until the syslog-ng process is killed - the login process hangs

Expected results:

Should be able to log into KDE without any problems

Additional info:

Comment 1 Matthias Runge 2011-10-04 07:21:31 UTC
sounds really strange.

Comment 2 Douglas E. Warner 2011-10-04 12:23:55 UTC
I agree; it /is/ really strange.

Unfortunately, I experienced this late last week as well.  Somehow systemd and syslog-ng are not getting along currently, and when syslog-ng is running it seems like all syslog messages are blocked which causes pam to timeout.

This took me until Monday to figure out what it was because I had to reinstall my OS to get standard functionality again as I couldn't login to my system in any multi-user mode (since those were the ones that syslog-ng was enabled) and pam wasn't logging anything for the failed logins because systemd thought syslog-ng was handling the messages when apparently it wasn't.

It wasn't until after the reinstall (when everything worked fine) and I updated my system that things broke again for me, and I found that disabling syslog-ng got things working again.

Unless we have an idea on what might be going on we might want to bump the epoch on an older version that has the old init script until we have time to figure out why systemd/syslog-ng aren't getting along.

Comment 3 Matthias Runge 2011-10-04 18:45:10 UTC
OK, until now, I'm aware of you two affected. 

which version of syslog-ng are you using? (rpm -q syslog-ng)

Comment 4 Douglas E. Warner 2011-10-04 18:58:50 UTC
I removed it to get my system logging in again, but I can help debug (which is just difficult given no logging seems to work :P).

Sep 30 10:17:10 Installed: syslog-ng-3.2.4-8.fc15.x86_64

$ rpm -q systemd syslog-ng
systemd-26-9.fc15.x86_64
package syslog-ng is not installed

Comment 5 Greg 2011-10-04 19:01:28 UTC
rpm -q systemd syslog-ng
systemd-26-9.fc15.x86_64
syslog-ng-3.2.4-8.fc15.x86_64

Comment 6 Jose Pedro Oliveira 2011-10-05 03:07:31 UTC
Douglas and Greg,

Could you downgrade syslog-ng 3.2.4 to release 3 [1] and try to reproduce the problems you have described?

jpo

[1] - syslog-ng-3.2.4-3.fc15 RPMS are still available in koji
      (http://koji.fedoraproject.org/koji/buildinfo?buildID=244150)

      Changes from release 3 to release 8:
      * built with libdbi
      * patch syslog-ng-3.2.4-chain-hostnames-processing.patch (#713965)
      * syslog-ng.service configuration file:
        - Sockets setting (#734569)
        - StandardOutput setting (#734591)

Comment 7 Greg 2011-10-05 18:32:39 UTC
syslog-ng-3.2.4-7.fc15.x86_64 - caused my system to hang on reboot, I did not wait forever, but for at least a couple of minutes, and during that period the system was unresponsive and SSH was not listening.

Downgrade to syslog-ng-3.2.4-3.fc15.x86_64 and this version seems to be working without any hitches as far as I can tell.

Comment 8 Jose Pedro Oliveira 2011-10-06 02:33:42 UTC
In the past few days I have also experienced several strange problems in a Fedora 15 system that appear to be related to syslog-ng-3.2.4-8:

 * yum update transactions that never terminated (but that appeared to install 
   the new packages correctly)
 * several hanged crond processes
 * unable to reboot the machine via the reboot command 

All these problems appear to have disappear when I downgraded syslog-ng to the previous stable release (syslog-ng-3.2.4-3).

/jpo

Comment 9 Matthias Runge 2011-10-06 06:47:44 UTC
(In reply to comment #7)
> syslog-ng-3.2.4-7.fc15.x86_64 - caused my system to hang on reboot, I did not
> wait forever, but for at least a couple of minutes, and during that period the
> system was unresponsive and SSH was not listening.
> 
> Downgrade to syslog-ng-3.2.4-3.fc15.x86_64 and this version seems to be working
> without any hitches as far as I can tell.
Yes, there is a bug in syslog-ng-3.2.4-7 in combination with systemd, which causes system boot takes very long (until something times out). This is fixed in version -8. 

Sadly, I can't reproduce the problem not being able to log into KDE, when using syslog-ng-3.2.4-8

Doug and Greg, are you using a kde login manager? (I just tested gdm..)

jpo, I didn't see your issues either and am using syslog-ng on my three daily work machines.

Comment 10 Douglas E. Warner 2011-10-06 12:26:35 UTC
(In reply to comment #9)
> Doug and Greg, are you using a kde login manager? (I just tested gdm..)

I'm using gdm, but I also wasn't able to login to normal text login when I booted w/ "systemd.unit=multi-user.target".  I would type my username in, hit enter, and never be prompted for my password; the login would just time out after 60 seconds.  My impression was that pam was blocked trying to log output.

Comment 11 Greg 2011-10-06 18:00:20 UTC
I am using KDM

I upgraded again from 3.2.4-3 to 3.2.4-8 via yum and restarted the system which hung up during boot and the last message printed to the screen was "Started Display Manager" - I left the system in this state for about 5 minutes, no progression

I then ctrl-alt-delete and restart fails with the message "Show Plymouth Reboot Screen Failed"

I believe I was originally using GDM before I reinstalled my system and decided to try KDM.

Comment 12 Jose Pedro Oliveira 2011-10-07 15:26:26 UTC
Hi,

The problems are being caused by the changes done in the syslog-ng.service file:
removing the StandardOutput line and undoing the changes to the Sockets line fixes
the problem for me with syslog-ng-3.2.4-8 [1].  

----------
diff -ruN syslog-ng-3.2.4-6.1.fc15.src/syslog-ng.service syslog-ng-3.2.4-8.fc15.src/syslog-ng.service
--- syslog-ng-3.2.4-6.1.fc15.src/syslog-ng.service      2011-06-30 12:03:19.000000000 +0100
+++ syslog-ng-3.2.4-8.fc15.src/syslog-ng.service        2011-09-14 08:23:38.000000000 +0100
@@ -5,7 +5,8 @@
 EnvironmentFile=-/etc/sysconfig/syslog-ng
 ExecStartPre=/bin/systemctl stop systemd-kmsg-syslogd.service
 ExecStart=/sbin/syslog-ng $SYSLOGNG_OPTIONS
-Sockets=syslog-ng.socket
+Sockets=syslog.socket
+StandardOutput=null
-----------

[1] - The systemd version shipped with F15 appears to be to old for these changes. Maybe they only work correctly with systemd v35+ (as documented in http://www.freedesktop.org/wiki/Software/systemd/syslog).

Comment 13 Matthias Runge 2011-10-08 18:00:56 UTC
OK, jpo, thank you for sorting this out. 

Sadly, I'm still unable to reproduce this. I thinks, this bug is was introduced just with the 

StandardOutput=null

line.

I'd be tankful if someone of you guys could just remove this line from 
/lib/systemd/system/syslog-ng.service 
and retest.

I'd build a fixed version, if this succeeds.

Thanks

Comment 14 Jose Pedro Oliveira 2011-10-10 15:58:10 UTC
(In reply to comment #13)
> OK, jpo, thank you for sorting this out. 
> 
> Sadly, I'm still unable to reproduce this. I thinks, this bug is was introduced
> just with the 
> 
> StandardOutput=null
> 
> line.
> 
> I'd be tankful if someone of you guys could just remove this line from 
> /lib/systemd/system/syslog-ng.service 
> and retest.

Removing only the "StandardOutput" line doesn't solve the problem for me.

I believe there's already a patch for this problem in the syslog-ng 3.2 branch:
http://git.balabit.hu/?p=bazsi/syslog-ng-3.2.git;a=commitdiff;h=2f214c4f87d944aa28d53e331a67b1fd88d9840f
I'll try it out ...

/jpo

Comment 15 Jose Pedro Oliveira 2011-10-10 17:26:50 UTC
syslog-ng 3.2.4 built with the above patch doesn't solve the problem in my F15 x86_64 and F16 Beta x86_64 VMs.  I'm only able to fix the problem by changing the Sockets line in the syslog-ng.service file back to "Sockets=syslog-ng.socket".

 * Syslog-ng 3.2.4 SRPM with patch mentioned in the previous comment:

   syslog-ng-3.2.4-9.fc17
   http://koji.fedoraproject.org/koji/buildinfo?buildID=267857

Anyone willing to try the above syslog-ng release?

/jpo

Comment 16 Greg 2011-10-10 18:22:30 UTC
Sure I'll give it a try but won't be able to get to it for a day or so.

Comment 17 Greg 2011-10-17 19:20:42 UTC
I tried syslog-ng-3.2.4-9.fc17.x86_64

The system hangs on reboot and does not reply to SSH connection requests, I have to boot into rescue mode and uninstall that version and revert to version 3.2.4-3 which is the only version that seems to work for me.

Comment 18 Jose Pedro Oliveira 2011-10-23 03:20:13 UTC
I've just build syslog-ng-3.2.4-10.20111022.fc17. This is a pre-3.2.5 build using the tarball announced in the mail https://lists.balabit.hu/pipermail/syslog-ng/2011-October/017462.html. The syslog-ng.service file installed contains the line "Sockets=syslog.socket" (instead of the old line "Sockets=syslog-ng.socket").

I've rebuilt this version for Fedora 15 and installed in one F15 VM: I still detect a couple of problems (long reboot/poweroff times )and at least one time the network interface didn't initialized) but still haven't seen frozen startups
(but systemd also had a couple of updates recently...).  Changing the Sockets option back to "Sockets=syslog-ng.socket" solved the detected problems.

/jpo

Comment 19 Jose Pedro Oliveira 2011-10-25 19:15:08 UTC
Rawhide build with a new 3.2.5 pre-release tarball (a change in syslog-ng.service file keeping it synced with the one in syslog-ng 3.3 branch) :

 * syslog-ng-3.2.4-11.20111023.fc17
   http://koji.fedoraproject.org/koji/buildinfo?buildID=270588

Comment 20 Jose Pedro Oliveira 2011-11-01 20:56:52 UTC
Syslog-ng 3.2.5 final:

 * syslog-ng-3.2.5-1.fc17
   http://koji.fedoraproject.org/koji/buildinfo?buildID=271879

Note: The syslog-ng.service file installed contains the line "Sockets=syslog.socket" (instead of the old "working" line "Sockets=syslog-ng.socket").

Comment 21 Jose Pedro Oliveira 2011-11-02 10:43:34 UTC
(In reply to comment #20)
> Syslog-ng 3.2.5 final:
> 
>  * syslog-ng-3.2.5-1.fc17
>    http://koji.fedoraproject.org/koji/buildinfo?buildID=271879
> 
> Note: The syslog-ng.service file installed contains the line
> "Sockets=syslog.socket" (instead of the old "working" line
> "Sockets=syslog-ng.socket").

syslog-ng-3.2.5-1.fc17.src.rpm - scratch builds for Fedora 16 and Fedora 15:

 * F16: http://koji.fedoraproject.org/koji/taskinfo?taskID=3479993
 * F15: http://koji.fedoraproject.org/koji/taskinfo?taskID=3480016

Comment 22 Jose Pedro Oliveira 2011-11-04 14:45:31 UTC
Fedora 15 x86_64 VM with syslog-ng-3.2.5-1.fc15 and systemd-26-10 (or systemd-26-13 from updates-testing) still stalls for several minutes during boot:

   Last lines printed before stalling:
   ---------
   ...
   Starting Termninate Plymouth Boot Screen...
   Starting Wait for Plymouth Boot Screen to Quit...
   ---------

After 4/5 minutes it manages to present a login console (runlevel 3) where it is possible to make a successful login (but the system won't accept network logins via ssh).

The system boots without delays if I change the Sockets option from 
  Sockets=syslog.socket
to 
  Sockets=syslog-ng.socket

/jpo

Comment 23 Jose Pedro Oliveira 2011-11-04 14:59:12 UTC
Similar problems in a F16 VM with syslog-ng-3.2.5-1.fc16.x86_64 and
systemd-37-2.fc16.x86_64: boot process a bit slower (but faster than the F15 with problems) and no network after boot.

Fix: Change the Sockets option back to syslog-ng.socket

Comment 24 Greg 2011-11-11 17:58:07 UTC
Fix: Change the Sockets option back to syslog-ng.socket works for me - thanks everyone for looking into this and providing me incentive to familiarize myself a bit more with systemd.

Greg

Comment 25 Balazs Scheidler 2011-12-13 19:40:28 UTC
Can anyone check what happens with syslog-ng in the stuck case? An strace would help a lot to see where it hangs.

I'm not sure how strace could be applied to the syslog-ng process during startup,
but please make sure that "-s 256 -f -ttT" strace options are used, and strace
output is redirected to a file using "-o".

There was a related change in 3.2.5 (not enabling nonblocking I/O on systemd acquired sockets), but that should've fixed this one.

Thanks.

Comment 26 Michal Schmidt 2011-12-14 02:51:13 UTC
*** Bug 755682 has been marked as a duplicate of this bug. ***

Comment 27 Michal Schmidt 2011-12-14 03:34:00 UTC
This is suspicious:

# netstat -xap | grep /dev/log
unix  18     [ ]         DGRAM                    13346  1/init              /dev/log
unix  2      [ ACC ]     STREAM     LISTENING     15281  886/syslog-ng       /dev/log
unix  3      [ ]         STREAM     CONNECTED     15332  886/syslog-ng       /dev/log
unix  3      [ ]         STREAM     CONNECTED     15305  886/syslog-ng       /dev/log

The unix socket created by the unit syslog.socket and passed to syslog-ng.service is a datagram socket. Does syslog-ng ignore the socket fd it receives from systemd because it does not like it being a datagram socket?

Comment 28 Balazs Scheidler 2011-12-14 07:08:45 UTC
hmm... this depends on the configuration file for syslog-ng. if that contains
unix-stream() source driver, it'll try to locate a SOCK_STREAM socket, if that's
unix-dgram() it'll look for SOCK_DGRAM

This could explain the problem as in the case syslog-ng doesn't find the 
systemd socket, it'll open its own /dev/log

Which means that part of the system will connect to the now-defunct /dev/log, which is probably not polled by systemd. These processes will block.

Anything that connects /dev/log after syslog-ng has started will log properly though.

The system() source for syslog-ng (essentially a macro-like thing that expands to the proper set of source drivers for each platform) uses unix-dgram() by default.

This is how to use it:

@version: 3.3
@include 'scl.conf'

source s_local {
    system();
};

destination ... { ... };

log { source(s_local); destination(...); };

The scl.conf portion is important as that defines system as an SCL block.

Comment 29 Michal Schmidt 2011-12-14 11:32:51 UTC
The default syslog-ng.conf as shipped in the Fedora package has:

...
source s_sys {
    file ("/proc/kmsg" program_override("kernel: "));
    unix-stream ("/dev/log");
    internal();
    # udp(ip(0.0.0.0) port(514));
};
...

Changing "unix-stream" to "unix-dgram" makes it work.

I tried using the "system();" macro instead and I hit a problem that is caused by a packaging bug: The directory /usr/share/syslog-ng/include/scl is not shipped in the main package, but in -devel subpackage. It should be in the main package if "@include 'scl.conf'" is expected to work. And the SELinux policy needs a few additions, because currently it does not expect that syslog-ng needs to execute scripts of the scl plugins (such as "generate-system-source.sh").

Comment 30 Michal Schmidt 2011-12-14 11:46:35 UTC
Balazs,
I have a suggestion about the check done in modules/afsocket/afunix.c:afunix_sd_acquire_socket().
Rather than ignoring the socket fd when it does not match the expectations set by the config file, it would be safer to consider that an error and abort. That way the symptoms of the misconfiguration would be much more obvious than random processes hanging.

Comment 31 Matthias Runge 2011-12-14 11:56:08 UTC
Michal, Balazs, 

thank you for debugging and helping me on the road. 

Michal, I just tested your changes, and they work for me. I can't say, if they fix the original problem.

During testing it I encountered some problems. It looks like misconfiguring syslog-ng (and preventing start) prevents system from booting. (systemd seems to require a logging mechanism. If it doesn't find one, it stops/waits forever).

Comment 32 Michal Schmidt 2011-12-14 12:17:58 UTC
(In reply to comment #31)
> During testing it I encountered some problems. It looks like misconfiguring
> syslog-ng (and preventing start) prevents system from booting. (systemd seems
> to require a logging mechanism. If it doesn't find one, it stops/waits
> forever).

How exactly can I reproduce this situation?
If syslog-ng exits, I'd expect systemd to spawn systemd-kmsg-syslogd.service again which will read /dev/log and write the messages to the kernel's dmesg buffer.

Comment 33 Balazs Scheidler 2011-12-14 14:23:20 UTC
(In reply to comment #30)
> Balazs,
> I have a suggestion about the check done in
> modules/afsocket/afunix.c:afunix_sd_acquire_socket().
> Rather than ignoring the socket fd when it does not match the expectations set
> by the config file, it would be safer to consider that an error and abort. That
> way the symptoms of the misconfiguration would be much more obvious than random
> processes hanging.

This patch was commited against syslog-ng 3.3:

commit 606c8cc0b10aa4e877f70726b707402d100bb0f9
Author: Balazs Scheidler <bazsi>
Date:   Wed Dec 14 15:21:05 2011 +0100

    afunix: clarified error messages in case of a failure
    
    On Fedora systems, syslog-ng was configured to use unix-stream() /dev/log
    whereas systemd supplied a unix-dgram() one, which caused difficult to
    diagnose problems.
    
    This patch adds further logging to this case and causes syslog-ng to fail
    with an error message if it finds that /dev/log is using an incorrect
    socket type.
    
    Reported-By: Michal Schmidt <mschmidt>
    Signed-off-by: Balazs Scheidler <bazsi>

Comment 34 Michal Schmidt 2011-12-14 16:00:09 UTC
Thank you Balazs, the patch works fine in my testing.
With correct configuration syslog-ng starts without problems.
With the mismatched socket type between syslog-ng and systemd it quits immediately (with exit status 2) and thus allows systemd-kmsg-syslogd.service to take over. No process hangs.

Comment 35 Balazs Scheidler 2011-12-14 19:37:14 UTC
Great to hear, thanks.

Comment 36 Fedora Update System 2011-12-14 19:57:37 UTC
syslog-ng-3.2.5-2.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/syslog-ng-3.2.5-2.fc16

Comment 37 Fedora Update System 2011-12-14 19:58:14 UTC
syslog-ng-3.2.5-2.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/syslog-ng-3.2.5-2.fc15

Comment 38 Fedora Update System 2011-12-16 19:50:17 UTC
Package syslog-ng-3.2.5-2.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing syslog-ng-3.2.5-2.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2011-17158/syslog-ng-3.2.5-2.fc16
then log in and leave karma (feedback).

Comment 39 Fedora Update System 2011-12-24 20:51:43 UTC
syslog-ng-3.2.5-2.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 40 Fedora Update System 2011-12-24 20:52:15 UTC
syslog-ng-3.2.5-2.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.