Bug 1232665

Summary: Open a socket to /dev/log raises Connection refused
Product: Red Hat Enterprise Linux 7 Reporter: yisun
Component: rsyslogAssignee: Tomas Heinrich <theinric>
Status: CLOSED INSUFFICIENT_DATA QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: high Docs Contact:
Priority: high    
Version: 7.1CC: alukiano, bazulay, dyuan, ecohen, gklein, lpeer, lsurette, michal.skrivanek, mikolaj, msekleta, pvrabec, stirabos, xuzhang, yanyang, ybronhei, yeylon, yisun
Target Milestone: pre-dev-freeze   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-09 12:07:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1154205, 1172230    

Description yisun 2015-06-17 09:20:06 UTC
Description of problem:


Version-Release number of selected component (if applicable):
vdsm-4.16.20-1.el7ev.x86_64
libvirt-1.2.16-1.el7.x86_64
kernel-3.10.0-229.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
1. having a host with libvirt-1.2.16-1.el7.x86_64 (let's say hostname=HOST1)
2. prepare a vdsm yum repo file which pointing to vdsm-4.16.20-1.el7ev.x86_64
3. in RHEVM portal, try to new a host with HOST1
4. HOST1's registration will be failed. Last error message in rhevm is "Host HOST1 installation failed. Host not reachable."
5. login HOST1 and check vdsmd daemon, it's not running. And repeatedly trying to activate.  

Actual results:
vdsmd failed to activate. 

Expected results:
vdsmd should in running state.

Additional info:
===========================
/var/log/message:
===========================
Jun 17 17:17:40 dhcp-66-83-202 systemd: Starting Virtual Desktop Server Manager...
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running mkdirs
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running configure_coredump
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running configure_vdsm_logs
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running wait_for_network
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running run_init_hooks
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running upgraded_version_check
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running check_is_configured
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: libvirt is already configured for vdsm
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running validate_configuration
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: SUCCESS: ssl configured to true. No conflicts
Jun 17 17:17:40 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running prepare_transient_repository
Jun 17 17:17:41 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running syslog_available
Jun 17 17:17:41 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running nwfilter
Jun 17 17:17:41 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running dummybr
Jun 17 17:17:41 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running load_needed_modules
Jun 17 17:17:41 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running tune_system
Jun 17 17:17:41 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running test_space
Jun 17 17:17:41 dhcp-66-83-202 vdsmd_init_common.sh: vdsm: Running test_lo
Jun 17 17:17:41 dhcp-66-83-202 systemd: Started Virtual Desktop Server Manager.
Jun 17 17:17:42 dhcp-66-83-202 python: detected unhandled Python exception in '/usr/share/vdsm/vdsm'


===========================
abrt log:
===========================
# cat backtrace 
socket.py:224:meth:error: [Errno 111] Connection refused

Traceback (most recent call last):
  File "/usr/share/vdsm/vdsm", line 220, in <module>
    run(**argDict)
  File "/usr/share/vdsm/vdsm", line 94, in run
    lconfig.fileConfig(loggerConfFile, disable_existing_loggers=False)
  File "/usr/lib64/python2.7/logging/config.py", line 78, in fileConfig
    handlers = _install_handlers(cp, formatters)
  File "/usr/lib64/python2.7/logging/config.py", line 156, in _install_handlers
    h = klass(*args)
  File "/usr/lib64/python2.7/logging/handlers.py", line 760, in __init__
    self._connect_unixsocket(address)
  File "/usr/lib64/python2.7/logging/handlers.py", line 788, in _connect_unixsocket
    self.socket.connect(address)
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused

Local variables in innermost frame:
self: <socket._socketobject object at 0x29d94b0>
args: ('/dev/log',)
name: 'connect'

Comment 1 yisun 2015-06-17 09:31:01 UTC
I also tried a elder vdsm version:  vdsm-4.16.8.1-6.el7ev
And it works well with rhel7.1, but it also has similar issue with libvirt 1.2.16-1, which is in rhel7.2.

Comment 2 Yaniv Bronhaim 2015-06-21 13:18:19 UTC
It doesn't look related to libvirt version, the exception is about the logger - which fails to connect to /dev/log - does "/dev/log" exist in your system? this is the socket for syslog prints.
please attach /var/log/messsages or the output of "journalctl -a" 

Also yum.log or "rpm -qa" output can help. 
Easier way to investigate is to reach me on irc (I'm on the #ovirt and #vdsm freenode channel)

Comment 6 Yaniv Bronhaim 2015-07-01 11:30:57 UTC
The issue is not related to vdsm nor libvirt - still checking what causes it, but from some reason this server cannot open a socket to /dev/log 

>>> import socket
>>> s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
>>> s.connect('/dev/log')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 111] Connection refused

Vdsm reports the same while loading logger.conf. Changing the conf to use different socket works.

I'll try a bit more to figure out what's going on there, but its surely not vdsm bug. Changing the title accordingly.


The host runs:
Red Hat Enterprise Linux Server release 7.1 (Maipo)
rsyslog-7.4.7-7.el7_0.x86_64
python-2.7.5-16.el7.x86_64

Comment 8 Yaniv Bronhaim 2015-07-12 14:02:39 UTC
*** Bug 1242213 has been marked as a duplicate of this bug. ***

Comment 9 Michal Sekletar 2015-07-29 15:09:34 UTC
I am confused, can you explain,

1) why is this bug reported against rsyslog but is assigned to systemd-maintainers?
2) why are you creating stream socket? /dev/log is datagram socket on RHEL5/6/7. AFAIK only some ancient syslog implementations create /dev/log as stream socket.
3) in case component should really be systemd as assignee suggests, then I can not see described behavior using provided code sample. Instead I am getting following error on all RHEL versions.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 91] Protocol wrong type for socket

This is expected because 2). However if I create DGRAM socket then no exception is thrown and I can send log messages via socket.

Comment 10 Yaniv Bronhaim 2015-08-01 06:59:12 UTC
1) /dev/log is managed as part of systemd-journald-dev-log.socket in el7 (afaiu) and I found another bugzilla about rsyslog bug that is assigned to systemd-maintainers so I assumed this the address.. apologize if not and please assign to the relevant guys if you can
2) I checked logging.py code that failed in our program due to connection refused error and that's how they open the socket to /dev/log - but except that also "nc -U /dev/log" and 'logger "echo"' commands raise connection refused when it appears
3) Currently I can't reproduce it and I don't have environment with that issue to share so its hard for me to tell more or give more info, but we certainly had few cases that lead to this scenario and we couldn't run any application that tries to log to syslog (only in el7)

Before I close the bug until next time I see that, any suggestion you can propose to do (logs\strace\coredump?) if we will see this issue again ?

Comment 11 Yaniv Bronhaim 2015-08-01 07:28:47 UTC
probably [1] talks about the same issue although its in debian list. any replies that I miss?

[1] https://www.mail-archive.com/debian-bugs-dist@lists.debian.org/msg1339754.html

Comment 12 Yaniv Bronhaim 2015-08-05 14:24:43 UTC
I couldn't reproduce it and nobody reported anything about this issue during this week. 
Michal, if you have any lead or idea about it please share.. Otherwise I'll close that bug until it will appear again

Comment 13 Yaniv Bronhaim 2015-08-09 12:07:49 UTC
I didn't see it for awhile. I'll reopen the issue if it occurs again

Comment 14 Simone Tiraboschi 2015-08-20 18:54:07 UTC
Probably it's an issue on systemd-journald.
I faced it on one of my virtual machine that was suddenly powered off.

/dev/log is keep there but is unusable.

So on next system boot systemd-journald fails so so vdsmd cause it's not able to open /dev/log
The bad issue is that it fails without giving any hint and so it's not that easy to debug. Adding some debug hints there could help somebody else.

Manually removing /dev/log and restarting systemd-journald was enough to be able to restart vdsmd.
No issue on next reboot


[root@c71he12 ovirt-hosted-engine-ha]# systemctl status systemd-journald-dev-log.socket
● systemd-journald-dev-log.socket - Journal Socket (/dev/log)
   Loaded: loaded (/usr/lib/systemd/system/systemd-journald-dev-log.socket; static; vendor preset: disabled)
   Active: inactive (dead) since gio 2015-08-20 18:37:50 UTC; 2min 44s ago
     Docs: man:systemd-journald.service(8)
           man:journald.conf(5)
   Listen: /run/systemd/journal/dev-log (Datagram)

ago 20 18:37:50 c71he12.localdomain systemd[1]: Stopping Journal Socket (/dev/log).
ago 20 18:37:50 c71he12.localdomain systemd[1]: Socket service systemd-journald.service already active, refusing.
ago 20 18:37:50 c71he12.localdomain systemd[1]: Failed to listen on Journal Socket (/dev/log).
ago 20 18:38:28 c71he12.localdomain systemd[1]: Socket service systemd-journald.service already active, refusing.
ago 20 18:38:28 c71he12.localdomain systemd[1]: Failed to listen on Journal Socket (/dev/log).
ago 20 18:40:23 c71he12.localdomain systemd[1]: Socket service systemd-journald.service already active, refusing.
ago 20 18:40:23 c71he12.localdomain systemd[1]: Failed to listen on Journal Socket (/dev/log).
ago 20 18:40:25 c71he12.localdomain systemd[1]: Socket service systemd-journald.service already active, refusing.
ago 20 18:40:25 c71he12.localdomain systemd[1]: Failed to listen on Journal Socket (/dev/log).

[root@c71he12 ovirt-hosted-engine-ha]# rm /dev/log
[root@c71he12 ovirt-hosted-engine-ha]# systemctl restart systemd-journald
[root@c71he12 ovirt-hosted-engine-ha]# systemctl status systemd-journald
● systemd-journald.service - Journal Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static; vendor preset: disabled)
   Active: active (running) since gio 2015-08-20 18:40:55 UTC; 16s ago
     Docs: man:systemd-journald.service(8)
           man:journald.conf(5)
 Main PID: 20806 (systemd-journal)
   Status: "Processing requests..."
   CGroup: /system.slice/systemd-journald.service
           └─20806 /usr/lib/systemd/systemd-journald

ago 20 18:40:55 c71he12.localdomain systemd-journal[20806]: Runtime journal is using 8.0M (max allowed 1.5G, trying to leave 2.3G free of 15.6G available → current limit 1.5G).
ago 20 18:40:55 c71he12.localdomain systemd-journal[20806]: Runtime journal is using 8.0M (max allowed 1.5G, trying to leave 2.3G free of 15.6G available → current limit 1.5G).
ago 20 18:40:55 c71he12.localdomain systemd-journal[20806]: Journal started




[root@c71he12 ovirt-hosted-engine-ha]# systemctl restart vdsmd
[root@c71he12 ovirt-hosted-engine-ha]# systemctl status vdsmd
● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: disabled)
   Active: active (running) since gio 2015-08-20 18:41:42 UTC; 6s ago
  Process: 20783 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh --post-stop (code=exited, status=0/SUCCESS)
  Process: 20819 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
 Main PID: 20895 (vdsm)
   CGroup: /system.slice/vdsmd.service
           ├─20895 /usr/bin/python /usr/share/vdsm/vdsm
           ├─20997 /usr/libexec/ioprocess --read-pipe-fd 40 --write-pipe-fd 39 --max-threads 10 --max-queued-requests 10
           ├─21032 /usr/libexec/ioprocess --read-pipe-fd 48 --write-pipe-fd 47 --max-threads 10 --max-queued-requests 10
           ├─21039 /usr/libexec/ioprocess --read-pipe-fd 56 --write-pipe-fd 55 --max-threads 10 --max-queued-requests 10
           ├─21046 /usr/libexec/ioprocess --read-pipe-fd 65 --write-pipe-fd 64 --max-threads 10 --max-queued-requests 10
           ├─21053 /usr/libexec/ioprocess --read-pipe-fd 73 --write-pipe-fd 72 --max-threads 10 --max-queued-requests 10
           ├─21060 /usr/libexec/ioprocess --read-pipe-fd 80 --write-pipe-fd 78 --max-threads 10 --max-queued-requests 10
           ├─21084 /usr/libexec/ioprocess --read-pipe-fd 91 --write-pipe-fd 90 --max-threads 10 --max-queued-requests 10
           └─21091 /usr/libexec/ioprocess --read-pipe-fd 98 --write-pipe-fd 97 --max-threads 10 --max-queued-requests 10

ago 20 18:41:42 c71he12.localdomain vdsmd_init_common.sh[20819]: vdsm: Running test_space
ago 20 18:41:42 c71he12.localdomain vdsmd_init_common.sh[20819]: vdsm: Running test_lo
ago 20 18:41:42 c71he12.localdomain systemd[1]: Started Virtual Desktop Server Manager.
ago 20 18:41:42 c71he12.localdomain python[20895]: DIGEST-MD5 client step 2
ago 20 18:41:42 c71he12.localdomain python[20895]: DIGEST-MD5 parse_server_challenge()
ago 20 18:41:42 c71he12.localdomain python[20895]: DIGEST-MD5 ask_user_info()
ago 20 18:41:42 c71he12.localdomain python[20895]: DIGEST-MD5 client step 2
ago 20 18:41:42 c71he12.localdomain python[20895]: DIGEST-MD5 ask_user_info()
ago 20 18:41:42 c71he12.localdomain python[20895]: DIGEST-MD5 make_client_response()
ago 20 18:41:42 c71he12.localdomain python[20895]: DIGEST-MD5 client step 3