Bug 1299005
Summary: | skyring crashes when /tmp/.skyring-event socket exists | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Storage Console | Reporter: | Martin Bukatovic <mbukatov> | ||||
Component: | core | Assignee: | Shubhendu Tripathi <shtripat> | ||||
Status: | CLOSED ERRATA | QA Contact: | Martin Bukatovic <mbukatov> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 2 | CC: | japplewh, mkudlej, nthomas, sankarshan, shtripat | ||||
Target Milestone: | beta | ||||||
Target Release: | 2 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | rhscon-ceph-0.0.23-1.el7scon.x86_64, rhscon-core-0.0.24-1.el7scon.x86_64, rhscon-ui-0.0.39-1.el7scon.noarch | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-08-23 19:45:54 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Martin Bukatovic
2016-01-15 16:38:04 UTC
This is still a valid issue. The socket has been moved into /var/run/.skyring-event and skyring service has a proper systemd unit file now, but the issue remains: ~~~ # rpm -q rhscon-core rhscon-core-0.0.17-1.el7scon.x86_64 # ls -l /var/run/.skyring-event srwxr-xr-x. 1 root root 0 May 16 07:21 /var/run/.skyring-event # journalctl -u skyring --since "11:22:50" -- Logs begin at Tue 2016-05-10 10:36:54 CEST, end at Tue 2016-05-17 17:01:01 CEST. -- May 17 11:22:50 dhcp46-190.example.com systemd[1]: skyring.service holdoff time over, scheduling restart. May 17 11:22:50 dhcp46-190.example.com systemd[1]: Started Skyring Daemon. May 17 11:22:50 dhcp46-190.example.com systemd[1]: Starting Skyring Daemon... May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.341+02:00 ERROR auth.go:197 AddUser] Username: admin already exists May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.341+02:00 ERROR auth.go:41 AddDefaultUser] Unable to create default User:user already exists May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.344+02:00 INFO main.go:214 start] Starting the providers May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.345+02:00 DEBUG provider-conf.go:57 LoadProviderConfig] File Name: ceph.conf May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.345+02:00 DEBUG provider-conf.go:57 LoadProviderConfig] File Name: ceph.dat May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.345+02:00 DEBUG provider-conf.go:57 LoadProviderConfig] File Name: ceph.evt May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.345+02:00 DEBUG provider-conf.go:86 LoadProviderConfig] Collection: [{{ceph bigfin 10.1} [] {ceph-installer true false}}] May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.345+02:00 INFO skyring.go:98 StartProviders] Config:%!(EXTRA []conf.ProviderInfo=[{{ceph bigfin 10.1} [] {ceph-installer true false}}]) May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.345+02:00 DEBUG skyring.go:101 StartProviders] Config:%!(EXTRA conf.ProviderInfo={{ceph bigfin 10.1} [] {ceph-installer true false}}) May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.348+02:00 INFO main.go:221 start] Starting event listener May 17 11:22:51 dhcp46-190.example.com skyring[27755]: 2016-05-17T11:22:51.357+02:00 CRITICAL listener.go:146 StartListener] listen error:listen unix /var/run/.skyring-event: bind: address already in use May 17 11:22:51 dhcp46-190.example.com systemd[1]: skyring.service: main process exited, code=exited, status=1/FAILURE May 17 11:22:51 dhcp46-190.example.com systemd[1]: Unit skyring.service entered failed state. May 17 11:22:51 dhcp46-190.example.com systemd[1]: skyring.service failed. May 17 11:22:56 dhcp46-190.example.com systemd[1]: Stopped Skyring Daemon. ~~~ I'm sorry, actually, it seems that skyring didn't crash, but just stopped as it should in this case. That said, we may need to check if it makes sense to use systemd to track the socket so that the socket would not be forgotten in a case of crash (which is usually the reason why there is some socket file left) and create another BZ for it. Moving the BZ state back. Using rhscon-core-0.0.28-1.el7scon.x86_64 ~~~ # systemctl stop skyring # socat UNIX-LISTEN:/var/run/.skyring-event - ^C # ls -l /var/run/.skyring-event srwxr-xr-x. 1 root root 0 Jul 1 18:59 /var/run/.skyring-event # systemctl start skyring # systemctl status skyring ● skyring.service - Skyring Daemon Loaded: loaded (/usr/lib/systemd/system/skyring.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Fri 2016-07-01 19:01:48 CEST; 2s ago Docs: man:skyring(8) Process: 327 ExecStart=/usr/bin/skyring $OPTIONS (code=exited, status=1/FAILURE) Main PID: 327 (code=exited, status=1/FAILURE) Jul 01 19:01:46 dhcp-126-78.lab.eng.brq.redhat.com systemd[1]: Started Skyring Daemon. Jul 01 19:01:46 dhcp-126-78.lab.eng.brq.redhat.com systemd[1]: Starting Skyring Daemon... Jul 01 19:01:48 dhcp-126-78.lab.eng.brq.redhat.com skyring[327]: 2016-07-01T19:01:48.211+02:00 INFO main.go:214 start] Starting the providers Jul 01 19:01:48 dhcp-126-78.lab.eng.brq.redhat.com skyring[327]: 2016-07-01T19:01:48.212+02:00 INFO skyring.go:98 StartProviders] Config:%!(EXTR...r:3]}]) Jul 01 19:01:48 dhcp-126-78.lab.eng.brq.redhat.com skyring[327]: 2016-07-01T19:01:48.216+02:00 INFO main.go:221 start] Starting event listener Jul 01 19:01:48 dhcp-126-78.lab.eng.brq.redhat.com skyring[327]: 2016-07-01T19:01:48.217+02:00 CRITICAL listener.go:146 StartListener] listen error:... in use Jul 01 19:01:48 dhcp-126-78.lab.eng.brq.redhat.com systemd[1]: skyring.service: main process exited, code=exited, status=1/FAILURE Jul 01 19:01:48 dhcp-126-78.lab.eng.brq.redhat.com systemd[1]: Unit skyring.service entered failed state. Jul 01 19:01:48 dhcp-126-78.lab.eng.brq.redhat.com systemd[1]: skyring.service failed. Hint: Some lines were ellipsized, use -l to show in full. ~~~ To sum this up: skyring fails to start, but no longer crashes. Checking the logs in more details: ~~~ 2016-07-01T19:01:48.211+02:00 INFO main.go:214 start] Starting the providers 2016-07-01T19:01:48.212+02:00 INFO skyring.go:98 StartProviders] Config:%!(EXTRA []conf.ProviderInfo=[{{ceph bigfin 10.1} [] {ceph-installer true false} map[max_metadata_on_ssd:4 min_monitors_in_cluster:3]}]) 2016-07-01T19:01:48.216+02:00 INFO main.go:221 start] Starting event listener 2016-07-01T19:01:48.217+02:00 CRITICAL listener.go:146 StartListener] listen error:listen unix /var/run/.skyring-event: bind: address already in use ~~~ We see that error about existing socket file is reported properly. That said, the start/stop of the daemon is not logged (but there is another bugzilla for that: BZ 1343651) >> VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2016:1754 |