Bug 1265798

Summary: "systemctl start ntpd.service" hanged
Product: Red Hat Enterprise Linux 7 Reporter: Jan Stancek <jstancek>
Component: systemdAssignee: systemd-maint
Status: CLOSED DUPLICATE QA Contact: qe-baseos-daemons
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: jburke, jstancek, lnykryn, systemd-maint-list
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-09-25 12:00:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
systemctl core (captured with gcore) none

Description Jan Stancek 2015-09-23 19:05:52 UTC
Description of problem:
"systemctl start ntpd.service" didn't complete in ~2 hours.

ntpd is still reported as "inactive" with no sign that it's starting:

# systemctl status ntpd
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
   Active: inactive (dead) since Wed 2015-09-23 11:17:10 EDT; 3h 27min ago
  Process: 19146 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 19147 (code=exited, status=0/SUCCESS)

Sep 23 10:50:34 ibm-p720-01-lp5.rhts.eng.bos.redhat.com ntpd[19147]: 0.0.0.0 0615 05 clock_sync
Sep 23 11:00:30 ibm-p720-01-lp5.rhts.eng.bos.redhat.com ntpd[19147]: Listen normally on 9 eth0 2002:102:304:1234:5ef3:fcff:fe85:b71 UDP 123
Sep 23 11:00:30 ibm-p720-01-lp5.rhts.eng.bos.redhat.com ntpd[19147]: new interface(s) found: waking up resolver
Sep 23 11:05:40 ibm-p720-01-lp5.rhts.eng.bos.redhat.com ntpd[19147]: Deleting interface #9 eth0, 2002:102:304:1234:5ef3:fcff:fe85:b71#123, interface sta...10 secs
Sep 23 11:09:28 ibm-p720-01-lp5.rhts.eng.bos.redhat.com ntpd[19147]: Listen normally on 10 eth0 2002:102:304:1234:5ef3:fcff:fe85:b71 UDP 123
Sep 23 11:09:28 ibm-p720-01-lp5.rhts.eng.bos.redhat.com ntpd[19147]: new interface(s) found: waking up resolver
Sep 23 11:14:42 ibm-p720-01-lp5.rhts.eng.bos.redhat.com ntpd[19147]: Deleting interface #10 eth0, 2002:102:304:1234:5ef3:fcff:fe85:b71#123, interface st...14 secs
Sep 23 11:17:09 ibm-p720-01-lp5.rhts.eng.bos.redhat.com ntpd[19147]: ntpd exiting on signal 15
Sep 23 11:17:09 ibm-p720-01-lp5.rhts.eng.bos.redhat.com systemd[1]: Stopping Network Time Service...
Sep 23 11:17:10 ibm-p720-01-lp5.rhts.eng.bos.redhat.com systemd[1]: Stopped Network Time Service.
Hint: Some lines were ellipsized, use -l to show in full.

Here's part of process tree:

23658 ?        S      0:00              \_ make run
 9840 ?        S      0:00              |   \_ /bin/bash ./runtest.sh
15165 ?        S      0:00              |       \_ /bin/systemctl start ntpd.service

I managed to capture journalctl and core from systemctl before harness watchdog rebooted system.

Version-Release number of selected component (if applicable):
systemd-219-14.el7

How reproducible:
single instance

Steps to Reproduce:
unknown

Actual results:
systemctl start hangs

Expected results:
systemctl doesn't hang

Additional info:

Comment 2 Jan Stancek 2015-09-23 19:07:34 UTC
Created attachment 1076288 [details]
systemctl core (captured with gcore)

Comment 4 Lukáš Nykrýn 2015-09-25 07:35:20 UTC
Hmm those lines are the real issue:

Sep 23 12:36:35 ibm-p720-01-lp5.rhts.eng.bos.redhat.com systemd[1]: Looping too fast. Throttling execution a little.

Do you have a reproducer?

Comment 5 Jan Stancek 2015-09-25 07:42:53 UTC
(In reply to Lukáš Nykrýn from comment #4)
> Hmm those lines are the real issue:
> 
> Sep 23 12:36:35 ibm-p720-01-lp5.rhts.eng.bos.redhat.com systemd[1]: Looping
> too fast. Throttling execution a little.
> 
> Do you have a reproducer?

No. Any hints on what data to grab in case I see it again?

Comment 6 Lukáš Nykrýn 2015-09-25 07:59:12 UTC
Unfortunately no. The events in the reproducer could lead us to the culprit.

Comment 7 Lukáš Nykrýn 2015-09-25 12:00:47 UTC

*** This bug has been marked as a duplicate of bug 1266479 ***