Bug 769792

Summary: puppet consumes CPU in ERESTARTNOHAND loop
Product: [Fedora] Fedora EPEL Reporter: Ade Rixon <ade.rixon>
Component: puppetAssignee: Jeroen van Meeuwen <vanmeeuwen+fedora>
Status: CLOSED DEFERRED QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: el5CC: k.georgiou, ktdreyer, tmz, vanmeeuwen+fedora
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-18 15:03:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Ade Rixon 2011-12-22 09:25:39 UTC
Description of problem:
The Puppet client (0.25.5 and latest 2.6.12) consumes low but measurable amounts of CPU while idle. Running strace on it shows the following loop:

select(4, [3], [], [], {10, 130000})    = ? ERESTARTNOHAND (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a)                      = -1 EINTR (Interrupted system call)
select(4, [3], [], [], {10, 110000})    = ? ERESTARTNOHAND (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a)                      = -1 EINTR (Interrupted system call)
(ad infinitum)

Over time, this leads Puppet to consume an inordinate amount of system I/O wait time (approaching 100% in top) and require a restart.

This appears to be Puppet bug #1539 around network partitioning, which is allegedly caused by a Ruby pthreads issue (#2553) that may or may not be addressed in later 1.8.7 releases. As Puppet is normally expected to be a long-running process, is there any chance of backporting an upstream fix to RHEL?

Version-Release number of selected component (if applicable):
0.25.5, 2.6.12

How reproducible:
Install, configure and run Puppet client against remote Puppet master.

Steps to Reproduce:
1. service puppet start
2. strace -p `pgrep puppet`
3.
  
Actual results:
select(4, [3], [], [], {10, 130000})    = ? ERESTARTNOHAND (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a)                      = -1 EINTR (Interrupted system call)
select(4, [3], [], [], {10, 110000})    = ? ERESTARTNOHAND (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a)                      = -1 EINTR (Interrupted system call)


Expected results:
Process sleeping when not executing a Puppet run.

Additional info:

Comment 1 Todd Zullinger 2013-03-18 15:03:10 UTC
If this bug still affects current releases, please reopen and change the product/component to RHEL/ruby.  Thanks.