Bug 2349352 - Puppet agent errors
Summary: Puppet agent errors
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: puppet
Version: 42
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Breno
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2025-03-03 04:24 UTC by Ian Dall
Modified: 2025-04-24 03:39 UTC (History)
11 users (show)

Fixed In Version: puppet-8.10.0-1.fc43 puppet-8.10.0-1.fc42
Clone Of:
Environment:
Last Closed: 2025-04-15 19:07:15 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Backtrace (369.08 KB, text/plain)
2025-03-03 04:27 UTC, Ian Dall
no flags Details

Description Ian Dall 2025-03-03 04:24:59 UTC
Puppet agent fails, with ruby backtrace, for all (at least most) actions. This is possibly a puppet problem, but downgrading ruby, with the same puppet version fixes the problem.  

Reproducible: Always

Steps to Reproduce:
1.Run `puppet agent -t`
2.
3.
Actual Results:  
See attached backtrace.

Expected Results:  
Successful application of puppet actions

I'm using puppet-8.6.0-2.fc41. Using puppet-8.6.0-3.fc42 exhibits the same problem, but that version of puppet also has problems even with the downgraded ruby.

Comment 1 Ian Dall 2025-03-03 04:27:05 UTC
Created attachment 2078627 [details]
Backtrace

This is output to stderr of `puppet agent -t`

Comment 2 Vít Ondruch 2025-03-03 14:25:35 UTC
Will be probably better if Puppet folks looked into this.

(In reply to Ian Dall from comment #0)
> Puppet agent fails, with ruby backtrace, for all (at least most) actions.
> This is possibly a puppet problem, but downgrading ruby, with the same
> puppet version fixes the problem.  

Could you please provide details about Ruby versions you were testing against?

Comment 3 Ian Dall 2025-03-03 22:07:37 UTC
Sorry about missing details.

The issue was experienced with ruby-3.4.2-23.fc42.x86_64

Downgrading to ruby-3.3.5-14.fc41.x86_64 is known to work.

Comment 4 Ewoud Kohl van Wijngaarden 2025-03-04 10:32:22 UTC
The specific backtrace piece:

Error: Execution of 'journalctl -n 50 --since '5 minutes ago' -u sshd --no-pager' returned 1: /usr/share/ruby/vendor_ruby/puppet/util.rb:481:in 'Dir.foreach': Bad file descriptor - closedir (Errno::EBADF)
	from /usr/share/ruby/vendor_ruby/puppet/util.rb:481:in 'block in Puppet::Util.safe_posix_fork'

That is https://github.com/puppetlabs/puppet/blob/e227c27540975c25aa22d533a52424a9d2fc886a/lib/puppet/util.rb#L481

      begin
        Dir.foreach('/proc/self/fd') do |f|
          if f != '.' && f != '..' && f.to_i >= 3
            begin
              IO.new(f.to_i).close
            rescue
              nil
            end
          end
        end
      rescue Errno::ENOENT, Errno::ENOTDIR # /proc/self/fd not found, /proc/self not a dir
        3.upto(256) { |fd|
          begin
            IO.new(fd).close
          rescue
            nil
          end
        }
      end

I have no idea why a file descriptor would be bad here.

Comment 5 Vít Ondruch 2025-03-04 15:01:57 UTC
Maybe the `Errno::EBADF` needs to be rescued now?

Comment 6 Vít Ondruch 2025-03-04 15:10:47 UTC
Maybe this?

https://github.com/ruby/ruby/pull/11393

Comment 7 Ian Dall 2025-03-04 21:32:46 UTC
(In reply to Vít Ondruch from comment #6)
> Maybe this?
> 
> https://github.com/ruby/ruby/pull/11393

That looks promising, but EBADF is only one class of errors in the backtrace. The others all have the pattern 

`'Puppet::Util::Execution.execute': Could not unmask <service>:  (Puppet::Error)`

which I assume implements the equivalent of `systemctl unmask`, which in turn basically involves operations on the file system (removing a symbolic link). The EBADF fixes are in dir.c and if unmask calls dir.c functions than maybe these errors are fixed as well.

Comment 8 Vít Ondruch 2025-03-05 09:53:09 UTC
I'd say this is actually one backtrace. So fixing the EBADF should fix both issues. But I might be wrong.

Comment 9 Ewoud Kohl van Wijngaarden 2025-03-05 12:55:42 UTC
(In reply to Vít Ondruch from comment #6)
> Maybe this?
> 
> https://github.com/ruby/ruby/pull/11393

That looks very promising indeed.

What I'm wondering about is that IO.new(f.to_i).close is guarded by rescue without any specific exception. Also, line 481 is Dir.foreach('/proc/self/fd') so I'm wondering if the loop itself is closing the directory opened by Dir.foreach and then it raises the exception. That would also imply that the code was never really correct to begin with.

I tried a small reproducer:

#!/usr/bin/env ruby

Dir.foreach('/proc/self/fd') do |f|
  if f != '.' && f != '..' && f.to_i >= 3
    begin
      puts "Closing #{f}"
      IO.new(f.to_i).close
    rescue
      nil
    end
  end
end

On my Fedora 41 with Ruby 3.3 this passes but on Rawhide it fails with:

# ruby test.rb 
Closing 3
Closing 4
Closing 5
test.rb:3:in 'Dir.foreach': Bad file descriptor - closedir (Errno::EBADF)
	from test.rb:3:in '<main>'
test.rb:3:in 'Dir.foreach': Bad file descriptor - readdir (Errno::EBADF)
	from test.rb:3:in '<main>'

And that is exactly the behavior we're seeing here.

If I modify the code to exclude the fd for the open directory to:

#!/usr/bin/env ruby

d = Dir.new('/proc/self/fd')
d.each_child do |f|
  if f != '.' && f != '..' && f.to_i >= 3 && f.to_i != d.fileno
    begin
      puts "Closing #{f}"
      IO.new(f.to_i).close
    rescue
      nil
    end
  end
end

Then it passes on Rawhide. Notice we no longer close fd 5:

# ruby test.rb 
Closing 3
Closing 4

Comment 10 Cédric Bellegarde 2025-03-21 18:29:08 UTC
I can confirm that this patch fixes the issue:

--- /usr/share/ruby/vendor_ruby/puppet/util.rb	1970-01-01 01:00:00.000000000 +0100
+++ util.rb	2025-03-21 18:09:53.326824547 +0100
@@ -478,8 +478,9 @@
       $stderr = STDERR
 
       begin
-        Dir.foreach('/proc/self/fd') do |f|
-          if f != '.' && f != '..' && f.to_i >= 3
+        d = Dir.new('/proc/self/fd')
+        d.each_child do |f|
+          if f != '.' && f != '..' && f.to_i >= 3 && f.to_i != d.fileno
             begin
               IO.new(f.to_i).close
             rescue


On Silverblue, as a workaround:

# mount --bind util.rb /usr/share/ruby/vendor_ruby/puppet/util.rb

Comment 11 Cédric Bellegarde 2025-04-14 10:19:21 UTC
Any hope for a fix before Fedora 42 release?

We depend on Puppet in my University.

Comment 12 Fedora Update System 2025-04-15 17:45:29 UTC
FEDORA-2025-43731b849e (puppet-8.10.0-1.fc43) has been submitted as an update to Fedora 43.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-43731b849e

Comment 13 Fedora Update System 2025-04-15 19:07:15 UTC
FEDORA-2025-43731b849e (puppet-8.10.0-1.fc43) has been pushed to the Fedora 43 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 14 Fedora Update System 2025-04-15 21:19:11 UTC
FEDORA-2025-13f2e70c1c (puppet-8.10.0-1.fc42) has been submitted as an update to Fedora 42.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-13f2e70c1c

Comment 15 Fedora Update System 2025-04-16 01:04:25 UTC
FEDORA-2025-13f2e70c1c has been pushed to the Fedora 42 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2025-13f2e70c1c`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2025-13f2e70c1c

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 16 Fedora Update System 2025-04-24 03:39:05 UTC
FEDORA-2025-13f2e70c1c (puppet-8.10.0-1.fc42) has been pushed to the Fedora 42 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.