RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1109513 - firewall-cmd makes the machine unreachable after installing polkit
Summary: firewall-cmd makes the machine unreachable after installing polkit
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: dbus
Version: 7.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: David King
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On: 1098866 1099031
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-06-14 18:20 UTC by Attila Fazekas
Modified: 2019-09-11 17:37 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1099031
Environment:
Last Closed: 2019-09-11 17:37:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
firewalld-polit-dbus.txt (21.16 KB, text/plain)
2014-06-14 18:20 UTC, Attila Fazekas
no flags Details

Description Attila Fazekas 2014-06-14 18:20:44 UTC
Created attachment 908818 [details]
firewalld-polit-dbus.txt

+++ This bug was initially created as a clone of Bug #1099031 +++

+++ This bug was initially created as a clone of Bug #1098866 +++

Description of problem:

I am having this problem on a Racksapce "Fedora 20 (Heisenbug) (PVHVM)" image.

I can see "firewall-cmd --state" is hanging, and it causes libvirtd startup to hang as well.

You can replicate this with 

"firewall-cmd --state && yum install -y libvirt && firewall-cmd --state"

The last firewall-cmd command will hang.

I've found that restarting dbus *or* firewalld after installing libvirt will solve the problem, so something about the libvirt install gets it into a funny state.  

The next weird thing to happen was when trying to diagnose this, I strace'd firewalld while I ran firewall-cmd and saw it catching an exception.  After killing the hung firewall-cmd process and running it again, I started getting that exception on the command line: e.g.

---
[root@cloud-server-01 ~]# firewall-cmd --state


^\Quit
[root@cloud-server-01 ~]# firewall-cmd --state
Error: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/dbus/service.py", line 707, in _message_cb
    retval = candidate_method(self, *args, **keywords)
  File "/usr/lib/python2.7/site-packages/slip/dbus/service.py", line 153, in wrapped_method
    reply_handler=reply_handler, error_handler=error_handler)
  File "/usr/lib/python2.7/site-packages/slip/dbus/polkit.py", line 292, in IsSystemBusNameAuthorizedAsync
    details)
  File "/usr/lib/python2.7/site-packages/slip/dbus/polkit.py", line 276, in IsSystemBusNameAuthorizedAsync
    timeout=method_call_no_timeout)
  File "/usr/lib/python2.7/site-packages/dbus/proxies.py", line 137, in __call__
    **keywords)
  File "/usr/lib/python2.7/site-packages/dbus/connection.py", line 584, in call_async
    message.append(signature=signature, *args)
ValueError: Unable to guess signature from an empty dict

---

I will attach an strace of the first "good" run and one of the run that hangs after libvirt install.  I'll also attach lsof output of firewalld before & after

--- Additional comment from Ian Wienand on 2014-05-19 01:42:59 EDT ---



--- Additional comment from Ian Wienand on 2014-05-19 03:34:08 EDT ---



--- Additional comment from Ian Wienand on 2014-05-19 03:35:11 EDT ---



--- Additional comment from Ian Wienand on 2014-05-19 03:35:45 EDT ---



--- Additional comment from Daniel Berrange on 2014-05-19 06:01:53 EDT ---

Can you check whether there are any suspect kernel messages in dmesg. There have been some kernel bugs I've seen in F20, which cause firewalld (and indeed iptables in general) to hang - usually there is a kernel bug logged to dmesg when this happens.

--- Additional comment from Ian Wienand on 2014-05-19 06:30:10 EDT ---

(In reply to Daniel Berrange from comment #5)
> Can you check whether there are any suspect kernel messages in dmesg.

Nothing in dmesg, but yes there is something in the logs that looks suspicious

---
May 19 10:24:23 cloud-server-01 dbus-daemon[282]: dbus[282]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.4" (uid=0 pid=277 comm="/usr/bin/python /usr/sbin/firewalld --nofork --nop") interface="org.freedesktop.DBus.Introspectable" member="Introspect" error name="(unset)" requested_reply="0" destination=":1.16" (uid=998 pid=2314 comm="/usr/lib/polkit-1/polkitd --no-debug ")
May 19 10:24:23 cloud-server-01 dbus[282]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.4" (uid=0 pid=277 comm="/usr/bin/python /usr/sbin/firewalld --nofork --nop") interface="org.freedesktop.DBus.Introspectable" member="Introspect" error name="(unset)" requested_reply="0" destination=":1.16" (uid=998 pid=2314 comm="/usr/lib/polkit-1/polkitd --no-debug ")
---

--- Additional comment from Ian Wienand on 2014-05-19 06:46:11 EDT ---

Ok, so trying again, installing the polkit package causes the problem to appear

---
[root@cloud-server-01 ~]# yum install polkit

 ...

  Installing : mozjs17-17.0.0-8.fc20.x86_64                                                                                          1/3 
  Installing : polkit-0.112-2.fc20.x86_64                                                                                            2/3 
  Installing : polkit-pkla-compat-0.1-3.fc20.x86_64                                                                                  3/3 
[root@cloud-server-01 ~]# firewall-cmd --state
 ... hang...
---

to be clear -- replication is 

1) boot Racksapce "Fedora 20 (Heisenbug) (PVHVM)"
2) yum install polkit
3) firewall-cmd --state will hang after this until firewalld or dbus is restarted

--- Additional comment from Miloslav Trmač on 2014-05-19 18:56:02 EDT ---

Thanks for your report.

This is something specific to firewalld (or the python-slip logic):

Directly after installing polkit, (pkaction) works fine (i.e. polkit basically works), and then
> dbus-send  --system --dest=org.freedesktop.PolicyKit1 --print-reply /org/freedesktop/PolicyKit1/Authority org.freedesktop.DBus.Introspectable.Introspect
works fine, and the same call using --dest=:1.$the_right_unique_name works as well.

OTOH firewalld/python-slip detects NameOwnerChanged, determines $the_right_unique_name, but AFAICS the same Introspect method call fails with an error.

(For reproducing, note that even a minimal Fedora install includes polkit: you’ll have to remove it (and possibly set up init.d/network instead of NM)).

--- Additional comment from Jiri Popelka on 2014-05-20 06:12:37 EDT ---

Looks like python-slip problem, we've seen similar bug #895067 in past.

--- Additional comment from Nils Philippsen on 2014-05-20 10:59:54 EDT ---

I don't think this is a bug in python-slip -- merely triggered by its trying to find out if polkitd exists on the dbus system bus.

Here's my take on what likely happens:

(In reply to Ian Wienand from comment #0)
> --- Additional comment from Ian Wienand on 2014-05-19 06:30:10 EDT ---
> 
> (In reply to Daniel Berrange from comment #5)
> > Can you check whether there are any suspect kernel messages in dmesg.
> 
> Nothing in dmesg, but yes there is something in the logs that looks
> suspicious
> 
> ---
> May 19 10:24:23 cloud-server-01 dbus-daemon[282]: dbus[282]: [system]
> Rejected send message, 1 matched rules; type="method_call", sender=":1.4"
> (uid=0 pid=277 comm="/usr/bin/python /usr/sbin/firewalld --nofork --nop")
> interface="org.freedesktop.DBus.Introspectable" member="Introspect" error
> name="(unset)" requested_reply="0" destination=":1.16" (uid=998 pid=2314
> comm="/usr/lib/polkit-1/polkitd --no-debug ")
> May 19 10:24:23 cloud-server-01 dbus[282]: [system] Rejected send message, 1
> matched rules; type="method_call", sender=":1.4" (uid=0 pid=277
> comm="/usr/bin/python /usr/sbin/firewalld --nofork --nop")
> interface="org.freedesktop.DBus.Introspectable" member="Introspect" error
> name="(unset)" requested_reply="0" destination=":1.16" (uid=998 pid=2314
> comm="/usr/lib/polkit-1/polkitd --no-debug ")
> ---

Dbus-python usually introspects methods before invoking them -- see http://dbus.freedesktop.org/doc/dbus-python/doc/tutorial.txt "Data Types" -- to figure out how to convert the python arguments.

This introspection is prohibited by dbus-daemon and I can only guess why:

1) When firewalld is first started, /etc/dbus-1/system.d/org.freedesktop.PolicyKit1.conf doesn't yet exist (because polkitd isn't installed).
2) Therefore dbus-daemon doesn't yet know about the org.freedesktop.PolicyKit1 destination and that anybody may send to it.
3) In order to be able to provide a fallback if polkit is not available (root may do everything, other users nothing), firewalld (via slip.dbus) "pings" the org.freedesktop.PolicyKit1.Authority interface which polkitd would provide, were it running.
4) Assumption: dbus-daemon prohibits that ping, and caches the result ("<firewalld process> may not send to <polkit destination>").
5) The polkit package gets installed, /etc/dbus-1/system.d/org.freedesktop.PolicyKit1.conf is available.
6) Firewalld (again via slip.dbus) notices polkitd appearing on the bus (via the NameOwnerChanged signal), successfully attempts to get hold of the org.freedesktop.PolicyKit1.Authority interface object (that may not even touch the bus, I'm not sure about dbus internals here), but the subsequent introspection of the method fails due to 4.
7) Restarting a) dbus-daemon or b) firewalld causes a) the complete bus policy to get reevaluated or b) firewalld being a new process makes dbus-daemon not use any cached information, therefore things work again.

If this is what happens (and it's the best explanation I have ;-), then dbus-daemon should forget about the cached permissions from step 4 when it applies the polkit dbus policy (/etc/dbus-1/system.d/org.freedesktop.PolicyKit1.conf) elsewhere. Changing component to dbus.

--- Additional comment from Colin Walters on 2014-05-20 11:18:45 EDT ---

Did you try sending SIGHUP to dbus-daemon after #5 ?  

The messy thing here is that dbus tries to use inotify but it's going to race with the new polkit process starting.

--- Additional comment from Miloslav Trmač on 2014-05-20 11:56:20 EDT ---

(In reply to Nils Philippsen from comment #3)
> 4) Assumption: dbus-daemon prohibits that ping, and caches the result
> ("<firewalld process> may not send to <polkit destination>").
> 5) The polkit package gets installed,
> /etc/dbus-1/system.d/org.freedesktop.PolicyKit1.conf is available.
> 6) Firewalld (again via slip.dbus) notices polkitd appearing on the bus (via
> the NameOwnerChanged signal), successfully attempts to get hold of the
> org.freedesktop.PolicyKit1.Authority interface object (that may not even
> touch the bus, I'm not sure about dbus internals here), but the subsequent
> introspection of the method fails due to 4.

The 4) cache might be an explanation.  I was looking at strace() of firewalld, and firewalld does get a NameOwnerChanged signal and does look up the “new” owner of PolicyKit1 by calling GetNameOwner, and only _then_ calls Introspect.

And I was able to use dbus-send to _successfully_ call Introspect even before invoking the firewalld command=>server code, which rules out a race in rereading the configuration or in name ownership; there was something different between a dbus-send caller and firewalld caller[1], and that cache might be an explanation.

[1] Or perhaps my reading of the strace was wrong and I wasn’t sending the right dbus-send command.

--- Additional comment from Ian Wienand on 2014-06-11 18:30:23 EDT ---

Sorry but this has become a high-priority issue.

The problem is that the rackspace image used by upstream openstack has firewalld installed, while other cloud images don't.  Thus we need work-arounds in devstack to work in all situations.

Upstream are not too happy about having fedora specific work-arounds [1] and this is causing breakage

[1] https://review.openstack.org/#/c/99047/1

--- Additional comment from Nils Philippsen on 2014-06-12 05:30:12 EDT ---

Let me ask the heretical question: why is polkit installed after the fact and not shipped with it?

--- Additional comment from Ian Wienand on 2014-06-12 06:02:16 EDT ---

(In reply to Nils Philippsen from comment #7)
> Let me ask the heretical question: why is polkit installed after the fact
> and not shipped with it?

I don't know; it's the standard Rackspace F20 image ("Fedora 20 (Heisenbug) (PVHVM)").  It seems to differ in a few ways to the standard, perhaps it has something to do with issues they found running on xen


--- --- --- --- --- --- ---
I tried similar thing on the rhel-guest-image-7.0-20140506.1.

I had to uninstall the polkit at the beginning and at the end the machine become unreachable!!!

See the firewalld-polit-dbus.txt for reproducing the issue.

Comment 3 andrew 2016-04-23 19:25:17 UTC
This also happens on latest RHEL7.2

This happens when polkit is not installed as an initial package with kickstart.

# firewall-cmd --state && yum install -y polkit && firewall-cmd --state

Second firewall-cmd hangs indefinitely until dbus or firewalld is restarted.


Note You need to log in before you can comment on or make changes to this bug.