Bug 1098866 - firewall-cmd hang causes libvirtd startup to fail
Summary: firewall-cmd hang causes libvirtd startup to fail
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1099031 1109513
TreeView+ depends on / blocked
 
Reported: 2014-05-19 05:41 UTC by Ian Wienand
Modified: 2015-06-29 20:42 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1099031 (view as bug list)
Environment:
Last Closed: 2015-06-29 20:42:12 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
strace of a successful run of firewall-cmd --state (339.68 KB, text/plain)
2014-05-19 05:42 UTC, Ian Wienand
no flags Details
strace of a hanging run of firewall-cmd --state (340.55 KB, text/plain)
2014-05-19 07:34 UTC, Ian Wienand
no flags Details
lsof of firewalld before command hangs (7.86 KB, text/plain)
2014-05-19 07:35 UTC, Ian Wienand
no flags Details
lsof of firewalld after command hangs (7.86 KB, text/plain)
2014-05-19 07:35 UTC, Ian Wienand
no flags Details

Description Ian Wienand 2014-05-19 05:41:10 UTC
Description of problem:

I am having this problem on a Racksapce "Fedora 20 (Heisenbug) (PVHVM)" image.

I can see "firewall-cmd --state" is hanging, and it causes libvirtd startup to hang as well.

You can replicate this with 

"firewall-cmd --state && yum install -y libvirt && firewall-cmd --state"

The last firewall-cmd command will hang.

I've found that restarting dbus *or* firewalld after installing libvirt will solve the problem, so something about the libvirt install gets it into a funny state.  

The next weird thing to happen was when trying to diagnose this, I strace'd firewalld while I ran firewall-cmd and saw it catching an exception.  After killing the hung firewall-cmd process and running it again, I started getting that exception on the command line: e.g.

---
[root@cloud-server-01 ~]# firewall-cmd --state


^\Quit
[root@cloud-server-01 ~]# firewall-cmd --state
Error: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/dbus/service.py", line 707, in _message_cb
    retval = candidate_method(self, *args, **keywords)
  File "/usr/lib/python2.7/site-packages/slip/dbus/service.py", line 153, in wrapped_method
    reply_handler=reply_handler, error_handler=error_handler)
  File "/usr/lib/python2.7/site-packages/slip/dbus/polkit.py", line 292, in IsSystemBusNameAuthorizedAsync
    details)
  File "/usr/lib/python2.7/site-packages/slip/dbus/polkit.py", line 276, in IsSystemBusNameAuthorizedAsync
    timeout=method_call_no_timeout)
  File "/usr/lib/python2.7/site-packages/dbus/proxies.py", line 137, in __call__
    **keywords)
  File "/usr/lib/python2.7/site-packages/dbus/connection.py", line 584, in call_async
    message.append(signature=signature, *args)
ValueError: Unable to guess signature from an empty dict

---

I will attach an strace of the first "good" run and one of the run that hangs after libvirt install.  I'll also attach lsof output of firewalld before & after

Comment 1 Ian Wienand 2014-05-19 05:42:59 UTC
Created attachment 896978 [details]
strace of a successful run of firewall-cmd --state

Comment 2 Ian Wienand 2014-05-19 07:34:08 UTC
Created attachment 897031 [details]
strace of a hanging run of firewall-cmd --state

Comment 3 Ian Wienand 2014-05-19 07:35:11 UTC
Created attachment 897032 [details]
lsof of firewalld before command hangs

Comment 4 Ian Wienand 2014-05-19 07:35:45 UTC
Created attachment 897033 [details]
lsof of firewalld after command hangs

Comment 5 Daniel Berrangé 2014-05-19 10:01:53 UTC
Can you check whether there are any suspect kernel messages in dmesg. There have been some kernel bugs I've seen in F20, which cause firewalld (and indeed iptables in general) to hang - usually there is a kernel bug logged to dmesg when this happens.

Comment 6 Ian Wienand 2014-05-19 10:30:10 UTC
(In reply to Daniel Berrange from comment #5)
> Can you check whether there are any suspect kernel messages in dmesg.

Nothing in dmesg, but yes there is something in the logs that looks suspicious

---
May 19 10:24:23 cloud-server-01 dbus-daemon[282]: dbus[282]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.4" (uid=0 pid=277 comm="/usr/bin/python /usr/sbin/firewalld --nofork --nop") interface="org.freedesktop.DBus.Introspectable" member="Introspect" error name="(unset)" requested_reply="0" destination=":1.16" (uid=998 pid=2314 comm="/usr/lib/polkit-1/polkitd --no-debug ")
May 19 10:24:23 cloud-server-01 dbus[282]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.4" (uid=0 pid=277 comm="/usr/bin/python /usr/sbin/firewalld --nofork --nop") interface="org.freedesktop.DBus.Introspectable" member="Introspect" error name="(unset)" requested_reply="0" destination=":1.16" (uid=998 pid=2314 comm="/usr/lib/polkit-1/polkitd --no-debug ")
---

Comment 7 Ian Wienand 2014-05-19 10:46:11 UTC
Ok, so trying again, installing the polkit package causes the problem to appear

---
[root@cloud-server-01 ~]# yum install polkit

 ...

  Installing : mozjs17-17.0.0-8.fc20.x86_64                                                                                          1/3 
  Installing : polkit-0.112-2.fc20.x86_64                                                                                            2/3 
  Installing : polkit-pkla-compat-0.1-3.fc20.x86_64                                                                                  3/3 
[root@cloud-server-01 ~]# firewall-cmd --state
 ... hang...
---

to be clear -- replication is 

1) boot Racksapce "Fedora 20 (Heisenbug) (PVHVM)"
2) yum install polkit
3) firewall-cmd --state will hang after this until firewalld or dbus is restarted

Comment 8 Ian Wienand 2014-05-19 10:52:28 UTC
I cloned this to bug #1099031 for the polkit issue.  However, it would be nice if libvirtd gave some more help for this issue.  It didn't have any useful logs or seem to have any timeout on the command to log its failure, etc.

Comment 9 Fedora End Of Life 2015-05-29 11:53:20 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 10 Fedora End Of Life 2015-06-29 20:42:12 UTC
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.