RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1544937 - libvirtd and firewalld need to be able to be restarted in co-ordination with each other, preserving post firewalld effective transaction set virtual client connections
Summary: libvirtd and firewalld need to be able to be restarted in co-ordination with ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.4
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Laine Stump
QA Contact: yalzhang@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-13 19:41 UTC by R P Herrold
Modified: 2018-04-19 19:04 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1348434
Environment:
Last Closed: 2018-04-19 19:04:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
log extract on affected unit, which I will use to hunt with (7.15 KB, text/plain)
2018-02-16 17:52 UTC, R P Herrold
no flags Details

Description R P Herrold 2018-02-13 19:41:59 UTC
libvirtd and firewalld need to be able to be restarted in co-ordination with each other, preserving post firewalld effective transaction set virtual client connections

This is wordy, but is the needed outcome.


environment is: 
[root@router ~]# rpm -q libvirt systemd firewalld
libvirt-3.2.0-14.el7_4.7.x86_64
systemd-219-42.el7_4.7.x86_64
firewalld-0.4.4.4-6.el7.noarch


old subject was:

internal error: Unable to apply rule 'The name org.fedoraproject.FirewallD1 was not provided by any .service files'


+++ This bug was initially created as a clone of Bug #1348434 +++

After several days of work, libvirt no longer wanted to start VMs.

root@ovh1 ~ # virsh start 138_xy.com_abcd
error: Failed to start domain 138_xy.com_abcd
error: The name org.fedoraproject.FirewallD1 was not provided by any .service files

I stumbled upon a piece of advise on the internet: "Solution : libvirtd restart is needed between firewalld stopped and iptables started." https://github.com/redhat-cip/infra-virt#troubleshooting 


==============

RPH 

it is worse than that -- that writeup states:

| You will need to disable firewalld and enable libvirt:
| 
| sudo systemctl enable libvirtd
| sudo systemctl start libvirtd
| 
| sudo service iptables save
| sudo systemctl disable firewalld
| sudo systemctl enable iptables
| sudo systemctl stop firewalld
| sudo systemctl start iptables


it was perhaps once acceptable as a transition rule to advocate disabling the firewalld, but manual maintenance of iptables rules now deprecated

---------------

Prior comment string:
We discussed this at the time firewalld support was added to libvirt, and decided that the occurrence of starting/stopping firewalld is quite rare, and the solution (restarting libvirtd) is simple. 


RPH: ??? the starting and stopping of the firewalld is quite common locally, and is done under programatic control, so invisible to an end user


later:

reasons? We could throw in a hint but if a user deliberately disables firewalld, and libvirtd throws an error like that, I think it's a pretty good indication that a libvirtd restart is worth a shot.

Plus that linked page with the troubleshooting section, the steps at the top of the page should be stick a libvirtd restart after disabling firewalld. Their steps basically enforce hitting that error

Closing as DEFERRED, but if someone has suggestions for improving the error that avoid that ambiguity I suggest sending a patch to libvir-list

--- Additional comment from Damian Nowak on 2016-06-22 22:08:01 EDT ---

> if a user deliberately disables firewalld, and libvirtd throws an error like that, I think it's a pretty good indication that a libvirtd restart is worth a shot.

How can user be aware that changing something in the system may affect a different service, like libvirt? What if sysadmin #1 changes while sysadmin #2 gets that error? (Actually my case) This type of message belongs to a category of wtf-like messages that only Google can help with.

Why not something like that?

root@ovh1 ~ # virsh start 138_xy.com_abcd
error: Failed to start domain 138_xy.com_abcd
error: The name org.fedoraproject.FirewallD1 was not provided by any .service files
error: If you disabled Firewalld after libvirtd start, please restart libvirtd

It makes everyone's life easier at the expense of a couple extra lines of code. Libvirt error messages are typically very informative so it would play well with the rest.

RPH: I dislike this suggeston as it just papers over, rather than solves the problem of a needed idempotent and 'correct' conditional restart of the firewalld AND the libvirtd

--- Additional comment from R P Herrold on 2018-02-13 14:08:39 EST ---

This problem still occurs, and its cause is 'invisible' (we manipulate the firewalld, for network ACL reasons).  I found the 'workaround' only by a google search with the error message 

impairing / interrupting libvirtd connections (mostly interior networks facing) with a restart is a problem

--- Additional comment from R P Herrold on 2018-02-13 14:09:20 EST ---

I became aware of the issue when it showed up in my comment 13 at:

https://bugzilla.redhat.com/show_bug.cgi?id=1486803

===========

proposed resolution:

I would expect a conditional restart rule for firewalld.service of libvirtd.service,  in a systemd stanza, probably owned by firewalld, but needed changes in libvirtd (see next para)

along with a libvirtd 'smart enough' to maintain and not drop connections, through a firewalld restart at any time.  

Isn't this the point of systemd?  to permit describing state transitions in environments filled with Atomic, docker, and VMs?, and getting rid of "hidden gotcha's?"

Comment 2 Daniel Berrangé 2018-02-14 08:43:16 UTC
We absolutely should *not* restart libvirtd automatically when firewalld is restarted. When libvirtd has decided to use firewalld, it should not matter if firewalld is restarted. Libvirt can carry on using it once it has started again. The situation described in the original bug was talking bout the situation where the host is initially not running firewalld at all, and the user switches from using "iptables" init script, to using firewalld. Alternatively the reverse where the host is using firewalld, and it is then stopped & uninstalled and iptables initscript activated. Those iptables<->firewalld switch scenarios are the ones libvirt doesn't intend to support without a restart.

Comment 3 R P Herrold 2018-02-14 17:45:08 UTC
@Daniel Berrange 

Concur as to your comment #2 ... I was really surprised to run into the issue, but seemingly it arises with manipulations of the firewalld (and a side application [fail2ban] ex EPEL) also working on the iptables doing ACD's

[root@router ~]# rpm -qa fail\*
fail2ban-server-0.9.7-1.el7.noarch
fail2ban-firewalld-0.9.7-1.el7.noarch
fail2ban-systemd-0.9.7-1.el7.noarch
fail2ban-sendmail-0.9.7-1.el7.noarch
fail2ban-0.9.7-1.el7.noarch

[root@router ~]# rpm -qa libvir\* fire\* | sort
firefox-52.6.0-1.el7.centos.x86_64
firewall-applet-0.4.4.4-6.el7.noarch
firewall-config-0.4.4.4-6.el7.noarch
firewalld-0.4.4.4-6.el7.noarch
firewalld-filesystem-0.4.4.4-6.el7.noarch
libvirt-3.2.0-14.el7_4.7.x86_64
libvirt-client-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-config-network-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-config-nwfilter-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-interface-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-lxc-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-network-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-nodedev-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-nwfilter-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-qemu-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-secret-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-storage-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-storage-core-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-storage-disk-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-storage-iscsi-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-storage-logical-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-storage-mpath-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-storage-rbd-3.2.0-14.el7_4.7.x86_64
libvirt-daemon-driver-storage-scsi-3.2.0-14.el7_4.7.x86_64
libvirt-glib-1.0.0-1.el7.x86_64
libvirt-libs-3.2.0-14.el7_4.7.x86_64
libvirt-python-3.2.0-3.el7_4.1.x86_64

Comment 4 Laine Stump 2018-02-15 01:58:13 UTC
NB: libvirt listens on dbus for notifications of firewalld being restarted, and reloads all of the rules it needs (without itself restarting) when this happens. Also, whenever libvirtd is restarted, it removes and reloads all of its iptables rules as part of the startup. So as long as firewalld isn't permanently stopped, but just restarted, there shouldn't be any problem.

So are you saying that you get this message in a situation where firewalld has not been stopped, but is just being restarted? Can you provide an exact set of steps to reproduce the problem?

Comment 5 R P Herrold 2018-02-16 17:50:11 UTC
cron and systemd are doing it -- I have to dig a bit

I also see the 'race condition' bug on ip6tables -- I was not previously aware that that was present

perhaps that is in play as well

Comment 6 R P Herrold 2018-02-16 17:52:07 UTC
Created attachment 1397099 [details]
log extract on affected unit, which I will use to hunt with

This is troubling:

[herrold@centos-7 tmp]$ nl firewalld-presence.txt | grep -i "perhaps"
    48  Feb 13 14:13:04 router ip6tables.init: ip6tables: Flushing firewall rules: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
    58  Feb 13 14:14:06 router ip6tables.init: ip6tables: Flushing firewall rules: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
[herrold@centos-7 tmp]$

Comment 7 R P Herrold 2018-02-16 17:54:43 UTC
that race bug I mentioned in Comment 5 is:

https://bugzilla.redhat.com/show_bug.cgi?id=1486803

Comment 8 R P Herrold 2018-02-16 17:59:34 UTC
the other race bug was: 

https://bugzilla.redhat.com/show_bug.cgi?id=1477413

which is already cross-referenced in an UN-numbered comment:

by: Lukas Vrabec 2018-02-14 07:40:55 EST

Comment 9 Laine Stump 2018-04-19 19:04:05 UTC
(Note that libvirt already uses "-w" on iptables and ip6tables commands (and --concurrent on ebtables commands) if the local binaries of those applications support it.)

I'm not seeing anything for us to do here. If there is some event that's causing firewalld to stop running, that is beyond libvirt's control (it should be fixed, but the fix will be elsewhere). Since we've stated that we don't want libvirt to automatically switch backends without restarting the daemon, I think this should just be closed.


Note You need to log in before you can comment on or make changes to this bug.