Created attachment 1652988 [details] interrupted ssh session Description of problem: If connected remotely to a fedora 31 box through ssh and updating system via 'dnf update', the network is taken down when the list of updated packages contains 'cockpit-ws' package. The system have to rebooted in order to get networking up again. This problem was not present in earlier Fedora versions (30-). Version-Release number of selected component (if applicable): cockpit-ws-210-1.fc31.x86_64 How reproducible: log in to fedora 31 remotely and install 'cockpit-ws' package update using 'dnf update' Steps to Reproduce: 1. login remotely (ssk) 2. when updated 'cockpit-ws' package is available, execute 'dnf update' 3. wait till the ssh connection is interrupted Actual results: networking is down, you can't connect the box through ssh again Expected results: networking state should be preserved or restored Additional info: there are probably more packages "suffering" from the same problem
Just about the only thing in cockpit's %post script that could be able to do this is # firewalld only partially picks up changes to its services files without this test -f /usr/bin/firewall-cmd && firewall-cmd --reload --quiet || true Do you get the same effect with just calling "firewall-cmd --reload"? If so, then you apparently made some runtime modifications to your firewall state that isn't reflected in its configuration.
I'm definitely NOT doing any runtime modification to firewall state. All firewall settings are persisted using firewald-cmd. Once I have physical access to the affected fedora 31 server (later today) I will log in using console and check the networking / firewall state.
(followup) I have logged in on the affected server and I tried several commands: # firewall-cmd --reload firewalld[1547]: WARNING: ZONE_ALREADY_SET: 'enp6s0f0' already bound to 'external' # firewall-cmd --complete-reload I had to restart firewall to "fix" the state of firewall: # systemctl restart firewalld.service then machine started to accept packets on relevant ports again... current firewall config: # firewall-cmd --get-active-zones external interfaces: enp6s0f0 trusted interfaces: enp6s0f1 tun0 # firewall-cmd --info-zone external external (active) target: default icmp-block-inversion: no interfaces: enp6s0f0 sources: services: cockpit http openvpn plexmediaserver smtp ssh ports: 8080/tcp 6036/tcp protocols: masquerade: yes forward-ports: port=8080:proto=tcp:toport=80:toaddr=192.168.1.105 port=6036:proto=tcp:toport=:toaddr=192.168.1.105 source-ports: icmp-blocks: rich rules: # cat /etc/firewalld/zones/external.xml <?xml version="1.0" encoding="utf-8"?> <zone> <short>External</short> <description>For use on external networks. You do not trust the other computers on networks to not harm your computer. Only selected incoming connections are accepted.</description> <interface name="enp6s0f0"/> <service name="ssh"/> <service name="smtp"/> <service name="cockpit"/> <service name="http"/> <service name="openvpn"/> <service name="plexmediaserver"/> <port port="8080" protocol="tcp"/> <port port="6036" protocol="tcp"/> <masquerade/> <forward-port port="8080" protocol="tcp" to-port="80" to-addr="192.168.1.105"/> <forward-port port="6036" protocol="tcp" to-addr="192.168.1.105"/> </zone> I can't see anything wrong with my firewalld config...
more info: if the firewall config is reloaded completely, then firewall is reconfigured properly and networking is working fine: # firewall-cmd --reload Warning: ZONE_ALREADY_SET: 'enp6s0f0' already bound to 'external' success firewall state not configured properly # firewall-cmd --complete-reload Warning: ZONE_ALREADY_SET: 'enp6s0f0' already bound to 'external' success firewall state is ok from firewall-cmd manpage: --reload Reload firewall rules and keep state information. Current permanent configuration will become new runtime configuration, i.e. all runtime only changes done until reload are lost with reload if they have not been also in permanent configuration. Note: Runtime changes applied via the direct interface are not affected and will therefore stay in place until firewalld daemon is restarted completely. --complete-reload Reload firewall completely, even netfilter kernel modules. This will most likely terminate active connections, because state information is lost. This option should only be used in case of severe firewall problems. For example if there are state information problems that no connection can be established with correct firewall rules. Note: Runtime changes applied via the direct interface are not affected and will therefore stay in place until firewalld daemon is restarted completely. I'm no expert here so I can't tell if --reload or --complete-reload should be used in cockpit-ws's %post scriptlet
I ssh'ed into a Fedora 31 VM, installed cockpit 210, downgraded (with dnf) to 209, and upgraded back to 210. Can you try this in isolation? dnf install https://kojipkgs.fedoraproject.org//packages/cockpit/209/1.fc31/x86_64/cockpit-ws-209-1.fc31.x86_64.rpm dnf update cockpit-ws Your previous update included a lot of other packages which might cause trouble. Does this reproduce the hang? As it wasn't the firewall reloading, the only other thing that it does is systemctl try-restart cockpit.socket cockpit.service I wouldn't know how that could kill the network, but let's cover all bases. After it hangs, can you please ssh in again and grab a recent journal (journalctl --since '10 minutes ago'), and paste it here?
It looks line "my" trouble is with "firewall-cmd --reload" used in post script of cockpit-ws package, not the package itself. If I run "firewall-cmd --reload" alone on my fedora 31 box, I can't connect to it by ssh anymore, all ports (ssh, dhcp, ...) are disabled. The only remedy to login using console and execute "firewall-cmd --complete-reload" or "systemctl restart firewalld.service", then the firewall is restored back to its permanent configuration. Maybe we should close the issue as the cockpit is not the primary cause of the problem...
I found the real problem googling for "Warning: ZONE_ALREADY_SET: 'enp6s0f0' already bound to 'external'": https://access.redhat.com/solutions/4586771 "firewall-cmd --reload" now works as it should (maybe the command shouldn't print "success" in the first place) My problem is solved now, I'm closing the bug, thank you for your time Martin ;-)