Description of problem: Cockpit CI's recent Fedora 37 image refresh [1] detected a regression with podman's port conflict handling. Version-Release number of selected component (if applicable): The image build log [2] has a complete list of package updates. Of these, I confirmed the kernel-core 6.0.15-300.fc37 -> 6.0.16-300.fc37 update to be the one that triggers the regression. However, I am filing this against podman for your initial triaging, as I don't know if that's a kernel regression or some new feature that podman needs to be adapted to. podman-4.3.1-1.fc37.x86_64 crun-1.7.2-3.fc37.x86_64 kernel-core-6.0.16-300.fc37.x86_64 How reproducible: Always Steps to Reproduce: podman run -d -p 5000:5000 --name c1 registry:2 podman run -d -p 5000:5000 --name c2 registry:2 Actual results: Both commands succeed Expected results: Until recently, the second command failed with Error: c2 listen tcp 0.0.0.0:5000: bind: address already in use Now both containers start and "podman ps" claims that they are both forwarding local port 5000 to the container. But this (naturally) only works for the first container. Additional info: [1] https://github.com/cockpit-project/bots/pull/4248 [2] https://cockpit-logs.us-east-1.linodeobjects.com/image-refresh-logs/fedora-37-20230106-230101.log
Might be an issue with the kernel. I did something similar with nc -l 5001 & nc -l 5001
This does seem to be a problem in the kernel. I am also seeing it in 6.0.17-300.fc37.x86_64 & 6.0.18-300.fc37.x86_64. I see it when I start two ssh sessions going to a server both using X11 forwarding. The second or subsequent sessions get allocated the same port number for the forwarding ports. On kernel 6.0.15 with debug enabled sshd reports "Address already in use" as it cycles through ports 6010- onwards. On the faulty kernels sshd just allocates port 6010 on all sessions.
Ack, this starts looking serious! Reassigning to the kernel then.
This is no longer happening in kernel-6.1.4-200.fc37. See https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/7VPNMC77YC3SI5LFYKUA4B5MTFPLTLVB/
*** Bug 2159802 has been marked as a duplicate of this bug. ***
This bug breaks X11 forwarding in ssh and also breaks some in-house software. For us it is quite a severe showstopper. Fortunately, it seems easy to fix. In net/ipv4/inet_connection_sock.c:370, ret should be initialised to -EADDRINUSE not 1 in the function inet_csk_get_port. Patching this seems to fix the problem. The suggestion is from: https://lore.kernel.org/stable/CAFsF8vL4CGFzWMb38_XviiEgxoKX0GYup=JiUFXUOmagdk9CRg@mail.gmail.com/ which Miro pointed to.
Created attachment 1937748 [details] patch to fix it Here's the actual patch
Current F37/F38 kernels seem fine.