Bug 1557048

Summary: [docker] cannot create networks simultaneously
Product: Red Hat Enterprise Linux 7 Reporter: Carl George <carl>
Component: dockerAssignee: Daniel Walsh <dwalsh>
Status: CLOSED WONTFIX QA Contact: atomic-bugs <atomic-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: amurdaca, lsm5, pasik
Target Milestone: rcKeywords: Extras
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: docker-1.13.1-58.git87f2fab.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-15 07:37:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Carl George 2018-03-15 21:17:04 UTC
Description of problem:
In both RHEL and Fedora, the docker daemon is not able to accept simultaneous requests to create docker networks.  I first observed this behavior in the drone CI software [1], which uses separate docker networks for each task in a matrix build.  One of the matrix tasks will succeed, but the others will fail with this error:

Error response from daemon: unable to remove jump to DOCKER-ISOLATION rule in FORWARD chain: (COMMAND_FAILED: '/usr/sbin/iptables -w2 -D FORWARD -j DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name. )

I was also able to reproduce this error just by simultaneously running multiple `docker network create ...` commands via tmux synchronized panes (see Steps to Reproduce).


Version-Release number of selected component (if applicable):
docker-1.13.1-53.git774336d.el7
docker-1.13.1-44.git584d391.fc27


How reproducible:
Always when multiple commands are run simultaneously.  Not reproducible when only one command is run.


Steps to reproduce:
1. start tmux session
2. tmux command: split-window
3. in each pane, type `docker network create ...` with a unique network name
4. tmux command: set-window-option synchronize-panes
5. hit enter


Actual results:
One network is created, the other fails.


Expected results:
Both networks be created.


How to fix:
I was able to track this down to an upstream issue [2], which has already been fixed in docker/libnetwork and bundled in with docker 17.04.  It boils down to this commit [3] from docker/libnetwork#1658 [4].  I created a patch [5] based on that commit, and it applies cleanly to the RHEL SRPM and produces RPMS [6] that do not suffer from this bug.  I'm happy to send this patch as a pull request directly into projectatomic/docker if so desired.


Additional info:
[1]: https://discourse.drone.io/t/matrix-builds-network-concurrency-issue/1936
[2]: https://github.com/moby/moby/issues/25393
[3]: https://github.com/docker/libnetwork/commit/fe741120dbacc1c720a80812de1d65811695f5d6
[4]: https://github.com/docker/libnetwork/pull/1658
[5]: https://github.com/carlwgeorge/docker/commit/660a257754ab0e43e1e778baf8a625f1542fd334
[6]: https://copr.fedorainfracloud.org/coprs/carlwgeorge/docker/

Comment 2 Daniel Walsh 2018-03-16 11:26:00 UTC
Can you open a pr against projectatomic/docker to fix this issue.

Comment 3 Carl George 2018-03-16 12:29:48 UTC
Absolutely.  I noticed the Fedora package builds the docker-1.13.1 branch, and RHEL builds the docker-1.13.1-rhel branch.  Should I send separate pull requests for each?  Or just one pull request and let yall cherry-pick to the other later?

Comment 4 Daniel Walsh 2018-03-16 13:35:02 UTC
Both, thanks.

Comment 6 Carl George 2018-04-11 21:01:40 UTC
This was released in RHEL 7.5, but I don't see an option to close the bug.

Comment 9 RHEL Program Management 2021-02-15 07:37:58 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.