Bug 1336496

Summary: different ways of blocking a packet using ip6tables leads to different results
Product: [Fedora] Fedora Reporter: Pavel Šimerda (pavlix) <psimerda>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 23CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab, psimerda, psutter
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-26 16:43:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 883152    
Description Flags
a test script
a test script
a test script
output of the script as a tarball (script included) none

Description Pavel Šimerda (pavlix) 2016-05-16 15:59:28 UTC
Description of problem:

When working on the userspace networking test suite[1], I descovered suspicous behavior of the IPv6 firewall. Slightly different ip6tables rules that should block IPv6 packets result in very different results. In one case the program learns immediately about being rejected and gets the chance to fall back to IPv4, in the other case it doesn't.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. (optional) Configure a TCP service, e.g. a local SSH server.
2. (optional) Use a testing tool to connect to it.
3. Block outgoing TCP packets on firewall and retest.
4. Block all outgoing packets on firewall and retest.

Actual results:

3. Quick failure.
4. Quick failure.

In both cases most tools will fall back to IPv4.

Expected results:

3. Quick failure as expected.
4. Delay taking many seconds.

Additional info:

Test script will be included together with its output on my system.

[1]: https://github.com/pavlix/network-testing

Comment 1 Pavel Šimerda (pavlix) 2016-05-16 16:03:04 UTC
Created attachment 1158014 [details]
a test script

Comment 2 Pavel Šimerda (pavlix) 2016-05-16 16:03:49 UTC
Sorry actual/expected results were swapped.

Comment 3 Pavel Šimerda (pavlix) 2016-05-16 16:21:56 UTC
Created attachment 1158016 [details]
a test script

Comment 5 Pavel Šimerda (pavlix) 2016-05-16 16:39:16 UTC
For quick assessment of the bug it should be enough to see the time output that shows that the first two runs (no firewall, tcp rejected) finish quickly while the last run (all rejected) takes 10 seconds in my case.

$ grep '' *.time

0.02user 0.02system 0:00.05elapsed 86%CPU (0avgtext+0avgdata 7064maxresident)k
0inputs+56outputs (0major+736minor)pagefaults 0swaps

0.03user 0.02system 0:01.08elapsed 5%CPU (0avgtext+0avgdata 7108maxresident)k
time:0inputs+56outputs (0major+732minor)pagefaults 0swaps

0.03user 0.03system 0:10.08elapsed 0%CPU (0avgtext+0avgdata 7104maxresident)k
0inputs+56outputs (0major+733minor)pagefaults 0swaps

Comment 6 Pavel Šimerda (pavlix) 2016-05-16 16:45:16 UTC
The most notable difference in strace is:

$ diff -u 02-tcp-rejected.strace 03-all-rejected.strace

 fcntl(4, F_GETFL)                       = 0x2 (flags O_RDWR)
 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
 connect(4, {sa_family=AF_INET6, sin6_port=htons(22), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)
-select(5, [4], [4], [4], {10, 0})       = 2 (in [4], out [4], left {8, 998426})
-getsockopt(4, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
+select(5, [4], [4], [4], {10, 0})       = 0 (Timeout)
 fcntl(5, F_GETFL)                       = 0x2 (flags O_RDWR)
 fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
 connect(5, {sa_family=AF_INET, sin_port=htons(22), sin_addr=inet_addr("")}, 1

While in one case an error is returned imediately using the sequence of connect, select and getsockopt, in the other case the select actually times out which also explains the 10 seconds when using nc for testing.

Comment 7 Pavel Šimerda (pavlix) 2016-05-16 16:49:57 UTC
Created attachment 1158044 [details]
a test script

Comment 8 Pavel Šimerda (pavlix) 2016-05-16 16:51:19 UTC
Created attachment 1158045 [details]
output of the script as a tarball (script included)

Comment 9 Laura Abbott 2016-09-23 19:24:18 UTC
*********** MASS BUG UPDATE **************
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 23 kernel bugs.
Fedora 23 has now been rebased to 4.7.4-100.fc23.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
If you have moved on to Fedora 24 or 25, and are still experiencing this issue, please change the version to Fedora 24 or 25.
If you experience different issues, please open a new bug report for those.

Comment 10 Laura Abbott 2016-10-26 16:43:33 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.

Comment 11 Pavel Šimerda (pavlix) 2016-11-10 07:11:46 UTC
So this is apparently the very classic problem with blocked ICMP messages.