Bug 1477088 - restraint should not be re-trying IPv6 on every request
restraint should not be re-trying IPv6 on every request
Status: CLOSED WONTFIX
Product: Restraint
Classification: Community
Component: general (Show other bugs)
0.1.30
Unspecified Unspecified
unspecified Severity unspecified
: 0.1.31
: ---
Assigned To: beaker-dev-list
tools-bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-01 04:26 EDT by Jan Stancek
Modified: 2017-08-03 03:57 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-03 03:57:58 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jan Stancek 2017-08-01 04:26:13 EDT
Description of problem:
Test system has IPv6 address from auto-configuration, there is a DNS AAAA record for LC, but LC itself can't be reached over IPv6.

Every new request (log upload, etc.) from restraint to LC now has ~60 second delay before it falls back to IPv4:

...
[pid  5283] 04:07:47 connect(7, {sa_family=AF_INET6, sin6_port=htons(8000), inet_pton(AF_INET6, "2620:52:0:aa1:216:3eff:fe60:cd5c", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)
[pid  5283] 04:07:47 eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK) = 8
[pid  5283] 04:07:47 write(3, "\0\0\0\0\0\0\0\1", 8) = 8
[pid  5283] 04:07:47 write(3, "\0\0\0\0\0\0\0\1", 8) = 8
[pid  5283] 04:07:47 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLOUT}, {fd=8, events=POLLIN}], 5, 60000) = 1 ([{fd=3, revents=POLLIN}])
[pid  5283] 04:07:47 read(3, "\0\0\0\0\0\0\0\3", 16) = 8
[pid  5283] 04:07:47 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLOUT}, {fd=8, events=POLLIN}], 5, 60000 <unfinished ...>
...
[pid  5283] 04:08:47 <... poll resumed> ) = 0 (Timeout)

Version-Release number of selected component (if applicable):
restraint-0.1.30-1.el7_2

How reproducible:
high

Actual results:
Delays combined doesn't allow /distribution/install task to complete and system hits EWD.

Expected results:
If IPv6 can't be used, restraint gives up on re-trying it in every request.

Additional info:
Comment 2 PaulB 2017-08-02 14:56:58 EDT
Jan,
Thank you :)

Best,
-pbunyan
Comment 3 Dan Callaghan 2017-08-03 03:28:47 EDT
I think it's reasonable for restraint to try IPv6 on every request, I mean there is an AAAA record and a route, it just doesn't work. Ideally we would treat a problem like this as a critical network failure which brings down the whole lab, and just get the network configuration fixed up. If we had the same problem on IPv4 it would be treated that way...
Comment 4 Dan Callaghan 2017-08-03 03:29:27 EDT
I guess another possible workaround is to disable IPv6 on the system in kickstart %post, until that particular lab network problem is fixed.
Comment 5 Jan Stancek 2017-08-03 03:57:58 EDT
Using beah or kernel command line option to disable IPv6 as workaround is also possible.

Artem also pointed out, that this would be difficult to achieve in restraint, because some log/result upload is running in separate processes.

I'm closing this and we'll follow up with lab admins.

Note You need to log in before you can comment on or make changes to this bug.