Bug 1477088 - restraint should not be re-trying IPv6 on every request
Summary: restraint should not be re-trying IPv6 on every request
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Restraint
Classification: Retired
Component: general
Version: 0.1.30
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 0.1.31
Assignee: beaker-dev-list
QA Contact: tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-01 08:26 UTC by Jan Stancek
Modified: 2017-08-03 07:57 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-03 07:57:58 UTC
Embargoed:


Attachments (Terms of Use)

Description Jan Stancek 2017-08-01 08:26:13 UTC
Description of problem:
Test system has IPv6 address from auto-configuration, there is a DNS AAAA record for LC, but LC itself can't be reached over IPv6.

Every new request (log upload, etc.) from restraint to LC now has ~60 second delay before it falls back to IPv4:

...
[pid  5283] 04:07:47 connect(7, {sa_family=AF_INET6, sin6_port=htons(8000), inet_pton(AF_INET6, "2620:52:0:aa1:216:3eff:fe60:cd5c", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)
[pid  5283] 04:07:47 eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK) = 8
[pid  5283] 04:07:47 write(3, "\0\0\0\0\0\0\0\1", 8) = 8
[pid  5283] 04:07:47 write(3, "\0\0\0\0\0\0\0\1", 8) = 8
[pid  5283] 04:07:47 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLOUT}, {fd=8, events=POLLIN}], 5, 60000) = 1 ([{fd=3, revents=POLLIN}])
[pid  5283] 04:07:47 read(3, "\0\0\0\0\0\0\0\3", 16) = 8
[pid  5283] 04:07:47 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLOUT}, {fd=8, events=POLLIN}], 5, 60000 <unfinished ...>
...
[pid  5283] 04:08:47 <... poll resumed> ) = 0 (Timeout)

Version-Release number of selected component (if applicable):
restraint-0.1.30-1.el7_2

How reproducible:
high

Actual results:
Delays combined doesn't allow /distribution/install task to complete and system hits EWD.

Expected results:
If IPv6 can't be used, restraint gives up on re-trying it in every request.

Additional info:

Comment 2 PaulB 2017-08-02 18:56:58 UTC
Jan,
Thank you :)

Best,
-pbunyan

Comment 3 Dan Callaghan 2017-08-03 07:28:47 UTC
I think it's reasonable for restraint to try IPv6 on every request, I mean there is an AAAA record and a route, it just doesn't work. Ideally we would treat a problem like this as a critical network failure which brings down the whole lab, and just get the network configuration fixed up. If we had the same problem on IPv4 it would be treated that way...

Comment 4 Dan Callaghan 2017-08-03 07:29:27 UTC
I guess another possible workaround is to disable IPv6 on the system in kickstart %post, until that particular lab network problem is fixed.

Comment 5 Jan Stancek 2017-08-03 07:57:58 UTC
Using beah or kernel command line option to disable IPv6 as workaround is also possible.

Artem also pointed out, that this would be difficult to achieve in restraint, because some log/result upload is running in separate processes.

I'm closing this and we'll follow up with lab admins.


Note You need to log in before you can comment on or make changes to this bug.