1477088 – restraint should not be re-trying IPv6 on every request

Bug 1477088 - restraint should not be re-trying IPv6 on every request

Summary: restraint should not be re-trying IPv6 on every request

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Restraint
Classification:	Retired
Component:	general
Sub Component:
Version:	0.1.30
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	0.1.31
Assignee:	beaker-dev-list
QA Contact:	tools-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-08-01 08:26 UTC by Jan Stancek
Modified:	2017-08-03 07:57 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-08-03 07:57:58 UTC
Embargoed:

Attachments	(Terms of Use)

Description Jan Stancek 2017-08-01 08:26:13 UTC

Description of problem:
Test system has IPv6 address from auto-configuration, there is a DNS AAAA record for LC, but LC itself can't be reached over IPv6.

Every new request (log upload, etc.) from restraint to LC now has ~60 second delay before it falls back to IPv4:

...
[pid  5283] 04:07:47 connect(7, {sa_family=AF_INET6, sin6_port=htons(8000), inet_pton(AF_INET6, "2620:52:0:aa1:216:3eff:fe60:cd5c", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)
[pid  5283] 04:07:47 eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK) = 8
[pid  5283] 04:07:47 write(3, "\0\0\0\0\0\0\0\1", 8) = 8
[pid  5283] 04:07:47 write(3, "\0\0\0\0\0\0\0\1", 8) = 8
[pid  5283] 04:07:47 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLOUT}, {fd=8, events=POLLIN}], 5, 60000) = 1 ([{fd=3, revents=POLLIN}])
[pid  5283] 04:07:47 read(3, "\0\0\0\0\0\0\0\3", 16) = 8
[pid  5283] 04:07:47 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLOUT}, {fd=8, events=POLLIN}], 5, 60000 <unfinished ...>
...
[pid  5283] 04:08:47 <... poll resumed> ) = 0 (Timeout)

Version-Release number of selected component (if applicable):
restraint-0.1.30-1.el7_2

How reproducible:
high

Actual results:
Delays combined doesn't allow /distribution/install task to complete and system hits EWD.

Expected results:
If IPv6 can't be used, restraint gives up on re-trying it in every request.

Additional info:

Comment 2 PaulB 2017-08-02 18:56:58 UTC

Jan,
Thank you :)

Best,
-pbunyan

Comment 3 Dan Callaghan 2017-08-03 07:28:47 UTC

I think it's reasonable for restraint to try IPv6 on every request, I mean there is an AAAA record and a route, it just doesn't work. Ideally we would treat a problem like this as a critical network failure which brings down the whole lab, and just get the network configuration fixed up. If we had the same problem on IPv4 it would be treated that way...

Comment 4 Dan Callaghan 2017-08-03 07:29:27 UTC

I guess another possible workaround is to disable IPv6 on the system in kickstart %post, until that particular lab network problem is fixed.

Comment 5 Jan Stancek 2017-08-03 07:57:58 UTC

Using beah or kernel command line option to disable IPv6 as workaround is also possible.

Artem also pointed out, that this would be difficult to achieve in restraint, because some log/result upload is running in separate processes.

I'm closing this and we'll follow up with lab admins.

Note You need to log in before you can comment on or make changes to this bug.