Description of problem:
While running automation tests all operations of setupNetworks fail
error from the engine:
Reason: Bad Request
Detail: [Unexpected exception]
the fd reach 1025 and then we start getting this error.
ll /proc/8344/fd | wc -l
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Run host_network_api tests few time or network tier1
Created attachment 1066477 [details]
Created attachment 1066479 [details]
vdsm logs - host is host_mixed_1 - 10.35.128.28
I've looked at the VDSM logs. and VDSM runs out of its allowed 1024 file descriptors.
Following the open FDs during several runs of the tests, VDSM is constantly leaking FDs at relatively steady pace when the tests are active, furthermore, leak is limited to a single type, VDSM is leaking TCP sockets.
I've tried to intercept its syscalls and I came across multiple accept(2) calls that never closed their descriptors during the whole time of the syscall trace (1~2 minutes), I'd suggest continuing the investigation there.
It seems that it still randomly happens. We need to determine the steps how to reproduce the issue again. It is related to setupNetworks BZ #1262051.
Please provide the steps to reproduce.
Marked as a GA blocker for now, since no clear repo steps and frequency seems to be down. not a beta1 blocker.
I have access to the env so working on it now.
This isn't a regression. Removing regression flag.
Cloned also to 3.5.Z.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.