Bug 1608568 - line coverage tests: bug-1432542-mpx-restart-crash.t times out consistently
Summary: line coverage tests: bug-1432542-mpx-restart-crash.t times out consistently
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: tests
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1608564
TreeView+ depends on / blocked
 
Reported: 2018-07-25 20:14 UTC by Shyamsundar
Modified: 2018-10-23 15:15 UTC (History)
1 user (show)

Fixed In Version: glusterfs-5.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1608564
Environment:
Last Closed: 2018-10-23 15:15:30 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Shyamsundar 2018-07-25 20:14:39 UTC
+++ This bug was initially created as a clone of Bug #1608564 +++

The nightly line coverage tests are failing consistently for over a few weeks. The failures are as follows,

2 test(s) failed 
./tests/basic/sdfs-sanity.t
./tests/bugs/core/bug-1432542-mpx-restart-crash.t

1 test(s) generated core 
./tests/basic/sdfs-sanity.t

a) ./tests/bugs/core/bug-1432542-mpx-restart-crash.t

This test is timing out, my thought is to increment the time for this test, as the line coverage tests seem to take more time (assuming lcov instrumentation slows things down).

For example the time taken for the following tests in centos7 regression builds look as follows,
./tests/bugs/index/bug-1559004-EMLINK-handling.t  -  896 second
./tests/bugs/core/bug-1432542-mpx-restart-crash.t  -  309 second
./tests/basic/afr/lk-quorum.t  -  225 second

On lcov tests these take,
./tests/bugs/index/bug-1559004-EMLINK-handling.t  -  1063 second
./tests/bugs/core/bug-1432542-mpx-restart-crash.t  -  400 second (timeout)
./tests/basic/afr/lk-quorum.t  -  267 second

As can be seen each test seems to add 25 seconds for every 100 seconds of a normal run.

Need to reproduce this locally and check if we can increase the timeout for the mpx test to resolve (a)

Comment 1 Worker Ant 2018-08-06 16:58:15 UTC
REVIEW: https://review.gluster.org/20568 (tests: Increase timeout for mpx restart crash test) posted (#3) for review on master by Shyamsundar Ranganathan

Comment 2 Worker Ant 2018-08-09 03:17:07 UTC
COMMIT: https://review.gluster.org/20568 committed in master by "Atin Mukherjee" <amukherj> with a commit message- tests: Increase timeout for mpx restart crash test

In lcov based regression testing environments, all tests take
more time than what occurs in centos7 regressions. Possibly
due to code instrumentation for lcov purposes.

Due to this the test, bug-1432542-mpx-restart-crash.t constantly
times out. This patch increases the timeout for the same to enable
lcov tests to pass on a more regular basis.

It was also noted by Nithya that the test at times generated an
OOM kill on the regression machines. In order to reduce runtime
memory foot print of the tests, FUSE mounts are unmounted as
soon as the required test is complete.

Fixes: bz#1608568

Change-Id: I37f8d4b45807a69c52c7c7df4923c0fc33fab4e4
Signed-off-by: ShyamsundarR <srangana>

Comment 3 Shyamsundar 2018-10-23 15:15:30 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report.

glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.