Bug 1321606 - accPolicy stress tests crashing the server
Summary: accPolicy stress tests crashing the server
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: 389-ds-base
Version: 7.2
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Noriko Hosoi
QA Contact: Viktor Ashirov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-28 15:21 UTC by Sankar Ramalingam
Modified: 2016-04-11 16:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-04-11 16:06:48 UTC
Target Upstream Version:


Attachments (Terms of Use)
Email received from abrt crash report for accPolicy stress tests (63.35 KB, text/plain)
2016-03-28 15:21 UTC, Sankar Ramalingam
no flags Details

Description Sankar Ramalingam 2016-03-28 15:21:27 UTC
Created attachment 1140893 [details]
Email received from abrt crash report for accPolicy stress tests

Description of problem: Account policy stress tests are making the server to hang the server crashes.


Version-Release number of selected component (if applicable): 389-ds-base-1.3.4.0-29


How reproducible: Consistently on RHEL7.2


Steps to Reproduce:
1. Run accPolicy stress tests by cloning a beaker job.
Eg: https://beaker.engineering.redhat.com/jobs/1274745
2. Server crashes.
3. If you don't observe a crash, try running it manually on the same machine.
    Modify engage.cfg file to remove cleanup tests and Uninstall test suite.
4. I am yet to figure out which test is crashing the server. I will update the bugzilla with more information.

Actual results: Server hangs and slapd crashes.


Expected results: No server crash.


Additional info: It fails on RHEL7.x or RHEL7.2. Not reproducible on RHEL6.x
Also, there was a crash from "/usr/lib/systemd/systemd-logind"

Attaching the crash e-mail.

Comment 7 Sankar Ramalingam 2016-03-29 18:17:14 UTC
The system isn't accessible to me as well. I doubt, the server crashes might have been the reason. I rebooted the server from beaker UI.

Comment 8 Sankar Ramalingam 2016-03-29 18:29:23 UTC
The machine is now accessible. Also, I would like to add more information about the test case which is causing these failures...

Test Case accPolicy_21 does the following and that causes the server to hang/crash, I guess.
1. Set accountInactivityLimit: 31536600 # which is one year
2. Set the system date to 1 year ahead.
3. Check if the account is inactivated
4. Then, change the date back to original using ntpd service.

Patch - https://code.engineering.redhat.com/gerrit/70858

with patch set 1, the execution completed without any hassle. I excluded accPolicy_21 test case.
Bkr job - https://beaker.engineering.redhat.com/jobs/1281583

with patch set 2, the execution is hanging. I added it back to the execution.
Bkr job - https://beaker.engineering.redhat.com/jobs/1282568

Comment 9 Noriko Hosoi 2016-03-29 19:19:45 UTC
Thanks, Sankar.

I could login ibm-x3650m4-02-vm-04.lab.eng.bos.redhat.com.

Did you observe a crash or a hang on this host/test env?

If so, could you please tell me where I can find it?

The error log /var/log/dirsrv/slapd-deftestinst/errors looks clean and I don't see any ns-slapd related logs in /var/log/messages.

No core files are found in /var/log/dirsrv/slapd-deftestinst.

I also checked /var/*/abrt, but I don't see anything there...

Could it be possible to leave a core on the system?
http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-crashes

If it is a hang, how you could tell that?

Thanks.

Comment 10 Sankar Ramalingam 2016-04-11 08:59:35 UTC
Hi Noriko, sorry for the late response. I tried few more attempts over the last two weeks, but couldn't reproduce the crash. I guess, its specific to test environment/machine. So, this can be closed as not reproducible.

Comment 11 Noriko Hosoi 2016-04-11 16:06:48 UTC
(In reply to Sankar Ramalingam from comment #10)
> Hi Noriko, sorry for the late response. I tried few more attempts over the
> last two weeks, but couldn't reproduce the crash. I guess, its specific to
> test environment/machine. So, this can be closed as not reproducible.

Thank you for retesting this case, Sankar!

Closing this bug...


Note You need to log in before you can comment on or make changes to this bug.