Bug 2231559
| Summary: | CRIT - connection_table_move_connection_out_of_active_list - conn 0 is already OUT of the active list (refcnt is 0) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Viktor Ashirov <vashirov> | ||||||||||
| Component: | 389-ds-base | Assignee: | LDAP Maintainers <idm-ds-dev-bugs> | ||||||||||
| Status: | NEW --- | QA Contact: | LDAP QA Team <idm-ds-qe-bugs> | ||||||||||
| Severity: | unspecified | Docs Contact: | |||||||||||
| Priority: | medium | ||||||||||||
| Version: | 9.3 | CC: | idm-ds-dev-bugs, jachapma, tbordaz | ||||||||||
| Target Milestone: | rc | Keywords: | Triaged | ||||||||||
| Target Release: | --- | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | Type: | Bug | |||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Viktor Ashirov
2023-08-11 21:33:04 UTC
Created attachment 1983126 [details]
pstack after the test was stopped
Created attachment 1983127 [details]
pstack of unresponsive server
I left the server running after the test was stopped. After some time I found it unresponsive and consuming 100% CPU.
Hi Viktor, I would be very interested to know the size of the connection sub tables on the system you used for this issue. cat /var/log/dirsrv/[YOUR INSTANCE]/errors | grep connection_table_new eg [15/Aug/2023:06:32:52.962909140 -0400] - INFO - connection_table_new - Number of connection sub-tables 4, each containing 15984 slots. Thanks Hi Jamie, I have the same number: [11/Aug/2023:17:08:23.654787326 -0400] - INFO - connection_table_new - Number of connection sub-tables 4, each containing 15984 slots. Thanks Looking at 'pstack of the unresponsive server', it is looking like the test of the turbo mode may contribute. For example all workers may entered in turbo mode and entering in infinite loop to evaluate its rank. Suprisingly all connection should be close so active list should be empty. Thread 27 (Thread 0x7f9b309ff640 (LWP 24356) "ns-slapd"): #0 table_iterate_function (arg=<synthetic pointer>, conn=<optimized out>) at ldap/servers/slapd/connection.c:1410 #1 connection_table_iterate_active_connections (f=<optimized out>, arg=<synthetic pointer>, ct=0x7f9bb9c32870) at ldap/servers/slapd/conntable.c:343 #2 connection_find_our_rank (conn=<optimized out>, conn=0x7f9b3b90ed70, our_rank=<synthetic pointer>, connection_count=<synthetic pointer>) at ldap/servers/slapd/connection.c:1425 #3 connection_enter_leave_turbo (new_turbo_flag=<synthetic pointer>, current_turbo_flag=0, conn=0x7f9b3b90ed70) at ldap/servers/slapd/connection.c:1458 #4 connection_threadmain (arg=<optimized out>) at ldap/servers/slapd/connection.c:1661 #5 0x00007f9bbede9c34 in _pt_root () at target:/lib64/libnspr4.so #6 0x00007f9bbe69f822 in start_thread () at target:/lib64/libc.so.6 #7 0x00007f9bbe63f450 in clone3 () at target:/lib64/libc.so.6 If reproduced, samples of top -H may say if only one worker is active or all of them (more likely). Could you try to reproduce with turbo mode disabled ? Created attachment 1983591 [details]
pstack with turbo mode disabled
With turbo mode disabled the server still hangs.
I'll make another build with https://github.com/389ds/389-ds-base/pull/5893 to see if it makes any difference. Created attachment 1983610 [details]
pstack with turbo mode removed
Test still hands with turbo mode completely removed.
|