Bug 821176
| Summary: | ns-slapd segfault in libreplication-plugin after IPA upgrade from 2.1.3 to 2.2.0 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Scott Poore <spoore> | ||||
| Component: | 389-ds-base | Assignee: | Rich Megginson <rmeggins> | ||||
| Status: | CLOSED ERRATA | QA Contact: | IDM QE LIST <seceng-idm-qe-list> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 6.3 | CC: | jgalipea, mkosek, nhosoi, nkinder, rmeggins | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | 389-ds-base.1.2.10.2-12.el6 | Doc Type: | Bug Fix | ||||
| Doc Text: |
This bug was introduced by the fix for Bug 819643 - "Database RUV could mismatch the one in changelog under the stress" which is in the same errata.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-06-20 07:15:36 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Scott Poore
2012-05-12 19:13:51 UTC
Are you getting a core dump? Can you get a backtrace of the crashed 389-ds instance? Created attachment 584409 [details]
Stack trace
I ran into the similar problem. This patch is supposed to fix the bug. The current llist code fails to set list->tail to NULL at the right place. I'm going to rebuild 389-ds-base with this patch and others in 1.2.10.2-12 once our reliability test is passed.
diff --git a/ldap/servers/plugins/replication/llist.c b/ldap/servers/plugins/rep
index e80f532..05cfa48 100644
--- a/ldap/servers/plugins/replication/llist.c
+++ b/ldap/servers/plugins/replication/llist.c
@@ -165,14 +165,14 @@ void* llistRemoveCurrentAndGetNext (LList *list, void **it
if (node)
{
prevNode->next = node->next;
+ if (list->tail == node) {
+ list->tail = prevNode;
+ }
_llistDestroyNode (&node, NULL);
node = prevNode->next;
if (node) {
return node->data;
} else {
- if (list->head->next == NULL) {
- list->tail = NULL;
- }
return NULL;
}
}
Thread 1 (Thread 0x7fc84e1fc700 (LWP 18031)):
#0 0x00007fc86b0e8276 in csnplInsert (csnpl=0x7fc838008090, csn=0x7fc8280014a0) at ldap/servers/plugins/replication/csnpl.c:155
rc = <value optimized out>
csnplnode = 0x30000
csn_str = "\000\000\000\000\000\000\000\000\246A\020k\310\177\000\000p\344\t\001"
#1 0x00007fc86b1051ac in ruv_add_csn_inprogress (ruv=0x147f5f0, csn=0x7fc8280014a0) at ldap/servers/plugins/replication/repl5_ruv.c:1438
replica = 0x7fc8380044e0
csn_str = "\024\000\000\000\000\000\000\000\300\067C\001\000\000\000\000p\203M\001"
rc = 0
#2 0x00007fc86b0fa08c in process_operation (pb=<value optimized out>, csn=0x7fc8280014a0) at ldap/servers/plugins/replication/repl5_plugins.c:1316
r_obj = 0x1442c30
r = <value optimized out>
ruv_obj = 0x11e8120
ruv = <value optimized out>
rc = <value optimized out>
#3 0x00007fc86b0fa683 in multimaster_preop_modify (pb=0x14d8370) at ldap/servers/plugins/replication/repl5_plugins.c:452
csn = 0x7fc8280014a0
target_uuid = 0x7fc828000e40 "d30b348a-9c4c11e1-b596ca1b-778d212c"
drc = <value optimized out>
ctrlp = 0x7fc828002b90
sessionid = "conn=19 op=9", '\000' <repeats 12 times>, " j\0
[...]
Upstream ticket: https://fedorahosted.org/389/ticket/359 Verified.
Version ::
ipa-server-2.2.0-14.el6.x86_64
389-ds-base-1.2.10.2-12.el6.x86_64
Automated Test Results ::
automation not yet run from beaker. This was manually executed...
On MASTER:
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [ LOG ] :: upgrade_bz_821176: ns-slapd segfault in libreplication-plugin after IPA upgrade from 2.1.3 to 2.2.0
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [15:34:05] :: Machine in recipe is MASTER
:: [15:34:06] :: Restarting IPA services
Restarting Directory Service
Shutting down dirsrv:
PKI-IPA... [ OK ]
TESTRELM-COM... [ OK ]
Starting dirsrv:
PKI-IPA... [ OK ]
TESTRELM-COM... [ OK ]
Restarting KDC Service
Stopping Kerberos 5 KDC: [ OK ]
Starting Kerberos 5 KDC: [ OK ]
Restarting KPASSWD Service
Stopping Kerberos 5 Admin Server: [ OK ]
Starting Kerberos 5 Admin Server: [ OK ]
Restarting DNS Service
Stopping named: . [ OK ]
Starting named: [ OK ]
Restarting MEMCACHE Service
Stopping ipa_memcached: [ OK ]
Starting ipa_memcached: [ OK ]
Restarting HTTP Service
Stopping httpd: [ OK ]
Starting httpd: [Thu May 17 15:34:23 2012] [warn] worker ajp://localhost:9447/ already used by another worker
[Thu May 17 15:34:23 2012] [warn] worker ajp://localhost:9447/ already used by another worker
[ OK ]
Restarting CA Service
Stopping pki-ca: [ OK ]
Starting pki-ca: [ OK ]
:: [ PASS ] :: Running 'ipactl restart'
result_server not set, assuming developer mode.
Setting 192.168.122.101 to state upgrade_bz_821176.18.1
:: [ PASS ] :: Running 'rhts-sync-set -s 'upgrade_bz_821176.18.1' -m 192.168.122.101'
result_server not set, assuming developer mode.
Enter STATE:STATE:etc. when the following machines
['192.168.122.102']
are in one of these states: ['upgrade_bz_821176.18.2']
:: [ PASS ] :: Running 'rhts-sync-block -s 'upgrade_bz_821176.18.2' 192.168.122.102'
:: [15:36:17] :: Checking /var/log/messages for ns-slapd segfault
:: [ PASS ] :: BZ 821176 not found. No ns-slapd segfault found in /var/log/messages
:: [15:36:17] :: Checking /var/log/dirsrv/slapd-TESTRELM-COM/errors for LDAP error
:: [ PASS ] :: BZ 821176 not found...didn't find LDAP error in dirsrv log
result_server not set, assuming developer mode.
Setting 192.168.122.101 to state upgrade_bz_821176.18.3
:: [ PASS ] :: Running 'rhts-sync-set -s 'upgrade_bz_821176.18.3' -m 192.168.122.101'
On REPLICA:
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [ LOG ] :: upgrade_bz_821176: ns-slapd segfault in libreplication-plugin after IPA upgrade from 2.1.3 to 2.2.0
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [15:34:14] :: Machine in recipe is SLAVE
result_server not set, assuming developer mode.
Enter STATE:STATE:etc. when the following machines
['192.168.122.101']
are in one of these states: ['upgrade_bz_821176.18.1']
:: [ PASS ] :: Running 'rhts-sync-block -s 'upgrade_bz_821176.18.1' 192.168.122.101'
:: [15:36:04] :: Running ipa-replica-manage force-sync to make sure that works
ipa: INFO: Setting agreement cn=meTospoore-dvm2.testrelm.com,cn=replica,cn=dc\3Dtestrelm\2Cdc\3Dcom,cn=mapping tree,cn=config schedule to 2358-2359 0 to force synch
ipa: INFO: Deleting schedule 2358-2359 0 from agreement cn=meTospoore-dvm2.testrelm.com,cn=replica,cn=dc\3Dtestrelm\2Cdc\3Dcom,cn=mapping tree,cn=config
:: [ PASS ] :: Running 'ipa-replica-manage force-sync --from=spoore-dvm1.testrelm.com --password=Secret123'
result_server not set, assuming developer mode.
Setting 192.168.122.102 to state upgrade_bz_821176.18.2
:: [ PASS ] :: Running 'rhts-sync-set -s 'upgrade_bz_821176.18.2' -m 192.168.122.102'
result_server not set, assuming developer mode.
Enter STATE:STATE:etc. when the following machines
['192.168.122.101']
are in one of these states: ['upgrade_bz_821176.18.3']
:: [ PASS ] :: Running 'rhts-sync-block -s 'upgrade_bz_821176.18.3' 192.168.122.101'
Manual Test Results ::
# grep -i segfault /var/log/messages
#
# grep -i "NSMMReplicationPlugin.*Warning: unable to send endReplication extended operation.*Can't contact LDAP server" /var/log/dirsrv/slapd-TESTRELM-COM/errors
#
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
This bug was introduced by the fix for Bug 819643 - "Database RUV could mismatch the one in changelog under the stress" which is in the same errata.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0813.html |