Bug 1791300
Summary: | sporadic sssd_be crash on s390x | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Jan Stancek <jstancek> |
Component: | sssd | Assignee: | Pavel Březina <pbrezina> |
Status: | CLOSED ERRATA | QA Contact: | Steeve Goveas <sgoveas> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 8.2 | CC: | atikhono, dlavu, ed.dickson, grajaiya, jhrozek, lslebodn, mzidek, pbrezina, sgoveas, thuth, tscherf |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | 8.2 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | sync-to-jira | ||
Fixed In Version: | sssd-2.4.0-1.el8 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-05-18 15:03:54 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1881992 | ||
Bug Blocks: |
Description
Jan Stancek
2020-01-15 13:30:06 UTC
#0 0x000002aa08d969c2 in dp_client_register (mem_ctx=<optimized out>, sbus_req=<optimized out>, provider=0x2aa1583a700, name=0x2aa1589dc60 "nss") at src/providers/data_provider/dp_client.c:107 #1 0x000003ffb508cdaa in _sbus_sss_invoke_in_s_out__step (ev=<optimized out>, te=<optimized out>, tv=<error reading variable: value has been optimized out>, private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:682 #2 0x000003ffb4f0cee8 in tevent_common_invoke_timer_handler (te=te@entry=0x2aa1586e020, current_time=..., removed=removed@entry=0x0) at ../../tevent_timed.c:370 (gdb) frame 0 #0 0x000002aa08d969c2 in dp_client_register (mem_ctx=<optimized out>, sbus_req=<optimized out>, provider=0x2aa1583a700, name=0x2aa1589dc60 "nss") at src/providers/data_provider/dp_client.c:107 107 dp_cli->name = talloc_strdup(dp_cli, name); (gdb) p name $2 = 0x2aa1589dc60 "nss" (gdb) p dp_cli $3 = (struct dp_client *) 0x0 ``` cli_conn = sbus_server_find_connection(dp_sbus_server(provider), sbus_req->sender->name); if (cli_conn == NULL) { DEBUG(SSSDBG_CRIT_FAILURE, "Unknown client: %s\n", sbus_req->sender->name); return ENOENT; } dp_cli = sbus_connection_get_data(cli_conn, struct dp_client); dp_cli->name = talloc_strdup(dp_cli, name); ``` (gdb) p cli_conn $5 = <optimized out> So `cli_conn != NULL` but either `cli_conn->data == NULL` or typeof `cli_conn->data` != `struct dp_client` (gdb) p *provider $9 = {uid = 0, gid = 0, be_ctx = 0x2aa15839060, ev = 0x2aa15820860, sbus_server = 0x2aa15852520, sbus_conn = 0x2aa158551f0, clients = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, terminating = false, requests = {index = 0, num_active = 0, active = 0x0}, modules = 0x2aa1586aff0, targets = 0x2aa1586b0b0} (gdb) p *provider->sbus_server $11 = {ev = 0x2aa15820860, server = 0x2aa1583dcf0, symlink = 0x2aa1583d7f0 "/var/lib/sss/pipes/private/sbus-dp_implicit_files", watch_ctx = 0x2aa15838070, router = 0x2aa15851e80, data_slot = 0, last_activity = 0x0, names = 0x2aa15838d30, match_rules = 0x2aa15852610, max_connections = 1000, uid = 0, gid = 0, on_connection = 0x2aa1583dae0, disconnecting = false, name = {major = 1, minor = 2}} p sizeof(struct talloc_chunk) $15 = 88 (gdb) p *((struct talloc_chunk *)((char *)provider->sbus_server - 96)) $17 = {flags = 3404304376, next = 0x0, prev = 0x2aa15855190, parent = 0x0, child = 0x2aa15861560, refs = 0x0, destructor = 0x3ffb5029d20 <sbus_server_destructor>, name = 0x3ffb5034f08 "struct sbus_server", size = 112, limit = 0x0, pool = 0x2aa15852280} (gdb) p *((struct talloc_chunk *)((char *)provider->sbus_server->names - 96)) $18 = {flags = 3404304368, next = 0x2aa1583da80, prev = 0x2aa158525b0, parent = 0x0, child = 0x2aa158980e0, refs = 0x0, destructor = 0x0, name = 0x3ffb5d6e436 "src/util/util.c:374", size = 168, limit = 0x0, pool = 0x0} => pointers are valid (not freed) (gdb) frame 1 (gdb) p state $20 = <optimized out> (gdb) p *((struct _sbus_sss_invoke_in_s_out__state*)req->data)->sbus_req->sender $26 = {name = 0x2aa1587b3c0 "sssd.nss", uid = 0} cli_conn = sbus_server_find_connection = sss_ptr_hash_lookup(provider->sbus_server->names, sbus_req->sender->name == "sssd.nss") (gdb) p *provider->sbus_server->names $14 = {p = 0, maxp = 4, entry_count = 4, bucket_count = 4, segment_count = 1, min_load_factor = 1, max_load_factor = 5, directory_size = 4, directory_size_shift = 2, segment_size = 4, segment_size_shift = 2, delete_callback = 0x3ffb5d59ef8 <sss_ptr_hash_delete_cb>, delete_pvt = 0x2aa1583e210, halloc = 0x3ffb5d490b0 <hash_talloc>, hfree = 0x3ffb5d490a0 <hash_talloc_free>, halloc_pvt = 0x2aa1583d200, directory = 0x2aa1583f610, statistics = {hash_accesses = 14, hash_collisions = 2, table_expansions = 0, table_contractions = 0}} (gdb) p *provider->sbus_server->names->directory[0][0] $45 = {entry = {key = {type = HASH_KEY_STRING, {str = 0x2aa15869aa0 "sssd.domain_implicit_5ffiles", c_str = 0x2aa15869aa0 "sssd.domain_implicit_5ffiles", ul = 2929528838816}}, value = {type = HASH_VALUE_PTR, {ptr = 0x2aa15869880, ..}}}, next = 0x2aa15869720} -- key of this entry - "sssd.domain_implicit_5ffiles" - looks strange (gdb) p *provider->sbus_server->names->directory[0][0]->next $57 = {entry = {key = {type = HASH_KEY_STRING, {str = 0x2aa1586a5a0 ":1.2", c_str = 0x2aa1586a5a0 ":1.2", ul = 2929528841632}}, value = { type = HASH_VALUE_PTR, {ptr = 0x2aa15866f60, ...}}}, next = 0x0} (gdb) p *provider->sbus_server->names->directory[0][1] $46 = {entry = {key = {type = HASH_KEY_STRING, {str = 0x2aa15867760 ":1.1", c_str = 0x2aa15867760 ":1.1", ul = 2929528829792}}, value = { type = HASH_VALUE_PTR, {ptr = 0x2aa15867560, ...}}}, next = 0x0} (gdb) p *provider->sbus_server->names->directory[0][3] $47 = {entry = {key = {type = HASH_KEY_STRING, {str = 0x2aa158764a0 "sssd.nss", c_str = 0x2aa158764a0 "sssd.nss", ul = 2929528890528}}, value = { type = HASH_VALUE_PTR, {ptr = 0x2aa15898140, ...}}}, next = 0x0} (gdb) p *((struct sss_ptr_hash_value *)provider->sbus_server->names->directory[0][3]->entry->value->ptr) $55 = {spy = 0x2aa15872a60, ptr = 0x2aa158615c0} (gdb) p *((struct talloc_chunk *)((char *)((struct sss_ptr_hash_value *)provider->sbus_server->names->directory[0][3]->entry->value->ptr)->ptr - 96)) $60 = {flags = 3404304368, next = 0x2aa15859e40, prev = 0x0, parent = 0x2aa158524c0, child = 0x2aa158e7eb0, refs = 0x0, destructor = 0x3ffb50147c8 <sbus_connection_destructor>, name = 0x3ffb502d0ca "struct sbus_connection", size = 128, limit = 0x0, pool = 0x0} => pointer is valid (not freed) and has proper type `struct sbus_connection` (gdb) p *((struct sbus_connection *)((struct sss_ptr_hash_value *)provider->sbus_server->names->directory[0][3]->entry->value->ptr)->ptr) $63 = {ev = 0x2aa15820860, connection = 0x2aa15862930, type = SBUS_CONNECTION_CLIENT, address = 0x0, wellknown_name = 0x2aa1586f2f0 sssd.nss", unique_name = 0x2aa1586a4c0 ":1.2", disconnecting = false, access = 0x2aa158654a0, destructor = 0x2aa15865520, requests = 0x2aa15863240, reconnect = 0x2aa15863710, router = 0x2aa158637a0, watch = 0x2aa158614f0, data = 0x0, senders = 0x2aa15861070, last_activity = 0x0} => dp_client *dp_cli == sbus_connection *cli_conn->data == NULL I think this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1775766#c14 Other relevant tickets: * bz 1768670 * bz 1770467 * bz 1684824 Hi In RHEL-8.3 on s390x during the VM's provision, we got a problem with sssd_be. The VM crushed with error in dmesg log: User process fault: interruption code 003b ilc:3 in sssd_be[2aa21b80000+37000] Failing address: 0000000000000000 TEID: 0000000000000400 Fault in primary space mode while using user ASCE. AS:00000003e4e001c7 R3:0000000000000024 Can you provide status for this bug? Thanks Upstream PR: https://github.com/SSSD/sssd/pull/5299 Pushed PR: https://github.com/SSSD/sssd/pull/5299 * `master` * 4a84f8e18ea5604ac7e69849dee492718fd96296 - dp: fix potential race condition in provider's sbus server Pushed PR: https://github.com/SSSD/sssd/pull/5344 * `master` * 7fbcaa8feeb968711ff52f51705c45062fd81394 - be: remove accidental sleep *** Bug 1895794 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (sssd bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1666 |