Bug 1660939
Summary: | coredump on unlock after applying updates | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Patrick C. F. Ernzer <pcfe> |
Component: | sssd | Assignee: | Michal Zidek <mzidek> |
Status: | CLOSED WORKSFORME | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 29 | CC: | abokovoy, jhrozek, lslebodn, mzidek, pbrezina, pcfe, rharwood, sbose, ssorce |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-07-08 15:45:13 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Patrick C. F. Ernzer
2018-12-19 16:31:42 UTC
gdb tells me the following: Core was generated by `/usr/libexec/sssd/sssd_be --domain default --uid 0 --gid 0 --logger=files'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000560b1625e1a1 in dp_client_register (mem_ctx=<optimized out>, sbus_req=<optimized out>, provider=0x560b1762e230, name=0x560b176a0530 "autofs") at src/providers/data_provider/dp_client.c:107 107 dp_cli->name = talloc_strdup(dp_cli, name); (gdb) list 102 return ENOENT; 103 } 104 105 dp_cli = sbus_connection_get_data(cli_conn, struct dp_client); 106 107 dp_cli->name = talloc_strdup(dp_cli, name); 108 if (dp_cli->name == NULL) { 109 talloc_free(dp_cli); 110 return ENOMEM; 111 } (gdb) p dp_cli $1 = (struct dp_client *) 0x0 So cli_conn is not NULL but cli_conn->data is NULL. I'm not sure if this is an expected state and just a NULL check is missing or if this is unexpected and more investigation is needed why we got into this state. Pavel knows the SBus code best, so I set Needinfo for him. In the update to sssd-2.0.0-5.fc29.x86_64 only the SBus timeout is change from iirc 25s, the DBus default, to 120s. I wonder if there is maybe some dependent timeout which has to be increased as well? No, it is not expected. I though this was a race condition in the initialization code when we set on connection function after the server is already created (dp_client_init creates the dp_cli): static void dp_init_done(struct tevent_req *subreq) { struct dp_init_state *state; struct tevent_req *req; errno_t ret; req = tevent_req_callback_data(subreq, struct tevent_req); state = tevent_req_data(req, struct dp_init_state); ret = sbus_server_create_and_connect_recv(state->provider, subreq, &state->provider->sbus_server, &state->provider->sbus_conn); talloc_zfree(subreq); if (ret != EOK) { tevent_req_error(req, ret); return; } sbus_server_set_on_connection(state->provider->sbus_server, dp_client_init, state->provider); However, responders are started way past this point in sss_monitor_service_init that is called after dp_init_done (in dp_initialized). I even tried this with setting some delay before dp_init_done is called but it only proved that responders will not start that soon. Unfortunately the sssd logs are empty so it does not tell us anything. I suppose this was a one time event and it is not reproducible, right? I'm afraid we can't do much without logs (ideally level 0x3ff0). Yes, have not seen the problem since the initial occurance. Since there was nothin guseful in the logs I provided and I do not have a reproducer for you, I'll close this now. |