Bug 1697202
| Summary: | fail to stop daemon-server | ||
|---|---|---|---|
| Product: | [Community] LVM and device-mapper | Reporter: | dukaitian <dukaitian> |
| Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> |
| lvm2 sub component: | lvmetad | QA Contact: | cluster-qe <cluster-qe> |
| Status: | POST --- | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | high | CC: | agk, heinzm, jbrassow, msnitzer, prajnoha, zkabelac |
| Version: | 2.02.178 | Flags: | pm-rhel:
lvm-technical-solution?
pm-rhel: lvm-test-coverage? |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | 2.02.187 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This has been fixed upstream with stable-2.0 branch commit: 61358d92cbf202dbb483d63a63d5adf0463bb934 https://www.redhat.com/archives/lvm-devel/2019-September/msg00106.html |
Description of problem: Found a bug in function daemon_start from libdaemon/server/daemon-server.c. In stable-2.02 branch, if we use the daemon lvmetad without -t option. when lvmetad is running a sub process, if the daemon recieve a SIGTERM from systemd. _shutdown_requested will turn to 1, s.threads->next is not NULL, then the while loop could not break, in the next loop, select will be blocked. Sometimes, it will lead to timeout of shutdown machine. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1.run process 1 like, while : do lvs done 2.run process 2 like, while : do service lvm2-lvmetad stop if [ $? -ne 0 ];then echo "fail to stop" break fi sleep 2 done 3.soon, process 2 will be blocked, then press "CTRL+C" to stop process 1, we will see process still blocke there. Using gdb attach to lvm2-lvmetad daemon, we will see, lvm2-lvmetad blocket in select function of daemon_start. Although, at this time, _shutdown_requested is 1, s.threads->next is NULL(the _client_thread thread has already quit). Actual results: Expected results: when process 1 stoped , process 2 should continue stoping lvmetad successfully. Additional info: Suggested patch: diff --git a/libdaemon/server/daemon-server.c b/libdaemon/server/daemon-server.c index a2216ac..c753473 100644 --- a/libdaemon/server/daemon-server.c +++ b/libdaemon/server/daemon-server.c @@ -559,6 +559,8 @@ void daemon_start(daemon_state s) thread_state _threads = { .next = NULL }; unsigned timeout_count = 0; fd_set in; + struct timeval slect_timeout = { .tv_sec = 1, .tv_usec = 0 }; + struct timeval *ptimeout; /* * Switch to C locale to avoid reading large locale-archive file used by @@ -643,9 +645,15 @@ void daemon_start(daemon_state s) while (!failed) { _reset_timeout(s); + slect_timeout.tv_sec = 1; + slect_timeout.tv_usec = 0; FD_ZERO(&in); FD_SET(s.socket_fd, &in); - if (select(FD_SETSIZE, &in, NULL, NULL, _get_timeout(s)) < 0 && errno != EINTR) + ptimeout = _get_timeout(s); + if (_shutdown_requested && !ptimeout) + ptimeout = &slect_timeout; + + if (select(FD_SETSIZE, &in, NULL, NULL, ptimeout) < 0 && errno != EINTR) perror("select error"); if (FD_ISSET(s.socket_fd, &in)) { timeout_count = 0;