Bug 1697202 - fail to stop daemon-server
Summary: fail to stop daemon-server
Keywords:
Status: POST
Alias: None
Product: LVM and device-mapper
Classification: Community
Component: lvm2
Version: 2.02.178
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
: ---
Assignee: Zdenek Kabelac
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-08 04:34 UTC by dukaitian
Modified: 2023-08-10 15:40 UTC (History)
6 users (show)

Fixed In Version: 2.02.187
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
pm-rhel: lvm-technical-solution?
pm-rhel: lvm-test-coverage?


Attachments (Terms of Use)

Description dukaitian 2019-04-08 04:34:54 UTC
Description of problem:

Found a bug in function daemon_start from libdaemon/server/daemon-server.c. 
In stable-2.02 branch, if we use the daemon lvmetad without -t option. when lvmetad is running a sub process, if the daemon recieve a SIGTERM from systemd. _shutdown_requested will turn to 1, s.threads->next is not NULL, then the while loop could not break, in the next loop, select will be blocked. Sometimes, it will lead to timeout of shutdown machine.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.run process 1 like,
while :
do
    lvs
done
2.run process 2 like,
while : 
do 
    service lvm2-lvmetad stop
    if [ $? -ne 0 ];then 
        echo "fail to stop"
        break
    fi
    sleep 2
done
3.soon, process 2 will be blocked, then press "CTRL+C" to stop process 1, we will see process still blocke there. Using gdb attach to lvm2-lvmetad daemon, we will see, lvm2-lvmetad blocket in select function of daemon_start. Although, at this time, _shutdown_requested is 1, s.threads->next is NULL(the _client_thread thread has already quit).

Actual results:


Expected results:
when process 1 stoped , process 2 should continue stoping lvmetad successfully.

Additional info:
Suggested patch:
diff --git a/libdaemon/server/daemon-server.c b/libdaemon/server/daemon-server.c
index a2216ac..c753473 100644
--- a/libdaemon/server/daemon-server.c
+++ b/libdaemon/server/daemon-server.c
@@ -559,6 +559,8 @@ void daemon_start(daemon_state s)
        thread_state _threads = { .next = NULL };
        unsigned timeout_count = 0;
        fd_set in;
+ struct timeval slect_timeout = { .tv_sec = 1, .tv_usec = 0 };
+        struct timeval *ptimeout;
 
        /*
         * Switch to C locale to avoid reading large locale-archive file used by
@@ -643,9 +645,15 @@ void daemon_start(daemon_state s)
 
        while (!failed) {
                _reset_timeout(s);
+                slect_timeout.tv_sec = 1;
+                slect_timeout.tv_usec = 0;
                FD_ZERO(&in);
                FD_SET(s.socket_fd, &in);
-           if (select(FD_SETSIZE, &in, NULL, NULL, _get_timeout(s)) < 0 && errno != EINTR)
+                ptimeout = _get_timeout(s);
+         if (_shutdown_requested && !ptimeout)
+                 ptimeout = &slect_timeout;
+
+         if (select(FD_SETSIZE, &in, NULL, NULL, ptimeout) < 0 && errno != EINTR)
                        perror("select error");
                if (FD_ISSET(s.socket_fd, &in)) {
                        timeout_count = 0;

Comment 1 Zdenek Kabelac 2020-10-06 13:51:22 UTC
This has been fixed upstream with stable-2.0 branch commit:  61358d92cbf202dbb483d63a63d5adf0463bb934

https://www.redhat.com/archives/lvm-devel/2019-September/msg00106.html


Note You need to log in before you can comment on or make changes to this bug.