Bug 1096269 - lost signals when sending lots of signals using --sig-proxy to docker
Summary: lost signals when sending lots of signals using --sig-proxy to docker
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Matthew Heon
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1087697 1087700
Blocks: 1109938
TreeView+ depends on / blocked
 
Reported: 2014-05-09 14:32 UTC by Lukas Doktor
Modified: 2019-03-06 00:58 UTC (History)
11 users (show)

Fixed In Version: docker-1.0
Doc Type: Bug Fix
Doc Text:
Clone Of: 1087700
Environment:
autotest-docker:docker_cli/kill
Last Closed: 2014-09-18 20:45:24 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1266 0 normal SHIPPED_LIVE docker bug fix and enhancement update 2014-09-19 00:45:12 UTC

Description Lukas Doktor 2014-05-09 14:32:47 UTC
+++ This bug was initially created as a clone of Bug #1087700 +++

Description of problem:
When I send lots of signals to the running docker with --sig-proxy (actual kill signals, not `docker kill`), most of them got lost.

Version-Release number of selected component (if applicable):
docker-0.10.0-8.el7.x86_64
docker-io-0.9.1-1.fc21.x86_64


How reproducible:
always

Steps to Reproduce:
1. /usr/bin/docker -D run --tty=false --rm -i --name test_eoly localhost:5000/ldoktor/fedora:latest bash -c 'for NUM in `seq 1 64`; do trap "echo Received $NUM, ignoring..." $NUM; done; while :; do sleep 1; done'
2. ps ax |grep docker
3. for AAA in `seq 1 32`; do [ $AAA -ne 9 ] && [ $AAA -ne 20 ] && [ $AAA -ne 19 ] && kill -s $AAA $PID; done

Actual results:
Output of the docker is:
Received 1, ignoring...
Received 2, ignoring...


Expected results:
Messages for all of the `Received $NUM, ignoring...` printed (order doesn't matter)

Additional info:
Skipping 9, 19, 20 as they are a bit too special..

--- Additional comment from Lukas Doktor on 2014-05-05 04:10:09 EDT ---

The same results with upstream docker dc9c28f/0.10.0:

Output:
Received 1, ignoring...
[debug] stdcopy.go:111 framesize: 24
Received 2, ignoring...

Daemon output:
2014/05/05 10:08:45 POST /v1.10/containers/b01a849cb45ebe94c3a61fa021a5464186345d5b159faee4ea9d5da39fb36de5/kill?signal=HUP
[/home/medic/Work/Projekty/Docker/root|fa3816b6] +job kill(b01a849cb45ebe94c3a61fa021a5464186345d5b159faee4ea9d5da39fb36de5, HUP)
[/home/medic/Work/Projekty/Docker/root|fa3816b6] -job kill(b01a849cb45ebe94c3a61fa021a5464186345d5b159faee4ea9d5da39fb36de5, HUP) = OK (0)
2014/05/05 10:08:45 POST /v1.10/containers/b01a849cb45ebe94c3a61fa021a5464186345d5b159faee4ea9d5da39fb36de5/kill?signal=INT
[/home/medic/Work/Projekty/Docker/root|fa3816b6] +job kill(b01a849cb45ebe94c3a61fa021a5464186345d5b159faee4ea9d5da39fb36de5, INT)
[/home/medic/Work/Projekty/Docker/root|fa3816b6] -job kill(b01a849cb45ebe94c3a61fa021a5464186345d5b159faee4ea9d5da39fb36de5, INT) = OK (0)

Comment 3 Daniel Walsh 2014-05-19 20:01:52 UTC
I added a sleep of 1 second before sending the signal and I got.

# /usr/bin/docker run --sig-proxy --rm --tty=false -i fedora bash -c 'for NUM in `seq 1 64`; do trap "echo Received $NUM, ignoring..." $NUM; done; while :; do sleep 1; done'
Received 1, ignoring...
Received 2, ignoring...
Received 3, ignoring...
Received 4, ignoring...
Received 5, ignoring...
Received 6, ignoring...
Received 7, ignoring...
Received 8, ignoring...
Received 10, ignoring...
Received 11, ignoring...
Received 12, ignoring...
Received 13, ignoring...
Received 14, ignoring...
Received 15, ignoring...
Received 16, ignoring...
Received 21, ignoring...
Received 22, ignoring...
Received 23, ignoring...
Received 24, ignoring...
Received 25, ignoring...
Received 26, ignoring...
Received 28, ignoring...
Received 29, ignoring...
Received 30, ignoring...
Received 31, ignoring...

for AAA in `seq 1 32`; do [ $AAA -ne 9 ] && [ $AAA -ne 20 ] && [ $AAA -ne 19 ] && sleep 1 && echo $AAA && kill -s $AAA 2041; done

Comment 4 Daniel Walsh 2014-05-19 20:03:41 UTC
These are missing

#define	SIGCHLD		17	/* Child status has changed (POSIX).  */
#define	SIGCONT		18	/* Continue (POSIX).  */

Comment 5 Daniel Walsh 2014-05-19 20:05:46 UTC
#define	SIGPROF		27	/* Profiling alarm clock (4.2 BSD).  */
Also missing.

Comment 6 Daniel Walsh 2014-05-19 20:12:27 UTC
bash -c 'for NUM in `seq 1 64`; do trap "echo Received $NUM, ignoring..." $NUM; done; while :; do sleep 1; done'Received 1, ignoring...
Received 2, ignoring...
Received 3, ignoring...
Received 4, ignoring...
Received 5, ignoring...
Received 6, ignoring...
Received 7, ignoring...
Received 8, ignoring...
Received 10, ignoring...
Received 11, ignoring...
Received 12, ignoring...
Received 13, ignoring...
Received 14, ignoring...
Received 15, ignoring...
Received 16, ignoring...
Received 18, ignoring...
Received 21, ignoring...
Received 22, ignoring...
Received 23, ignoring...
Received 24, ignoring...
Received 25, ignoring...
Received 26, ignoring...
Received 27, ignoring...
Received 28, ignoring...
Received 29, ignoring...
Received 30, ignoring...
Received 31, ignoring...
Unknown signal 32

Running test against bash shows missing 17 

Why 18 and 27 don't show I have no idea.

Comment 7 Matthew Heon 2014-06-18 15:21:54 UTC
I've identified the root cause of this. Docker uses a buffer to store incoming signals before sending them to the (https://github.com/dotcloud/docker/blob/master/api/client/commands.go#L538). This buffer is, in current versions of Docker, size 1 - multiple signals arriving near-simultaneously will overwrite one another. I've submitted a pull request to increase the size of the buffer (https://github.com/dotcloud/docker/pull/6508).

Comment 8 Daniel Walsh 2014-06-23 12:39:11 UTC
Is this fixed in docker-1.0 for RHEL7?

Comment 9 Matthew Heon 2014-06-23 12:40:05 UTC
No, patch is not in docker-1.0

Comment 10 Daniel Walsh 2014-06-23 19:26:11 UTC
Ok lets get it in.

Comment 11 Matthew Heon 2014-06-25 14:12:51 UTC
Patch is in our builds of docker-1.0

Comment 12 Lukas Doktor 2014-07-18 07:59:31 UTC
I'm sorry to report but the problem persists on docker-1.0.0-10.el7.x86_64:

output without additional sleep:
[debug] stdcopy.go:111 framesize: 48
Received 1, ignoring...
Received 8, ignoring...

output with sleep 0.1:
[debug] stdcopy.go:111 framesize: 24
Received 1, ignoring...
[debug] stdcopy.go:111 framesize: 218
Received 3, ignoring...
Received 4, ignoring...
Received 5, ignoring...
Received 6, ignoring...
Received 7, ignoring...
Received 8, ignoring...
Received 10, ignoring...
Received 11, ignoring...
Received 2, ignoring...
[debug] stdcopy.go:111 framesize: 125
Received 12, ignoring...
Received 13, ignoring...
Received 14, ignoring...
Received 15, ignoring...
Received 16, ignoring...
[debug] stdcopy.go:111 framesize: 225
Received 21, ignoring...
Received 22, ignoring...
Received 23, ignoring...
Received 24, ignoring...
Received 25, ignoring...
Received 26, ignoring...
Received 28, ignoring...
Received 29, ignoring...
Received 30, ignoring...
[debug] stdcopy.go:111 framesize: 25
Received 31, ignoring...

The received signals numbers differs with runs, but number of received signals is between 1-4.

Comment 14 Lukas Doktor 2014-07-21 07:22:53 UTC
OK, today I tried the docker-1.1.1-1.el7.x86_64 and it works perfectly. Thanks

Comment 16 errata-xmlrpc 2014-09-18 20:45:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1266.html


Note You need to log in before you can comment on or make changes to this bug.