Bug 1331003

Summary: docker load/import: write unix /var/run/docker.sock: broken pipe
Product: Red Hat Enterprise Linux 7 Reporter: Ed Santiago <santiago>
Component: docker-latestAssignee: Antonio Murdaca <amurdaca>
Status: CLOSED ERRATA QA Contact: atomic-bugs <atomic-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: ajia, amurdaca, dwalsh, jhonce, lsm5, lsu, mpatel, vbatts
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-23 16:19:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ed Santiago 2016-04-27 12:38:00 UTC
docker load and docker import are failing:

    # docker-latest load -i /tmp/foo.docker
    An error occurred trying to connect: Post http:///var/run/docker.sock/v1.22/images/load: write unix /var/run/docker.sock: broken pipe

    # docker import /tmp/foo.docker-export simple_simple_eu1w_dfwo
    An error occurred trying to connect: Post http:///var/run/docker.sock/v1.22/images/create?fromSrc=-&message=&repo=simple_simple_eu1w_dfwo&tag=: write unix /var/run/docker.sock: broken pipe

Noticed in docker-latest-1.10.3-15 but might also be in -14 (sorry, my test logs didn't capture that). Still present in -17.

    # rpm -qa|grep docker|sort
    docker-1.9.1-37.el7.x86_64
    docker-common-1.9.1-37.el7.x86_64
    docker-forward-journald-1.9.1-37.el7.x86_64
    docker-latest-1.10.3-17.el7.x86_64
    docker-rhel-push-plugin-1.10.3-17.el7.x86_64
    docker-selinux-1.9.1-37.el7.x86_64
    docker-utils-1.9.1-28.el7.x86_64

It's not stdin-related as I first thought. Doesn't seem to be SELinux (setenforce 0 has no effect). My apologies: I know this isn't much to go on.

Comment 2 Ed Santiago 2016-04-28 12:53:35 UTC
Issue still present in docker-latest-1.10.3-19.el7.x86_64

Comment 3 Daniel Walsh 2016-05-03 15:30:19 UTC
Works on docker-1.9?

Any thing in the journal?

journalctl -b -u docker-latest

Comment 4 Antonio Murdaca 2016-05-03 15:36:12 UTC
I can see a panic in the journald logs. Investigating.

Comment 5 Ed Santiago 2016-05-03 15:49:14 UTC
docker-1.9: works fine. I've never seen this failure in any docker-1.9 build (up to and including today's docker-1.9.1-39.el7)

journal: good call. I'm seeing a panic too. Will upload on request, but it sounds like this is reproducible. Thanks.

Comment 6 Antonio Murdaca 2016-05-03 15:53:12 UTC
Interestingly I can only get this reproduced in rhel with:

[root@rhel01 ~]# docker load --input busybox.tar
Error response from daemon: Empty archive
[root@rhel01 ~]# docker load --input busybox.tar
An error occurred trying to connect: Post http:///var/run/docker.sock/v1.21/images/load: write unix /var/run/docker.sock: broken pipe
[root@rhel01 ~]# docker load --input busybox.tar
Error response from daemon: Empty archive
[root@rhel01 ~]# docker load --input busybox.tar
An error occurred trying to connect: Post http:///var/run/docker.sock/v1.21/images/load: write unix /var/run/docker.sock: broken pipe
[root@rhel01 ~]# docker load --input busybox.tar
An error occurred trying to connect: Post http:///var/run/docker.sock/v1.21/images/load: write unix /var/run/docker.sock: broken pipe
[root@rhel01 ~]# docker load --input busybox.tar
An error occurred trying to connect: Post http:///var/run/docker.sock/v1.21/images/load: write unix /var/run/docker.sock: broken pipe
[root@rhel01 ~]# docker load --input busybox.tar
An error occurred trying to connect: Post http:///var/run/docker.sock/v1.21/images/load: write unix /var/run/docker.sock: broken pipe
[root@rhel01 ~]# docker load --input busybox.tar
Error response from daemon: Empty archive
[root@rhel01 ~]#


and the journald panic also

Comment 7 Ed Santiago 2016-05-03 15:57:00 UTC
> Interestingly I can only get this reproduced in rhel with [running commands twice]

Interestingly, same here with docker-latest-1.10.3-21.el7. That's why it took me so long to reply! When I filed the bz, with -17, I remember it being 100% deterministic. I.e. I ran many repetitions and variations of docker load/import and never saw the "Empty archive" message.

Comment 8 Antonio Murdaca 2016-05-03 15:59:17 UTC
Ed, you're saying that with docker-latest-1.10.3-17.el7.x86_64 you didn't see the "Empty archive" error?

Comment 9 Antonio Murdaca 2016-05-03 15:59:46 UTC
asking because I'm testing exactly with docker-latest-1.10.3-17.el7.x86_64 and I can see Empty archive

Comment 10 Ed Santiago 2016-05-03 16:03:26 UTC
I'm saying I have no recollection of seeing that in my command-line invocations, which struck me as unusual. But reviewing journald I see many such log messages.

Comment 11 Antonio Murdaca 2016-05-03 16:07:47 UTC
Something odd is going on. The error seems legit but seems like is an issue in the save|export functions.

```
[root@rhel01 ~]# docker save -o busybox.tar busybox
[ 2053.085181] XFS (dm-1): Mounting V4 Filesystem
[ 2053.095722] XFS (dm-1): Ending clean mount
[ 2053.137433] XFS (dm-1): Unmounting Filesystem
[ 2053.190315] XFS (dm-1): Mounting V4 Filesystem
[ 2053.194086] XFS (dm-1): Ending clean mount
[ 2053.218520] XFS (dm-1): Unmounting Filesystem
[root@rhel01 ~]# tar xf busybox.tar 
tar: manifest.json: implausibly old time stamp 1969-12-31 19:00:00
tar: repositories: implausibly old time stamp 1969-12-31 19:00:00
```

Comment 12 Antonio Murdaca 2016-05-03 16:09:43 UTC
adding --warning=no-timestamp made the untar successful but in fedora the exported|saved tar(s) do not need the extra arg in the tar command

Comment 13 Daniel Walsh 2016-05-03 17:35:09 UTC
Yes.

Comment 14 Antonio Murdaca 2016-05-04 09:31:16 UTC
This is not caused by any of our patches (git bisecting doesn't give us anything) - there must be something in the way we share the socket maybe.

Comment 15 Antonio Murdaca 2016-05-04 09:51:38 UTC
Found it. There's something broken in the authorization plugin framework. I'll see because probably we need to backport some patches which didn't make 1.10.x.

Comment 16 Antonio Murdaca 2016-05-04 10:32:29 UTC
This bug is also present in Fedora (F23 docker in updates-testing, and F24). Lokesh should we clone this bug?

However, as said, this was an issue in the authz framework which docker uses and we see this happening in RHEL docker-latest because we enable by default the rhel-push-plugin plugin.

However again, the fix for RHEL is at https://github.com/projectatomic/docker/pull/153
(Fix for fedora https://github.com/projectatomic/docker/pull/152)

After we merge it we will rebuild docker-latest (Lokesh) - in the meantime I'll rebuild docker in F23/F24 also after the PR is merged for the Fedora branch.

To test this out:

Failing before my patch:

$ systemctl start docker-latest
$ docker-latest pull docker.io/busybox
$ docker-latest save -o busybox.tar docker.io/busybox
$ docker load --input busybox.tar
[ watch it failing ]

$ docker run -d docker.io/busybox true
[ container ID ]
$ docker export [ container ID ] > image.tar
$ docker import image.tar
[ watch it failing ]

After my patch the above failures are gone.

Comment 17 Alex Jia 2016-05-05 04:06:26 UTC
The patches haven't been shipped into docker-latest-1.10.3-22.el7.x86_64 yet.

Comment 19 Luwen Su 2016-06-12 14:02:26 UTC
In docker-latest-1.10.3-40.el7.x86_64, works fine

Comment 21 errata-xmlrpc 2016-06-23 16:19:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1275