Bug 1119849 - su - postgres Results in System Error inside Fedora 20/rawhide containers
Summary: su - postgres Results in System Error inside Fedora 20/rawhide containers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: docker-io
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Lokesh Mandvekar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1120567 1121345 (view as bug list)
Depends On:
Blocks: 1128208
TreeView+ depends on / blocked
 
Reported: 2014-07-15 16:13 UTC by Devan Goodwin
Modified: 2014-08-23 01:54 UTC (History)
21 users (show)

Fixed In Version: docker-io-1.1.2-3.fc19
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1128208 (view as bug list)
Environment:
Last Closed: 2014-07-30 07:01:53 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Devan Goodwin 2014-07-15 16:13:33 UTC
Description of problem:

Attempting to use postgresql in a docker container, errors occur due to an issue with using su to the postgres user.

Looks similar to: https://bugzilla.redhat.com/show_bug.cgi?id=1098120#c15 but the fix for libselinux should have been in Fedora images long ago.

Version-Release number of selected component (if applicable):

On host:

(root@lenovo ~) $ rpm -qa | grep docker
docker-registry-0.7.1-2.fc20.noarch
docker-io-1.0.0-6.fc20.x86_64
(root@lenovo ~) $ getenforce 
Disabled
(root@lenovo ~) $ rpm -qa | grep libselinux
libselinux-python-2.2.1-6.fc20.x86_64
libselinux-utils-2.2.1-6.fc20.x86_64
libselinux-2.2.1-6.fc20.x86_64
libselinux-devel-2.2.1-6.fc20.x86_64
(root@lenovo ~) $ 



How reproducible: Very.


Steps to Reproduce:

Build the following Dockerfile on a fully updated Fedora 20 system:

FROM fedora:20
MAINTAINER Devan Goodwin <dgoodwin>
RUN yum -y update
RUN yum -y install postgresql postgresql-server
RUN su - postgres -c "ls /"
CMD ["/bin/bash"]

Actual results:

Complete!
 ---> 7b3287af76e9
Removing intermediate container d869af784195
Step 4 : RUN su - postgres -c "ls /"
 ---> Running in 72da359c2576
su: System error
2014/07/15 12:57:45 The command [/bin/sh -c su - postgres -c "ls /"] returned a non-zero code: 1



Expected results:

The su error is blocking use of postgres in a number of ways starting with initdb phase.


Additional info:

postgres user does exist at this point.

Problem also occurs if building off fedora:rawhide image.

Similar failures can be seen if you attempt to build the fedora-cloud Postgresql container: https://github.com/fedora-cloud/Fedora-Dockerfiles/tree/master/postgres I believe these boil down to the same issue however.

Comment 1 Lokesh Mandvekar 2014-07-15 18:16:58 UTC
how about:

RUN sudo -u postgres ls /

instead of the "RUN su..."

This does the job (but with an annoying "sudo: unable to send audit message: Operation not permitted" before the output ... gotta check if there are better ways though, anyone?

Comment 2 Marek Goldmann 2014-07-16 08:39:50 UTC
As Lokesh said, use sudo. The su command failure is not limited to fedora images, it fails on other images too:

Ubuntu:

$ docker run -it --rm ubuntu bash
root@b3c384ec1959:/# useradd test
root@b3c384ec1959:/# su - test -c "ls /"
su: System error
root@b3c384ec1959:/# 

CentOS:

$ docker run -it --rm centos bash
bash-4.2# useradd test
bash-4.2# su - test -c "ls /"
su: System error
bash-4.2# 

This may be related to this: https://github.com/dotcloud/docker/issues/6345

Comment 3 Devan Goodwin 2014-07-16 11:26:49 UTC
Thanks for quick response.

My use of su is just to demonstrate the problem in simplest form I could think of, the actual issue is inability to deploy or use postgresql in the container as it's suppose scripts I suspect are using su. initdb.log will show up with "system error" so my suspicion is that those scripts are using the same approach.

I can try to modify postgres scripts myself for time being and see where that gets me but definitely something nasty going on. Does look related to github issue above.

Comment 4 Devan Goodwin 2014-07-16 14:04:17 UTC
Tried manually modifying postgresql-setup to get through initdb but no luck:

[root@c1de208601a7 /]# cat /var/lib/pgsql/initdb.log
sudo: unable to send audit message: Operation not permitted
sudo: sorry, you must have a tty to run sudo

Comment 5 Devan Goodwin 2014-07-16 14:38:04 UTC
I've been going back through old docker-io versions trying to see if this appeared recently, it seems to appear in everything I tested. 

I can do this operation fine on a RHEL 7 host, also docker 1.0.0.

This leads me to believe it's (a) not something in Fedora 20 container itself, and (b) some other Fedora component beside docker-io.

Comment 6 Devan Goodwin 2014-07-16 14:52:19 UTC
On the advice of https://github.com/dotcloud/docker/issues/6345

I tried rebooting into an older kernel-3.14.6-200.fc20.x86_64 (as opposed to kernel-3.15.4-200.fc20.x86_64)

Problem goes away. 

Will re-assign to kernel package.

Comment 7 Josh Boyer 2014-07-16 15:09:22 UTC
SELinux/audit guys, any clues on this one?  Looks like a problem with 3.15.y?  The github issue seems to point to https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=543bc6a1a987672b79d6ebe8e2ab10471d8f1047, but that isn't conclusive.

Comment 8 Marek Goldmann 2014-07-17 09:40:53 UTC
*** Bug 1120567 has been marked as a duplicate of this bug. ***

Comment 9 Lars Kellogg-Stedman 2014-07-22 18:49:56 UTC
*** Bug 1121345 has been marked as a duplicate of this bug. ***

Comment 10 Lars Kellogg-Stedman 2014-07-22 18:51:07 UTC
I just closed 1121345 as a duplicate of this bug, but I wanted to include some information here:

This behavior appears to have been introduced by kernel commit 33faba7fa7f2288d2f8aaea95958b2c97bf9ebfb (https://github.com/torvalds/linux/commits/33faba7fa7f2288d2f8aaea95958b2c97bf9ebfb).

Specifically, this check is failing in kernel/audit.c, in audit_netlink_ok():

case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
    if (!netlink_capable(skb, CAP_AUDIT_WRITE))
        err = -EPERM;
break;

Comment 11 Lars Kellogg-Stedman 2014-07-22 19:40:04 UTC
Okay:

Prior to commit 33faba7fa7f2288d2f8aaea95958b2c97bf9ebfb, audit events were only accepted in the root network namespace, and attempts to send audit events in other namespaces always resulted in ECONNREFUSED, which, as documented in https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=543bc6a1a987672b79d6ebe8e2ab10471d8f1047, is a non-fatal error that will allow the sending process to continue on it's merry way.  Running strace on docker using a kernel from 2f2ad10 (the immediately prior commit) yields, for example:

539   sendto(3, "p\0\0\0L\4\5\0\1\0\0\0\0\0\0\0op=PAM:authentication acct=\"root\" exe=\"/usr/bin/su\" hostname=? a"..., 112, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = -1 ECONNREFUSED (Connection refused)

With commit 33faba7fa7f2288d2f8aaea95958b2c97bf9ebfb, audit events are now accepted inside of all network namespaces.  Since processes are now able to connect to the audit socket, they no longer receive a simple ECONNREFUSED and must instead pass the capability checks in kernel/audit.c.  This means that need to have CAP_AUDIT_WRITE, and without that that will get an EPERM and will probably exit with an error.

Comment 12 Lars Kellogg-Stedman 2014-07-22 20:14:27 UTC
I spoke to rgb and eparis about this issue, and the general consensus seems to be that if you want to write audit messages you need CAP_AUDIT_WRITE, and you were just lucky it was working prior to 33faba7.

There's a docker bug open at:

  https://github.com/dotcloud/docker/issues/6345

I will be updating that bug with this recommendation.

Comment 13 Lars Kellogg-Stedman 2014-07-23 13:57:08 UTC
This problem is fixed by:

  https://github.com/dotcloud/docker/pull/7179

I am moving this back to the docker-io component.

Comment 14 Fedora Update System 2014-07-24 23:49:19 UTC
docker-io-1.0.0-9.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/docker-io-1.0.0-9.fc19

Comment 15 Fedora Update System 2014-07-24 23:49:46 UTC
docker-io-1.0.0-9.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/docker-io-1.0.0-9.fc20

Comment 16 Fedora Update System 2014-07-26 00:01:50 UTC
Package docker-io-1.0.0-9.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing docker-io-1.0.0-9.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-8877/docker-io-1.0.0-9.fc20
then log in and leave karma (feedback).

Comment 17 Fedora Update System 2014-07-30 07:01:53 UTC
docker-io-1.0.0-9.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 18 Fedora Update System 2014-08-01 08:05:36 UTC
docker-io-1.1.2-2.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/docker-io-1.1.2-2.fc19

Comment 19 Fedora Update System 2014-08-05 15:55:54 UTC
docker-io-1.1.2-3.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/docker-io-1.1.2-3.fc19

Comment 20 Fedora Update System 2014-08-23 01:54:35 UTC
docker-io-1.1.2-3.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.