Bug 862917 - Varnish 3.0.3 hangs on startup
Summary: Varnish 3.0.3 hangs on startup
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: selinux-policy
Version: 17
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Miroslav Grepl
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 861493
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-10-03 21:16 UTC by Tom Moertel
Modified: 2012-12-20 15:37 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 861493
Environment:
Last Closed: 2012-12-20 15:37:07 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Command dump showing a successful start of varnishd (2.32 KB, text/plain)
2012-10-09 22:01 UTC, Ingvar Hagelund
no flags Details

Description Tom Moertel 2012-10-03 21:16:35 UTC
Description of problem:

The varnish-3.0.3-1.fc17 package causes varnish to hang when started by systemd if you have any custom VCL rules (which every non-trivial use will have). Downgrading to 3.0.2-2.fc17 solves the problem, as does starting varnish by hand instead of via systemd.


Steps to Reproduce:

1. Install varnish-3.0.3-1.fc17.x86_64.

2. Edit /etc/varnish/default.vcl to add any logic you want, as long as it's not a no-op.  For example:

    # cat >>/etc/varnish/default.vcl <<EOF
    sub vcl_recv {
        set req.url = regsub(req.url, " +$", "");
    }
    EOF
  
3. Try to start varnish:

   # systemctl start varnish.service



Actual results:

The startup process for varnishd hangs, and systemd eventually reports the following in /var/log/messages:

    systemd[1]: varnish.service operation timed out. Terminating.
    systemd[1]: Unit varnish.service entered failed state.



Expected results:

Varnishd starts promptly and reliably.



Additional info:

You can make the test recipe produce the expected results by doing any of the following:

* Removing all VCL logic from /etc/varnish/default.vcl. (Obviously, this is not a desirable solution since any non-trival use of Varnish will require custom VCL rules.)

* Downgrading to the 3.0.2-2.fc17 package.

* Starting varnishd by hand (as root), using the exact same ExecStart command that systemd uses:
  # /usr/sbin/varnishd -P /var/run/varnish.pid -f /etc/varnish/default.vcl -a :80 -T 127.0.0.1:6082 -t 120 -w 1,1000,120 -u varnish -g varnish -S /etc/varnish/secret -s file,/var/lib/varnish/varnish_storage.bin,1G

These three pieces of evidence suggest that something changed between 3.0.2-2.fc17 and 3.0.3-1.fc17 that affects how Varnish compiles VCL rules into C and ultimately into a binary.  That the 3.0.3-1.fc17 varnishd actually starts as expected when started by hand, but not by systemctl, suggests that the problematic part of the change is triggered by some constraint in the systemd environment that is loosened in the root user's environment.  (SELinux?)

Comment 1 Tom Moertel 2012-10-04 15:42:15 UTC
Could it be related to using a private /tmp directory in recent varnish packages?  I know that PrivateTmp=true was introduced in varnish-3.0.2-2.fc17 [1], which works, but perhaps some change introduced in 3.0.3-1.fc17 depends upon normal /tmp?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=782539

Comment 2 Ingvar Hagelund 2012-10-09 22:01:36 UTC
Created attachment 624289 [details]
Command dump showing a successful start of varnishd

I'm not able to reproduce this on fedora 17/x86_64 with selinux enforcing. See attached command dump.

Ingvar

Comment 3 Tom Moertel 2012-10-09 22:35:20 UTC
Thanks for taking a look at this problem.  When I reproduced it, I did so on a nearly bare Amazon EC2 m1.small x86_64 instance running Fedora 17 (Fedora Provided ami-2ea50247 [1]) after a yum update and reboot.  Is it possible that varnish-3.0.3-1.fc17 has some additional, undeclared dependency that exists within your test environment but not in a bare environment in which a person yum installs varnish-3.0.3-1.fc17?

Links:
[1] https://fedoraproject.org/wiki/Cloud_images

Comment 4 Tore Anderson 2012-10-10 10:01:39 UTC
By Ingvar's request:

type=AVC msg=audit(1349863214.323:1515870): avc:  denied  { execmem } for  pid=11549 comm="varnishd" scontext=system_u:system_r:varnishd_t:s0 tcontext=system_u:system_r:varnishd_t:s0 tclass=process
type=SYSCALL msg=audit(1349863214.323:1515870): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=10000 a2=7 a3=22 items=0 ppid=11546 pid=11549 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="varnishd" exe="/usr/sbin/varnishd" subj=system_u:system_r:varnishd_t:s0 key=(null)
type=AVC msg=audit(1349863214.493:1515871): avc:  denied  { execmem } for  pid=11563 comm="varnishd" scontext=system_u:system_r:varnishd_t:s0 tcontext=system_u:system_r:varnishd_t:s0 tclass=process
type=SYSCALL msg=audit(1349863214.493:1515871): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=10000 a2=7 a3=22 items=0 ppid=11562 pid=11563 auid=4294967295 uid=991 gid=987 euid=991 suid=991 fsuid=991 egid=987 sgid=987 fsgid=987 tty=(none) ses=4294967295 comm="varnishd" exe="/usr/sbin/varnishd" subj=system_u:system_r:varnishd_t:s0 key=(null)

Tore

Comment 5 Ingvar Hagelund 2012-10-10 10:07:43 UTC
Tom, see the selinux audit message above. Could this be related to your problem?

Ingvar

Comment 6 Ingvar Hagelund 2012-10-10 10:15:24 UTC
selinux-policy, please have a look at this.

Ingvar

Comment 7 Daniel Walsh 2012-10-10 12:38:20 UTC
Is varnishd using some kind of java or JIT operation here. Execmem is caused when you have both executable and writeable memory at the same time.

Here is further explanation.

http://www.akkadia.org/drepper/selinux-mem.html

Comment 8 Ingvar Hagelund 2012-10-10 13:57:18 UTC
Daniel,
There is no java involved, but varnish does something that may look like jit. varnishd calls gcc to compile its vcl configuration to a shared library, which in turn is loaded. Both these operations happens at run time.

For those who do not know varnish, this may seem a bit strange, but it is among the things that makes varnish very fast. FAQ: https://www.varnish-cache.org/docs/3.0/faq/general.html#why

Ingvar

Comment 9 Tom Moertel 2012-10-10 14:17:39 UTC
Ingvar, yes I get the same audit message when running 3.0.3-1.fc17 while tailing /var/log/audit/audit.log:


[root@localhost ~]# rpm -q varnish
varnish-3.0.3-1.fc17.x86_64

[root@localhost ~]# systemctl start varnish.service
type=AVC msg=audit(1349878121.947:28109): avc:  denied  { execmem } for  pid=11974 comm="varnishd" scontext=system_u:system_r:varnishd_t:s0 tcontext=system_u:system_r:varnishd_t:s0 tclass=process
type=SYSCALL msg=audit(1349878121.947:28109): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=10000 a2=7 a3=22 items=0 ppid=11971 pid=11974 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="varnishd" exe="/usr/sbin/varnishd" subj=system_u:system_r:varnishd_t:s0 key=(null)


But the audit message is not produced under 3.0.2-2.fc17:

# yum remove varnish varnish-libs
# yum install varnish-3.0.2-2.fc17.x86_64

[root@localhost ~]# rpm -q varnish
varnish-3.0.2-2.fc17.x86_64

[root@localhost ~]# systemctl start varnish.service
type=SERVICE_START msg=audit(1349878217.984:28120): pid=0 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg=' comm="varnish" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'


So I would say that we have probably identified the cause of the problem  :-)

Comment 10 Ingvar Hagelund 2012-10-15 11:54:27 UTC
Upstream reports that 3.0.3 has started using jit-functionality in pcre, so this might also be related. selinux-policy, advice please.

Ingvar

Comment 11 Daniel Walsh 2012-10-15 14:10:55 UTC
We added execmem to F18.

Comment 12 Miroslav Grepl 2012-10-17 07:01:14 UTC
Fixed.

commit d436068f0dd4e3df288cf534a5e183ff9d79d8b2
Author: Miroslav Grepl <mgrepl>
Date:   Wed Oct 17 09:00:48 2012 +0200

    Allow varnish execmem

Comment 13 Fedora Update System 2012-10-17 12:34:52 UTC
selinux-policy-3.10.0-156.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/selinux-policy-3.10.0-156.fc17

Comment 14 Fedora Update System 2012-10-18 00:26:10 UTC
Package selinux-policy-3.10.0-156.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing selinux-policy-3.10.0-156.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-16347/selinux-policy-3.10.0-156.fc17
then log in and leave karma (feedback).

Comment 15 Fedora Update System 2012-12-20 15:37:09 UTC
selinux-policy-3.10.0-156.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.