Bug 862917 - Varnish 3.0.3 hangs on startup
Varnish 3.0.3 hangs on startup
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: selinux-policy (Show other bugs)
17
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Miroslav Grepl
Fedora Extras Quality Assurance
:
Depends On: 861493
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-03 17:16 EDT by Tom Moertel
Modified: 2012-12-20 10:37 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 861493
Environment:
Last Closed: 2012-12-20 10:37:07 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Command dump showing a successful start of varnishd (2.32 KB, text/plain)
2012-10-09 18:01 EDT, Ingvar Hagelund
no flags Details

  None (edit)
Description Tom Moertel 2012-10-03 17:16:35 EDT
Description of problem:

The varnish-3.0.3-1.fc17 package causes varnish to hang when started by systemd if you have any custom VCL rules (which every non-trivial use will have). Downgrading to 3.0.2-2.fc17 solves the problem, as does starting varnish by hand instead of via systemd.


Steps to Reproduce:

1. Install varnish-3.0.3-1.fc17.x86_64.

2. Edit /etc/varnish/default.vcl to add any logic you want, as long as it's not a no-op.  For example:

    # cat >>/etc/varnish/default.vcl <<EOF
    sub vcl_recv {
        set req.url = regsub(req.url, " +$", "");
    }
    EOF
  
3. Try to start varnish:

   # systemctl start varnish.service



Actual results:

The startup process for varnishd hangs, and systemd eventually reports the following in /var/log/messages:

    systemd[1]: varnish.service operation timed out. Terminating.
    systemd[1]: Unit varnish.service entered failed state.



Expected results:

Varnishd starts promptly and reliably.



Additional info:

You can make the test recipe produce the expected results by doing any of the following:

* Removing all VCL logic from /etc/varnish/default.vcl. (Obviously, this is not a desirable solution since any non-trival use of Varnish will require custom VCL rules.)

* Downgrading to the 3.0.2-2.fc17 package.

* Starting varnishd by hand (as root), using the exact same ExecStart command that systemd uses:
  # /usr/sbin/varnishd -P /var/run/varnish.pid -f /etc/varnish/default.vcl -a :80 -T 127.0.0.1:6082 -t 120 -w 1,1000,120 -u varnish -g varnish -S /etc/varnish/secret -s file,/var/lib/varnish/varnish_storage.bin,1G

These three pieces of evidence suggest that something changed between 3.0.2-2.fc17 and 3.0.3-1.fc17 that affects how Varnish compiles VCL rules into C and ultimately into a binary.  That the 3.0.3-1.fc17 varnishd actually starts as expected when started by hand, but not by systemctl, suggests that the problematic part of the change is triggered by some constraint in the systemd environment that is loosened in the root user's environment.  (SELinux?)
Comment 1 Tom Moertel 2012-10-04 11:42:15 EDT
Could it be related to using a private /tmp directory in recent varnish packages?  I know that PrivateTmp=true was introduced in varnish-3.0.2-2.fc17 [1], which works, but perhaps some change introduced in 3.0.3-1.fc17 depends upon normal /tmp?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=782539
Comment 2 Ingvar Hagelund 2012-10-09 18:01:36 EDT
Created attachment 624289 [details]
Command dump showing a successful start of varnishd

I'm not able to reproduce this on fedora 17/x86_64 with selinux enforcing. See attached command dump.

Ingvar
Comment 3 Tom Moertel 2012-10-09 18:35:20 EDT
Thanks for taking a look at this problem.  When I reproduced it, I did so on a nearly bare Amazon EC2 m1.small x86_64 instance running Fedora 17 (Fedora Provided ami-2ea50247 [1]) after a yum update and reboot.  Is it possible that varnish-3.0.3-1.fc17 has some additional, undeclared dependency that exists within your test environment but not in a bare environment in which a person yum installs varnish-3.0.3-1.fc17?

Links:
[1] https://fedoraproject.org/wiki/Cloud_images
Comment 4 Tore Anderson 2012-10-10 06:01:39 EDT
By Ingvar's request:

type=AVC msg=audit(1349863214.323:1515870): avc:  denied  { execmem } for  pid=11549 comm="varnishd" scontext=system_u:system_r:varnishd_t:s0 tcontext=system_u:system_r:varnishd_t:s0 tclass=process
type=SYSCALL msg=audit(1349863214.323:1515870): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=10000 a2=7 a3=22 items=0 ppid=11546 pid=11549 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="varnishd" exe="/usr/sbin/varnishd" subj=system_u:system_r:varnishd_t:s0 key=(null)
type=AVC msg=audit(1349863214.493:1515871): avc:  denied  { execmem } for  pid=11563 comm="varnishd" scontext=system_u:system_r:varnishd_t:s0 tcontext=system_u:system_r:varnishd_t:s0 tclass=process
type=SYSCALL msg=audit(1349863214.493:1515871): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=10000 a2=7 a3=22 items=0 ppid=11562 pid=11563 auid=4294967295 uid=991 gid=987 euid=991 suid=991 fsuid=991 egid=987 sgid=987 fsgid=987 tty=(none) ses=4294967295 comm="varnishd" exe="/usr/sbin/varnishd" subj=system_u:system_r:varnishd_t:s0 key=(null)

Tore
Comment 5 Ingvar Hagelund 2012-10-10 06:07:43 EDT
Tom, see the selinux audit message above. Could this be related to your problem?

Ingvar
Comment 6 Ingvar Hagelund 2012-10-10 06:15:24 EDT
selinux-policy, please have a look at this.

Ingvar
Comment 7 Daniel Walsh 2012-10-10 08:38:20 EDT
Is varnishd using some kind of java or JIT operation here. Execmem is caused when you have both executable and writeable memory at the same time.

Here is further explanation.

http://www.akkadia.org/drepper/selinux-mem.html
Comment 8 Ingvar Hagelund 2012-10-10 09:57:18 EDT
Daniel,
There is no java involved, but varnish does something that may look like jit. varnishd calls gcc to compile its vcl configuration to a shared library, which in turn is loaded. Both these operations happens at run time.

For those who do not know varnish, this may seem a bit strange, but it is among the things that makes varnish very fast. FAQ: https://www.varnish-cache.org/docs/3.0/faq/general.html#why

Ingvar
Comment 9 Tom Moertel 2012-10-10 10:17:39 EDT
Ingvar, yes I get the same audit message when running 3.0.3-1.fc17 while tailing /var/log/audit/audit.log:


[root@localhost ~]# rpm -q varnish
varnish-3.0.3-1.fc17.x86_64

[root@localhost ~]# systemctl start varnish.service
type=AVC msg=audit(1349878121.947:28109): avc:  denied  { execmem } for  pid=11974 comm="varnishd" scontext=system_u:system_r:varnishd_t:s0 tcontext=system_u:system_r:varnishd_t:s0 tclass=process
type=SYSCALL msg=audit(1349878121.947:28109): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=10000 a2=7 a3=22 items=0 ppid=11971 pid=11974 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="varnishd" exe="/usr/sbin/varnishd" subj=system_u:system_r:varnishd_t:s0 key=(null)


But the audit message is not produced under 3.0.2-2.fc17:

# yum remove varnish varnish-libs
# yum install varnish-3.0.2-2.fc17.x86_64

[root@localhost ~]# rpm -q varnish
varnish-3.0.2-2.fc17.x86_64

[root@localhost ~]# systemctl start varnish.service
type=SERVICE_START msg=audit(1349878217.984:28120): pid=0 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg=' comm="varnish" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'


So I would say that we have probably identified the cause of the problem  :-)
Comment 10 Ingvar Hagelund 2012-10-15 07:54:27 EDT
Upstream reports that 3.0.3 has started using jit-functionality in pcre, so this might also be related. selinux-policy, advice please.

Ingvar
Comment 11 Daniel Walsh 2012-10-15 10:10:55 EDT
We added execmem to F18.
Comment 12 Miroslav Grepl 2012-10-17 03:01:14 EDT
Fixed.

commit d436068f0dd4e3df288cf534a5e183ff9d79d8b2
Author: Miroslav Grepl <mgrepl@redhat.com>
Date:   Wed Oct 17 09:00:48 2012 +0200

    Allow varnish execmem
Comment 13 Fedora Update System 2012-10-17 08:34:52 EDT
selinux-policy-3.10.0-156.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/selinux-policy-3.10.0-156.fc17
Comment 14 Fedora Update System 2012-10-17 20:26:10 EDT
Package selinux-policy-3.10.0-156.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing selinux-policy-3.10.0-156.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-16347/selinux-policy-3.10.0-156.fc17
then log in and leave karma (feedback).
Comment 15 Fedora Update System 2012-12-20 10:37:09 EST
selinux-policy-3.10.0-156.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.