Description of problem: If you fill up your dom0 disk space, along with other badness, xenstore will start spewing AVC's to the serial console. This is because of two problems that happen simulatenously; the xenstore daemon will attempt to "setrlimit" somewhere (and fail because of SElinux policy), and the audit daemon will stop processing messages because there is no disk space left to write them, so you end up with: audit: audit_backlog=326 > audit_backlog_limit=320 audit: audit_lost=1798 audit_rate_limit=0 audit_backlog_limit=320 audit: backlog limit exceeded on the serial console. The xenstore denial messages look like: type=AVC msg=audit(1186107301.242:5201): avc: denied { sys_resource } for pid=2998 comm="xenstored" capability=24 scontext=system_u:system_r:xenstored_t:s0 tcontext=system_u:system_r:xenstored_t:s0 tclass=capability I'm not quite sure what to do about it; running out of disk space is surely bad for other reasons, but I'm wondering if we can either allow xenstore to do the setrlimit by modifying the policy, or cause xenstore just not to do setrlimit at all. Note that interestingly, despite being out of disk space, xenstore continues happily along, creating and destroying domains, etc.; the only real problem seems to be the flooding of the serial console.
Are you sure sys_resource maps to setrlimit ? I don't find any calls to setrlimit in the xenstored source code or the other xen libraries it uses. Any other likely system calls to check for ? The TDB storage used by xenstored is (fortunately) atomically updated & writes are transactional. So if xenstored fails to expand/create the TDB datafile, it'll abort the transaction. Everything in Xen automatically retries transactions if they're importnat, so anything using xenstore will keep trying presumably untill some space is free'd on the filesystem. Also note that free'd storage is not released, so if you destroy & then restart a domain chances are it'll not need to allocate more disk space.
Dan, No, I'm not 100% sure of that; it's what Dan Walsh initially told me. Yeah, I also took a quick gander through the source code and didn't see anything like that. It may be part of some library that xenstore is using, or it just might be another error. Chris Lalancette
Unless there's more info here its unclear what change is required in xenstore - I can't find any syscalls which would trigger the reported audit message.
Well, now I know what's happenning and this isn't really xen's fault (aside from not giving up when the disk is full) extX filesystems check CAP_SYS_RESOURCE when an FS is full to see if that process is allowed to use the reserved blocks. In this case the process in question is trying to write to a full disk. According to the capabilities cap it has CAP_SYS_RESOUCE (by the fact that it is a root process) but according to SELinux it does NOT have the capability (because selinux enforces a least priv) So the problem is not in Xen. upstream I solved this issue a different way, I created a way for ext3 to check for CAP_SYS_RESOUCE without emitting any kind of warning. I could probably backport a similar change, but seeing as how things like this have existed since, well, ever since SELinux came along, I kinda feel like we can just leave it alone and tell people to stop filling their partitions :)
Actually, this may have gone away since we switched xenstore over to using tmpfs. Even if not, it's not something we really need to go crazy over, and after all this time we haven't had customer complaints about it. Finally, I'm not really inclined to go and test it again. I'm going to close as WONTFIX, and if a customer runs into it in the future, we can deal with it in a new BZ. Chris Lalancette