Bug 1339989
Summary: | Mariadb fails to start when PrivateTmp is in place and /var/tmp is a symlink | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Zdenek Pytela <zpytela> | ||||
Component: | systemd | Assignee: | Jan Synacek <jsynacek> | ||||
Status: | CLOSED WONTFIX | QA Contact: | qe-baseos-daemons | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.2 | CC: | bblaskov, databases-maint, ffotorel, fkrska, hhorak, jsynacek, litnialex, lnykryn, mschena, msekleta, systemd-maint-list, systemd-maint | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-04-11 08:31:40 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1421181 | ||||||
Bug Blocks: | 1298243, 1346768, 1383699, 1393867 | ||||||
Attachments: |
|
Description
Zdenek Pytela
2016-05-26 10:04:45 UTC
(In reply to Zdenek Pytela from comment #0) > Description of problem: > Mariadb fails to start with PrivateTmp=True when /var/tmp is a symlink: > > [ERROR] mysqld: Can't create/write to file '/var/tmp/ibjbEG7V' (Errcode: 2) > 160524 13:56:46 InnoDB: Error: unable to create temporary file; errno: 2 Maybe a selinux issue? /var/tmp is there, so ENOENT makes no sense. > other services like ypbind and ntp are reported to work even with PrivateTmp > and symlink This alone means that the problem is not on the systemd side. I have rechecked with cups.service and that too works well when /var/tmp/ is a symlink. Selinux is not involved. Actually it works with mariadb in case we have: PrivateTmp=false But as soon as the service is turned to: PrivateTmp=true then the service has problem with writing to /var/tmp, this is the error that mariadb daemon reports: [ERROR] mysqld: Can't create/write to file '/var/tmp/ib1ivPT8' (Errcode: 2) Anyway, it seems the cause of the problem is PrivateTmp feature in systemd, so moving back to the component where I believe this can be fixed. For the customer, we can probably provide a work-around: by creating the following file and reloading systemd configuration, the PrivateTmp feature can be turned off for mariadb service (but customer should be aware of consequences): [root ~]# cat /etc/systemd/system/mariadb.service.d/mariadb.conf [Service] PrivateTmp=false [root ~]# systemctl daemon-reload [root ~]# systemctl start mariadb.service I've found another workaround for this bug that doesn't involve turning off PrivateTmp. I've found that the very fact that /var/tmp is a symlink actually doesn't matter at all. What seems to trigger the problem is that the symlink's target is somewhere in /tmp. As a workaround, just point the symlink anywhere *but* under /tmp. For example: # mkdir -p /mariadb/tmp # mv /var/tmp /var/tmp-old # ln -s /mariadb/tmp /var/tmp # systemctl start mariadb && echo $? 0 # tree /mariadb/tmp/ /mariadb/tmp/ └── systemd-private-05f04ad970d74e24a5531fb97d853b30-mariadb.service-PjgL07 └── tmp Honzo, thanks for the other workaround, customer has just reported this change works for them. We are now awaiting the bug resolution decision. The following is a strace snippet of systemd right after it forks the new process with PrivateTmp enabled. In this setup, /var/tmp is a symlink to /tmp/varrr. My commentary is inlined. [pid 128] unshare(CLONE_NEWNS) = 0 [pid 128] mount(NULL, "/", NULL, MS_REC|MS_SLAVE, NULL) = 0 1) The new process now has its own mount namespace and rebinds / as a slave to make any mount changes not visible in the parent. [pid 128] open("/", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|0x200000) = 3 [pid 128] openat(3, "tmp", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|0x200000) = 4 [pid 128] fstat(4, {st_mode=S_IFDIR|S_ISVTX|0777, st_size=200, ...}) = 0 [pid 128] close(3) = 0 [pid 128] close(4) = 0 [pid 128] mount("/tmp/systemd-private-2ff4d3d07c9b4b9bb0eb70222a437ed4-mariadb.service-jwVs90/tmp", "/tmp", NULL, MS_BIND|MS_REC, NULL) = 0 2) Here, the new process mounts its new /tmp. [pid 128] open("/", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|0x200000) = 3 [pid 128] openat(3, "var", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|0x200000) = 4 [pid 128] fstat(4, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 [pid 128] close(3) = 0 [pid 128] openat(4, "tmp", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|0x200000) = 3 [pid 128] fstat(3, {st_mode=S_IFLNK|0777, st_size=10, ...}) = 0 [pid 128] readlinkat(4, "tmp", "/tmp/varrr", 99) = 10 3) The /var/tmp symlink is traversed and resolved to /tmp/varrr. [pid 128] close(4) = 0 [pid 128] open("/", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|0x200000) = 4 [pid 128] close(3) = 0 [pid 128] openat(4, "tmp", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|0x200000) = 3 [pid 128] fstat(3, {st_mode=S_IFDIR|S_ISVTX|0777, st_size=40, ...}) = 0 [pid 128] close(4) = 0 [pid 128] openat(3, "varrr", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|0x200000) = -1 ENOENT (No such file or directory) 4) Here, it is checked whether the symlink target exists and it doesn't, because the /tmp has been already remounted and is now empty. [pid 128] close(3) = 0 [pid 128] socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 3 [pid 128] getsockopt(3, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 [pid 128] setsockopt(3, SOL_SOCKET, 0x20 /* SO_??? */, [8388608], 4) = -1 EPERM (Operation not permitted) [pid 128] setsockopt(3, SOL_SOCKET, SO_SNDBUF, [8388608], 4) = 0 [pid 128] setsockopt(3, SOL_SOCKET, SO_SNDTIMEO, "\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0 [pid 128] connect(3, {sa_family=AF_LOCAL, sun_path="/run/systemd/journal/socket"}, 29) = 0 [pid 128] sendmsg(3, {msg_name(0)=NULL, msg_iov(9)=[{"PRIORITY=3\nSYSLOG_FACILITY=3\nCOD"..., 128}, {"MESSAGE_ID=641257651c1b4ec9a8624"..., 43}, {"\n", 1}, {"UNIT=mariadb.service", 20}, {"\n", 1}, {"MESSAGE=mariadb.service: Failed "..., 121}, {"\n", 1}, {"EXECUTABLE=/usr/libexec/mariadb-"..., 46}, {"\n", 1}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 362 [pid 128] exit_group(226) = ? [pid 128] +++ exited with 226 +++ 5) The rest fails. The code is general and both /tmp and /var/tmp are mounted into a new namespace. We cannot reverse the order of mounting for the very same reason as in 4). We cannot use mount() with unresolved symlink, since it gets resolved anyway. We could, in theory, make this a very specific case and just create the /tmp/varrr directory in the newly created namespace, but that's pretty bad because it destroys the generality of the algorithm, plus may break many cases that I can't come up with off the top of my head right now. From my point of view, we don't support /var/tmp being a symlink into /tmp/<anywhere> together with PrivateTmp enabled. We should put a warning somewhere in the documentation. Development Management has reviewed and declined this request. You may appeal this decision by reopening this request. (In reply to Jan Synacek from comment #23) > We should put a warning somewhere in the documentation. I realized that there is no need to document this specific case. We already have https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html-single/Storage_Administration_Guide/index.html#s1-filesystem-fhs which says that RHEL uses FHS. Symlinking /var/tmp -> /tmp is a violation of the FHS and therefore not supported. Honzo, although I can accept the inner reasons for declining to deal with the problem, it seems that it needs more proper explanation; I suppose it's really worth mentioning somewhere in our documentation. Unfortunately, I cannot see where exactly it clashes with FHS. Here are the only things which seem to be relevant: FHS 2.3, Chapter 5. The /var Hierarchy says: Requirements The following directories, or symbolic links to directories, are required in /var. Directory Description ... tmp Temporary files preserved between system reboots Could you please provide us with explanation, possibly just as the bugzilla "Doc Text" field oneliner? It clashes in that /var/tmp is supposed to be persistent between reboots, but /tmp is not. Thus FHS indirectly (also) says, that /var/tmp cannot be a symlink to /tmp. Honzo, I am sorry, but I cannot find any strong foundation for such a statement. It's true that /tmp and /var/tmp have different purpose, may have different setup, and programs should have different assumptions. However, I cannot see why /tmp should not be treated the same as /var/tmp, e. g. when being pointed from /var/tmp. Also, FHS does not say anything about enforcing the recommendations. We already know that this setup is incompatible with PrivateTmp feature, don't have it properly documented though. We are looking for any good resource. This arguing about the interpretation of FHS back and forth is useless, really. I already said everything in comment 23 and I'm not going to change my mind about this particular case. As for documenting this explicitly, I suggest writing a kbase article. |