Bug 858265

Summary: RFE: possibly automatically turn off tmpfs on /tmp if root fs is large and writable, and RAM is limited
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: systemdAssignee: systemd-maint
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: rawhideCC: cheerleone, eblake, johannbg, lnykryn, metherid, msekleta, pachoramos1, pahan, plautrba, psklenar, rvokal, sct, sergio, systemd-maint, tgunders, vpavlin
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-16 11:06:16 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 873273, 826015    

Description Richard W.M. Jones 2012-09-18 09:27:38 EDT
Description of problem:

When installing a virtual machine with (as is usual with VMs)
limited memory and hard disk space, the default maximum size
of /tmp is now half of RAM.  Usually this means about 256 MB.
Since lots of programs store arbitrary sized data in /tmp, it
can easily run out.

Even if the limit is increased, tmp-on-tmpfs still limits us to
the size of RAM + swap, which with a 10 GB disk, means about 2.5 GB
(2 GB swap, plus 512 MB of RAM).

Thus you can easily run out of space in /tmp and still have plenty
of disk space available.
Comment 1 Petr Sklenar 2012-11-14 10:34:54 EST
I have similar issue with tmpfs on /tmp.

Current size of /tmp causes plenty of scripts failure due to low space. That's why tmpfs on /tmp is not usable for any "bigger" script as I am not sure if size of such a dir is enough.

Could $TMPDIR be set to /vat/tmp like system DEFAULT to be able to use mktemp or any other tool for temp on disk space ?
Comment 2 Bill Nottingham 2012-11-14 14:05:44 EST
Can you quantify 'many' and 'bigger' a little here?
Comment 3 Petr Sklenar 2012-11-15 03:45:23 EST
Its scripts for testing, which uses /tmp for +-100Mb temporary data. If all (+-20) the scripts are run in parallel then there is no space left on device.

There is internal logging systems which logs with +-2/3 gigs - its copying over /tmp to other places.

We know, test are not the best designed but it used to be working for ages, now we have to change paths to /var/tmp

I can understand that for single user desktop there are some advantage but not for such a server with higher load and more users who saves random data into /tmp
Comment 4 Richard W.M. Jones 2012-11-15 03:56:14 EST
This just brings us back to the problem that this change
requires everyone to analyse their scripts/programs and
decide if each individual use of /tmp creates a "small"
file or a file of potentially unbounded size[1].

However the solution is NOT to force everyone to use /var/tmp,
and even less to set TMPDIR=/var/tmp (which just means that /tmp
would hardly ever be used).

The solution is to forget about tmp-on-tmpfs as a default.
Revert this change.  Any program that really needs to use
memory/tmpfs can use /dev/shm which has been around for years.

[1] https://rwmj.wordpress.com/2012/09/12/tmpfs-considered-harmful/
Comment 5 Cheer Leone 2012-12-05 23:41:17 EST
If tmpfs had a pass-through mechanism, this situation could be avoided.

desc:
  100MB temp fs /tmp primarily in ram with a pass-through to /var/tmp

logic:
  1x 80MB file stored in /tmp
  attempt to save 40MB file, silently log error, send file to /var/tmp

  if we ls /tmp, tmpfs should provide results for /tmp and /var/tmp

  /var/tmp should be cleared on reboot
  /tmp will obviously be cleared on reboot due to being RAM based.

Problem solved.
Comment 6 Lennart Poettering 2013-01-14 15:14:08 EST
I am pretty sure this is primarily a documentation issue. We should document well how people can change the default tmpfs values.

*** This bug has been marked as a duplicate of bug 895109 ***
Comment 7 Stephen Tweedie 2013-01-14 15:25:54 EST
(In reply to comment #6)
> I am pretty sure this is primarily a documentation issue. We should document
> well how people can change the default tmpfs values.

I'm not so convinced.  In reality, disk- and memory-based filesystems perform very differently; I've personally been using a substantial (larger-than-default) tmpfs for /tmp recently, but that only performs well for me with ssd swap and tuned /proc/sys/vm/swappiness.  Relying on tmpfs tuning size also requires partitioning additional swap space in advance.  

We can't assume that tmpfs will always be a good drop-in replacement for a block-based filesystem.  Documenting the tmpfs options is good (and necessary if tmpfs is default) but is a separate issue from whether tmpfs is always right in the first place; virt systems with very limited VM are a very good example where it may well be wrong, and this bz request is a separate, legitimate question.
Comment 8 Lennart Poettering 2013-01-15 17:43:00 EST
So I take it you are basically asking for a logic that automatically disables /tmp on tmpfs if the root dir is writable, RAM is limited and disk plenty, is that correct?

I am not a big fan of such automatisms that expose entirely different configurations based on arbitrarily selected heuristics. Before we do anything like this I'd really prefer some more data coming in from users who need this. We shouldn't create baroque, opaque, surprising automatisms such as this one without having a really good reason, and a really good data set that proves this is useful and required.
Comment 9 Richard W.M. Jones 2013-01-15 23:40:04 EST
(In reply to comment #8)
> So I take it you are basically asking for a logic that automatically
> disables /tmp on tmpfs if the root dir is writable, RAM is limited and disk
> plenty, is that correct?
> 
> I am not a big fan of such automatisms that expose entirely different
> configurations based on arbitrarily selected heuristics. Before we do
> anything like this I'd really prefer some more data coming in from users who
> need this. We shouldn't create baroque, opaque, surprising automatisms such
> as this one without having a really good reason, and a really good data set
> that proves this is useful and required.

As has been explained to you numerous times, the problem is that
you provided no data in the first place that tmp-on-tmpfs was
necessary.  Now that you've succeeded in the politics of breaking
Fedora in this way, please don't push the onus of proof on anyone else.
Comment 10 Lennart Poettering 2013-01-16 13:07:21 EST
We made our case, and convinced FESCO with that. You made yours and you didn't. 

Now the change has been implemented, and if you can make a good case (and hard data can give you a good case), then we can alter it.
Comment 11 Richard W.M. Jones 2013-01-16 14:22:23 EST
So funny.
Comment 12 Richard W.M. Jones 2013-01-16 14:39:01 EST
http://meetbot.fedoraproject.org/fedora-meeting/2012-04-02/fesco.2012-04-02-17.00.log.html

from "17:40:06 <mitr> #topic #834 F18 Feature: /tmp on tmpfs - http://fedoraproject.org/wiki/Features/tmp-on-tmpfs"

There is such a lot of nonsense in that, eg:

17:48:39 <mezcalero> you know, we are really not on our own with this
17:48:44 <mezcalero> this is not where we are pioneering
17:48:49 <mjg59> mitr: Yes, there are benefits
17:48:53 <mezcalero> this is simply where we are following, debian and commercial unixes
17:49:01 <mezcalero> and ubuntu is doing this too

(Debian and Ubuntu reverted this)

17:49:50 <kay> just like solaris has from the beginning ...

(Yeah, that worked out well ...)

17:50:41 <mezcalero> mitr: yet, the change has been made in debian already

(It's not)

17:51:09 <mezcalero> mitr: but the slightest opposition is not reason to already give up

(Ignoring massive use cases like virtualization)

17:53:21 <mezcalero> so, let me stress a couple of things: a) we will fix what breaks. b) if it's too much we will revert this before the release. c) people can easily opt-out from this, and we document it

(Except, I pointed out the huge number of packages that are broken by this change, but nothing was done, and it's NOT easy to opt out -- it requires a reboot)

And on and on, nonsense, with NO actual measurements at all.

18:12:52 <mitr> #agreed tmp-on-tmpfs is accepted (+5 -3)

Great result.
Comment 13 Nicolas Mailhot 2013-04-11 07:56:08 EDT
The only thing that can make /tmp on tmpfs safe from apps POW is autoprovisionning of swap files under memory pressure 
(add swap files when needed, remove them when pressure ceases, clean them up on reboot)

And that would be actually useful for more things than /tmp on tmpfs, and probably the kind of housekeeping task an expanded init system like systemd should take care of.
Comment 14 Eric Blake 2013-05-13 16:57:22 EDT
I hit this again today, when my VM slowed to a painfully slow crawl during a 'git gc --aggressive' that exhausted /tmp. As finally documented in http://www.freedesktop.org/wiki/Software/systemd/APIFileSystems, I am now doing this in all of my VMs which are intentionally <1G memory but with 10G disks:

systemctl mask tmp.mount

but it would be SOOOO much nicer if systemd would automatically avoid /tmp on tmpfs in VMs in the first place.
Comment 15 Fedora End Of Life 2013-12-21 10:06:05 EST
This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 16 Lennart Poettering 2014-07-03 15:47:46 EDT
OK, let's be honest, I don't think this makes any sense. If people are concerned about RAM being limited, they should set up a swap partition, it's a much better idea. 

Supporting /tmp on physical disk is a worthy thing to do, but any algorithm of trying to be smart and then enabling completely different behaviour is just wrong.

If you have limited RAM, then the answer is to use a swapfile, not disable tmpfs. 

Closing.
Comment 17 Richard W.M. Jones 2014-07-03 16:25:26 EDT
Reopening as it's still a real bug that affects VMs, and the
solution suggested in comment 16 is not in fact a solution.
Comment 18 Tom Gundersen 2014-07-10 10:18:42 EDT
Richard, care to elaborate on the problem with using swapfiles?
Comment 19 Richard W.M. Jones 2014-07-10 10:34:54 EDT
Care to say what the solution is?  Are you genuinely proposing
adding lots of swap files as a way to work around this?  How
would that be done automatically for VMs?  Would the swap files
be created and removed on demand?
Comment 20 Lennart Poettering 2014-07-16 11:06:16 EDT
If you put together a VM image, where you previously reserved a certain amount of bytes for /tmp, instead simply create a new swap partition of the same size, and use that. That's all.

The performance will be much better (since it relieves the kernel from having to always sync things to disk), and things will be simpler, too.
Comment 21 Richard W.M. Jones 2014-07-16 11:26:15 EDT
The swap file uses regular disk space, instead of sharing disk space
as files on a regular /tmp would do.

I can't believe you wouldn't know this, but I'm sick of arguing
with you.
Comment 22 Eric Blake 2014-07-16 12:39:22 EDT
(In reply to Lennart Poettering from comment #20)
> If you put together a VM image, where you previously reserved a certain
> amount of bytes for /tmp, instead simply create a new swap partition of the
> same size, and use that. That's all.
> 
> The performance will be much better (since it relieves the kernel from
> having to always sync things to disk), and things will be simpler, too.

I agree with Richard here.  Your proposal is broken.  In a system where /tmp is backed by disk, I can _share_ the storage for /tmp with the REST of my system, all by keeping /tmp as part of the same mount point as the rest of the system.  Thus, if I create a guest VM with 10G of disk, I can SHARE the 10G between /tmp and / and /home (assuming I do a single mount point).  This works whether I need 1G temp files and 9G normal files, or 5G temp files and 5G normal files.

But with your proposal, I have to carve out space IN ADVANCE that is reserved just for swap (and therefore /tmp).  If I want 2G for /tmp, I have then limited myself to 8G for the rest of my file system.  I can't get either scenario listed above (1G temp and 9G normal? Nope, 9G exceeds the 8G of the normal partition.  5G temp and 5G normal? Nope, 5G /tmp exceeds the size of 2G swap).

The argument here is that for VMs, we do NOT want to be in the business of artificially partitioning the system, because we do not necessarily know the workload that will be used on that VM, and therefore cannot accurately predict an optimal partitioning when we would much rather expose a SINGLE mount point for the entire system, including /tmp.
Comment 23 Sergio Monteiro Basto 2014-08-04 11:07:56 EDT
(In reply to Richard W.M. Jones from comment #21)
> The swap file uses regular disk space, instead of sharing disk space
> as files on a regular /tmp would do.
> 
> I can't believe you wouldn't know this, but I'm sick of arguing
> with you.

The problem isn't that tmpfs use memory instead disk space ? if a vm (or computer) as limit RAM , we shouldn't use tmpfs, isn't it ? 

Other problem happen to me, when a application (Digikam) wants to write more than 2 gigas in /tmp , the solution is use /var/tmp instead /tmp ...
Comment 24 Lennart Poettering 2014-08-04 16:05:00 EDT
(In reply to Eric Blake from comment #22)
> (In reply to Lennart Poettering from comment #20)
> > If you put together a VM image, where you previously reserved a certain
> > amount of bytes for /tmp, instead simply create a new swap partition of the
> > same size, and use that. That's all.
> > 
> > The performance will be much better (since it relieves the kernel from
> > having to always sync things to disk), and things will be simpler, too.
> 
> I agree with Richard here.  Your proposal is broken.  In a system where /tmp
> is backed by disk, I can _share_ the storage for /tmp with the REST of my
> system, all by keeping /tmp as part of the same mount point as the rest of
> the system.  Thus, if I create a guest VM with 10G of disk, I can SHARE the
> 10G between /tmp and / and /home (assuming I do a single mount point).  This
> works whether I need 1G temp files and 9G normal files, or 5G temp files and
> 5G normal files.
> 
> But with your proposal, I have to carve out space IN ADVANCE that is
> reserved just for swap (and therefore /tmp).  If I want 2G for /tmp, I have
> then limited myself to 8G for the rest of my file system.  I can't get
> either scenario listed above (1G temp and 9G normal? Nope, 9G exceeds the 8G
> of the normal partition.  5G temp and 5G normal? Nope, 5G /tmp exceeds the
> size of 2G swap).
> 
> The argument here is that for VMs, we do NOT want to be in the business of
> artificially partitioning the system, because we do not necessarily know the
> workload that will be used on that VM, and therefore cannot accurately
> predict an optimal partitioning when we would much rather expose a SINGLE
> mount point for the entire system, including /tmp.

Well, you have to size at least RAM and the image's rootfs anyway. Whether you have to size two or three parameters, where's the big difference? You could even simply not use a swap partition, instead increase the RAM size and allow the host system to then swap it out where necessary (not that i would recommend that though).
Comment 25 Lennart Poettering 2014-08-04 16:07:01 EDT
(In reply to Sergio Monteiro Basto from comment #23)
> (In reply to Richard W.M. Jones from comment #21)
> > The swap file uses regular disk space, instead of sharing disk space
> > as files on a regular /tmp would do.
> > 
> > I can't believe you wouldn't know this, but I'm sick of arguing
> > with you.
> 
> The problem isn't that tmpfs use memory instead disk space ? if a vm (or
> computer) as limit RAM , we shouldn't use tmpfs, isn't it ? 

No, this is not how this works. tmpfs is backed by swappable memory. If you are low on RAM, add more swap. tmpfs will be in RAM when it is avilable, and paged out if it isn't. That's exactly as any other memory.