Bug 1031243

Summary: glibc: [RFE] build-locale-archive should have a --prefix option.
Product: [Fedora] Fedora Reporter: Andrew J. Schorr <aschorr>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: aschorr, codonell, fweimer, jakub, law, mnewsome, pfrankli, schwab, zbyszek
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-05 15:01:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Script to shrink locale-archive safely on a running system none

Description Andrew J. Schorr 2013-11-16 04:11:44 UTC
Description of problem:
When a minimal install is desired, one may want to reduce the size of /usr/lib/locale/locale-archive.  This can be done by using the "localedef --delete-from-archive" command followed by rebuilding the archive with build-locale-archive.  The problem is that the current behavior of build-locale-archive makes it impossible to do this cleanly.  If build-locale-archive had a --prefix option as localedef does, then one could easily build a new archive off to the side and then install it atomically.  The current build-locale-archive seems to run only on the live files, so it is impossible to rebuild cleanly without setting up a chroot.

Version-Release number of selected component (if applicable):
glibc-common-2.17-19.fc19.x86_64


How reproducible:
Run build-locale-archive and note the race conditions.

Steps to Reproduce:
1. localedef --list-archive | grep -iv ^en_us.utf | xargs localedef --delete-from-archive
2. mv /usr/lib/locale/locale-archive /usr/lib/locale/locale-archive.tmpl
3. build-locale-archive

Actual results:
Processes that have /usr/lib/locale/locale-archive mmaped will crash.  This includes crond and various shells and other processes.

Expected results:
There should be a way to build a new archive off to the side and install it atomically.

Additional info:

Comment 1 Carlos O'Donell 2013-11-16 07:12:37 UTC
I agree that adding a --prefix option to build-locale-archive would be nice, but it's orthogonal to the issue at hand.

There is nothing wrong with build-locale-archive, and it is the steps you have used that have broken your system. I will try to explain.

Every process that uses the locale-archive loads it via mmap with MAP_PRIVATE. In doing so that creates a references to the file in the filesystem.

The glibc upgrade process is very very careful to first unlink locale-archive, followed by a recreating of locale-archive based on the template. This is critically important because when the unlink happens any old process using the old list of locales continue to run correctly. Linux will not remove the old file data until the reference counts drop to zero.

There is no problem with build-locale-archive (except for your missing feature). The problem is in your process. Your use of `mv' in step 2 boils down to a rename syscall and is not equivalent to cp/rm. Therefore all existing processes with mmaps that reference locale-archive now have mmaps that reference locale-archive.tmpl. This is now dangerous. When you run build-locale-archive it will eventually truncate locale-archive.tmpl and that will cause undefined behaviour for all the processes in your system that have mmaps to that data. You must not execute an mv in step 2, instead you must use cp and rm.

Does that explain why what you did is incorrect?

I've retitled the RFE. I agree it would be nice for testing to support some alternate --prefix.

Comment 2 Andrew J. Schorr 2013-11-16 13:22:11 UTC
Hi Carlos,

Thanks for the prompt and lengthy explanation.  I was actually aware that using "cp" and "rm" would fix the problem with crashes, but I don't think it changes the need for a --prefix option.

First of all, where is any of this documented?  Why isn't there a man page for build-locale-archive?  And why doesn't localedef's man page document the --list-archive and --delete-from-archive options?  At least localedef has a usage message, whereas build-locale-archive does not.

Because of this lack of documentation, one can only figure this out by googling and getting pointers from other folks who have devised the faulty method above.  After getting that tip, I downloaded the source code and realized that the crashes were caused by using "mv" instead of "cp/rm".

But there are still 2 flaws with the "cp/rm" approach:

1. What happens to any program that tries to map the locale-archive after it was removed and before build-locale-archive finishes rebuilding a good copy?

2. The "cp/rm" approach requires a lot of disk space.  One must have free disk space equal to the size of the current locale-archive file plus enough free space to contain the new, smaller file.  On a flash-based system, this might be a problem.

With a --prefix option, I think one should be able to build a new version of locale-archive under /tmp and then install it atomically.  This will eliminate the race condition problem for programs that try to start between the removal of the old file and the completion of building the new one.  And it reduces the additional disk space requirements to the size of the new (smaller) file.

That is why I think --prefix is needed.  And some documentation of this process somewhere would be great!

Thanks,
Andy

Comment 3 Andrew J. Schorr 2013-11-16 20:33:30 UTC
I think this is the best that can be done without having a --prefix option for build-locale-archive.  It at least seems to avoid the need for having enough disk space to copy the 100MB+ locale-archive file (error checking omitted):

tmpdir=`mktemp -dt shrinkla.XXXXXXXXXX`
fn=/usr/lib/locale/locale-archive
nfn=${tmpdir}$fn
mkdir -p `dirname $nfn`
cp -p $fn $nfn
pfxarg="--prefix $tmpdir"
localedef --list-archive | grep -iv ^en_us.utf | xargs localedef --delete-from-archive $pfxarg
mv $nfn ${nfn}.tmpl
rm -f ${fn}.tmpl
ln -s ${nfn}.tmpl ${fn}.tmpl
rm -f $fn
build-locale-archive
rm -f ${fn}.tmpl
touch ${fn}.tmpl
rm -rf $tmpdir

It still suffers from a race condition (there is a period of time when there is no locale-archive file installed), but it solves the disk space problem.

Is there any way to improve on this without having a build-locale-archive --prefix option?

Note that a reboot is probably required to reclaim the disk space for the deleted (huge) locale-archive file still mapped by running processes.  Or one could use lsof to identify such processes and restart them individually.

Regards,
Andy

Comment 4 Carlos O'Donell 2013-11-17 01:15:11 UTC
(In reply to Andrew J. Schorr from comment #2)
> First of all, where is any of this documented?  Why isn't there a man page
> for build-locale-archive?  And why doesn't localedef's man page document the
> --list-archive and --delete-from-archive options?  At least localedef has a
> usage message, whereas build-locale-archive does not.

There is no man page for build-locale-archive because it's a distribution specific tool used by Fedora that is not intended for use by users. However on principle I agree that all tools, internal or external, should have a --help that lists options if any and the version of the tool (so anyone supporting you can know exactly what tool it was). The documentation issue is an orthogonal bug and please feel free to file a bug for that too. However, keep in mind that build-locale-archive was not intended for use by users.

The localedef's man page documents only the portable POSIX options that are required for localedef. Non-portal GNU extensions are only documented in `localedef --help' output. The traditional manual page that would describe all of these options is normally a part of the linux kernel man pages project. In this case however the man pages project doesn't have a manual for localedef (which is odd actually since Michael Kerrisk is very thorough). You could file a bug against glibc to document localedef in the locales portion of the manual, a bug against man-pages to get a localedef manual that describes all the options under Linux (even the non-portable ones).
 
> Because of this lack of documentation, one can only figure this out by
> googling and getting pointers from other folks who have devised the faulty
> method above.  After getting that tip, I downloaded the source code and
> realized that the crashes were caused by using "mv" instead of "cp/rm".

That's not true. You can email the maintainers of the package or file a bug as you just did and we'd be happy to talk about it. I'm glad that my comments helped.

> But there are still 2 flaws with the "cp/rm" approach:
> 
> 1. What happens to any program that tries to map the locale-archive after it
> was removed and before build-locale-archive finishes rebuilding a good copy?

Don't do the `rm.'

In the "cp/rm" approach there is obviously a race where any new application starting up won't find the locale archive and fall back onto loading the requested locales from their on-disk storage. The problem there might be that if you don't have any locales in on-disk storage then only the builtin locales are available.

Therefore you should just adjust the archive to contain the required locales, copy it to the template, and re-run build-locale-archive and allow it to take care of the rest. There are a few bugs with build-locale-archive, but we can fix those as we go. For example build-locale-archive also has a similar race condition in that it unlinks the original archive, then creates it again atomically with a minimal empty header. Therefore any application that happens to try use a locale in the middle of this process, about a 20ms window on my new-ish laptop, will have problems using any locales not in the archive and not in the silesystem.

> 2. The "cp/rm" approach requires a lot of disk space.  One must have free
> disk space equal to the size of the current locale-archive file plus enough
> free space to contain the new, smaller file.  On a flash-based system, this
> might be a problem.

It gets worse if you want to reduce the race window where only POSIX and C are available. To do that you need to:

* Copy the original locale archive to the template.
* Remove the locales from the template you don't need.
* Rebuild a temporary locale archive from the template (to make it smaller)
* Use rename to atomically move the temporary locale archive into position as locale-archive.

In the middle you need space for 3x the size of the locale archive.

My opinion is that this small race window is unacceptable and is actually a bug.

> With a --prefix option, I think one should be able to build a new version of
> locale-archive under /tmp and then install it atomically.  This will
> eliminate the race condition problem for programs that try to start between
> the removal of the old file and the completion of building the new one.  And
> it reduces the additional disk space requirements to the size of the new
> (smaller) file.

I can agree that with --prefix you could create a new 3rd file somewhere else say in /tmp, and then use rename to atomically move it into place and never have a window where there locales might be missing.
 
> That is why I think --prefix is needed.  And some documentation of this
> process somewhere would be great!

It's not needed, but it's nice. If build-locale-archive used rename it would work to just copy the locale archive to the template.

Right so in summary:
* Add --prefix, --inputfile, and FILE support like localedef has (or does now after my recent upstream fix)
* Fix unlink early bug and use rename from within build-locale-archive.
  - Takes 3x size but has no race
  - Makes running `/usr/sbin/build-locale-archive' just work as long as the default template is present.

Comments?

Comment 5 Carlos O'Donell 2013-11-17 01:16:57 UTC
(In reply to Andrew J. Schorr from comment #3)
> Is there any way to improve on this without having a build-locale-archive
> --prefix option?

No. That's probably the best option.

> Note that a reboot is probably required to reclaim the disk space for the
> deleted (huge) locale-archive file still mapped by running processes.  Or
> one could use lsof to identify such processes and restart them individually.

Right.

Comment 6 Andrew J. Schorr 2013-11-17 13:36:33 UTC
Hi Carlos,

Thanks again for your thorough response.  Everything you say makes sense.  It would be a big improvement if there were a way to build a new locale-archive off to the side and then install it atomically.  The only aspects I don't understand are adding --inputfile and FILE support to build-locale-archive.  I'm not sure how that comes into play, but I defer to your far superior knowledge of what's going on here.  At the end of the day, it would be very nice to give folks a recipe for how to shrink locale-archive atomically without trashing running processes.  I think my formula above works but for the race condition.  I have a script to do this that I can attach, but I'd like to add some disk space checks.  Is there any way of knowing in advance how large the locale-archive file will be if I remove all locales except for one (e.g. en_US.utf8)?  If I had that info, I could calculate the disk space requirements properly for the new file.

Thanks,
Andy

Comment 7 Andrew J. Schorr 2013-11-17 17:02:43 UTC
Created attachment 825208 [details]
Script to shrink locale-archive safely on a running system

This script allows you to remove all but one locale from the locale-archive file.  It does this safely on a running system with minimal disk space requirements.  I think the only defect is a race condition due to a short window of time when there is no locale-archive file present.  To eliminate this problem will require enhancements to the build-locale-archive program.

Regards,
Andy

Comment 8 Fedora End Of Life 2015-01-09 20:36:28 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Zbigniew Jędrzejewski-Szmek 2015-11-19 01:55:51 UTC
FWIW, I think that the script in comment #8 is overly complicated. To keep things simple and robust, build-locale-archive should open /usr/lib/locale/locale-archive in read mode, open /usr/lib/locale/locale-archive.XXXXXX for writing, write whatever locales it is supposed to write, and atomically rename /usr/lib/locale/locale-archive.XXXXXX over /usr/lib/locale/locale-archive. This should be all built into the binary itself, no need to wrap it with a script. Also, /tmp/ should not be used for this, because /tmp is usually a different fs, and atomic renames don't work anymore.

--prefix would be nice, but is completely orthogonal.

Comment 10 Carlos O'Donell 2016-01-08 15:13:18 UTC
You can now set the %_install_lang rpm macro to control exactly
which locales are installed by build-locale-archive. This is a
temporary solution to the problem you are facing. You can set
the macro globally and that will control which locales are installed
during glibc upgrades (when locale-archive is rebuilt). This should
be enough to solve your problem of trying to reduce locale archive
size.

The fedora rawhide build-locale-archive has --help,
supports specifying the template file (the source of
the locales you will install from) and the archive file
(the target file you will install locales into).
~~~
Usage: build-locale-archive [OPTION]... [TEMPLATE-FILE] [ARCHIVE-FILE]
 Builds a locale archive from a template file.
 Options:
  -h, --help                 Print this usage message.
  -v, --verbose              Verbose execution.
  -l, --install-langs=LIST   Only include locales given in LIST into the 
                             locale archive.  LIST is a colon separated list
                             of locale prefixes, for example "de:en:ja".
                             The special argument "all" means to install
                             all languages and it must be present by itself.
                             If "all" is present with any other language it
                             will be treated as the name of a locale.
 			     If the --install-langs option is present with an
			     empty string for an argument e.g. "", then no
			     locales are installed, similarly if "none" is
 			     used.  The special argument "none" means to
			     install zero locales in addition to the builtin
			     locales.  If "none" is present with any other
			     language it will be treated as the name of a
 			     locale.
                             If the --install-langs option is missing, all
                             locales are installed. The colon separated list
                             can contain any strings matching the beginning of
                             locale names.
                             If a string does not contain a "_", it is added.
                             Examples:
                               --install-langs="en"
                                 installs en_US, en_US.iso88591,
                                 en_US.iso885915, en_US.utf8,
                                 en_GB ...
                               --install-langs="en_US.utf8"
                                 installs only en_US.utf8.
                               --install-langs="ko"
                                 installs ko_KR, ko_KR.euckr,
                                 ko_KR.utf8 but *not* kok_IN
                                 because "ko" does not contain
                                 "_" and it is silently added
                               --install-langs"ko:kok"
                                 installs ko_KR, ko_KR.euckr,
                                 ko_KR.utf8, kok_IN, and
                                 kok_IN.utf8.
                               --install-langs="POSIX" will
                                 installs *no* locales at all
                                 because POSIX matches none of
                                 the locales. Actually, any string
                                 matching nothing will do that.
                                 POSIX and C will always be
                                 available because they are
                                 builtin.
                             Aliases are installed as well,
                             i.e. --install-langs="de"
                             will install not only every locale starting with
                             "de" but also the aliases "deutsch"
                             and and "german" although the latter does not
                             start with "de".

  If the arguments TEMPLATE-FILE and ARCHIVE-FILE are not given the locations
  where the glibc used expects these files are used by default.
~~~

We don't yet have --prefix, but I hope that %_install_langs helps.

Comment 11 Florian Weimer 2019-06-05 15:01:16 UTC
The binary has been removed from rawhide.