Bug 1094072

Summary: corrupted rpm database when exec of build-locale-archive fails in lua postinstall scriptlet
Product: [Fedora] Fedora Reporter: Christian Iseli <Christian.Iseli>
Component: rpmAssignee: Packaging Maintenance Team <packaging-team-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: 22CC: codonell, ffesti, fweimer, jakub, jzeleny, law, lkardos, mnewsome, novyjindrich, packaging-team-maint, pfrankli, pknirsch, pmatilai
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-19 13:40:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Christian Iseli 2014-05-04 23:05:44 UTC
Description of problem:

In the lua postinstall scriptlet of glibc-common, which looks like so:
if posix.access("/etc/ld.so.cache") then
  if posix.stat("/usr/lib/locale/locale-archive.tmpl", "size") > 0 then
    pid = posix.fork()
    if pid == 0 then
      posix.exec("/usr/sbin/build-locale-archive")
    elseif pid > 0 then
      posix.wait(pid)
    end
  end
end

if the exec call of posix.exec("/usr/sbin/build-locale-archive") fails for any reason, the forked process continues to run and duplicates the rest of the operations of the main thread, leading to a potentially corrupted rpm database

Version-Release number of selected component (if applicable):

2.19.90-1

How reproducible:

In principle should not happen, but I had a local corner case where the trigger code on glibc was launched before the /usr/sbin/build-locale-archive file was installed and it took me a while to figure out why yum was failing in the mock --init step with a strange DB_LOCK error...

Steps to Reproduce:

Sorry, no easy way here

May I suggest to add a graceful exit of some sort if the exec call step fails ?

Comment 1 Jaroslav Reznik 2015-03-03 15:45:44 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 2 Panu Matilainen 2015-03-09 11:10:58 UTC
Eek. Ffesti & lkardos, you might want to see if rpm can somehow protect itself in such a situation. This is similar in spirit to http://rpm.org/ticket/167.

Comment 3 Florian Weimer 2016-02-05 15:43:39 UTC
(In reply to Christian Iseli from comment #0)

> May I suggest to add a graceful exit of some sort if the exec call step
> fails ?

I don't see a way to do this in the glibc spec file because RPM does not expose the exit functions:

$ rpm --eval '%{lua: print(posix.fork, posix.exit, posix._exit)}'
function: 0x7fa8cb031490	nil	nil

Reassigning to RPM.  I think the RPM Lua script interpreter needs to detect forks and call _exit outside the script.  I assume that exit and _exit are not exposed for a reason.

Comment 4 Christian Iseli 2016-02-05 16:45:25 UTC
Wouldn't os.exit work ?

Comment 5 Christian Iseli 2016-02-06 08:20:23 UTC
chris: rpm --eval '%{lua: print(posix.fork, posix.exit, posix._exit, os.exit)}'
function: 0x7fe4632c8dc0	nil	nil	function: 0x7fe4632c98c0

Comment 6 Florian Weimer 2016-02-06 10:06:50 UTC
(In reply to Christian Iseli from comment #4)
> Wouldn't os.exit work ?

Ah, no, because it runs atexit handlers, which has unknown consequences.  These atexit handlers will be run *again* when the parent process exits, so this is usually not what is needed.

Comment 7 Ľuboš Kardoš 2016-02-19 13:40:46 UTC
This is fixed in rpm-4.13.x which is in F23.

Upsteam commit:
https://github.com/rpm-software-management/rpm/commit/2d418ad3c11bcf0261d0022ac177d13284a8d5fb