Bug 518418 - Package rename shuts down server, results in unconfigured package
Summary: Package rename shuts down server, results in unconfigured package
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: 389-ds-base
Version: 11
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Rich Megginson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 518526 (view as bug list)
Depends On:
Blocks: 389_1.2.2
TreeView+ depends on / blocked
 
Reported: 2009-08-20 10:52 UTC by Mitchell Berger
Modified: 2011-04-25 23:27 UTC (History)
4 users (show)

Fixed In Version: 1.2.2-1.fc11
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-08-27 02:09:21 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
patch (3.71 KB, patch)
2009-08-24 16:41 UTC, Rich Megginson
no flags Details | Diff
new patch - uses lua (4.95 KB, patch)
2009-08-24 23:17 UTC, Rich Megginson
no flags Details | Diff

Description Mitchell Berger 2009-08-20 10:52:27 UTC
Description of problem:

fedora-ds-base was renamed to 389-ds-base last weekend.  More accurately,
last weekend was the first time 389-ds-base packages were built for
F10 and F11 because the software version was updated.  This update
caused (at least) two problems for anyone who had been using fedora-ds-base:

1. The old fedora-ds-base package has this fairly common preun scriptlet:

%preun
if [ $1 = 0 ]; then
        /sbin/service %{pkgname} stop >/dev/null 2>&1 || :
        /sbin/chkconfig --del %{pkgname}
fi

Thus when 389-ds-base was installed, because it Obsoletes fedora-ds-base,
fedora-ds-base believed that it was being removed instead of upgraded,
and so it dutifully shut down the directory server, with nothing in
place to restart it.  It is not okay to shut down (and leave stopped)
users' daemons when they update their systems.

2. Both the fedora-ds-base and 389-ds-base spec files contain this
definition:

%define pkgname   dirsrv

And their post scriptlets both contain this line:

/sbin/chkconfig --add %{pkgname}

As a result, when 389-ds-base was installed, it "added" the dirsrv
service, which was a no-op since that service was already installed
because it was part of the old fedora-ds-base package.  And then later
in the update transaction when the old fedora-ds-base package is cleaned
up, as you can see from its preun scriptlet above, it proceeds to
delete the dirsrv service.

There are two unacceptable consequences that result.  One is that not
only was the directory server shut down as described above, but it
won't ever be run again, even on reboot, because it's no longer
symlinked in the rc.d directories.  Another is that a user can't
even 'chkconfig dirsrv on' because the service doesn't exist, and
won't appear in 'chkconfig --list' output - the effect of the post
scriptlet for 389-ds-base is undone, and effectively the package is
broken because fedora-ds-base rips the service out from under it on
its way out the door.

The missing dirsrv service could be remedied by immediately pushing
a no-change update to 389-ds-base, because the post scriptlet will
run again and add the service.  However, irreparable damage has been
done because you've lost the users' configuration of which runlevels
to run the service in.

The loss of configuration information and the service could have
been avoided by renaming the service to match the package name
(the post scriptlet for 389-ds-base would have had to examine and
copy the dirsrv runlevels).  I don't see an easy way to have avoided
the server shutdown except to have not renamed the package at all.

While I understand that the software has been renamed, and thus
its new name should be used in Rawhide, I believe it was an error
to allow this rename to enter Fedora 10 and 11 because it was easy
to predict that it would cause these problems.  I also see in
Bodhi that you pushed these updates straight to production with no
testing.  That testing would very likely have pointed out that the
update was going to break users' systems and lose their configuration
data.  Since there did not seem to be immediate urgency, skipping
testing doesn't seem to have been appropriate.

Additional info:

I see from bug 287241 that this is not the first time the name of this
package has been changed resulting in breaking users' installs because
of the pkgname dirsrv macro.  Why was this allowed to happen again?

Comment 1 Rich Megginson 2009-08-20 17:55:31 UTC
Sorry about that.  I did not occur to me that renaming the package would trigger the preun package deletion case - that was quite unexpected.  It is very important (to the 389 project team, anyway) that the package is renamed to 389.  (There is a long and boring history of the naming - the short version is that it never should have been called Fedora Directory Server).

This is what really bothers me:
> I don't see an easy way to have avoided
> the server shutdown except to have not renamed the package at all.

There must be a way, in the %preun, to differentiate between the package rename case and the package removal case?

Comment 2 Nathan Kinder 2009-08-20 19:53:35 UTC
*** Bug 518526 has been marked as a duplicate of this bug. ***

Comment 3 Rich Megginson 2009-08-20 20:04:34 UTC
Looks like no - there is no way, because all the fedora-ds-base package sees is the removal event - it has no way to know that it is being removed in order to replace it with 389.

I suppose I could push out an update to fedora-ds-base (and fedora-ds-admin - the bug affects both packages) - the update would just remove the preun script.  If someone really wanted to remove fedora-ds-base and admin without an upgrade, they would have to manually do the chkconfig --del.  This seems to me the lesser of two evils.  Would this be an acceptable solution to avoid this problem in the future?

Comment 4 Mitchell Berger 2009-08-21 06:22:39 UTC
There seem to be a handful of problems to consider with that solution:

If you were to push a new fedora-ds-base, you would have to bump its
version to at least 1.2.1-1 in order for it to be of any interest at
all to package managers, since 389-ds-base is visible and Obsoletes
any earlier version.  Since that's a change in the version and not
just the release, you'd presumably have to actually push the new
version of the directory server in that package, and not just a scriptlet
update (otherwise the package can't be version 1.2.1).

Then the question is whether its version should be 1.2.1-1 or something
higher.  If it's 1.2.1-1, then you have two packages offering to provide
fedora-ds-base-1.2.1-1.  I'm not sure whether yum on a system running
fedora-ds-base-1.2.0-4 would choose to take the fedora-ds-base update or
the 389-ds-base replacement, but either way fails: if it takes the
fedora-ds-base update, then it's never caused to change to 389-ds-base, and
if it takes the 389-ds-base replacement, then the fedora-ds-base scriptlet
never gets updated before the package is removed.  If you push a
fedora-ds-base with a higher version, I assume that yum will take that
update in preference to the 389-ds-base replacement.  Presumably
389-ds-base would later have to be bumped (and have its Obsoletes bumped
as well) in order to cause the switch.  But how much later?  You'll never
know when everyone's system has taken the fedora-ds-base transitional
update, so you won't know when it's safe to push the new 389-ds-base.

For systems that have already taken the recent 389-ds-base update,
would they be caused to upgrade back to fedora-ds-base?  I'm not sure
whether updates only happen for installed real packages or if they
also apply to metapackages that are Provided by real packages.  You
certainly don't want anyone who has 389-ds-base switching back now;
their systems would break again.

Even if you could arrange for what you propose, I don't think that
simply removing the preun scriptlet is correct - for anyone who was
genuinely trying to uninstall the package, it would probably cause the
daemon to crash (perhaps not terrible if you're getting rid of it), and
would remove the /etc/rc.d/init.d/dirsrv service file without
deconfiguring it, so the runlevels would end up with dangling symlinks
to a nonexistent service.  I have some vague recollection that chkconfig
only works when the service file exists, but the manpage doesn't seem
to mention that (maybe it only applies to --add?).  It seems like just
removing that scriptlet would leave you with an effectively broken package.

Thinking harder about this, it is actually possible to communicate
information between the packages during the update transaction, though
doing so is sort of ugly.  The way I've come up with for doing this
would be to have the 389-ds-base %pre script check to see if it's being
installed instead of upgraded (that is, $1 == 1 instead of 2), and if
so, see whether the dirsrv service already exists.  If it does, then
389-ds-base knows that fedora-ds-base had already been installed.  It
can then go about querying the chkconfig state of dirsrv, and save it
to some temporary file.  Then, in the infrequently used %posttrans
scriptlet for 389-ds-base, which will run after the fedora-ds-base
package has been removed and gone through its %preun and %postun, you
could restore the dirsrv service and configuration.

How to query the chkconfig state is arguably more ugly.  chkconfig
doesn't offer a machine-readable state readout that you can pipe back
into it.  You can query the on/off state for each level by looking at
the return code of 'chkconfig --level X dirsrv', but there's no exposed
way to find the start/stop priorities.  The best thought I've heard of
for how to completely capture the state would be to tar up
/etc/rc.d/rc?.d/[SK]??dirsrv as a temp file, and then untar it in
the %posttrans.  (I warned you it was an ugly idea.)

If this had been planned far ahead, perhaps the daemon shutdown could
have been cleanly avoided by an update to fedora-ds-base that would
have its %preun scriptlet check for (and subsequently remove) a flag
file left behind by the 389-ds-base %pre if it saw that it was being
installed and the dirsrv service existed.  It could choose to only
shut down the daemon if that flag file is absent.  Again, though,
there would never be a way for you to know when everyone had received
that update and it was safe to deploy the 389-ds-base name change.

Comment 5 Rich Megginson 2009-08-21 15:30:08 UTC
The 389-ds-base and 389-ds-admin packages will detect if their fedora-ds counterparts are installed in their %pre sections.  The %pre section will save the run level configuration somehow (temp file?  global lua variable?) and set a flag to denote that this is an upgrade replacement.  The %posttrans section will check to see if the flag exists, and if so, restore the run level config and start the server.

Will that solve the problems, except that there seems to be no way to avoid shutting down the servers?

Comment 6 Rich Megginson 2009-08-24 16:41:40 UTC
Created attachment 358482 [details]
patch

Comment 7 Rich Megginson 2009-08-24 23:17:12 UTC
Created attachment 358509 [details]
new patch - uses lua

It occurred to me that using tar and untarring the file is a tremendous security hole, with no obvious way to secure it.  Therefore, I'm scrapping that solution, and going with a solution that uses the built-in lua interpreter.  This also allows me to save the information in a global variable in the %pre section that I can access in the %posttrans section, which should be very secure.

Comment 8 Fedora Update System 2009-08-25 20:22:18 UTC
389-ds-base-1.2.2-1.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/389-ds-base-1.2.2-1.fc11

Comment 9 Fedora Update System 2009-08-25 20:23:28 UTC
389-ds-base-1.2.2-1.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/389-ds-base-1.2.2-1.fc10

Comment 10 Fedora Update System 2009-08-25 20:25:01 UTC
389-admin-1.1.8-4.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/389-admin-1.1.8-4.fc11

Comment 11 Fedora Update System 2009-08-25 20:25:37 UTC
389-admin-1.1.8-4.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/389-admin-1.1.8-4.fc10

Comment 12 Fedora Update System 2009-08-27 02:09:15 UTC
389-ds-base-1.2.2-1.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 13 Fedora Update System 2009-08-27 02:14:14 UTC
389-admin-1.1.8-4.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 14 Fedora Update System 2009-08-27 02:16:18 UTC
389-admin-1.1.8-4.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 15 Fedora Update System 2009-08-27 02:17:59 UTC
389-ds-base-1.2.2-1.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.