Bug 171038

Summary: up2date -u hangs when running preinstall scripts for Update 2
Product: Red Hat Enterprise Linux 4 Reporter: Karl E. Kelley <kekelley>
Component: up2dateAssignee: Adrian Likins <alikins>
Status: CLOSED DUPLICATE QA Contact: Fanny Augustin <fmoquete>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: shillman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-18 19:42:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/var/log/up2date covering the period of the events described. none

Description Karl E. Kelley 2005-10-17 16:21:05 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050920 Firefox/1.0.7

Description of problem:
When installing the updates for Update 2, but after up2date up2date and
up2date rpm had completed correctly.  up2date -u would download several
rpms and then quit for various reasons, retrying it would bring down
a few more rpms, and finally, all of them were downloaded and tried to
install, but the install hung near the end of the installs, it just
stopped doing I/O, and I noticed that the preinstall script for nfs-utils
was still running, and was stuck in the useradd command for
the rpcuser login name.  It was impossible to even switch to a different
console via <ALT>-Fx, but I had an ssh window open from a different machine
so I could look at various things about the system.
The process was impossible to kill, and
eventually the machine quit talking all together, and had to be 
powered off and back on.

After recovering from the violent reboot, I tried the up2date -u again,
and found that rpm had been left in a less than useful state, with
both the old and new rpms still installed as far as rpm was concerned, and
up2date failed to update because it thought that an rpm had been installed
that was part of a group, but the rest of the group was not installed.
I fixed that up by installing, sometimes with --force the new rpm, and 
finally the up2date -u would be happy and continue, but then it again
hung at a similar place, but this time it was in the preinstall script for
nscd, for the nscd login id.  Again the machine had to be power cycled to
recover from the problem.  When it got back up, and went through and
fixed every rpm that wasn't installed correctly, from the list in
/var/log/up2date, which I will attach to this report.  I noticed that
shadow-utils was in this update set, which is the rpm that owns useradd,
so I made sure that it was installed correctly and updated as part of the
fixes.  I also manually issued the useradd commands and they didn't hang
outside of up2date.  After fixing the rpms up completely, which took
quite some time.  I again issued up2date -u and it completed finally.

This appears to be a timing problem with the rpm scripts for different
packages,

Version-Release number of selected component (if applicable):
rpm-4.3.3-11_nonptl , up2date-4.4.50.4

How reproducible:
Couldn't Reproduce

Steps to Reproduce:
1.Obviously this happened at least twice, but isn't really reproducible
without a machine that can be restored to an exact state and up2date -u
repeated.  I don't really have such machines at my disposal.
2.
3.
  

Additional info:

I will enclose a copy of /var/log/up2date for the days involved in this
problem, it ends when the second hang occurs, and does not include the
finall recovery, which was mostly done with rpm commands and not
up2date, it does include the list of packages that were being updated
for update 2.

Comment 1 Karl E. Kelley 2005-10-17 16:23:43 UTC
Created attachment 120063 [details]
/var/log/up2date covering the period of the events described.

Comment 2 Suzanne Hillman 2005-10-18 16:02:09 UTC
Karl - this looks like it might be the same as bug 170087. Do you agree?

Comment 3 Karl E. Kelley 2005-10-18 16:53:42 UTC
Looks like it probably is the same as bug 170087, though I didn't notice 100%
cpu, though it certainly could have had 100% cpu.  I didn't think to search other
components, and probably should have searched for shadow-utils, since it
obviously was involved. sorry.



Comment 4 Suzanne Hillman 2005-10-18 19:42:50 UTC
No apologies necessary; just wanted to make sure I was correct before duping
this against that bug. :)

*** This bug has been marked as a duplicate of 170087 ***