Bug 223931 - rpm has problems with the system date is very wrong
rpm has problems with the system date is very wrong
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: rpm (Show other bugs)
6
All Linux
medium Severity medium
: ---
: ---
Assigned To: Panu Matilainen
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-01-22 23:42 EST by Kevin Fenzi
Modified: 2008-01-25 12:32 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-01-25 12:32:59 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
output from rpm (22.77 KB, text/plain)
2007-01-23 15:38 EST, Kevin Fenzi
no flags Details

  None (edit)
Description Kevin Fenzi 2007-01-22 23:42:50 EST
Description of problem:

On a freshly install fc6 ppc system with a dead cmos/pram battery, after the 
install, the machine comes up thinking the date is something like: 
Thu Dec 31 17:45:43 MST 1903
When an update to the glibc in glibc-updates is run, rpm removes the /lib/ld-
2.5.so file, making the dynamic linker very unhappy. 

  Updating  : glibc                        ##################### [  6/426] 
error: unpacking of archive failed on file /lib/ld-2.5.so: cpio: open

If you first set the date correctly with ntpdate the update happens just fine. 

Version-Release number of selected component (if applicable):
rpm-4.4.2-32

How reproducible:
Set the date to something very wrong and old. 
Try updating a package like glibc. 
Watch your dynamic linker go into la la land when /lib/ld-2.5.so is removed. 

Additional info:

This could be a cpio problem, or the way rpm handles unpacking the cpio 
archive.
Comment 1 Jeff Johnson 2007-01-23 07:46:18 EST
Odd. The failure is while trying to open a file, there's no obvious
time dependency there.

Running rpm -Uvv --fsmdebug ... should display steps in installing.

Add --rpmiodebug to see every I/O operation performed by rpm.
Comment 2 Kevin Fenzi 2007-01-23 14:43:35 EST
ok. I will see about trying those commands from comment #1 and post results. 

Thanks for the response. 
Comment 3 Kevin Fenzi 2007-01-23 15:33:32 EST
ok. I pulled the plug on the machine, rebooted and confirmed the date was back 
to 1903. Then I tried to rpm --replacepkgs -Uvv --fsmdebug --rpmiodebug on it

I have attached as much of it as was still in my terminal buffer. 

The end of it's pretty interesting: 

D: process   100755  1 (   0,   0)    148960 /lib/ld-2.5.so
*** Fopen ufdio path /lib/ld-2.5.so fmode w.ufdio
*** ufdOpen(/lib/ld-2.5.so,0x241,0666)
D:     Fopen (/lib/ld-2.5.so, "w") wfd (nil) wrbuf 0x10625020
D: undo      100755  1 (   0,   0)    148960 /lib/ld-2.5.so
D:    Unlink (/lib/ld-2.5.so)
--> fd  0x1065a3b8 -- 4 persist (fsm) at fsm.c:616      | LIBIO 0x105f2920(-1) 
fdno -1 | GZD 0x105f6518 fdno 8 | FD -1 fp (nil)
==> Fclose(0x1065a3b8)  | LIBIO 0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 8 
| FD -1 fp (nil)
--> fd  0x1065a3b8 ++ 4 Fclose at rpmio.c:2840  | LIBIO 0x105f2920(-1) fdno -1 
| GZD 0x105f6518 fdno 8 | FD -1 fp (nil)
==>     gzdClose(0x1065a3b8) zerror 0   | LIBIO 0x105f2920(-1) fdno -1 | GZD 
0x105f6518 fdno 8 | FD -1 fp (nil)
==>     gzdClose(0x1065a3b8) rc 0       | LIBIO 0x105f2920(-1) fdno -1 | GZD 
0x105f6518 fdno 8 | FD -1 fp (nil)
GZDIO:       1 reads,     8192 total bytes in 0.000849 secs
--> fd  0x1065a3b8 -- 4 open (gzdClose) at rpmio.c:2497         | LIBIO 
0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 8 | FD -1 fp (nil)
--> fd  0x1065a3b8 -- 3 fopencookie (Fclose) at rpmio.c:2901    | LIBIO 
0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 8 | FD -1 fp (nil)
==>     fdClose(0x1065a3b8) rc fffffffe         | FD -1 fp (nil)
--> fd  0x1065a3b8 -- 2 open (fdClose) at rpmio.c:471   | FD -1 fp (nil)
--> fd  0x1065a3b8 -- 1 Fclose at rpmio.c:2918  | FD -1 fp (nil)
error: unpacking of archive failed on file /lib/ld-2.5.so: cpio: open failed - 
Bad file descriptor
--> fd  0x105f6940 -- 3 persist (showProgress) at rpminstall.c:140      | LIBIO 
0x1065b900(-1) fdno -1 | UFD 5 fp 0x1065b900
==> Fclose(0x105f6940)  | LIBIO 0x1065b900(-1) fdno -1 | UFD 5 fp 0x1065b900
--> fd  0x105f6940 ++ 3 Fclose at rpmio.c:2840  | LIBIO 0x1065b900(-1) fdno -1 
| UFD 5 fp 0x1065b900
==>     fdClose(0x105f6940) rc 0        | UFD -1 fp 0x1065b900
--> fd  0x105f6940 -- 3 open (fdClose) at rpmio.c:471   | UFD -1 fp 0x1065b900
==>     fdClose(0x105f6940) rc fffffffe         | LIBIO 0x1065b900(-1) fdno -1 
| UFD -1 fp (nil)
--> fd  0x105f6940 -- 2 open (fdClose) at rpmio.c:471   | LIBIO 0x1065b900(-1) 
fdno -1 | UFD -1 fp (nil)
--> fd  0x105f6940 -- 1 Fclose at rpmio.c:2918  | UFD -1 fp (nil)
Comment 4 Kevin Fenzi 2007-01-23 15:38:33 EST
Created attachment 146350 [details]
output from rpm

Here's the output from the rpm command (or as much of it as I could get)
Comment 5 Jeff Johnson 2007-01-23 19:28:59 EST
Weirder and weirder. The portion of the spew relevant to your problem is here ...
...
==> Fread(0x10625020,1,1,0x1065a3b8)    | LIBIO 0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 8 
| FD -1 fp (nil)
D:     Fread (wrbuf, 1, cfd)    rdnb 1
==> Ferror(0x1065a3b8) rc 0     | LIBIO 0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 8 | FD -1 fp 
(nil)
==> Fread(0x10625020,1,110,0x1065a3b8)  | LIBIO 0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 
8 | FD -1 fp (nil)
D:     Fread (wrbuf, 110, cfd)  rdnb 110
==> Ferror(0x1065a3b8) rc 0     | LIBIO 0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 8 | FD -1 fp 
(nil)
==> Fread(0x10625020,1,16,0x1065a3b8)   | LIBIO 0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 
8 | FD -1 fp (nil)
D:     Fread (wrbuf, 16, cfd)   rdnb 16
==> Ferror(0x1065a3b8) rc 0     | LIBIO 0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 8 | FD -1 fp 
(nil)
==> Fread(0x10625020,1,2,0x1065a3b8)    | LIBIO 0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 8 
| FD -1 fp (nil)
D:     Fread (wrbuf, 2, cfd)    rdnb 2
==> Ferror(0x1065a3b8) rc 0     | LIBIO 0x105f2920(-1) fdno -1 | GZD 0x105f6518 fdno 8 | FD -1 fp 
(nil)
D: process   100755  1 (   0,   0)    148960 /lib/ld-2.5.so 
*** Fopen ufdio path /lib/ld-2.5.so fmode w.ufdio
*** ufdOpen(/lib/ld-2.5.so,0x241,0666)
D:     Fopen (/lib/ld-2.5.so, "w") wfd (nil) wrbuf 0x10625020
D: undo      100755  1 (   0,   0)    148960 /lib/ld-2.5.so 
D:    Unlink (/lib/ld-2.5.so) 
...

That *should have been a temporary file name with transaction id (which is just a time stamp)
appended. The path should *not* have been the actual path.

So there's something fishy with the generation of temporary file names if the date is really weird.

Would you mind repeating the above with same --rpmiodebug --fsmdebug -vv using strace strace -o /
tmp/xxx ... and appending the output? That should be enough to pin this sucker down ... 
Comment 6 Kevin Fenzi 2007-01-23 19:39:27 EST
ok. I can try that when I get time to do another re-install. ;) 

How is the transaction id generated? Is it using seconds since '1970-01-01 
00:00:00 UTC' ?

I make the 1903 date above as '-2082842057' using that. Perhaps the negative 
number there is making it ignore the timestamp?

Comment 7 Jeff Johnson 2007-01-23 19:57:31 EST
Ah this is likely the core of the problem:

    memset(fsm->sufbuf, 0, sizeof(fsm->sufbuf));
    if (fsm->goal == FSM_PKGINSTALL) {
        if (ts && rpmtsGetTid(ts) > 0)
            sprintf(fsm->sufbuf, ";%08x", (unsigned)rpmtsGetTid(ts));
    }

The code above formats the transaction id time stamp for creating temporaray file names,
the weird date is failing the test, no suffix, no tempname, and the Unlink on the error path
nukes the original rather than the temp file name.

I'll try to address those issues.

The only thing I still don't understand is why the Fopen() failed, that is what triggered
the Unlink call.

An strace would be very interesting ...
Comment 8 Jeff Johnson 2007-01-23 20:12:09 EST
This looks like most of the fix (although I should probably prevent the Unlink if no suffix):
Index: fsm.c
===============================================================
====
RCS file: /cvs/devel/rpm/lib/fsm.c,v
retrieving revision 2.112.2.29
diff -u -b -B -w -p -r2.112.2.29 fsm.c
--- fsm.c       20 Jan 2007 21:24:19 -0000      2.112.2.29
+++ fsm.c       24 Jan 2007 01:10:19 -0000
@@ -613,7 +613,7 @@ fprintf(stderr, "\tcpio vectors set\n");
 
     memset(fsm->sufbuf, 0, sizeof(fsm->sufbuf));
     if (fsm->goal == FSM_PKGINSTALL) {
-       if (ts && rpmtsGetTid(ts) > 0)
+       if (ts && rpmtsGetTid(ts) != 0)
            sprintf(fsm->sufbuf, ";%08x", (unsigned)rpmtsGetTid(ts));
     }
 
Comment 9 Jeff Johnson 2007-01-23 20:58:01 EST
Fixed (handle negative tid's, don't unlink if not a temp file) in rpm cvs, will be in rpm-4.4.8-0.12
when built.

UPSTREAM

And I'd still like to know why the Fopen() failed, the strace should tell all ...
Comment 10 David Woodhouse 2007-01-24 11:10:50 EST
(In reply to comment #9)
> And I'd still like to know why the Fopen() failed, the strace should tell all ...

Presumably -ETXTBUSY
Comment 11 Jeff Johnson 2007-01-24 19:09:19 EST
-ETXTBUSY from open? Hmm, has that changed in the last 4 years or so? rpm used to just
write into *.so files directly and Fopen is just open(2), as always. ETXTBUSY on hpux from
unlink(2) is/was a known problem in rpm, fixed like 1998 or so.

So is -ETXTBUSY from open(2) a change in linux kernels within the last 4 years?
Comment 12 Jeff Johnson 2007-01-24 19:39:47 EST
Bingo. man 2 open tells all (as always). Thanks for the ptr ...

  ETXTBSY
              pathname refers to an executable image which is currently  being
              executed and write access was requested.
Comment 13 Kevin Fenzi 2007-01-24 20:08:04 EST
So, do you still need me to run an strace?
I'm getting pretty tired of re-installing that box. ;) 

I'll try and find time to do so if you really need the strace though...
Comment 14 Jeff Johnson 2007-01-31 08:57:59 EST
No, I needed to understand why the problem was happening, rpm used to write
directly to *.so files, not to a temp file, without any problem whatsoever.

The difference is that /bin/rpm used to be statically linked, and so ETXTBUSY
was never encountered.

So no strace needed.

Change yr battery instead ;-)
Comment 15 Red Hat Bugzilla 2007-08-21 01:31:14 EDT
User pnasrat@redhat.com's account has been closed
Comment 16 Panu Matilainen 2007-08-22 02:33:41 EDT
Reassigning to owner after bugzilla made a mess, sorry about the noise...
Comment 17 Panu Matilainen 2008-01-25 12:32:59 EST
Fixed in 4.4.2.3-0.1.rc1, thanks to Jeff for the fix...

Note You need to log in before you can comment on or make changes to this bug.