Bug 72043 - glibc-2.2.90-24: system hangs during shutdown
Summary: glibc-2.2.90-24: system hangs during shutdown
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: initscripts
Version: 8.0
Hardware: i686
OS: Linux
high
high
Target Milestone: ---
Assignee: Bill Nottingham
QA Contact: Brian Brock
URL:
Whiteboard:
: 71559 72949 73152 75700 (view as bug list)
Depends On:
Blocks: 67217 79578
TreeView+ depends on / blocked
 
Reported: 2002-08-20 20:58 UTC by Joachim Frieben
Modified: 2014-03-17 02:30 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-01-14 05:10:40 UTC
Embargoed:


Attachments (Terms of Use)

Description Joachim Frieben 2002-08-20 20:58:42 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020809

Description of problem:
Prevents system from unmounting "/usr" partition during shutdown. The system
needs a hard reset to reboot!

Version-Release number of selected component (if applicable):
2.2.90-24

How reproducible:
Always

Steps to Reproduce:
1. Shut down system
2. Watch procedure continue until the system partitions are about to be unmounted.

Actual Results:  System issues the following message:

"Unmounting file systems: umount2: Device or resource busy"
"umount: /dev/sda5 : not mounted"
"umount : /usr : Illegal seek"

The system ist stuck then.

Expected Results:  The system should shut down normally.

Additional info:

The system is a PR440FX based dual Pentium Pro workstation with SCSI peripherals
only. The reported issue has never been observed in the past. Downgrading to
"glibc-2.2.90-17" from "Limbo 2"cures this problem.

Comment 1 Bill Nottingham 2002-08-22 01:02:57 UTC
*** Bug 71559 has been marked as a duplicate of this bug. ***

Comment 2 Bill Nottingham 2002-08-22 01:04:01 UTC
This works fine with -23. Presumably something to do with locale archives?

Comment 3 Jakub Jelinek 2002-08-22 07:10:48 UTC
Yes, the question is what programs are running at umount /usr time with
non-C locale.
If it is just some program started from halt script or something similar
that late, it should be easy enough to fix - export LOCPATH=/usr/lib/locale
(this means locales still work but locale-archive is not used).
If it is bash running halt script, we could add
#!/bin/sh
if [ -z "$LOCPATH ]; then export LOCPATH=/usr/lib/locale; exec /etc/rc.d/init.d/half; fi
or something.
I'll try putting fuser -v /usr/lib/locale/locale-archive
before the umount commands in /etc/rc.d/init.d/halt
I don't have /usr mounted separately and no space to install that though...

Comment 4 Roland McGrath 2002-08-22 08:11:50 UTC
I reproduced this without a separate /usr partition by moving
/usr/lib/locale/locale-archive to a small partition (presumably a -o loop
filesystem would work too) and replacing it with a symlink to that.
Shutdown now barfs on unmounting /foobar as reported for /usr.

I would not expect LOCPATH=/usr/lib/locale to behave any differently
because in that case it will mmap the individual files and have the 
same issues with the filesystem.  But obviously that didn't happen before.
The difference must be MAP_SHARED vs MAP_PRIVATE in the mmap of the archive.
I had it in mind that it didn't matter which under PROT_READ, but in fact the
kernel has to hold on to the file in case we ever did mprotect to PROT_WRITE.
I have checked in a fix to glibc mainline to use MAP_PRIVATE for the archive,
which should make it behave the same as mmap'ing the individual files has done.

Comment 5 Bill Nottingham 2002-08-28 21:17:57 UTC
Still doesn't work in -26, FWIW.

Comment 6 Bill Nottingham 2002-08-29 18:34:27 UTC
*** Bug 72949 has been marked as a duplicate of this bug. ***

Comment 7 Bill Nottingham 2002-08-29 19:40:28 UTC
Works ok with 2.2.91-1.

Comment 8 Markku Kolkka 2002-08-31 12:40:02 UTC
*** Bug 73152 has been marked as a duplicate of this bug. ***

Comment 9 Daniel Hammer 2002-08-31 19:38:11 UTC
when will 2.2.91-1 finally hit the public rawhide tree?

Comment 10 Vesa-Matti Sarenius 2002-10-21 05:38:08 UTC
The same problem appears with glibc-2.2.93-5. 
So this Bug really should not be considered closed.

Comment 11 Roland McGrath 2002-11-06 09:45:01 UTC
This failure mode does indeed persist in 8.0.
I cannot see how it is glibc's fault, though.
It seems like the kernel's fault for not letting the filesystem
be unmounted when the only references to it are read-only mmap's
(the file descriptors are already closed).  If the kernel is not
supposed to let you unmount the partition, then I think the halt
script needs to work around the fact that /usr may still be referenced.
I think it's trying to do that with NOLOCLAE=1 before /etc/init.d/functions.
Adding "unset LANG" at the top of /etc/init.d/halt fixes it for me.
I suspect that should be done in /etc/init.d/functions in the NOLOCALE case.
Probably this bug should be reopened and reassigned to initscripts.

Comment 12 Jakub Jelinek 2002-11-06 10:02:42 UTC
The weird thing is that before locale-archive the individual LC_ files were mapped
exactly like that, r--p in /proc/<pid>/maps.

Comment 13 Jakub Jelinek 2002-11-06 10:25:28 UTC
In fact, if I:
dd if=/dev/zero of=localefs bs=1024k count=100
echo y | mke2fs -m 0 localefs
mount -o loop localefs /mnt/floppy
cp -a /usr/lib/locale/en_US /usr/lib/locale/locale-archive /mnt/floppy
and on another vt
LC_ALL=en_US LOCPATH=/mnt/floppy /bin/sh
then /mnt/floppy cannot be umounted.
Which means I don't understand why this ever worked.

Concerning /etc/init.d/halt, /etc/rc.d/rc is already supposed to unset it:
        if [ "$subsys" = "halt" -o "$subsys" = "reboot" ]; then
                unset LANG
                unset LC_ALL
                exec $i start
        fi
It would be obviously better to export LC_ALL=C, not unset those two vars,
so that even in presence of some other LC_ variable it doesn't use locale-archive
or locale files.

Comment 14 Roland McGrath 2002-11-06 22:18:02 UTC
I could have sworn the "unset LANG" was what made it work, but I think
I was confused by something else at the time.  I can no longer reproduce
this with a real partition for /usr/lib/locale.  My tests using a loopback
filesystem turned out to be a red herring, because /etc/init.d/netfs would
try to unmount it and lose before it got to /etc/init.d/halt.

I think at this point someone other than me and Jakub should try to reproduce it
on 8.0.

Comment 15 Bill Nottingham 2003-01-14 05:10:40 UTC
Setting LC_ALL=C in rc done in 7.03-1; this *should* solve the problem.

Comment 16 Bill Nottingham 2003-01-14 05:10:55 UTC
*** Bug 75700 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.