Bug 638091
Summary: | Kernel panic at boot after glibc update | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Artem <artem.goncharov> | ||||
Component: | glibc | Assignee: | Andreas Schwab <schwab> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 14 | CC: | bruce, deadletterfile, fweimer, jakub, jan.kratochvil, jcm, jean01brard, jeff, jlaska, jlayton, joachim.backes, kdudka, madko, martin.nad89, m.a.young, mrunge, ricardo.arguello, rjones, schwab, stephent98, thenzl, tomek | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | glibc-2.12.90-14 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-09-30 06:16:00 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 538277 | ||||||
Attachments: |
|
Description
Artem
2010-09-28 07:45:26 UTC
I can see problem only trying to boot with nomodeset (listing through the paper, therefore might be something wrong): dracut:switching root init[1]: segfault at 3d03c1fbf8 ip 0000003d03a033c7 sp 00007fff4ad80140 error 7 in ld-2.12.90.so[3d03a00000+1f000] init used greatest stack depth: 3069 bytes left Kernel panic - .... Pid:1, comm: init not tained kernel 2.6.35.4-28.fc14.x86_64 I tried to boot with older kernel, but it is the same. same problem here, glibc-2.12.90-13 update broke everything :( *** Bug 638114 has been marked as a duplicate of this bug. *** I'm unable to reproduce. Please provide a backtrace. Well, it's hard, since I had already downgraded to have working system, but while I had it I tried to write down at least function names in backtrace, so here it is (without addresses): panic do_exit do_group_exit get_signal_to_deliver do_signal printk bad_area_access_error+0x47/0x4e lockdep_sys_exit do_notify_resume retinit_signal It is basically all (except of addresses). Is it sufficient, or do you need addresses? If it is not enough info please tell me how can I get complete backtrace (maybe to add some option in grub)? I've had a similar problem on two machines that I've upgraded with this glibc version. It seems to work for a while after the upgrade and then everything starts segfaulting. I booted to a rescue cd and did a rpm -V glibc. Several libraries, including ld-*.so were corrupt with bad sizes and MD5 sums. I reinstalled the same glibc package and it corrected the problem, but now I'm wondering -- what corrupted the files in the first place? prelink problems maybe? As a quick test, I just ran /etc/cron.daily/prelink after fixing one of these machines and sure enough, everything started segfaulting again almost immediately. That seems to be triggering whatever the problem is. For the record, I hit this problem as well. I fixed my system by booting the F14 RC3 live CD and using "yum --installroot=... downgrade glibc\*" to downgrade glibc to 2.12.90-11. Paul Frields posted a good transcription the boot panic here: http://lists.fedoraproject.org/pipermail/test/2010-September/094213.html Created attachment 450204 [details]
screenshot with kernel panic 2.6.35.4-28.fc14.i686.PAE #1
screenshot showing kernel panic after dracut, init.
This is on my laptop after the last update.
Could not reproduce in a VM.
I'm manually copying it from the problematic console, so there may be some typing errors: dracut: Switching root init[1]: segfault at 381421fbf8 ip 00000038140033c7 sp 00007fffa38ddd30 error 7 in ld-2.12.90.so[3814000000+1f000] Kernel panic - not syncing: Attempted to kill init! Pid: 1, comm: init Tainted: G W 2.6.35.4-28.fc14.x86_64 #1 Call Trace: [<ffffffff8149b08b>] panic+0x8b/0x110 [<ffffffff81054a89>] do_exit+0x7b/0x7d0 [<ffffffff81055474>] do_group_exit+0x88/0xb6 [<ffffffff810628f4>] get_signal_to_deliver+0x3d6/0x3f5 [<ffffffff8107e44f>] ? lock_release+0x19a/0x1a6 [<ffffffff81008f91>] do_signal+0x72/0x690 [<ffffffff81042b7c>] ? mmdrop+0x1a/0x2a [<ffffffff8149b178>] ? printk+0x68/0x70 [<ffffffff8103f052>] ? need_resched+0x23/0x2d [<ffffffff8149b85f>] ? schedule+0x5dd/0x5f7 [<ffffffff8107f15e>] ? lockdep_sys_exit+0x20/0x76 [<ffffffff810095f0>] do_notify_resume+0x28/0x86 [<ffffffff8149e39b>] retint_signal+0x4d/0x92 (In reply to comment #7) > As a quick test, I just ran /etc/cron.daily/prelink after fixing one of these > machines and sure enough, everything started segfaulting again almost > immediately. That seems to be triggering whatever the problem is. Excellent suggestion. Reproduced in a VM by running "anacron -fn" after updating. Now reboot, ls, ps segfault. pwd, true, echo do not. glibc-2.12.90-14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/glibc-2.12.90-14 *** Bug 638210 has been marked as a duplicate of this bug. *** *** Bug 638202 has been marked as a duplicate of this bug. *** *** Bug 638203 has been marked as a duplicate of this bug. *** (In reply to comment #6) > I've had a similar problem on two machines that I've upgraded with this glibc > version. It seems to work for a while after the upgrade and then everything > starts segfaulting. Exactly what I reported in #638114 (system worked for a while, then all started with segfaulting). System was no more rebootable. > I booted to a rescue cd and did a rpm -V glibc. Several > libraries, including ld-*.so were corrupt with bad sizes and MD5 sums. > I reinstalled the same glibc package and it corrected the problem, but now I'm > wondering -- what corrupted the files in the first place? prelink problems > maybe? seems to be ok with glibc-2.12.90-14 (In reply to comment #17) > seems to be ok with glibc-2.12.90-14 I can confirm this. No more crash messages in /var/log messages. *** Bug 638208 has been marked as a duplicate of this bug. *** glibc-2.12.90-14 has been pushed to the Fedora 14 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update glibc'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/glibc-2.12.90-14 Command below from Live! CD shows only 90-13. I assume I am in error? yum --enablerepo=updates-testing --installroot=... list glibc Installed Packages glibc.x86_64 2.12.90-13 @updates-testing Available Packages glibc.i686 2.12.90-13 updates-testing (In reply to comment #20) > glibc-2.12.90-14 has been pushed to the Fedora 14 testing repository. If > problems still persist, please make note of it in this bug report. > If you want to test the update, you can install it with > su -c 'yum --enablerepo=updates-testing update glibc'. You can provide > feedback for this update here: > https://admin.fedoraproject.org/updates/glibc-2.12.90-14 (In reply to comment #21) > Command below from Live! CD shows only 90-13. I assume I am in error? ... glibc-2.12.90-14 hasn't propagated to the mirrors yet: $ sudo repoquery --releasever=14 --enablerepo=fedora --enablerepo='updates*' --disablerepo='rpm*' glibc glibc-0:2.12.90-13.i686 glibc-0:2.12.90-13.x86_64 It's here if you don't want to wait: http://koji.fedoraproject.org/koji/buildinfo?buildID=197187 glibc-0:2.12.90-14.x86_64 work ok, but restart work only with root privileges with user moustly hangs .Maybe is bug in systemd but with glibc-0:2.12.90-10.x86_64 systemd work ok ,I not sure on 100% glibc-2.12.90-14 works for me as well, thanks for the fast fix Also confirming that glibc-2.12.90-14 is working on my laptop and in a VM. Here is how I recovered after glibc-2.12.90-13 was installed: yum isn't on the F14-Beta-RC3 DVDs (nor is it on the F11, F12, or F13 DVDs), so the rescue shell tries to run /mnt/sysimage/usr/bin/yum and that fails with ImportError: No module named yummain However, rpm is on the discs, so I used rpm to downgrade to glibc* on the DVD: 1. Boot the F14 installer DVD in rescue mode (no networking, target system mounted). 2. mkdir /mnt/dvd 3. mount /dev/sr0 /mnt/dvd 4. cd /mnt/dvd/Packages 5. rpm --root /mnt/sysimage --oldpackage -Uv glibc-2*.rpm glibc-common-2*.rpm 6. chroot /mnt/sysimage 7. ls # to verify that the old packages fix the segfault problem 8. exit 9. reboot 10. Run /etc/cron.daily/prelink to verify that the problem does not recur. (anacron will run prelink, but after a random delay, which can be disabled in /etc/anacrontab. That's why the segfaults began appearing a random time after booting.) Andreas: Any chance of a summary in this bug describing what happened? glibc-2.12.90-14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report. |