Bug 157691 - smp boot hangs, hyperthreading
smp boot hangs, hyperthreading
Status: CLOSED DUPLICATE of bug 158413
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
4
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Dave Jones
Brian Brock
:
Depends On:
Blocks: FC4Target
  Show dependency treegraph
 
Reported: 2005-05-13 15:51 EDT by Richard Hitt
Modified: 2015-01-04 17:19 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-23 01:02:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Richard Hitt 2005-05-13 15:51:46 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041217

Description of problem:
Dell Dimension 4700 Intel-Hyperthreading kernel hangs at boot time, non-SMP kernel does not hang.  I had just installed fc4beta3 yesterday 5/12/05, installation went smoothly.  The installation was an upgrade from fc3. The hang at boot happened right after the message "reading all disks - this should take a while" message (I have one lvm2 partition on my 250G disk).  After > ten minutes there was no change.  Ctrl-Alt-Delete was successful in rebooting.  I chose from grub the non-SMP kernel and successfully booted.  On this reboot the same approximate message appeared and boot continued immediately.

Sorry I haven't tried to reproduce it, send email to rbh00@utsglobal.com if you'd like me to try.

Version-Release number of selected component (if applicable):


How reproducible:
Didn't try


Additional info:

I sure hope this can be fixed before fc4 goes general availability.  I would be happy to apply a patch to the kernel and rebuild to test your fix; I'm a competent kernel builder.

I'd be happy to try anything you suggest.  The box is at home; I'm at work here in Mountain View and so it'd be ~6 hours before I could try it.  Email address above is read both here at work and at home.
Comment 1 Dave Jones 2005-05-17 21:10:52 EDT
what disk controller is in this box ?
Comment 2 Richard Hitt 2005-05-17 23:10:51 EDT
From /etc/sysconfig/hwconf (retyped):
class: HD
bus: SCSI
detached: 0
device: sda
driver: ignore
desc: "Ata Maxtor 7Y250M0"
host: 0
id: 0
channel: 0
lun: 0
generic: sg0

lsmod shows that (this is the UP kernel, not the SMP one, of course) these
drivers are loaded:  libata sd_mod scsi_mod.  And ata_piix, and dm_modjbd ext3
dm_mirror dm_zero dm_snapshot mii, that last is probably unrelated.  uname -r
gives 2.6.11-1.1286_FC4.

This is a 250-Gigabyte disk; the default installation caused fedora to make one
huge LVM2 partition out of it.  My only change was to increase swap size to
2048M from 512M in anticipation of boosting my ram to 1GB from 256MB.  But I'm
still at 256MB ram.

Separately I tried and failed doing an "rpm build --rebuild" of the kernel
source tree, both the 1286 kernel and the later one I fetched using "up2date
--get-source kernel", which yielded kernel-2.6.11-1.1305.FC4.src.rpm in
/var/spool/up2date.  The failure was the same in both cases:  After the normal
"Installing ..." message came this"  "error: Architecture is not included:
i386".  The output of "uname -m" is i686.

Of course that's a separate bug; but it would prevent me from trying either a
new source tree or a patch to my source tree.

Nevertheless, thanks a whole lot for fedora and for your work on it.  Let me
know how I can help further.

Richard Hitt
Comment 3 Richard Hitt 2005-05-18 15:47:49 EDT
Here is a sane reformatting of my original "description":

Dell Dimension 4700 Intel-Hyperthreading kernel hangs at boot time, non-SMP
kernel does not hang.  I had just installed fc4beta3 yesterday 5/12/05,
installation went smoothly.  The installation was an upgrade from fc3. The hang
at boot happened right after the message "reading all disks - this should take a
while" message (I have one lvm2 partition on my 250G disk).  After > ten minutes
there was no change.  Ctrl-Alt-Delete was successful in rebooting.  I chose from
grub the non-SMP kernel and successfully booted.  On this reboot the same
approximate message appeared and boot continued immediately.

Also, regarding reproducibility:  I can reproduce this problem 100% of the time,
and I've tried 3 or 4 times now.
Comment 4 Michael 2005-05-23 00:08:43 EDT
I have also reproduced this on a Dell Dimension 4700 with FC4test3. Dies with smp kernel, works fine 
with no smp support. 

Additional data:
- It happens with both SATA and EIDE drives. That is, I removed the SATA drive and plugged in an EIDE 
drive ... behavior is the same. 
- turning off SMP in the BIOS did not do anything ... I didn't really expect it to. 
- It happens regardless of whether you are using lvm or not ... it was hanging with an lvm2 message on 
the screen. 
- under a minimal install without lvm there is more logging data that comes out during the startup 
process. It is dying after saying
 ...
 kjournald starting.  Commit interval 5 seconds
 EXT3-fs: mounted filesystem with ordered data mode
 switching to new root
 unmounting old /proc
 unmounting old /sys
 <<< dies here >>>

Let me know if I can do anything to help. 
Comment 5 Warren Togami 2005-05-23 01:01:21 EDT
Richard and Michael,
http://people.redhat.com/wtogami/temp/kernel-smp-2.6.11-1.1267_FC4.i686.rpm
http://people.redhat.com/wtogami/temp/kernel-smp-2.6.11-1.1276_FC4.i686.rpm
Please test these two old FC4 kernels.  You may need to install one at a time
with --oldpackage and --nodeps.  I suspect 1267 will boot while 1276 gets stuck.

Please report back your results.
Comment 6 Warren Togami 2005-05-23 01:02:49 EDT

*** This bug has been marked as a duplicate of 158413 ***
Comment 7 Richard Hitt 2005-05-23 04:11:15 EDT
Hi, Warren.  It is as you predicted.  The 1267 kernel works fine, and I verified
smp by looking at the gkrellm screen, which shows two CPUs.  The 1276 kernel
fails, in exactly the mode I described earlier.  Then I booted 1329-smp and it
failed the same way, and then 1329 (non-smp), and I'm up under that kernel now.
 I used the command "rpm -i --oldpackage" to install the two rpms; didn't need
--nodeps so didn't use it.

Happy to test a fixed kernel when you have one ready.  Thanks for your work.

Note You need to log in before you can comment on or make changes to this bug.