Bug 144726
Summary: | boot hangs starting cups | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Tony Albery <alberyt> |
Component: | kernel | Assignee: | Dave Jones <davej> |
Status: | CLOSED NOTABUG | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | bill-bugzilla.redhat.com, bugzilla, jamesrwelch, jay.hilliard, john_flanery, johnsteinman, kjmiller, missile-29, musuruan, pfrields, remco, r.nedbal, robatino, srhegde, twaugh, wtogami, zing |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-03-30 08:58:23 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Tony Albery
2005-01-10 23:02:53 UTC
*** Bug 144898 has been marked as a duplicate of this bug. *** I have the same problem on a HP x4000 workstation (dual 1700MHz Xeon). After hanging on the "starting cups" message for a minute or two, I get the text screen, and the last message is the swap message. The same thing on Compaq w6000, dual Xeon 2G, and also on x4000 dual Xeon 2.2G. It seems to be realted to SMP because on my laptop P4 2G everything works fine. The systems are freezing. I even tried to recompile the "Vanilla" kernel from kernel.org and add the latest cups-1.2.23. The same behavior. Definitely something w/ the SMP in 2.6.10 I don't think it's related to SMP. I'm using _single_ Athlon XP on nForce2 motherboard and cups hangs during startup. I am getting the same hang at boot time after running "yum" to upgrade a fresh installation. I was able to boot using the original kernel so the CUPS code appears to be working with the old kernel. rpm -qa | grep cups hal-cups-utils-0.5.2-8 cups-1.1.22-0.rc1.8.3 libgnomecups-0.1.12-5 cups-libs-1.1.22-0.rc1.8.3 rpm -qa | grep kernel kernel-2.6.9-1.667 kernel-utils-2.4-13.1.39 kernel-2.6.10-1.737_FC3 I started having this problem when I updated to Fedora2 Kernel 2.6.10.x and cups 1.1.20-11.9. If I revert to kernel 2.6.9.x, my system will boot properly. Updated to kernel 2.6.10-1.741_FC3. The same thing. Tried to disable cups from startup and tried to start it up after boot. The same thing, the machine froze in ~2 mins. BUT: I had 2 other terms open, 1 w/ top and I saw modprobe taking the CPU load to 99%. A ps -ef in the available term said modprobe -q -- cha_major_6_0 and modprobe -q -- parport-lowlevel. Hope this helps. Bug persists with kernel 2.6.10-1.741_FC3 I also have the problem with FC2 and the latest cups-1.1.20-11.10 with Fedora Core (2.6.10-1.9_FC2) and (2.6.10-1.8_FC2). Fedora Core (2.6.9-1.11_FC2) will boot with that latest version of cups. the same for a fresh fc3 install after yum update to 2.6.10. if i disable cups, then kudzu hangs. Seems to be hardware dependend as the same installation runs perfectly on my centrino notebook Bug persists with kernel 2.6.10-1.760_FC3. same symptoms on a Compaq Presario single 2.5GHz P4 Same problem exist on my Compaq Presario Laptop 1800 Series when I upgrade to 2.6.9-1.11_FC2 this bug appears to be in the new kernels in both FC2 and FC3. The only I can get my systems to boot is to disable CUPS or boot the older kernel. Dave, is there anything we can do to help you isolate this problem or obtain more information? Dave, I just updated to 2.6.10.14_FC2 on my Presario Laptop 1800 and my AMD system, the problem still exist under this release. I haven't tried this on FC3 I had dropped back to FC2 to continue my development. This is now becoming an issue as it is slowing down product development under FC2. I have disable CUPS as a work around for the time being. Nothing chaged with latest 2.6.10-1.766_FC3 kernel (system hangs while starting cups). I'm still forced to use old 2.6.9-1.724_FC3 that has security issues, but it's the latest kernel that works. And I'm not very happy with that. happens to me too on an eMachines T2245 with a HP Deskjet 3745 printer. :( Using old kernel works but I don't really want to use an old kernel. :S Just installed the latest kernel patch and the problem is still present. I booted the system with CUPS disable and started top "top -d 1" than started CUPS. What I found is the system isn't hung but it does appear that modprobe is taking up 99.9% of the cpu so I believe we still have a "modprobe" problem in the kern. I have the same problem on my Compaq Presario 1800 laptop. All 2.6.10 kernels I've tried to date (up to -1.770_FC3) exhibit this problem. Old 2.6.9-1.724_FC3 works fine every time. Agree that it is hardware dependent because I have 3 other machines of varying type and vintage and they all work fine. BTW, I did include a chunk of /var/log/messages from a failed boot in bug 147299 because I thought this bug and that one were the same. Ok, last night I tried passing acpi=off to the kernel in grub.conf and 2.6.10 kernels work fine. Never had to do this with 2.6.9 kernels. What changed with acpi? Passing acpi=off on my PC does not help (Athlon XP, nForce2 chipset). I'm still forced to use 2.6.9-1.724_FC3 kernel. I don't know that this is pertinent, but I thought any extra information can't hurt. I have the 2.6.10-1.770_FC3 kernel and I have the same problem with cups hanging (cups-1.1.22-0.rc1.8.5, cup-libs-1.1.22-0.rc1.8.5) in all the kernels I've tried since 2.6.8-1.521 in FC2. I had been booting the working kernel while I was still using FC2. When I switched to FC3, I just commented out cups startup. I have VMWare on this system. Since I was using a new kernel, I needed to re-configure VMWare. One step of the process of re-configuring involve doing a "/sbin/modprobe parport_pc" which would also hang the system. Now, the only other subsystem I know that commonly uses the parallel port is the printing system. This leads me to suspect that the culprit for the cups hang is somewhere in the kernel's handling of the parallel port. I hope this helps. problem persists with kernel 2.6.11-1.14_FC3 This problem persists and is now affecting other utilities like network settings. I have isolated the command line and arguments that cause âmodprobeâ on my Compaq 1800 laptop to appear to hang. âmodprobeâ is using 99.3% of the CPU with the following command line is executed: /sbin/modprobe -s -q parport_pc This is now preventing any development on the update kernel on this laptop. We have a problem here that appears to be platform specific and it need to be fix soon. Thanks for chasing this down. Few more questions.. do you have any parport options in /etc/modprobe.conf ? if you hit ctrl-scroll lock, you should get backtraces of every process running. Can you paste the modprobe trace ? Do you actually have anything connected to your parallel port ? do you have any parport options in /etc/modprobe.conf ?
> cat modprobe.conf
alias eth0 e100
alias snd-card-0 snd-es1938
install snd-es1938 /sbin/modprobe --ignore-install snd-es1938 &&
/usr/sbin/alsactl restore >/dev/null 2>&1 || :
remove snd-es1938 { /usr/sbin/alsactl store >/dev/null 2>&1 || : ; };
/sbin/modprobe -r --ignore-remove snd-es1938
alias usb-controller uhci-hcd
if you hit ctrl-scroll lock, you should get backtraces of every process running.
Can you paste the modprobe trace ?
I'll try this the next time I reboot on the failing kernel.
Do you actually have anything connected to your parallel port ? NO
I tried the ctrl-scroll lock but nothing happened. I could be doing it wrong. *** Bug 155927 has been marked as a duplicate of this bug. *** http://forums.fedoraforum.org/showthread.php?p=246689 (cross referencing more complaints) Thanks for finding the parport_pc loading problem. Since I don't have anything on the parallel port, I simply removed the parport_pc.ko and cups works fine now. Regarding comment #21 mentioning the kernel's handling of the parallel port, this may be related to bug #145151. Both bugs are triggered by exactly the same kernel versions (2.6.10 and up). Regarding comment #29, bug #145151 involves the kernel not finding a printer which is connected to the parallel port. In that case, a workaround is to add 2 lines to modprobe.conf: alias parport_lowlevel parport_pc options parport_pc io=0x378 irq=7 (assuming only one parallel port printer). This tells the kernel that a device is there, even though it thinks there isn't. Is it possible to add something similar to modprobe.conf to tell it that there is nothing on the parallel port, even if it thinks there is? Seems to me that removing a module is a little excessive. (In reply to comment #31) > Seems to me that removing a module is a little > excessive. Not if it is broken. How you work around the problem is an individual choice. What works for me might not work for you. The real solution is to fix the bug. Is this something LKML is working on? Or is it Fedora only? I've not seen it mentioned upstream, but I'd be very surprised if its a Fedora specific bug, as we don't have any patches to the parport driver. The kernel change log shows a 'fix' to parport in 2.6.10. The 2.6.11 change log shows another fix regarding module parameter passing. I'll try to get a .11 kernel and test that. The workaround given in comment #29 worked for me. Now I am able to boot with cups started, I am able to goto internet-druid etc etc. *** Bug 159096 has been marked as a duplicate of this bug. *** *** Bug 159124 has been marked as a duplicate of this bug. *** (In reply to comment #35) > The workaround given in comment #29 worked for me. > Now I am able to boot with cups started, I am able to goto internet-druid etc etc. Sorry the comment #31 worked for me. (In reply to comment #38) > (In reply to comment #35) > > The workaround given in comment #29 worked for me. > > Now I am able to boot with cups started, I am able to goto internet-druid etc etc. > I am sorry about this. I think these comment numbers are dynamic, the number keeps changing each time I load the web page. The work around that worked for me is to add the following two lines to my modprobe.conf alias parport_lowlevel parport_pc options parport_pc io=0x378 irq=7 The above also worked for me. Tks. The work around to add the following line to modprobe.conf: alias parport_lowlevel parport_pc options parport_pc io=0x378 irq=7 worked for my Compaq Presario 1800 Laptop. This doesn't fix the real problem however and I would think before the code goes to RedHat Enterprise the problem should be corrected. We shouldn't have to use the work around. (In reply to comment #41) > The work around to add the following line to modprobe.conf: > > alias parport_lowlevel parport_pc > options parport_pc io=0x378 irq=7 > > worked for my Compaq Presario 1800 Laptop. This doesn't fix the real problem > however and I would think before the code goes to RedHat Enterprise the problem > should be corrected. We shouldn't have to use the work around. > Sorry for posting it here even though I don't use FC but this seems to be the most comprehensive discussion of the kernel freezing when cups is started. On my dual-Opteron server, vanilla kernels 2.6.8.1-2.6.11.11 freeze when cups is started. A ***WORKAROUND*** is to do chmod -x /usr/lib/cups/backand/serial Then it becomes possible to start the cups service. Of course, it doesn't help if you have got a serial printer. Apparently there is something wrong with the kernel serial driver in the latest 2.6 kernels. (In reply to comment #42) > > chmod -x /usr/lib/cups/backand/serial > Sorry for the typo, this should read chmod -x /usr/lib/cups/backend/serial An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which may contain a fix for your problem. Please update to this new kernel, and report whether or not it fixes your problem. If you have updated to Fedora Core 4 since this bug was opened, and the problem still occurs with the latest updates for that release, please change the version field of this bug to 'fc4'. Thank you. Updated to kernel-2.6.12-1.1372_FC3. On load came up with message saying failed to detect Canon BJC250 printer which is on parallel port. 1st time I have seen this message. Gave me option to remove from configuration which I accepted. Subsequent attempt to start cups caused PC to hang as usual. Had to force power down to reboot. I'm also continuing to see the cups hang on kernel-2.6.12-1.1372_FC3 The modprobe.conf workaround works. Got this problem only after upgrading from FC3 to FC4.
Kernels are customized vanillas (2.6.11 with FC3,
2.6.13 with FC4), not FC.
FC4: cups-1.1.23-15
FC3: cups-1.1.22-0.rc1.8
My case is unusual as I don't use the parallel port for
printing - and I changed rc.sysinit to make it easier to
use it for something else:
> # start network connection over parallel port
> #
> action "Enable network connection over parallel port (PLIP): " /sbin/ifconfig
plip0 laptop pointopoint desktop netmask 255.255.255.255 up
This is a convenience to enable connection to an old machine
using the parallel port without logging in as root.
Normally 'desktop' isn't connected (so there's a timeout
message in the bootup sequence). This all worked fine with
FC3 and before.
Yes, that's a hack, but it's no excuse for CUPS to hang...
Stack trace (copied by hand):
interruptible_sleep_on
default_wake_function
parport_claim_or_block
lp_open
exact_match
chrdev_open
dentry_open
filp_open
audit_syscall_entry
get_unused_fd
get_name
sys_open
syscall_call
(The "claim or block" bit looks a bit lazy/greedy/
aggressive - surely CUPS should allow for the parallel
port being used for something else.)
Because it was a hack, I took out the 'ifconfig' from
rc.sysinit and fixed things up properly by creating
a file /etc/sysconfig/network-scripts/ifcfg-plip:
DEVICE=plip0
IPADDR=laptop
REMIP=desktop
NETWORK=desktop
NETMASK=255.255.255.255
This didn't work either because (of course) the network
service is started before CUPS. Now, adding "ONBOOT=no"
to the above gets around this problem but means that the
connection must be brought up manually whenever needed.
Either that, or add a new service that runs after CUPS
and just does 'ifup plip' or something...
Incidentally, taking a cue from Jan Merkaon's tip, I also
tried 'fixing' the problem by doing
chmod -x /usr/lib/cups/backend/parallel
This worked as far as startup was concerned but caused a
printer config GUI tool to hang later when I tried to set
up a USB printer so I had to chmod +x it back again.
And the printer config tool also hangs if the plip
connection is up when add printer is attempted. Which
makes the extra service hack above less desirable too.
2.6.14-1.1637_FC4 has been released as an update for FC4. Please retest with this update, as a large amount of code has been changed in this release, which may have fixed your problem. Thank you. This is a mass-update to all currently open kernel bugs. A new kernel update has been released (Version: 2.6.15-1.1830_FC4) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. Thank you. The experience I reported above actually resulted from a separate problem arising from an inadvertent local change to the kernel config combined with the fact that lp and plip won't normally work together if both are loaded. So, at least as far as I'm concerned, this bug can be closed. Thanks. |