Description of problem: running pm-suspend causes segmentation fault Version-Release number of selected component (if applicable): kernel 2.6.15-1.1860_FC5 This is new behaviour with this version How reproducible: 100% for me Steps to Reproduce: 1. service syslog stop (probably un-necessary) 2. pm-suspend Actual results: # pm-suspend Freezing cpus ... int3: 0000 [1] SMP last sysfs file: /power/state CPU 1 Modules linked in: ipv6 ppdev autofs4 rfcomm l2cap bluetooth sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables raid0 video button battery ac lp parport_pc parport nvram ohci1394 ieee1394 uhci_hcd ehci_hcd saa7134 video_buf compat_ioctl32 v4l2_common v4l1_compat ir_kbd_i2c ir_common videodev e100 mii snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm i2c_i801 i2c_core snd_timer snd hw_random soundcore snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd raid1 ahci libata sd_mod scsi_mod Pid: 2925, comm: pm-suspend Not tainted 2.6.15-1.1860_FC5 #1 RIP: 0010:[<ffffffff8055644b>] <ffffffff8055644b>{pageset_cpuup_callback+1} RSP: 0018:ffff81003e117db0 EFLAGS: 00000282 RAX: 0000000000000001 RBX: ffffffff803c74e0 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000005 RDI: ffffffff803c74e0 RBP: 0000000000000001 R08: ffffffff8053aa68 R09: 0000000000000004 R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000005 R13: 0000000000000003 R14: 0000000000000003 R15: ffff81003e117f50 FS: 00002b6beb43dd30(0000) GS:ffff81003fe16268(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002b6bee7da000 CR3: 0000000031d96000 CR4: 00000000000006e0 Process pm-suspend (pid: 2925, threadinfo ffff81003e116000, task ffff8100339f6080) Stack: ffffffff803403ae 0000000000000001 0000000000000001 0000000000000003 ffffffff8014b7bf ffff81003e117e38 ffffffff801465fd 0000000000000296 0000000000000296 0000000000000000 Call Trace: <ffffffff803403ae>{notifier_call_chain+28} <ffffffff8014b7bf>{cpu_down+96} <ffffffff801465fd>{remove_wait_queue+17} <ffffffff80254435>{vt_waitactive+150} <ffffffff8015359f>{disable_nonboot_cpus+82} <ffffffff801505e1>{enter_state+161} <ffffffff801507e1>{state_store+113} <ffffffff801be79b>{sysfs_write_file+201} <ffffffff80180af0>{vfs_write+206} <ffffffff801810a2>{sys_write+69} <ffffffff8010ab34>{tracesys+209} Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc RIP <ffffffff8055644b>{pageset_cpuup_callback+1} RSP <ffff81003e117db0> Segmentation fault Additional info: Machine is not dead at this point, but monitor on console is in DPMS sleep and won't wake up, trying a 2nd pm-suspend ties up the console (waiting for lock?) but still doesn't kill machine entirely ...
Just to be clear, this didn't happen in 2.6.15-1.1859_FC5 but does in 2.6.15-1.1860_FC5
also present in 2.6.15-1.1861_FC5
Been trying most rawhide kernels, still exists in 2.6.15-1.1872_FC5 # init 3 INIT: Switching to runlevel: 3 INIT: Sending processes the TERM signal Starting readahead_early: Starting background readahead: [ OK ] Starting irqbalance: [ OK ] Starting lm_sensors: [ OK ] # modprobe -r button # service syslog stop Shutting down kernel logger: [ OK ] Shutting down system logger: [ OK ] # pm-suspend Freezing cpus ... int3: 0000 [1] SMP last sysfs file: /power/state CPU 0 Modules linked in: radeon drm ipv6 ppdev autofs4 rfcomm l2cap sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables video battery ac lp parport_pc parport nvram hci_usb bluetooth ehci_hcd ohci1394 ieee1394 uhci_hcd snd_hda_intel saa7134 snd_hda_codec video_buf snd_seq_dummy compat_ioctl32 v4l2_common v4l1_compat snd_seq_oss snd_seq_midi_event ir_kbd_i2c snd_seq e100 snd_seq_device ir_common snd_pcm_oss snd_mixer_oss mii videodev snd_pcm snd_timer snd i2c_i801 hw_random soundcore i2c_core snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd ahci libata sd_mod scsi_mod Pid: 3373, comm: pm-suspend Not tainted 2.6.15-1.1872_FC5 #1 RIP: 0010:[<ffffffff80558435>] <ffffffff80558435>{pageset_cpuup_callback+1} RSP: 0018:ffff81002802fdb0 EFLAGS: 00000286 RAX: 0000000000000001 RBX: ffffffff803c8560 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000005 RDI: ffffffff803c8560 RBP: 0000000000000001 R08: ffffffff8053cae8 R09: 0000000000000004 R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000005 R13: 0000000000000003 R14: 0000000000000003 R15: ffff81002802ff50 FS: 00002aee15c8cd30(0000) GS:ffffffff8051a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aee19023000 CR3: 00000000292a1000 CR4: 00000000000006e0 Process pm-suspend (pid: 3373, threadinfo ffff81002802e000, task ffff810026b38040) Stack: ffffffff80341296 0000000000000001 0000000000000001 0000000000000003 ffffffff8014b803 ffff81002802fe38 ffffffff80146641 0000000000000296 0000000000000296 0000000000000000 Call Trace: <ffffffff80341296>{notifier_call_chain+28} <ffffffff8014b803>{cpu_down+96} <ffffffff80146641>{remove_wait_queue+17} <ffffffff80255149>{vt_waitactive+150} <ffffffff801535e3>{disable_nonboot_cpus+82} <ffffffff80150625>{enter_state+161} <ffffffff80150825>{state_store+113} <ffffffff801bf5c3>{sysfs_write_file+201} <ffffffff80180d38>{vfs_write+206} <ffffffff801812ea>{sys_write+69} <ffffffff8010a906>{system_call+126} Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc RIP <ffffffff80558435>{pageset_cpuup_callback+1} RSP <ffff81002802fdb0> Segmentation fault
just checked in a fix for this, it'll show up at http://people.redhat.com/davej/kernels/Fedora/devel in an hour or so.
Ah, so "sexing up" the title got your attention, I was begining to think you'd gone on the "jolly" to NZ after all ;-) I have a davej.repo file that is normally disabled, but which I use with yum update --enablerepo davej in situations like this, it has worked in the past, tonight it refused, looks like your repodata was updated at same time as the .RPMs truied cleaning emtadate but no joy, in the end I installed with rpm -i http://people.redhat.com/davej/kernels/Fedora/devel/RPMS.kernel/kernel-2.6.15-1.1881_FC5.x86_64.rpm and all went ok. I can now do a pm-suspend (without switching to runlevel3, or stopping syslog or doing rmmod button like I've needed in the past) from a serial console I see Freezing cpus ... Breaking affinity for irq 4 Breaking affinity for irq 14 Breaking affinity for irq 66 CPU 1 is now offline migration_cost=9 CPU1 is down Stopping tasks: ===========================================================| the monitor goes into DPMS sleep, machine power off :-) Machine wakes up in response to pressing PS/2 keyboard, monitor stays in DPMS sleep, ethernet doesn't reply to pings, keyboard LEDS are *NOT* flashing (which they used to do) and I get a big splurge of rubbish on the COM1: port, which is not intelligible at either 9600 or 115200 baud. Should COM1: be reset to same speed it was when suspended? is it likely to be a kernel panic/oops that is being sent there? I can't seem to find the "tricks" about vbetool and acpi=bios_mode3 or whatever it is that I was going to try to get video back ... p.s. I am running a newer BIOS flash than when testing a couple of weeks ago, which claims to have improved S3 resuming. Getting there ...
The serial splurge is an odd one. Hmm. From memory, I think the 8250/serial layer lacks suspend/resume hooks to reinit the device, which could explain this. Although.. during the kernel boot /before/ we do the resume, the 8250 should have been setup. So unless you changed some params after initial boot, but before suspend, I'm puzzled. The fact that we're dumping anything at all is also a little disturbing. I'm concerned that it's another oops. acpi_sleep=s3_bios is probably the boot command line option you were trying to remember ? There's some handy hints in the Documentation/power/video.txt of the kernel source (or kernel-doc rpm)
given that the initial problem is fixed, I suppose I ought to open a new bug for the resume? I am setting console=ttyS0,115200 on command line so as you say would expect it to be reset by kernel before resuming. Is it worth trying any/all speeds? Any command line options to force using a differerent driver to treat the UART as a "less ancient" flavour with better suspend/resume? Previously the PS/2 LEDS flashed, I thought this was a panic indicator, since I'm not getting the flashing does that mean much? I'll dig the docs and play with the command line settings ...