Bug 224597
Summary: | atd dies when it tries to process queue | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | david.hagood |
Component: | at | Assignee: | Marcela Mašláňová <mmaslano> |
Status: | CLOSED DUPLICATE | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7 | CC: | dougmc+bugzilla, mark, rh-bugzilla, steve |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-06-27 09:53:02 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
david.hagood
2007-01-26 16:16:38 UTC
Also, the man page indicates that atd will take a "-n" parameter to prevent it from forking, and a "-d" to send debug info to stderr rather than syslog - neither of which work. Either the man page should be updated or the program corrected to support the commands. I have at-3.1.10-9.fc7 I can't reproduce your problem. Only problem is glibc detected when I use: atd -d PAM module was fixed -> thanks for comment Sorry, but the problem persists for me. I now get the following in syslog: atd[9053]: Module is unknown when the job times out, and atd dies. You have to use the -d option to see anything at all, otherwise the daemon just silently dies. Try this: service atd stop gdb --args /usr/sbin/atd -d GNU gdb Red Hat Linux (6.6-4.fc7rh) Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"... (no debugging symbols found) Using host libthread_db library "/lib/libthread_db.so.1". (gdb) run Starting program: /usr/sbin/atd -d (at this point, from another console/Xterm, run the echo true | at "now+1 min" command. Then switch back to the console/Xterm with the debug session. After one minute....) Program received signal SIGHUP, Hangup. 0x00e26402 in __kernel_vsyscall () (gdb) bt #0 0x00e26402 in __kernel_vsyscall () #1 0x00283530 in __nanosleep_nocancel () from /lib/libc.so.6 #2 0x0028337f in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:138 #3 0x80002da2 in main () from /usr/sbin/atd (gdb) cont Continuing. *** glibc detected *** /usr/sbin/atd: free(): invalid pointer: 0xbfef856f *** Module is unknown ======= Backtrace: ========= /lib/libc.so.6[0x25b5ed] /lib/libc.so.6(cfree+0x90)[0x25ec40] /usr/sbin/atd[0x8000181a] /usr/sbin/atd[0x8000294f] /usr/sbin/atd(main+0x3ce)[0x80002d8e] /lib/libc.so.6(__libc_start_main+0xe0)[0x20aec0] /usr/sbin/atd[0x800014f1] ======= Memory map: ======== 001ea000-001f4000 r-xp 00000000 ee:00 40108131 /lib/libpam.so.0.81.6 001f4000-001f5000 rwxp 00009000 ee:00 40108131 /lib/libpam.so.0.81.6 001f5000-00340000 r-xp 00000000 ee:00 40108097 /lib/libc-2.5.90.so 00340000-00342000 r-xp 0014b000 ee:00 40108097 /lib/libc-2.5.90.so 00342000-00343000 rwxp 0014d000 ee:00 40108097 /lib/libc-2.5.90.so 00343000-00346000 rwxp 00343000 00:00 0 00391000-003a3000 r-xp 00000000 ee:00 3178513 /lib/libaudit.so.0.0.0 003a3000-003a5000 rwxp 00011000 ee:00 3178513 /lib/libaudit.so.0.0.0 0076c000-00777000 r-xp 00000000 ee:00 40108178 /lib/libgcc_s-4.1.2-20070222.so.1 00777000-00778000 rwxp 0000a000 ee:00 40108178 /lib/libgcc_s-4.1.2-20070222.so.1 007cc000-007e7000 r-xp 00000000 ee:00 3178504 /lib/ld-2.5.90.so 007e7000-007e8000 r-xp 0001a000 ee:00 3178504 /lib/ld-2.5.90.so 007e8000-007e9000 rwxp 0001b000 ee:00 3178504 /lib/ld-2.5.90.so 00e26000-00e27000 r-xp 00e26000 00:00 0 [vdso] 00fbb000-00fbe000 r-xp 00000000 ee:00 3178522 /lib/libdl-2.5.90.so 00fbe000-00fbf000 r-xp 00002000 ee:00 3178522 /lib/libdl-2.5.90.so 00fbf000-00fc0000 rwxp 00003000 ee:00 3178522 /lib/libdl-2.5.90.so 80000000-80004000 r-xp 00000000 ee:00 8896612 /usr/sbin/atd 80004000-80005000 rw-p 00004000 ee:00 8896612 /usr/sbin/atd 819f6000-81a17000 rw-p 819f6000 00:00 0 b7d00000-b7d21000 rw-p b7d00000 00:00 0 b7d21000-b7e00000 ---p b7d21000 00:00 0 b7e92000-b7ec7000 r--s 00000000 ee:00 2310146 /var/db/nscd/group b7ec7000-b7efc000 r--s 00000000 ee:00 2310145 /var/db/nscd/passwd b7efc000-b7efe000 rw-p b7efc000 00:00 0 b7f0f000-b7f10000 rw-p b7f0f000 00:00 0 bfee4000-bfef9000 rw-p bfee4000 00:00 0 [stack] Program received signal SIGABRT, Aborted. 0x00e26402 in __kernel_vsyscall () (gdb) bt #0 0x00e26402 in __kernel_vsyscall () #1 0x0021df60 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #2 0x0021f7b1 in *__GI_abort () at abort.c:88 #3 0x0025380b in __libc_message (do_abort=2, fmt=0x318c04 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:170 #4 0x0025b5ed in _int_free (av=0x343120, mem=0xbfef856f) at malloc.c:5788 #5 0x0025ec40 in *__GI___libc_free (mem=0xbfef856f) at malloc.c:3566 #6 0x8000181a in ?? () from /usr/sbin/atd #7 0x8000294f in ?? () from /usr/sbin/atd #8 0x80002d8e in main () from /usr/sbin/atd (gdb) Also, PLEASE check the ATD manpage as it lists several options that ATD does not accept, such as the "-n" (don't fork) option. The debuginfo RPM for at does not seem to contain any files at all, so it is not possible to debug what is going on in atd itself. BTW - I had updated to the same version of AT as you when I generated that last comment. it's because of --- atd.c --- | char newname[256]; | | free(newname); The do-not-fork option works, but requires an additional argument erroneously. Option string should be | getopt(argc, argv, "sdl:b:n") but not | getopt(argc, argv, "sdl:b:n:") I'm seeing the same error in the atd that comes with F7, with all updates applied. The system was updated from FC6, and there were at jobs in the queue at the time -- but new at jobs cause it to die too. When the time comes from atd to run a program, it does, but then atd immediately dies. atd -d fails with the same error as given above -- # atd -d *** glibc detected *** atd: free(): invalid pointer: 0xbfd46bef *** ======= Backtrace: ========= /lib/libc.so.6[0xe39f41] /lib/libc.so.6(cfree+0x90)[0xe3d580] atd[0x8000181a] ... I guess for now my workaround is to replace running `atd' with a script that just runs at -d in a loop -- #!/bin/sh while : ; do at -d done and run that instead. (In reply to comment #7) > at -d er, atd -d ... Atd crashes for me as well, after running a job. In /var/log/messages I get: atd[2621]: Open of jobfile failed: Success Same problem here using Fedora 7 x86_64 # rpm -q at at-3.1.10-11.fc7 # service atd restart;(echo "/bin/true"|at now);sleep 60;service atd status Stopping atd: [FAILED] Starting atd: [ OK ] job 6 at Thu Jun 21 10:32:00 2007 Jun 21 10:32:51 d490f7x64 atd[7572]: Open of jobfile failed: Success atd dead but pid file exists Same problem here using Fedora 7 i386 # rpm -q at at-3.1.10-11.fc7 In var/log/messages: Open of jobfile failed: Success # service atd status atd dead but pid file exists *** This bug has been marked as a duplicate of 243064 *** |