Bug 67414
Summary: | atd doesnt clean up old jobs | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Need Real Name <nkk> | ||||
Component: | at | Assignee: | Jens Petersen <petersen> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | Aaron Brown <abrown> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.2 | CC: | tao | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2002-07-19 09:25:19 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Need Real Name
2002-06-24 18:51:26 UTC
This system was actually running 7.3, not 7.2 1. "=" does not mean that the job has completed successfully. "=" means that the job is currently running (big difference). If your job (whatever it was) consistented of a waiting / looping / runaway process, you would see the at job remaining in the "=" (running) queue. It would appear that this is what was really happening on your machine in question. You can verify through the ps and top commands just what processes are currently doing what (e.g. "ps -xaf"). When you say that the job had completed normally, are you _sure_ that the job can completely terminated successfully (did you check the running processes)? 2. The problem w/ old jobs remaining and new jobs not running is a curious one. Even an old job in the "=" queue (i.e. currently running) will not block new jobs from running. Two cases which I suppose could cause problems like this: a) file system runs out of space b) if more than one at daemon (atd) is running (see below). 3. The issue w/ the atd process not terminating when the "service atd stop" is issued, and the second atd process starting w/ the "service atd start" can be explained as follows: If a job which has been spawned by the atd process is still running when you do a "service atd stop", atd continues to run until the spawned job terminates. If the old atd process is still running when you start the at daemon again (start a new atd process) you will then have two atd processes running concurrently. I wonder what kind of unpredictable behaviour this could generate, and whether you have seen some of this behaviour yourself. 4. The /var/spool/at job files being zero length... I believe this must be the result of at becoming confused? I have not seen this before, and am still not clear how this condition occurs. Created attachment 65869 [details]
patch to correct some of atd's bad behavior.
I have applied the patch.The issue 1 is solved.however issue 2 is still left in my case. Confirmed again.both of them are solved. See 8.1.3-30. |