Bug 661966 - Jobs dropped due to falling out of allowed hour range should not be locked
Summary: Jobs dropped due to falling out of allowed hour range should not be locked
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: cronie
Version: rawhide
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
Assignee: Marcela Mašláňová
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-12-10 07:17 UTC by Marcela Mašláňová
Modified: 2010-12-23 19:59 UTC (History)
4 users (show)

Fixed In Version: cronie-1.4.5-4.fc14
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-23 19:59:14 UTC


Attachments (Terms of Use)
problem (23.24 KB, text/plain)
2010-12-10 07:18 UTC, Marcela Mašláňová
no flags Details
lock (30.38 KB, text/plain)
2010-12-10 07:19 UTC, Marcela Mašláňová
no flags Details
pstree (6.76 KB, text/plain)
2010-12-10 07:20 UTC, Marcela Mašláňová
no flags Details
Log showing daily and weekly active at the same time (10.24 KB, text/plain)
2010-12-10 14:14 UTC, Anders Blomdell
no flags Details

Description Marcela Mašláňová 2010-12-10 07:17:37 UTC
Description of problem:
Occasionally (have only been observed when job running is delayed past the next
execution of cron) the weekly anacron task locks out the daily task. Atttached
you will find the output at such an instance of:

  pstree -p
  lslk
  /var/log/cron

Version-Release number of selected component (if applicable):
cronie-anacron-1.4.5-2.fc14.i686

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Marcela Mašláňová 2010-12-10 07:18:24 UTC
Created attachment 467907 [details]
problem

Comment 2 Marcela Mašláňová 2010-12-10 07:19:48 UTC
Created attachment 467908 [details]
lock

Comment 3 Marcela Mašláňová 2010-12-10 07:20:13 UTC
Created attachment 467909 [details]
pstree

Comment 4 Tomas Mraz 2010-12-10 08:13:02 UTC
Well this is clearly a bug in the 99-raid-check script which hangs for some reason and it should be fixed in the package that owns this script. Please open a bug against this package.

On the other hand we could add a feature to anacron such as a nowait flag that would make the job flagged with this flag to not wait for it to finish and mark it as finished immediately after its child process forks.

Comment 5 Anders Blomdell 2010-12-10 08:57:10 UTC
No it's not a bug in the 99-raid-check, it only takes 3 days to complete on heavily loaded 2TB disks, fine with me, 4 days left until next time.

Comment 6 Tomas Mraz 2010-12-10 09:52:54 UTC
Then it cannot be run from anacron at least not before the nowait feature is added and also cannot be run from the cron.weekly directory but directly with its own entry in /etc/anacrontab. The other possiblity is to handle the spawning of the long-running process directly in the 99-raid-check script.

Comment 7 Anders Blomdell 2010-12-10 12:35:57 UTC
Then why does it work all the weeks when start delay does not exceed the next invocation of anacron? 

I.e we have only seen it has lock out daily jobs when 'random(RANDOM_DELAY) + cron.weekly.delay > 60' (numerically: random(45) + 25 > 60).

Your comment seems to imply that no anacron task may last longer than the shortest period?

Comment 8 Tomas Mraz 2010-12-10 13:29:56 UTC
If it takes more than one day to complete then it will block the daily jobs the next day anyway regardless of the random delay.

Comment 9 Anders Blomdell 2010-12-10 14:14:44 UTC
Created attachment 467976 [details]
Log showing daily and weekly active at the same time

Also shows that the weird locking behavior does not always occur.

Comment 10 Tomas Mraz 2010-12-10 15:15:56 UTC
OK now I see what is the problem - it happens when the weekly jobs are started in an anacron instance that is started at 2am or earlier in the day. In that case the daily job falls out of the allowed range however its file is being locked - that is the bug. It should not have been locked in that case.

Comment 11 Marcela Mašláňová 2010-12-13 09:38:40 UTC
(In reply to comment #9)
> Created attachment 467976 [details]
> Log showing daily and weekly active at the same time
> 
> Also shows that the weird locking behavior does not always occur.

Could you test the update and let us now?

Comment 12 Anders Blomdell 2010-12-13 11:14:49 UTC
Where do I find the update?

Was only able to find the stable ones in https://admin.fedoraproject.org/updates

Comment 13 Marcela Mašláňová 2010-12-13 11:39:30 UTC
Rawhide doesn't have updates, packages are just synced on mirrors. It should be fixed by release 1.4.6-5.

Comment 14 Tomas Mraz 2010-12-13 17:26:20 UTC
I built a F14 package here in koji:
http://koji.fedoraproject.org/koji/buildinfo?buildID=209038
You can download it from there.

Comment 15 Fedora Update System 2010-12-14 08:52:01 UTC
cronie-1.4.5-3.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/cronie-1.4.5-3.fc14

Comment 16 Fedora Update System 2010-12-15 09:01:41 UTC
cronie-1.4.5-3.fc14 has been pushed to the Fedora 14 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update cronie'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/cronie-1.4.5-3.fc14

Comment 17 Fedora Update System 2010-12-16 14:20:36 UTC
cronie-1.4.5-4.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/cronie-1.4.5-4.fc14

Comment 18 Fedora Update System 2010-12-23 19:58:57 UTC
cronie-1.4.5-4.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.