Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2180875 - Recurring tasks not rescheduled to future during upgrade. [NEEDINFO]
Summary: Recurring tasks not rescheduled to future during upgrade.
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Tasks Plugin
Version: 6.13.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact: Satellite QE Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-03-22 13:43 UTC by Lukáš Hellebrandt
Modified: 2024-04-03 11:49 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-04-03 11:49:36 UTC
Target Upstream Version:
Embargoed:
rlavi: needinfo? (ehelms)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker SAT-16916 0 None None None 2023-04-03 13:19:49 UTC

Description Lukáš Hellebrandt 2023-03-22 13:43:18 UTC
Description of problem:
Recurring tasks do not get rescheduled to future during upgrade. That may lead to them being scheduled to past once the upgrade finishes, so they never run. When disabled, they can't be enabled again. Error:

ERF28-1357 [ForemanTasks::RecurringLogicCancelledException]: Cannot update a cancelled Recurring Logic.

Version-Release number of selected component (if applicable):
Sat 6.13.0 snap 15.0 and upgrade path 6.12.3 snap 2.0 -> 6.13.0 snap 15.0. Not a regression.

How reproducible:
Deterministic

Steps to Reproduce:
1) Create a recurring task (All hosts -> <host> -> Schedule a job) with cron "*/2 * * * *"
2) Create a sync plan (Content -> Sync plans -> Create sync plan) on some repo with the same cron
3) Run the upgrade to 6.13
4) Go to Recurring logics and disable both
5) Enable them again

Actual results:
A sync plan created in 2) was rescheduled to future during upgrade.
A recurring task created in 1) was NOT rescheduled to future during upgrade. After disabling it manually, it can't be enabled again due to above error. It doesn't run at specified time.

Expected results:
A recurring task created in 2) is also rescheduled to future during upgrade

Comment 1 Adam Ruzicka 2023-03-22 14:32:22 UTC
This is interesting, there should be no difference between recurring tasks and sync plans as sync plans use recurring tasks as the backend. Do you still have the machine?

Comment 2 Lukáš Hellebrandt 2023-03-28 10:21:59 UTC
Answer provided on Slack

Comment 3 Adam Ruzicka 2023-06-05 09:07:18 UTC
I might have a hunch what's going on, but it is a bit of a stretch. I believe what I wrote in https://bugzilla.redhat.com/show_bug.cgi?id=2131839#c15 still holds, however sync plans are handled in a special way during the upgrade. Before the upgrade starts, all sync plans are disabled and then re-enabled again once the upgrade is done. We don't do anything like this for other things using recurring logics. The upgrade itself isn't exactly gentle and services get restarted quite a lot.

Right now the root cause seems to be that services are restarted before the fix for 2131839 is deployed which triggers https://bugzilla.redhat.com/show_bug.cgi?id=2131839#c10 . And because things get broken before the upgrade, deploying a fixed version during the upgrade doesn't really help. Sync plans seem to be immune to this because they are disabled in this stage.

There are still some gaps I cannot really explain right now, but confirming or refuting this theory should be quite straightforward, albeit a bit time consuming.

Comment 4 Lukáš Hellebrandt 2023-06-07 16:47:56 UTC
I tried with 6.13.1 -> 6.14 and wasn't able to reproduce.

I used reproducer from OP on one machine. On another machine, before upgrade, I stopped Satellite services over time when the tasks were supposed to run. After starting the services again, everything got rescheduled to the future properly. Then even after upgrade of this second instance, everything still works.

Comment 5 Adam Ruzicka 2023-06-08 12:48:56 UTC
That somewhat confirms I was on to something in #3 and that the fix for the original BZ still helps here. 

The easiest way out seems to be delivering updated dynflow (or just picking the fix for the original bz) into all currently supported satellite versions, however one would have to either update dynflow by hand before continuing with the rest of the upgrade or disabling (probably) all recurring logics prior to the upgrade so the bug wouldn't get triggered when trying to deploy the fixed version. Alternatively, foreman-maintain could be made to do that.

Or, considering noone really complained about the issue (specifically about REX), we could treat this as resolved in current release, although disabling all recurring logics during upgrade might not be a bad idea.

Comment 7 Peter Vreman 2023-07-12 12:13:09 UTC
My case is attached to this BZ, it has for 6.12 as simple reproducer for the rh_cloud jobs:

My reproducer and the results were in Feb-2023 on Sat6.12
Test from yesterday-today overnight:
- Feb 01 14:11  - Stop Satellite 'satellite-maintain service stop'
<kept it stopped over night>
- Feb 02 09:29 - Started Satellite 'satellite-maintain service start'

Result in recurring logic (see also attach screenshot)
- Last occurrence 'Feb 02 09:30'
- Next occurrence 'Feb 01 xx:yy'


Day1 During the day stop Satellite '

Comment 10 Adam Ruzicka 2024-04-03 11:49:36 UTC
It seems we missed the boat with this a little bit. This should be fixed in dynflow that is shipped with 6.13, so all upgrades from 6.13+ should be safe. The only place where this can manifest is when updating from 6.12 to 6.13, but at this point, we can't really fix it there. With that being said, I'll go ahead and close this.


Note You need to log in before you can comment on or make changes to this bug.