Bug 1473595 - Messages in resource manager queue are not persisted after restarting Qpid and cause tasks never start
Messages in resource manager queue are not persisted after restarting Qpid an...
Status: CLOSED ERRATA
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Pulp (Show other bugs)
6.2.10
Unspecified Unspecified
high Severity high (vote)
: GA
: --
Assigned To: satellite6-bugs
Jitendra Yejare
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-21 05:00 EDT by Hao Chang Yu
Modified: 2018-02-21 11:54 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-02-21 11:54:37 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Pulp Redmine 2927 High CLOSED - CURRENTRELEASE Messages in resource manager queue are not persisted after restarting Qpid and cause tasks never start 2017-10-17 10:02 EDT

  None (edit)
Description Hao Chang Yu 2017-07-21 05:00:10 EDT
Description of problem:
The messages in resource manager will be lost after restarting Qpid broker and cause tasks in "waiting" state forever.

Steps to Reproduce:
1. systemctl stop pulp_resource_manager
2. Sync a repo in the Satellite web ui
3. systemctl restart qpidd
4. systemctl start pulp_resource_manager

Before restarting Qpid:
queue                                            dur  autoDel  excl  msg   msgIn  msgOut  bytes  bytesIn  bytesOut  cons  bind
================================================================================================================================
 resource_manager                                 Y                      1    20     19    1.59k  32.6k    31.0k        0     2

msg is 1

After restarting Qpid:
queue                                            dur  autoDel  excl  msg   msgIn  msgOut  bytes  bytesIn  bytesOut  cons  bind
================================================================================================================================
resource_manager                                 Y                      0     0      0       0      0        0         0     2

msg is now 0


Actual results:
Tasks in waiting state forever

Expected results:
Tasks should proceed after restarting Qpid.
Comment 1 David Davis 2017-07-21 11:13:05 EDT
The issue in https://pulp.plan.io/issues/2861 is caused by resource_manager removing its task from the queue during a warm shutdown. 

This issue is different in that resource manager is already shutdown. I have opened a new upstream issue:

https://pulp.plan.io/issues/2927
Comment 2 pulp-infra@redhat.com 2017-07-21 11:31:52 EDT
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.
Comment 3 pulp-infra@redhat.com 2017-07-21 11:31:55 EDT
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
Comment 5 pulp-infra@redhat.com 2017-08-04 11:01:46 EDT
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.
Comment 6 pulp-infra@redhat.com 2017-08-17 23:20:56 EDT
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.
Comment 7 pulp-infra@redhat.com 2017-09-11 15:46:30 EDT
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
Comment 8 pulp-infra@redhat.com 2017-09-11 16:07:45 EDT
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.
Comment 9 pulp-infra@redhat.com 2017-10-17 10:02:09 EDT
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.
Comment 10 Jitendra Yejare 2017-12-20 06:12:07 EST
Verifying this bug as per steps mentioned in the description and some observations are :

After stop pulp_resource_manager I am trying sync repo in satellite but somehow the sync is errored with the message below and was no more in pending state:
```
There was an issue with the backend service pulp: Not all necessary pulp workers running at https://qeblade36.rhq.lab.eng.bos.redhat.com/pulp/api/v2/.
```
hence the sync doesn't start. So is this expected?

Still, I restarted qpidd and check the queue before and after the restart.
After restart, I see its 0.

On starting pulp_resource_manager, the repo sync did not resume as the task was in error state. So Is this expected? I doubt.

So does all this behavior is correct ?
May I mark this bug verified?
Comment 11 David Davis 2017-12-20 10:24:32 EST
Katello has checks in place so that it doesn't send tasks to pulp unless all pulp processes are running. So the steps in the original comment aren't going to work. To bypass the checks, we can just use pulp-admin to do the sync.

First install and setup pulp-admin per this:

https://access.redhat.com/solutions/1295653


Now run these steps to verify this bug:

systemctl stop pulp_resource_manager

pulpAdminPassword=$(grep ^default_password /etc/pulp/server.conf | cut -d' ' -f2)

pulp-admin -u admin -p $pulpAdminPassword rpm repo sync run --repo-id 58ec9596-497b-42ea-acf3-6580062e0924

systemctl restart qpidd

systemctl start pulp_resource_manager



To verify this bug, run this and look at the last sync task:

pulp-admin -u admin -p $pulpAdminPassword tasks list -a


You should see something like this with either the state Running or Successful:

Operations:  sync
Resources:   58ec9596-497b-42ea-acf3-6580062e0924 (repository)
State:       Running
Start Time:  2017-12-20T15:22:53Z
Finish Time: Incomplete
Task Id:     b97c7b82-141a-4c76-8abb-7146d1f1d58a
Comment 12 Jitendra Yejare 2017-12-20 11:03:00 EST
Verified !

@ Satellite 6.3 snap 29

As per comment 11, installed pulp-admin in satellite to test this bug else its difficult to test this bug.

Steps:
(on Satellite)

1. systemctl stop pulp_resource_manager
2. pulpAdminPassword=$(grep ^default_password /etc/pulp/server.conf | cut -d' ' -f2)
3. pulp-admin -u admin -p $pulpAdminPassword rpm repo sync run --repo-id <repo_uuid>
```
+----------------------------------------------------------------------+
    Synchronizing Repository [58ec9596-497b-42ea-acf3-6580062esa887]
+----------------------------------------------------------------------+

This command may be exited via ctrl+c without affecting the request.


[/]
Waiting to begin...
```
-> Now the sync task in pending state as pulp manager is stopped
4. Now restart qpidd daemon
5. systemctl start pulp_resource_manager
-> Now the sync task should start running
```
-- Continue from where it stopped earlier
Downloading metadata...
[\]
... completed

Downloading repository content...
[-]
[==================================================] 100%
RPMs:       0/0 items
Delta RPMs: 0/0 items

... completed

Downloading distribution files...
[==================================================] 100%
Distributions: 0/0 items
... completed

Importing errata...
[-]
... completed

Importing package groups/categories...
[-]
... completed

Cleaning duplicate packages...
[-]
... completed

Task Succeeded

Task Succeeded
```



Qpid_Queue Before qpidd restart:
```
resource_manager                                                               Y                      1     3      2    1.32k  4.04k    2.72k        0     2
```

Qpid_Queue After qpidd restart:
```
resource_manager                                                               Y                      1     1      0    1.32k  1.32k       0         0     2
```

Hence, Moving the state to Verified.
Comment 13 pm-sat@redhat.com 2018-02-21 11:54:37 EST
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
> > 
> > For information on the advisory, and where to find the updated files, follow the link below.
> > 
> > If the solution does not work for you, open a new bug report.
> > 
> > https://access.redhat.com/errata/RHSA-2018:0336

Note You need to log in before you can comment on or make changes to this bug.