Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1940356

Summary: repo sync failed with "Could not create the repository:\n There was an issue with the backend service pulp3" error
Product: Red Hat Satellite Reporter: Imaan <ikaur>
Component: Satellite MaintainAssignee: Amit Upadhye <aupadhye>
Status: CLOSED ERRATA QA Contact: Gaurav Talreja <gtalreja>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.9.0CC: apatel, aupadhye, bmbouter, dalley, kgaikwad, smallamp, ttereshc
Target Milestone: 6.10.0Keywords: Performance, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rubygem-foreman_maintain-0.8.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-16 13:48:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Imaan 2021-03-18 09:52:55 UTC
Description of problem:

repo sync failed with "Could not create the repository:\n  There was an issue with the backend service pulp3: Not all necessary pulp workers running at https://satellite server IP /pulp/api/v3.", "stderr_lines": ["Could not create the repository:", "  There was an issue with the backend service pulp3: Not all necessary pulp workers running at https://satellite server IP /pulp/api/v3."], "stdout": "", "stdout_lines": []}"

While doing performance testing, we hit the above issue. 

Version-Release number of selected component (if applicable):

Satellite 6.9 


Steps to Reproduce:

1. Enable pulp3 services on satellite 6.9 server : 
   # systemctl start pulpcore-resource-manager pulpcore-api pulpcore-content  pulpcore-worker@1 pulpcore-worker@2  pulpcore-worker@3 pulpcore-worker@4
# systemctl enable pulpcore-resource-manager pulpcore-api pulpcore-content  pulpcore-worker@1 pulpcore-worker@2  pulpcore-worker@3 pulpcore-worker@4

# foreman-rake katello:pulp3_content_switchover


# sed -i -e 's/pulpcore::service_ensure: false/pulpcore::service_ensure: true/g' -e 's/pulpcore::service_enable: false/pulpcore::service_enable: true/g' /usr/share/foreman-installer/config/foreman.hiera/scenario/satellite.yaml

# satellite-installer  --foreman-proxy-content-proxy-pulp-isos-to-pulpcore=true --katello-use-pulp-2-for-file=false --katello-use-pulp-2-for-docker=false --katello-use-pulp-2-for-yum=false --foreman-proxy-content-proxy-pulp-yum-to-pulpcore=true

# satellite-maintain service restart

By referring to this doc - https://docs.google.com/document/d/1eV8cTq_e7uRl-H9CP1yWZxEajFfv0JZdmcRo-p5irSc/edit#

2. Perform sync operation. 

Actual results:

Sync process failed with above-mentioned error while creating the repos.


Expected results:

It should run without any failure. 

Additional info:

Sync process worked pretty well last week. I was doing testing for the last two weeks and last week entire testing went well as pulp3 workers are running and haven't faced this issue but this week it's failing with mentioned error.

Comment 5 Tanya Tereshchenko 2021-03-19 12:07:46 UTC
Daniel, I can't agree or disagree, there is not enough info here, imo.

I read steps to reproduce this way: you start services, then run sync and it fails. 
It's not clear whether there was a high load or not and at which point Katello(?) sync failed.
For Pulp it can be also sync, can be publish.


@Imaan, did the sync failed immediately or after some time under heavy load?
What exactly were you syncing and what else was happening on the machine?
Just one repo? How big? Many syncs in parallel? Any context will be helpful.

Thanks.

Comment 6 Brian Bouterse 2021-03-22 20:34:59 UTC
Can you look into two things?

1) On the system where this is happening, can you post the Pulp Status API? That is the response from Pulp at /pulp/api/v3/status/

2) Can you confirm you're using snap_6.9.0_18.0 ? I'm asking because I want to make sure you're Pulp worker timeouts are tolerant to slow I/O on spinny disks. That Satellite bug is tracked here: https://bugzilla.redhat.com/show_bug.cgi?id=1929344

Comment 7 Imaan 2021-03-23 09:52:53 UTC
Hello,

GET /pulp/api/v3/status/ : 

HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
    "versions": [
        {
            "component": "pulpcore",
            "version": "3.7.3"
        },
        {
            "component": "pulp_2to3_migration",
            "version": "0.9.1"
        },
        {
            "component": "pulp_rpm",
            "version": "3.9.0"
        },
        {
            "component": "pulp_file",
            "version": "1.3.0"
        },
        {
            "component": "pulp_container",
            "version": "2.1.0"
        },
        {
            "component": "pulp_certguard",
            "version": "1.0.3"
        }
    ],
    "online_workers": [
        {
            "pulp_created": "2021-03-22T13:32:20.693647Z",
            "pulp_href": "/pulp/api/v3/workers/8b6968d2-58e0-4844-bfd0-44c7e7b73763/",
            "name": "resource-manager",
            "last_heartbeat": "2021-03-23T09:47:57.228500Z"
        }
    ],
    "online_content_apps": [
        {
            "name": "715@satellite",
            "last_heartbeat": "2021-03-23T09:48:41.536370Z"
        },
        {
            "name": "727@satellite",
            "last_heartbeat": "2021-03-23T09:48:41.542385Z"
        }
    ],
    "database_connection": {
        "connected": true
    },
    "redis_connection": {
        "connected": true
    },
    "storage": {
        "total": 246192013312,
        "used": 5944152064,
        "free": 240247861248
    }
}


2. version/snap details - 

katello_version = katello-3.18.1-3.el7sat.noarch
satellite_version = satellite-6.9.0-1.el7sat.noarch

Comment 8 Brian Bouterse 2021-03-23 14:07:00 UTC
Imaan, I can see your installation has a resource manager started, but not pulp workers. I believe this is due to a known issue of the katello commands not restarting the pulp workers in some cases. https://bugzilla.redhat.com/show_bug.cgi?id=1907801

Until that is fixed, the workaround I recommend is, everytime you go to restart services also run: `systemctl start pulpcore-worker@1 pulpcore-worker@2`. QE has used this successfully.

Also here's a look at my dev system which shows more entries in the online_workers, when your system is health it should show that too:

```
{
	"versions": [{
		"component": "core",
		"version": "3.12.0.dev"
	}, {
		"component": "file",
		"version": "1.7.0.dev"
	}],
	"online_workers": [{
		"pulp_created": "2021-03-23T14:04:56.806901Z",
		"pulp_href": "/pulp/api/v3/workers/87f5be7d-2c20-4b6d-87b1-395a01794af0/",
		"name": "159449.example.com",
		"last_heartbeat": "2021-03-23T14:04:56.818243Z"
	}, {
		"pulp_created": "2021-03-19T21:18:23.484040Z",
		"pulp_href": "/pulp/api/v3/workers/bbb75f54-7d80-499e-b660-f9e2c1753fad/",
		"name": "resource-manager",
		"last_heartbeat": "2021-03-23T14:04:56.909092Z"
	}, {
		"pulp_created": "2021-03-23T14:04:56.925023Z",
		"pulp_href": "/pulp/api/v3/workers/f94f1ae4-fefb-4b74-9260-5bea9a54799e/",
		"name": "159448.example.com",
		"last_heartbeat": "2021-03-23T14:04:56.953304Z"
	}],
	"online_content_apps": [{
		"name": "159452.example.com",
		"last_heartbeat": "2021-03-23T14:04:59.281581Z"
	}, {
		"name": "159460.example.com",
		"last_heartbeat": "2021-03-23T14:05:00.096379Z"
	}, {
		"name": "159457.example.com",
		"last_heartbeat": "2021-03-23T14:05:00.110513Z"
	}, {
		"name": "159455.example.com",
		"last_heartbeat": "2021-03-23T14:05:00.210949Z"
	}, {
		"name": "159458.example.com",
		"last_heartbeat": "2021-03-23T14:05:00.258613Z"
	}, {
		"name": "159453.example.com",
		"last_heartbeat": "2021-03-23T14:05:00.259160Z"
	}, {
		"name": "159459.example.com",
		"last_heartbeat": "2021-03-23T14:05:00.307559Z"
	}, {
		"name": "159456.example.com",
		"last_heartbeat": "2021-03-23T14:05:00.358097Z"
	}],
	"database_connection": {
		"connected": true
	},
	"redis_connection": {
		"connected": true
	},
	"storage": {
		"total": 42006183936,
		"used": 3660034048,
		"free": 36181942272
	}
}
```

Comment 9 Imaan 2021-04-15 06:33:34 UTC
The workaround mentioned in #comment 8 worked for performance team. Thank you.

Comment 11 Amit Upadhye 2021-06-14 13:00:22 UTC
Hello Gaurav,

The issue is already fixed, request you to test this on 0.8.1(most recent version) of foreman-maintain.

Thank You,
Amit Upadhye.

Comment 12 Gaurav Talreja 2021-06-24 09:29:04 UTC
Verified.

Tested on Satellite 6.9.3 Snap 5
Version: rubygem-foreman_maintain-0.8.2-1.el7sat.noarch

Observation:
Followed all steps of pulp2to3 migration on Satellite6.9, and after service restart pulpcore-worker@* was running.

Note: Need to test this Satellite6.9 for pulp2to3 migration, though target_milestone is set to 6.10.

Comment 15 errata-xmlrpc 2021-11-16 13:48:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.10 Satellite Maintenance Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4697