Bug 1940356
| Summary: | repo sync failed with "Could not create the repository:\n There was an issue with the backend service pulp3" error | ||
|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Imaan <ikaur> |
| Component: | Satellite Maintain | Assignee: | Amit Upadhye <aupadhye> |
| Status: | CLOSED ERRATA | QA Contact: | Gaurav Talreja <gtalreja> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.9.0 | CC: | apatel, aupadhye, bmbouter, dalley, kgaikwad, smallamp, ttereshc |
| Target Milestone: | 6.10.0 | Keywords: | Performance, Triaged |
| Target Release: | Unused | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | rubygem-foreman_maintain-0.8.1 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-11-16 13:48:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Imaan
2021-03-18 09:52:55 UTC
Daniel, I can't agree or disagree, there is not enough info here, imo. I read steps to reproduce this way: you start services, then run sync and it fails. It's not clear whether there was a high load or not and at which point Katello(?) sync failed. For Pulp it can be also sync, can be publish. @Imaan, did the sync failed immediately or after some time under heavy load? What exactly were you syncing and what else was happening on the machine? Just one repo? How big? Many syncs in parallel? Any context will be helpful. Thanks. Can you look into two things? 1) On the system where this is happening, can you post the Pulp Status API? That is the response from Pulp at /pulp/api/v3/status/ 2) Can you confirm you're using snap_6.9.0_18.0 ? I'm asking because I want to make sure you're Pulp worker timeouts are tolerant to slow I/O on spinny disks. That Satellite bug is tracked here: https://bugzilla.redhat.com/show_bug.cgi?id=1929344 Hello,
GET /pulp/api/v3/status/ :
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
{
"versions": [
{
"component": "pulpcore",
"version": "3.7.3"
},
{
"component": "pulp_2to3_migration",
"version": "0.9.1"
},
{
"component": "pulp_rpm",
"version": "3.9.0"
},
{
"component": "pulp_file",
"version": "1.3.0"
},
{
"component": "pulp_container",
"version": "2.1.0"
},
{
"component": "pulp_certguard",
"version": "1.0.3"
}
],
"online_workers": [
{
"pulp_created": "2021-03-22T13:32:20.693647Z",
"pulp_href": "/pulp/api/v3/workers/8b6968d2-58e0-4844-bfd0-44c7e7b73763/",
"name": "resource-manager",
"last_heartbeat": "2021-03-23T09:47:57.228500Z"
}
],
"online_content_apps": [
{
"name": "715@satellite",
"last_heartbeat": "2021-03-23T09:48:41.536370Z"
},
{
"name": "727@satellite",
"last_heartbeat": "2021-03-23T09:48:41.542385Z"
}
],
"database_connection": {
"connected": true
},
"redis_connection": {
"connected": true
},
"storage": {
"total": 246192013312,
"used": 5944152064,
"free": 240247861248
}
}
2. version/snap details -
katello_version = katello-3.18.1-3.el7sat.noarch
satellite_version = satellite-6.9.0-1.el7sat.noarch
Imaan, I can see your installation has a resource manager started, but not pulp workers. I believe this is due to a known issue of the katello commands not restarting the pulp workers in some cases. https://bugzilla.redhat.com/show_bug.cgi?id=1907801 Until that is fixed, the workaround I recommend is, everytime you go to restart services also run: `systemctl start pulpcore-worker@1 pulpcore-worker@2`. QE has used this successfully. Also here's a look at my dev system which shows more entries in the online_workers, when your system is health it should show that too: ``` { "versions": [{ "component": "core", "version": "3.12.0.dev" }, { "component": "file", "version": "1.7.0.dev" }], "online_workers": [{ "pulp_created": "2021-03-23T14:04:56.806901Z", "pulp_href": "/pulp/api/v3/workers/87f5be7d-2c20-4b6d-87b1-395a01794af0/", "name": "159449.example.com", "last_heartbeat": "2021-03-23T14:04:56.818243Z" }, { "pulp_created": "2021-03-19T21:18:23.484040Z", "pulp_href": "/pulp/api/v3/workers/bbb75f54-7d80-499e-b660-f9e2c1753fad/", "name": "resource-manager", "last_heartbeat": "2021-03-23T14:04:56.909092Z" }, { "pulp_created": "2021-03-23T14:04:56.925023Z", "pulp_href": "/pulp/api/v3/workers/f94f1ae4-fefb-4b74-9260-5bea9a54799e/", "name": "159448.example.com", "last_heartbeat": "2021-03-23T14:04:56.953304Z" }], "online_content_apps": [{ "name": "159452.example.com", "last_heartbeat": "2021-03-23T14:04:59.281581Z" }, { "name": "159460.example.com", "last_heartbeat": "2021-03-23T14:05:00.096379Z" }, { "name": "159457.example.com", "last_heartbeat": "2021-03-23T14:05:00.110513Z" }, { "name": "159455.example.com", "last_heartbeat": "2021-03-23T14:05:00.210949Z" }, { "name": "159458.example.com", "last_heartbeat": "2021-03-23T14:05:00.258613Z" }, { "name": "159453.example.com", "last_heartbeat": "2021-03-23T14:05:00.259160Z" }, { "name": "159459.example.com", "last_heartbeat": "2021-03-23T14:05:00.307559Z" }, { "name": "159456.example.com", "last_heartbeat": "2021-03-23T14:05:00.358097Z" }], "database_connection": { "connected": true }, "redis_connection": { "connected": true }, "storage": { "total": 42006183936, "used": 3660034048, "free": 36181942272 } } ``` The workaround mentioned in #comment 8 worked for performance team. Thank you. Hello Gaurav, The issue is already fixed, request you to test this on 0.8.1(most recent version) of foreman-maintain. Thank You, Amit Upadhye. Verified. Tested on Satellite 6.9.3 Snap 5 Version: rubygem-foreman_maintain-0.8.2-1.el7sat.noarch Observation: Followed all steps of pulp2to3 migration on Satellite6.9, and after service restart pulpcore-worker@* was running. Note: Need to test this Satellite6.9 for pulp2to3 migration, though target_milestone is set to 6.10. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Satellite 6.10 Satellite Maintenance Release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4697 |