Bug 812499 - Unable to remove "Red Hat Content Provider" if something goes wrong
Unable to remove "Red Hat Content Provider" if something goes wrong
Status: CLOSED NOTABUG
Product: Red Hat Satellite 6
Classification: Red Hat
Component: WebUI (Show other bugs)
6.0.0
Unspecified Unspecified
low Severity high (vote)
: Unspecified
: --
Assigned To: Katello Bug Bin
Katello QA List
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-13 23:12 EDT by Justin Clift
Modified: 2015-07-13 00:35 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-03-12 18:56:37 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Screenshot showing the error message and missing view contents. (170.53 KB, image/png)
2012-04-13 23:12 EDT, Justin Clift
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Bugzilla 846828 None None None 2012-11-12 09:24:48 EST

  None (edit)
Description Justin Clift 2012-04-13 23:12:29 EDT
Created attachment 577437 [details]
Screenshot showing the error message and missing view contents.

Description of problem:

  Suffered a power outage of a CloudForms SE server, whilst it was adding a RH Content Provider manifest.

  Upon the box restarting again (filesystem recovery was ok), whenever the user goes to the "Red Hat Content Provider" page, the normal contents are missing.  Instead a "500 Internal Server Error" is the content for the page.

  This seems to give no way through the web UI, for removing a (likely) broken RH Content Provider definition.

  Screenshots attached.

Version-Release number of selected component (if applicable):

  It's a recent puddle build:

  $ rpm -qa | grep -i katello
  katello-selinux-0.1.10-1.el6.noarch
  katello-0.1.309-1.el6.noarch
  katello-cli-common-0.1.107-1.el6.noarch
  katello-candlepin-cert-key-pair-1.0-1.noarch
  katello-qpid-broker-key-pair-1.0-1.noarch
  katello-cli-0.1.107-1.el6.noarch
  katello-common-0.1.309-1.el6.noarch
  katello-glue-candlepin-0.1.309-1.el6.noarch
  katello-glue-foreman-0.1.309-1.el6.noarch
  katello-certs-tools-1.0.4-1.el6.noarch
  katello-configure-0.1.107-1.el6.noarch
  katello-glue-pulp-0.1.309-1.el6.noarch
  katello-all-0.1.309-1.el6.noarch
  katello-qpid-client-key-pair-1.0-1.noarch


How reproducible:

  Unknown.

Steps to Reproduce:
1. Being adding a RH Content Provider manifest... whilst in the installation process (waiting for Katello to finish processing the manifest), suffer a power outage.
2. Restart the box.
3. Go to the RH Content Providers tab.  It should be "500 Internal Server Error" instead of normal contents.


Expected results:

  Normal RH Content Provider tab contents should be there.

Additional info:

  I took a snapshot of the disk for this BZ, so can extract logs or whatever if needed.
Comment 1 James Laska 2012-04-18 11:11:28 EDT
Nice bug!  The concerning thing with this bug is there is no workaround.

Escalating as a blocker for visibility.  If we can identify a workaround, I support resolving this in a future release, and adding a 1.0 release note.
Comment 2 Mike McCune 2012-04-18 13:45:52 EDT
FYI you can not delete the Red Hat Provider.  it is baked into every org in CFSE and is created during Org creation time.  

you will have to reset your database using:

/usr/share/katello/script/katello-reset-dbs

or restore from backup.

there is no way to remove the Red Hat Content Provider even during normal operations.  since it is hard to predict the state of your database during an outage like you experienced it would be hard to know exactly what it would take to correct the situation your DB is in.

In 1.1 we could look into how to handle this type of situation better but there isn't a whole lot we can do for 1.0.*
Comment 3 Mike McCune 2012-05-09 11:36:37 EDT
For 1.1 we should look into better transaction management for long running jobs that can rollback and recover from situations like this where there is a power outage or some massive breakage during execution.

Lets investigate how to cleanup and recover broken data.
Comment 4 Justin Clift 2012-05-09 12:49:17 EDT
If there's a way to reset (just) the RH Provider (without everything else), that might make for a decent workaround.  The admin would just need to do (in this case) the manifest import again.

Though, it kind of sounds like this would be future work, with Mike's mentioned "more transactional" approach being a more complete (likely better) goal. ;)
Comment 6 Lukas Zapletal 2013-02-05 04:48:47 EST
There is one simple script/tool which compares repositories which are in Katello and in Pulp and prints what needs to be deleted to put Katello-Pulp back in sync. You can run in like that:

RAILS_ENV=production /usr/share/katello/script/katello-check

We can extend this script if the output is not helpful so GSS can take actions when this happens. Please note this tool is not documented and it is not intended to be used by users.

Ping me if it does not work or makes no sense to you and I can investigate the box directly extending this script with this special case.
Comment 7 Lukas Zapletal 2013-02-11 10:29:55 EST
I can't reproduce. There is no general advice how to recover - there can be so many states during things like manifest import. It depends on when you suffer power failure. It can be data inconsistency in Candlepin or Pulp or both.

I really cannot investigate all the possibilities. We need to improve our orchestration code and totally change our approach to orchestration. If you encounter any data inconsistency, we need access to the box to investigate particular case.

So the general advice is: backup and recover in this case.
Comment 9 Lukas Zapletal 2014-03-12 06:26:58 EDT
Together with org deletion, this is still relevant, but we are chainging our orchestration layer and this should be re-evaluated after the migration is done.
Comment 10 Bryan Kearney 2014-03-12 18:56:37 EDT
Providers have been hidden. This is no longer relevant.

Note You need to log in before you can comment on or make changes to this bug.