Bug 812499 - Unable to remove "Red Hat Content Provider" if something goes wrong
Summary: Unable to remove "Red Hat Content Provider" if something goes wrong
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: WebUI
Version: 6.0.0
Hardware: Unspecified
OS: Unspecified
low
high
Target Milestone: Unspecified
Assignee: Katello Bug Bin
QA Contact: Katello QA List
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-04-14 03:12 UTC by Justin Clift
Modified: 2015-07-13 04:35 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-03-12 22:56:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshot showing the error message and missing view contents. (170.53 KB, image/png)
2012-04-14 03:12 UTC, Justin Clift
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 846828 0 unspecified CLOSED [RFE] Uploading a new manifest from a different distributor fails 2021-02-22 00:41:40 UTC

Internal Links: 875883

Description Justin Clift 2012-04-14 03:12:29 UTC
Created attachment 577437 [details]
Screenshot showing the error message and missing view contents.

Description of problem:

  Suffered a power outage of a CloudForms SE server, whilst it was adding a RH Content Provider manifest.

  Upon the box restarting again (filesystem recovery was ok), whenever the user goes to the "Red Hat Content Provider" page, the normal contents are missing.  Instead a "500 Internal Server Error" is the content for the page.

  This seems to give no way through the web UI, for removing a (likely) broken RH Content Provider definition.

  Screenshots attached.

Version-Release number of selected component (if applicable):

  It's a recent puddle build:

  $ rpm -qa | grep -i katello
  katello-selinux-0.1.10-1.el6.noarch
  katello-0.1.309-1.el6.noarch
  katello-cli-common-0.1.107-1.el6.noarch
  katello-candlepin-cert-key-pair-1.0-1.noarch
  katello-qpid-broker-key-pair-1.0-1.noarch
  katello-cli-0.1.107-1.el6.noarch
  katello-common-0.1.309-1.el6.noarch
  katello-glue-candlepin-0.1.309-1.el6.noarch
  katello-glue-foreman-0.1.309-1.el6.noarch
  katello-certs-tools-1.0.4-1.el6.noarch
  katello-configure-0.1.107-1.el6.noarch
  katello-glue-pulp-0.1.309-1.el6.noarch
  katello-all-0.1.309-1.el6.noarch
  katello-qpid-client-key-pair-1.0-1.noarch


How reproducible:

  Unknown.

Steps to Reproduce:
1. Being adding a RH Content Provider manifest... whilst in the installation process (waiting for Katello to finish processing the manifest), suffer a power outage.
2. Restart the box.
3. Go to the RH Content Providers tab.  It should be "500 Internal Server Error" instead of normal contents.


Expected results:

  Normal RH Content Provider tab contents should be there.

Additional info:

  I took a snapshot of the disk for this BZ, so can extract logs or whatever if needed.

Comment 1 James Laska 2012-04-18 15:11:28 UTC
Nice bug!  The concerning thing with this bug is there is no workaround.

Escalating as a blocker for visibility.  If we can identify a workaround, I support resolving this in a future release, and adding a 1.0 release note.

Comment 2 Mike McCune 2012-04-18 17:45:52 UTC
FYI you can not delete the Red Hat Provider.  it is baked into every org in CFSE and is created during Org creation time.  

you will have to reset your database using:

/usr/share/katello/script/katello-reset-dbs

or restore from backup.

there is no way to remove the Red Hat Content Provider even during normal operations.  since it is hard to predict the state of your database during an outage like you experienced it would be hard to know exactly what it would take to correct the situation your DB is in.

In 1.1 we could look into how to handle this type of situation better but there isn't a whole lot we can do for 1.0.*

Comment 3 Mike McCune 2012-05-09 15:36:37 UTC
For 1.1 we should look into better transaction management for long running jobs that can rollback and recover from situations like this where there is a power outage or some massive breakage during execution.

Lets investigate how to cleanup and recover broken data.

Comment 4 Justin Clift 2012-05-09 16:49:17 UTC
If there's a way to reset (just) the RH Provider (without everything else), that might make for a decent workaround.  The admin would just need to do (in this case) the manifest import again.

Though, it kind of sounds like this would be future work, with Mike's mentioned "more transactional" approach being a more complete (likely better) goal. ;)

Comment 6 Lukas Zapletal 2013-02-05 09:48:47 UTC
There is one simple script/tool which compares repositories which are in Katello and in Pulp and prints what needs to be deleted to put Katello-Pulp back in sync. You can run in like that:

RAILS_ENV=production /usr/share/katello/script/katello-check

We can extend this script if the output is not helpful so GSS can take actions when this happens. Please note this tool is not documented and it is not intended to be used by users.

Ping me if it does not work or makes no sense to you and I can investigate the box directly extending this script with this special case.

Comment 7 Lukas Zapletal 2013-02-11 15:29:55 UTC
I can't reproduce. There is no general advice how to recover - there can be so many states during things like manifest import. It depends on when you suffer power failure. It can be data inconsistency in Candlepin or Pulp or both.

I really cannot investigate all the possibilities. We need to improve our orchestration code and totally change our approach to orchestration. If you encounter any data inconsistency, we need access to the box to investigate particular case.

So the general advice is: backup and recover in this case.

Comment 9 Lukas Zapletal 2014-03-12 10:26:58 UTC
Together with org deletion, this is still relevant, but we are chainging our orchestration layer and this should be re-evaluated after the migration is done.

Comment 10 Bryan Kearney 2014-03-12 22:56:37 UTC
Providers have been hidden. This is no longer relevant.


Note You need to log in before you can comment on or make changes to this bug.