Bug 1388201

Summary: candlepin content_id changes cause manfiest import/refresh failure
Product: Red Hat Satellite Reporter: sthirugn <sthirugn>
Component: CandlepinAssignee: Barnaby Court <bcourt>
Status: CLOSED ERRATA QA Contact: Sanket Jagtap <sjagtap>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.3.0CC: bcourt, bkearney, cdonnell, chrobert, egolov, fgarciad, hannsj_uhl, kdixon, ktordeur, ldelouw, mruzicka, mtenheuv, ohadlevy, rvdwees, shughes, sjagtap, smajumda, wburrows, xdmoon, zhunting
Target Milestone: UnspecifiedKeywords: PrioBumpGSS
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: candlepin-0.9.54.13-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1388234 1388236 1388241 1388551 1393439 1393442 1393444 (view as bug list) Environment:
Last Closed: 2016-11-10 08:14:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1388234    
Bug Blocks: 1385841, 1388551, 1393442    
Attachments:
Description Flags
production.log
none
candlepin error.log
none
Default org manifest upload
none
OrgA manifest upload none

Description sthirugn@redhat.com 2016-10-24 17:54:57 UTC
Description of problem:
Manifest upload on a second org failed

Version-Release number of selected component (if applicable):
# rpm -qa | grep satellite
satellite-installer-6.3.0-1.el7sat.noarch
satellite-cli-6.3.0-1.0.git.7.fb12bf2.el7sat.noarch
tfm-rubygem-foreman_theme_satellite-1.0.0-1.git.2.94b76fc.el7.noarch
satellite-6.3.0-1.0.git.7.fb12bf2.el7sat.noarch

How reproducible:
Always

Steps to Reproduce:
1. Upload a manifest in org - `Default Organization`
2. Create a second manifest from the same portal account and upload to second org.

Actual results:
Manifest upload failed.

See attached errors from:
* production.log:
2016-10-24 12:50:04  [app] [E] Error during manifest import: {"displayMessage"=>"Failed to import archive", "requestUuid"=>"398feb37-f6bd-4709-998a-630f0e8f4a95"}
2016-10-24 12:50:04  [foreman-tasks/action] [E] Failed to import archive (Katello::Errors::CandlepinError)


* /var/log/candlepin/error.log:
2016-10-24 12:50:04,662 [thread=http-bio-8443-exec-7] [req=398feb37-f6bd-4709-998a-630f0e8f4a95, org=secondorg] ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - ERROR: duplicate key value violates unique constraint "cp_content_label_key"
  Detail: Key (label)=(rhel-7-server-containers) already exists.
2016-10-24 12:50:04,693 [thread=http-bio-8443-exec-7] [req=398feb37-f6bd-4709-998a-630f0e8f4a95, org=secondorg] ERROR org.candlepin.sync.Importer - Failed to import archive


Expected results:
Manifest upload works successfully.

Additional info:

Comment 1 sthirugn@redhat.com 2016-10-24 17:55:18 UTC
Created attachment 1213534 [details]
production.log

Comment 2 sthirugn@redhat.com 2016-10-24 17:55:40 UTC
Created attachment 1213535 [details]
candlepin error.log

Comment 7 Matt Ruzicka 2016-10-24 18:21:45 UTC
Would we want to make sure users run a backup before modifying the DB as a "Step 0" or is it reasonably safe?

Comment 8 sthirugn@redhat.com 2016-10-24 18:26:51 UTC
It appears that workaround mentioned in Comment 6 can cause breaking linkages.  The candlepin team is working on a fix.

Comment 10 Barnaby Court 2016-10-24 19:28:09 UTC
A safer alternative would be the following:

Step 1:
Remove the uniqueness constraint on the cp_content label column and change it to a regular index.

# sudo su - postgres
# psql
postgres=# \connect candlepin
candlepin=# alter table cp_content drop constraint cp_content_label_key;
candlepin=# create index cp_content_label_key on cp_content (label);

Step 2: upload the second org manifest again

Step 3: refresh the first org manifest just to make sure it works.

Comment 17 Sanket Jagtap 2016-11-02 09:19:06 UTC
Build : Satellite 6.2.4 Snap 1

rpm -qa | grep satellite
tfm-rubygem-foreman_theme_satellite-0.1.33-1.el7sat.noarch
satellite-cli-6.2.4-1.0.el7sat.noarch
satellite-installer-6.2.0.12-1.el7sat.noarch
satellite-6.2.4-1.0.el7sat.noarch

Verification steps:
1. Upload a manifest in org - `Default Organization`
2. Created a second manifest from the same portal account and upload to second org - orgA.

The 2 manifests for the 2 organizations were uploaded successfully

Please see the screenshots

Comment 18 Sanket Jagtap 2016-11-02 09:20:19 UTC
Created attachment 1216410 [details]
Default org manifest upload

Comment 19 Sanket Jagtap 2016-11-02 09:21:05 UTC
Created attachment 1216411 [details]
OrgA manifest upload

Comment 20 Ohad Levy 2016-11-03 14:05:40 UTC
follow up question: I've applied the workaround, but now the upgrade procedure does not work:
Upgrade Step: migrate_candlepin...
[ INFO 2016-11-03 14:04:29 verbose] Upgrade Step: migrate_candlepin...

########## ERROR ############
Error running command: liquibase --driver=org.postgresql.Driver --classpath=/usr/share/java/postgresql-jdbc.jar:/var/lib/tomcat/webapps/candlepin/WEB-INF/classes/ --changeLogFile=db/changelog/changelog-update.xml --url=jdbc:postgresql:candlepin --username=candlepin  --password=UwAUXGvbHF3xAQcDeNsXSLah5bo5DoSb migrate -Dcommunity=False
Status code: 65280
Command output: Liquibase update Failed: Migration failed for change set db/changelog/20161025100925-remove-unique-content-name-constraint.xml::20161025100925-1::awood:
     Reason: liquibase.exception.DatabaseException: Error executing SQL ALTER TABLE public.cp_content DROP CONSTRAINT cp_content_label_key: ERROR: constraint "cp_content_label_key" of relation "cp_content" does not exist

Migrating candlepin database
Traceback (most recent call last):
  File "/usr/share/candlepin/cpdb", line 245, in <module>
    dbsetup.update()
  File "/usr/share/candlepin/cpdb", line 69, in update
    self._run_liquibase("db/changelog/changelog-update.xml")
  File "/usr/share/candlepin/cpdb", line 92, in _run_liquibase
    self.community))
  File "/usr/share/candlepin/cpdb", line 38, in run_command
    error_out(command, status, output)
  File "/usr/share/candlepin/cpdb", line 46, in error_out
    raise Exception("Error running command")
Exception: Error running command

[ERROR 2016-11-03 14:04:38 verbose] 
########## ERROR ############
Error running command: liquibase --driver=org.postgresql.Driver --classpath=/usr/share/java/postgresql-jdbc.jar:/var/lib/tomcat/webapps/candlepin/WEB-INF/classes/ --changeLogFile=db/changelog/changelog-update.xml --url=jdbc:postgresql:candlepin --username=candlepin  --password=UwAUXGvbHF3xAQcDeNsXSLah5bo5DoSb migrate -Dcommunity=False
Status code: 65280
Command output: Liquibase update Failed: Migration failed for change set db/changelog/20161025100925-remove-unique-content-name-constraint.xml::20161025100925-1::awood:
     Reason: liquibase.exception.DatabaseException: Error executing SQL ALTER TABLE public.cp_content DROP CONSTRAINT cp_content_label_key: ERROR: constraint "cp_content_label_key" of relation "cp_content" does not exist

Migrating candlepin database
Traceback (most recent call last):
  File "/usr/share/candlepin/cpdb", line 245, in <module>
    dbsetup.update()
  File "/usr/share/candlepin/cpdb", line 69, in update
    self._run_liquibase("db/changelog/changelog-update.xml")
  File "/usr/share/candlepin/cpdb", line 92, in _run_liquibase
    self.community))
  File "/usr/share/candlepin/cpdb", line 38, in run_command
    error_out(command, status, output)
  File "/usr/share/candlepin/cpdb", line 46, in error_out
    raise Exception("Error running command")
Exception: Error running command

Upgrade step migrate_candlepin failed. Check logs for more information.


what is the recommended approach to resolve it?

Comment 22 Barnaby Court 2016-11-04 18:41:49 UTC
Hi, 

Today the simplest option is to recreate the old constraint as the upgrade is failing because it recognizes the database is in an unknown state. 

If there is a way the installer could check whether the manual patch was taken then there is a manual liquibase task that could be used to mark the offending changeset as having already been run. 

http://www.liquibase.org/documentation/ant/marknextchangesetran_ant_task.html

The manual SQL would be the reverse of the current 
alter table cp_content drop constraint cp_content_label_key;
create unique index cp_content_label_key on cp_content (label);

Comment 33 errata-xmlrpc 2016-11-10 08:14:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:2699