Bug 2069135

Summary: After restore from 6.10.2 (and older) backup to 6.10.3 candlepin is broken
Product: Red Hat Satellite Reporter: Lukas Pramuk <lpramuk>
Component: Satellite MaintainAssignee: Evgeni Golov <egolov>
Status: CLOSED ERRATA QA Contact: Lukas Pramuk <lpramuk>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.10.3CC: apatel, aupadhye, egolov, ehelms, gtalreja, kgaikwad, zhunting
Target Milestone: 6.11.0Keywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rubygem-foreman_maintain-1.0.9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2076294 (view as bug list) Environment:
Last Closed: 2022-07-05 14:34:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lukas Pramuk 2022-03-28 10:57:23 UTC
Description of problem:
After restore from 6.10.2 (and older) backup to 6.10.3 candlepin is broken
Restoring from 6.10.3 backup to 6.10.3 is ok.
There seem to be breaking change in 6.10.3 for candlepin.


Version-Release number of selected component (if applicable):
Satellite 6.10.3
candlepin-4.0.15-1.el7sat.noarch


How reproducible:
deterministic

Steps to Reproduce:
1. Install 6.10.2 (using internal repo in order to be able to pin at 6.10.2)

2. Create/Have a 6.10.2 backup (or older)
# satellite-maintain backup offline /var/backup

3. Upgrade to 6.10.3 (used CDN as 6.10.3 is latest relesed)

3a. Check candlepin status after upgrade (so far good)
# hammer ping
...
candlepin:
    Status:          ok
    Server Response: Duration: 50ms
candlepin_auth:
    Status:          ok
    Server Response: Duration: 46ms
candlepin_events:
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms

4. Restore from the 6.10.2 backup

# satellite-maintain restore /var/backup/satellite-backup-*
...
| All services started                                                [OK]
--------------------------------------------------------------------------------
Run daemon reload:                                                    [OK]
--------------------------------------------------------------------------------

5. Check candlepin status after restore from the backup
# hammer ping
...
candlepin:
    Status:          FAIL
    Server Response: Message: 404 Not Found
candlepin_auth:
    Status:          FAIL
    Server Response: Message: Katello::Errors::CandlepinNotRunning
candlepin_events:
    Status:          FAIL
    message:         Not running
    Server Response: Duration: 2ms

>>> candlepin app not running and tomcat service restart doesn't help to recover !!!

Actual results:
candlepin app not running

Expected results:
candlepin app runs without errors

Additional info:

Mar 27, 2022 6:22:15 PM org.apache.catalina.core.StandardContext listenerStart
SEVERE: Exception sending context initialized event to listener instance of class org.candlepin.guice.CandlepinContextListener
com.google.inject.CreationException: Unable to create injector, see the following errors:

1) Error in custom provider, java.lang.NullPointerException
  while locating com.google.inject.persist.jpa.JpaPersistService
  while locating javax.persistence.EntityManager
  at org.candlepin.policy.js.JsRunnerProvider.<init>(JsRunnerProvider.java:85)
  at org.candlepin.guice.CandlepinModule.configure(CandlepinModule.java:281)
  while locating org.candlepin.policy.js.JsRunnerProvider
Caused by: java.lang.NullPointerException
        at com.google.inject.persist.jpa.JpaPersistService.begin(JpaPersistService.java:78)
        at com.google.inject.persist.jpa.JpaPersistService.get(JpaPersistService.java:54)
        at com.google.inject.persist.jpa.JpaPersistService.get(JpaPersistService.java:37)
        at com.google.inject.internal.ProviderInternalFactory.provision(ProviderInternalFactory.java:85)
        at com.google.inject.internal.BoundProviderFactory.provision(BoundProviderFactory.java:77)
        at com.google.inject.internal.ProviderInternalFactory.circularGet(ProviderInternalFactory.java:59)
        at com.google.inject.internal.BoundProviderFactory.get(BoundProviderFactory.java:61)
        at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1094)
        at org.candlepin.model.AbstractHibernateCurator.currentSession(AbstractHibernateCurator.java:712)
        at org.candlepin.model.RulesCurator.getDbRules(RulesCurator.java:84)
        at org.candlepin.model.RulesCurator.updateDbRules(RulesCurator.java:91)
        at org.candlepin.policy.js.JsRunnerProvider.<init>(JsRunnerProvider.java:90)
        at org.candlepin.policy.js.JsRunnerProvider$$FastClassByGuice$$db9cfd63.newInstance(<generated>)
        at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89)
        at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114)
        at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91)
        at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306)
        at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
        at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:168)
        at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39)
        at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:213)
        at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:184)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:111)
        at com.google.inject.Guice.createInjector(Guice.java:87)
        at org.jboss.resteasy.plugins.guice.GuiceResteasyBootstrapServletContextListener.contextInitialized(GuiceResteasyBootstrapServletContextListener.java:56)
        at org.candlepin.guice.CandlepinContextListener.contextInitialized(CandlepinContextListener.java:137)
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5127)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5643)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:899)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:875)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:652)
        at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1260)
        at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:2002)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

2) Error injecting constructor, javax.persistence.PersistenceException: [PersistenceUnit: default] Unable to build Hibernate SessionFactory
  at org.candlepin.common.guice.JPAInitializer.<init>(JPAInitializer.java:29)
  at org.candlepin.guice.CandlepinModule.configureJPA(CandlepinModule.java:389)
  while locating org.candlepin.common.guice.JPAInitializer
Caused by: javax.persistence.PersistenceException: [PersistenceUnit: default] Unable to build Hibernate SessionFactory
        at org.hibernate.jpa.boot.internal.EntityManagerFactoryBuilderImpl.persistenceException(EntityManagerFactoryBuilderImpl.java:1314)
        at org.hibernate.jpa.boot.internal.EntityManagerFactoryBuilderImpl.build(EntityManagerFactoryBuilderImpl.java:1240)
        at org.hibernate.jpa.HibernatePersistenceProvider.createEntityManagerFactory(HibernatePersistenceProvider.java:56)
        at javax.persistence.Persistence.createEntityManagerFactory(Persistence.java:55)
        at com.google.inject.persist.jpa.JpaPersistService.start(JpaPersistService.java:110)
        at org.candlepin.common.guice.JPAInitializer.<init>(JPAInitializer.java:30)
        at org.candlepin.common.guice.JPAInitializer$$FastClassByGuice$$80e35e6b.newInstance(<generated>)
        at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89)
        at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114)
        at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91)
        at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306)
        at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
        at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:168)
        at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39)
        at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:213)
        at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:184)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:111)
        at com.google.inject.Guice.createInjector(Guice.java:87)
        at org.jboss.resteasy.plugins.guice.GuiceResteasyBootstrapServletContextListener.contextInitialized(GuiceResteasyBootstrapServletContextListener.java:56)
        at org.candlepin.guice.CandlepinContextListener.contextInitialized(CandlepinContextListener.java:137)
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5127)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5643)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:899)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:875)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:652)
        at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1260)
        at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:2002)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.hibernate.tool.schema.spi.SchemaManagementException: Schema-validation: missing table [cp_system_locks]
        at org.hibernate.tool.schema.internal.AbstractSchemaValidator.validateTable(AbstractSchemaValidator.java:121)
        at org.hibernate.tool.schema.internal.GroupedSchemaValidatorImpl.validateTables(GroupedSchemaValidatorImpl.java:42)
        at org.hibernate.tool.schema.internal.AbstractSchemaValidator.performValidation(AbstractSchemaValidator.java:89)
        at org.hibernate.tool.schema.internal.AbstractSchemaValidator.doValidation(AbstractSchemaValidator.java:68)
        at org.hibernate.tool.schema.spi.SchemaManagementToolCoordinator.performDatabaseAction(SchemaManagementToolCoordinator.java:192)
        at org.hibernate.tool.schema.spi.SchemaManagementToolCoordinator.process(SchemaManagementToolCoordinator.java:73)
        at org.hibernate.internal.SessionFactoryImpl.<init>(SessionFactoryImpl.java:320)
        at org.hibernate.boot.internal.SessionFactoryBuilderImpl.build(SessionFactoryBuilderImpl.java:462)
        at org.hibernate.jpa.boot.internal.EntityManagerFactoryBuilderImpl.build(EntityManagerFactoryBuilderImpl.java:1237)
        ... 31 more

2 errors
        at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:554)
        at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:188)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:111)
        at com.google.inject.Guice.createInjector(Guice.java:87)
        at org.jboss.resteasy.plugins.guice.GuiceResteasyBootstrapServletContextListener.contextInitialized(GuiceResteasyBootstrapServletContextListener.java:56)
        at org.candlepin.guice.CandlepinContextListener.contextInitialized(CandlepinContextListener.java:137)
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5127)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5643)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:899)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:875)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:652)
        at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1260)
        at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:2002)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

Comment 2 Evgeni Golov 2022-03-28 12:34:35 UTC
I *love* reading Java stacktraces. The real problem should be:
Caused by: org.hibernate.tool.schema.spi.SchemaManagementException: Schema-validation: missing table [cp_system_locks]

This is because you restore a backup from candelpin 4.0.9, but you have candlepin 4.0.15.

This *usually* works, but is not guaranteed to, as there might be DB differences.

With https://projects.theforeman.org/issues/33281 foreman-maintain started to run the installer after restoring DBs, to exactly accommodate for that: restored DB and packages disagree on versions.

I have verified on your reproducer box that removing /var/lib/candlepin/.puppet-candlepin-rpm-version and running the installer is sufficient to make Candlepin happy again (as the installer did run the migrations for us).

I am not sure why I had to wipe /var/lib/candlepin/.puppet-candlepin-rpm-version -- need to dig a bit.

For the "run installer part" I am attaching a upstream redmine, that made f-maintain do exactly this.

Comment 3 Evgeni Golov 2022-03-28 13:54:17 UTC
Also needs to go to 6.11 @Brad ;)

Comment 4 Bryan Kearney 2022-03-28 16:05:14 UTC
Upstream bug assigned to egolov

Comment 5 Bryan Kearney 2022-03-28 16:05:16 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/33281 has been resolved.

Comment 6 Lukas Pramuk 2022-04-25 16:08:48 UTC
FailedQA.

Current fix can't help to mitigate the issue as was proven during sat-6.10.z+ BZ clone verification.
See https://bugzilla.redhat.com/show_bug.cgi?id=2076294#c2

Comment 7 Evgeni Golov 2022-04-26 13:31:44 UTC
https://github.com/theforeman/foreman_maintain/pull/609 / https://projects.theforeman.org/issues/34821 is done and needs backporting to the right f-m branch now.

Comment 8 Lukas Pramuk 2022-05-26 13:09:19 UTC
VERIFIED.

@Satellite 6.11.0 Snap22
rubygem-foreman_maintain-1.0.10-1.el7sat.noarch
foreman-installer-3.1.2.6-1.el7sat.noarch

by the original reproducer adapted to 6.11 context as the same breaking candlepin change from 6.10.3 landed in 6.11.0 Snap8

 => After restore from 6.11.0 Snap7 (and older) backup to 6.11.0 Snap8 (and newer) candlepin is broken

1) Install 6.11.0 Snap7

2) Create a 6.11.0 Snap7 backup (have to use latest foreman_maintain 1.0.10 to get around BZ 2052493 fixed in Snap20)
# satellite-maintain backup offline /var/backup

3) Restore the backup on Satellite 6.11.0 Snap22 (have to restore on another machine as upgrade 7.0.0(=6.11.0) Snap7 -> 6.11.0 Snap22 is not possible)

# satellite-change-hostname $SOURCE -y -u admin -p changeme
# satellite-maintain restore /var/backup/satellite-backup-*
...
/ All services started                                                [OK]      
--------------------------------------------------------------------------------
Run daemon reload:                                                    [OK]
--------------------------------------------------------------------------------
Procedures::Installer::Upgrade:                                       [OK]
--------------------------------------------------------------------------------
Execute upgrade:run rake task:                                        [OK]
--------------------------------------------------------------------------------

4) Check candlepin status after restore from the backup
# hammer ping
database:         
    Status:          ok
    Server Response: Duration: 0ms
candlepin:        
    Status:          ok
    Server Response: Duration: 62ms
candlepin_auth:   
    Status:          ok
    Server Response: Duration: 38ms
candlepin_events: 
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
katello_events:   
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 1ms
pulp3:            
    Status:          ok
    Server Response: Duration: 186ms
pulp3_content:    
    Status:          ok
    Server Response: Duration: 243ms
foreman_tasks:    
    Status:          ok
    Server Response: Duration: 5ms

>>> candlepin status is green now as cpdb migration is being run upon restore

Comment 11 errata-xmlrpc 2022-07-05 14:34:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498