Bug 2069135 - After restore from 6.10.2 (and older) backup to 6.10.3 candlepin is broken
Summary: After restore from 6.10.2 (and older) backup to 6.10.3 candlepin is broken
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Satellite Maintain
Version: 6.10.3
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: 6.11.0
Assignee: Evgeni Golov
QA Contact: Lukas Pramuk
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-28 10:57 UTC by Lukas Pramuk
Modified: 2022-07-19 11:07 UTC (History)
7 users (show)

Fixed In Version: rubygem-foreman_maintain-1.0.9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2076294 (view as bug list)
Environment:
Last Closed: 2022-07-05 14:34:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 33281 0 Normal Closed Restore should run "upgrade" steps after the data restore finished 2022-03-28 12:34:35 UTC
Foreman Issue Tracker 34686 0 Normal New --reset-data does not remove /var/lib/candlepin/.puppet-candlepin-rpm-version 2022-03-28 12:45:37 UTC
Foreman Issue Tracker 34821 0 Normal New restoring an older candlepin DB results in broken system 2022-04-26 11:06:18 UTC
Red Hat Product Errata RHSA-2022:5498 0 None None None 2022-07-05 14:35:03 UTC

Description Lukas Pramuk 2022-03-28 10:57:23 UTC
Description of problem:
After restore from 6.10.2 (and older) backup to 6.10.3 candlepin is broken
Restoring from 6.10.3 backup to 6.10.3 is ok.
There seem to be breaking change in 6.10.3 for candlepin.


Version-Release number of selected component (if applicable):
Satellite 6.10.3
candlepin-4.0.15-1.el7sat.noarch


How reproducible:
deterministic

Steps to Reproduce:
1. Install 6.10.2 (using internal repo in order to be able to pin at 6.10.2)

2. Create/Have a 6.10.2 backup (or older)
# satellite-maintain backup offline /var/backup

3. Upgrade to 6.10.3 (used CDN as 6.10.3 is latest relesed)

3a. Check candlepin status after upgrade (so far good)
# hammer ping
...
candlepin:
    Status:          ok
    Server Response: Duration: 50ms
candlepin_auth:
    Status:          ok
    Server Response: Duration: 46ms
candlepin_events:
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms

4. Restore from the 6.10.2 backup

# satellite-maintain restore /var/backup/satellite-backup-*
...
| All services started                                                [OK]
--------------------------------------------------------------------------------
Run daemon reload:                                                    [OK]
--------------------------------------------------------------------------------

5. Check candlepin status after restore from the backup
# hammer ping
...
candlepin:
    Status:          FAIL
    Server Response: Message: 404 Not Found
candlepin_auth:
    Status:          FAIL
    Server Response: Message: Katello::Errors::CandlepinNotRunning
candlepin_events:
    Status:          FAIL
    message:         Not running
    Server Response: Duration: 2ms

>>> candlepin app not running and tomcat service restart doesn't help to recover !!!

Actual results:
candlepin app not running

Expected results:
candlepin app runs without errors

Additional info:

Mar 27, 2022 6:22:15 PM org.apache.catalina.core.StandardContext listenerStart
SEVERE: Exception sending context initialized event to listener instance of class org.candlepin.guice.CandlepinContextListener
com.google.inject.CreationException: Unable to create injector, see the following errors:

1) Error in custom provider, java.lang.NullPointerException
  while locating com.google.inject.persist.jpa.JpaPersistService
  while locating javax.persistence.EntityManager
  at org.candlepin.policy.js.JsRunnerProvider.<init>(JsRunnerProvider.java:85)
  at org.candlepin.guice.CandlepinModule.configure(CandlepinModule.java:281)
  while locating org.candlepin.policy.js.JsRunnerProvider
Caused by: java.lang.NullPointerException
        at com.google.inject.persist.jpa.JpaPersistService.begin(JpaPersistService.java:78)
        at com.google.inject.persist.jpa.JpaPersistService.get(JpaPersistService.java:54)
        at com.google.inject.persist.jpa.JpaPersistService.get(JpaPersistService.java:37)
        at com.google.inject.internal.ProviderInternalFactory.provision(ProviderInternalFactory.java:85)
        at com.google.inject.internal.BoundProviderFactory.provision(BoundProviderFactory.java:77)
        at com.google.inject.internal.ProviderInternalFactory.circularGet(ProviderInternalFactory.java:59)
        at com.google.inject.internal.BoundProviderFactory.get(BoundProviderFactory.java:61)
        at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1094)
        at org.candlepin.model.AbstractHibernateCurator.currentSession(AbstractHibernateCurator.java:712)
        at org.candlepin.model.RulesCurator.getDbRules(RulesCurator.java:84)
        at org.candlepin.model.RulesCurator.updateDbRules(RulesCurator.java:91)
        at org.candlepin.policy.js.JsRunnerProvider.<init>(JsRunnerProvider.java:90)
        at org.candlepin.policy.js.JsRunnerProvider$$FastClassByGuice$$db9cfd63.newInstance(<generated>)
        at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89)
        at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114)
        at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91)
        at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306)
        at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
        at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:168)
        at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39)
        at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:213)
        at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:184)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:111)
        at com.google.inject.Guice.createInjector(Guice.java:87)
        at org.jboss.resteasy.plugins.guice.GuiceResteasyBootstrapServletContextListener.contextInitialized(GuiceResteasyBootstrapServletContextListener.java:56)
        at org.candlepin.guice.CandlepinContextListener.contextInitialized(CandlepinContextListener.java:137)
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5127)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5643)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:899)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:875)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:652)
        at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1260)
        at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:2002)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

2) Error injecting constructor, javax.persistence.PersistenceException: [PersistenceUnit: default] Unable to build Hibernate SessionFactory
  at org.candlepin.common.guice.JPAInitializer.<init>(JPAInitializer.java:29)
  at org.candlepin.guice.CandlepinModule.configureJPA(CandlepinModule.java:389)
  while locating org.candlepin.common.guice.JPAInitializer
Caused by: javax.persistence.PersistenceException: [PersistenceUnit: default] Unable to build Hibernate SessionFactory
        at org.hibernate.jpa.boot.internal.EntityManagerFactoryBuilderImpl.persistenceException(EntityManagerFactoryBuilderImpl.java:1314)
        at org.hibernate.jpa.boot.internal.EntityManagerFactoryBuilderImpl.build(EntityManagerFactoryBuilderImpl.java:1240)
        at org.hibernate.jpa.HibernatePersistenceProvider.createEntityManagerFactory(HibernatePersistenceProvider.java:56)
        at javax.persistence.Persistence.createEntityManagerFactory(Persistence.java:55)
        at com.google.inject.persist.jpa.JpaPersistService.start(JpaPersistService.java:110)
        at org.candlepin.common.guice.JPAInitializer.<init>(JPAInitializer.java:30)
        at org.candlepin.common.guice.JPAInitializer$$FastClassByGuice$$80e35e6b.newInstance(<generated>)
        at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89)
        at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114)
        at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91)
        at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306)
        at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
        at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:168)
        at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39)
        at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:213)
        at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:184)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:111)
        at com.google.inject.Guice.createInjector(Guice.java:87)
        at org.jboss.resteasy.plugins.guice.GuiceResteasyBootstrapServletContextListener.contextInitialized(GuiceResteasyBootstrapServletContextListener.java:56)
        at org.candlepin.guice.CandlepinContextListener.contextInitialized(CandlepinContextListener.java:137)
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5127)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5643)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:899)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:875)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:652)
        at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1260)
        at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:2002)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.hibernate.tool.schema.spi.SchemaManagementException: Schema-validation: missing table [cp_system_locks]
        at org.hibernate.tool.schema.internal.AbstractSchemaValidator.validateTable(AbstractSchemaValidator.java:121)
        at org.hibernate.tool.schema.internal.GroupedSchemaValidatorImpl.validateTables(GroupedSchemaValidatorImpl.java:42)
        at org.hibernate.tool.schema.internal.AbstractSchemaValidator.performValidation(AbstractSchemaValidator.java:89)
        at org.hibernate.tool.schema.internal.AbstractSchemaValidator.doValidation(AbstractSchemaValidator.java:68)
        at org.hibernate.tool.schema.spi.SchemaManagementToolCoordinator.performDatabaseAction(SchemaManagementToolCoordinator.java:192)
        at org.hibernate.tool.schema.spi.SchemaManagementToolCoordinator.process(SchemaManagementToolCoordinator.java:73)
        at org.hibernate.internal.SessionFactoryImpl.<init>(SessionFactoryImpl.java:320)
        at org.hibernate.boot.internal.SessionFactoryBuilderImpl.build(SessionFactoryBuilderImpl.java:462)
        at org.hibernate.jpa.boot.internal.EntityManagerFactoryBuilderImpl.build(EntityManagerFactoryBuilderImpl.java:1237)
        ... 31 more

2 errors
        at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:554)
        at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:188)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:111)
        at com.google.inject.Guice.createInjector(Guice.java:87)
        at org.jboss.resteasy.plugins.guice.GuiceResteasyBootstrapServletContextListener.contextInitialized(GuiceResteasyBootstrapServletContextListener.java:56)
        at org.candlepin.guice.CandlepinContextListener.contextInitialized(CandlepinContextListener.java:137)
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5127)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5643)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:899)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:875)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:652)
        at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1260)
        at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:2002)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

Comment 2 Evgeni Golov 2022-03-28 12:34:35 UTC
I *love* reading Java stacktraces. The real problem should be:
Caused by: org.hibernate.tool.schema.spi.SchemaManagementException: Schema-validation: missing table [cp_system_locks]

This is because you restore a backup from candelpin 4.0.9, but you have candlepin 4.0.15.

This *usually* works, but is not guaranteed to, as there might be DB differences.

With https://projects.theforeman.org/issues/33281 foreman-maintain started to run the installer after restoring DBs, to exactly accommodate for that: restored DB and packages disagree on versions.

I have verified on your reproducer box that removing /var/lib/candlepin/.puppet-candlepin-rpm-version and running the installer is sufficient to make Candlepin happy again (as the installer did run the migrations for us).

I am not sure why I had to wipe /var/lib/candlepin/.puppet-candlepin-rpm-version -- need to dig a bit.

For the "run installer part" I am attaching a upstream redmine, that made f-maintain do exactly this.

Comment 3 Evgeni Golov 2022-03-28 13:54:17 UTC
Also needs to go to 6.11 @Brad ;)

Comment 4 Bryan Kearney 2022-03-28 16:05:14 UTC
Upstream bug assigned to egolov

Comment 5 Bryan Kearney 2022-03-28 16:05:16 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/33281 has been resolved.

Comment 6 Lukas Pramuk 2022-04-25 16:08:48 UTC
FailedQA.

Current fix can't help to mitigate the issue as was proven during sat-6.10.z+ BZ clone verification.
See https://bugzilla.redhat.com/show_bug.cgi?id=2076294#c2

Comment 7 Evgeni Golov 2022-04-26 13:31:44 UTC
https://github.com/theforeman/foreman_maintain/pull/609 / https://projects.theforeman.org/issues/34821 is done and needs backporting to the right f-m branch now.

Comment 8 Lukas Pramuk 2022-05-26 13:09:19 UTC
VERIFIED.

@Satellite 6.11.0 Snap22
rubygem-foreman_maintain-1.0.10-1.el7sat.noarch
foreman-installer-3.1.2.6-1.el7sat.noarch

by the original reproducer adapted to 6.11 context as the same breaking candlepin change from 6.10.3 landed in 6.11.0 Snap8

 => After restore from 6.11.0 Snap7 (and older) backup to 6.11.0 Snap8 (and newer) candlepin is broken

1) Install 6.11.0 Snap7

2) Create a 6.11.0 Snap7 backup (have to use latest foreman_maintain 1.0.10 to get around BZ 2052493 fixed in Snap20)
# satellite-maintain backup offline /var/backup

3) Restore the backup on Satellite 6.11.0 Snap22 (have to restore on another machine as upgrade 7.0.0(=6.11.0) Snap7 -> 6.11.0 Snap22 is not possible)

# satellite-change-hostname $SOURCE -y -u admin -p changeme
# satellite-maintain restore /var/backup/satellite-backup-*
...
/ All services started                                                [OK]      
--------------------------------------------------------------------------------
Run daemon reload:                                                    [OK]
--------------------------------------------------------------------------------
Procedures::Installer::Upgrade:                                       [OK]
--------------------------------------------------------------------------------
Execute upgrade:run rake task:                                        [OK]
--------------------------------------------------------------------------------

4) Check candlepin status after restore from the backup
# hammer ping
database:         
    Status:          ok
    Server Response: Duration: 0ms
candlepin:        
    Status:          ok
    Server Response: Duration: 62ms
candlepin_auth:   
    Status:          ok
    Server Response: Duration: 38ms
candlepin_events: 
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
katello_events:   
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 1ms
pulp3:            
    Status:          ok
    Server Response: Duration: 186ms
pulp3_content:    
    Status:          ok
    Server Response: Duration: 243ms
foreman_tasks:    
    Status:          ok
    Server Response: Duration: 5ms

>>> candlepin status is green now as cpdb migration is being run upon restore

Comment 11 errata-xmlrpc 2022-07-05 14:34:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498


Note You need to log in before you can comment on or make changes to this bug.