Created attachment 1006529 [details] screenshot Description of problem: This is a RHEL 6.6 x86_64 system with the Satellite-6.1.0-RHEL-6-20150324.0 build. After importing a valid manifest, I started enabling the kickstart repos for the following products in this order (via the web ui, see the attached screenshot): * Red Hat Enterprise Linux 5 Server Kickstart i386 5.9 * Red Hat Enterprise Linux 5 Server Kickstart x86_64 5.9 * Red Hat Enterprise Linux 6 Server Kickstart i386 6.6 * Red Hat Enterprise Linux 6 Server Kickstart x86_64 6.6 These were enabled really quickly clicking on each checkbox and that's when I saw the error. It is interesting that I did the same exact setup on a RHEL 7.1 system and I did not see this. Version-Release number of selected component (if applicable): * Satellite-6.1.0-RHEL-6-20150324.0 How reproducible: Steps to Reproduce: 1. Install build Satellite-6.1.0-RHEL-6-20150324.0 on RHEL 6.6 system 2. Import valid manifest 3. Enable all repos listed above in a quick sequence using the web ui Actual results: Expected results: Additional info: Task 4340a6ba-a47d-4fea-81f9-60bac9103123: Katello::Errors::CandlepinError: Runtime Error could not execute statement at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,094
Created attachment 1006533 [details] foreman-debug
Since this issue was entered in Red Hat Bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release.
I tried same scenario on my setup which is on rhel66 and without squid proxy. I quickly enabled couple of ks repos and all enabled successfully. Not sure if proxy is causing this.
Created attachment 1006878 [details] selected ks repos
I got a slightly different error: "Exception: Katello::Errors::CandlepinError: Runtime Error null at org.candlepin.model.AbstractHibernateCurator.delete:326" I'd argue we can push this to GA as the resolution is to just resume the task and it doesn't happen that often unless you click really fast. ugly, but not fatal
The problem here is that both actions try to enable the same repository concurrently. The solution would be to catch the error coming from candlepin and handle the case, that some other action already added the id (by simple making sure the id is already in the cp environment). While looking at this issue, I've found another race-condition case that can cause issues. I will file another bugzilla for that one.
I've filed another BZ with another race-condition, related to this https://bugzilla.redhat.com/show_bug.cgi?id=1207642
Created redmine issue http://projects.theforeman.org/issues/9978 from this bug
should be resolved when -> https://github.com/Katello/katello/pull/5155 gets merged
I still see error when enabling and disabling. See candlepin error log and foreman debug.
Created attachment 1009959 [details] candlepin error.log
Created attachment 1009961 [details] foreman debug
Talked with Suresh on IRC about this. He got initial Permission Denied errors from the CDN which I think put his set of tasks in a bad state. This may be a different situation than this bug as I was able to import a fresh manifest into an org on his sat and enable 20+ repos without error. Will investigate a bit more tomorrow with a dead manifest to see if we can reproduce his specific error condition and if so, file a new bug.
I tried with a new manifest in a new Org as Mike suggested. But got the error in candlepin logs. I will attach the specific logs here.
Created attachment 1009969 [details] candlepin error log latest
I was able to reproduce it today but the steps to do it are not really clear. I expanded different channels and then as one of the channels was still loading to expose its products, I enabled a product from the first channel I clicked. I see errors in both candlepin.log and catalina.out.
Created attachment 1010152 [details] candlepin logs Candlepin logs using Satellite-6.1.0-RHEL-6-20150331.1 build.
Created attachment 1010154 [details] Tomcat logs Tomcat logs using Satellite-6.1.0-RHEL-6-20150331.1 build.
The fix for this error can still produce some errors in the catalina.out: the fix is to handle the error state properly: the errors in catalina.out are just saying, that the repository was already enabled, but should have now influence on the behaviour of the satellite itself. Putting back on QE to verify, that the satellite works after enabling the repositories: not paused tasks with error result.
Verified as per the steps mentioned in Comment 19. I dont see candlepin errors or task failures when I retested in Snap6. Please note that there is a know race condition as mentioned in Comment 8 - https://bugzilla.redhat.com/show_bug.cgi?id=1207642. Version Tested: Sate 6.1 GA Snap 6 Verification screenshot and verification logs attached for reference.
Created attachment 1033853 [details] verification screenshot
Created attachment 1033854 [details] verification logs
This bug is slated to be released with Satellite 6.1.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2015:1592