Red Hat Bugzilla – Bug 1012181
Dead passenger process leaves downloaded cartridge installation in broken state
Last modified: 2015-05-14 20:21:10 EDT
Description of problem:
If the passenger process/thread dies in the middle of installing a downloadable cartridge, the clear-pending-ops script seems to find that application's mongo entry does not have the cartridge stored in it.
Version-Release number of selected component (if applicable):
Does not happen normally. But should be reproducible by introducing a 'pkill' system command in the code right when a new gear is being sought for the downloadable cartridge.
Steps to Reproduce:
1. Create a nodejs/php app on an origin dev environment.
2. Patch the origin code to kill the broker when a new gear is being searched for (find_capacity maybe?).
3. Add a downloadable cartridge to an existing app
4. See that the broker is dead, and bring it back up. Run oo-admin-clear-pending-ops to clean the blocked pending_op queue for the application.
add-cartridge gets stuck.
And any more commands to the app do not work. (remove cart, add cart, stop, restart etc).
oo-admin-clear-pending-ops is not able to clean up the app.
Mongo entry of the app does not store the downloaded cartridge.
Even if the broker dies in the middle, oo-admin-clear-pending-ops should be able to recover from where the previous process left.
The above case was deduced from a live application that got broken because of broker thread dying. The downloaded cart was somehow missing from the mongo dump.
I have tried this several times over, but not been able to reproduce the issue. i.e. a killed broker process still does not allow the case where the downloaded cart goes missing.
Two possibilities :
1. This can happen but I have not tried enough code paths.
2. The 'missing downloaded cart' and 'broker thread dying' were two separate incidents that the investigation seems to have clubbed together. If they are unrelated, then this bug is a no-op.
The underlying issue here was the same as bug 997008. The manifestation of the issue described in this bug has not been reproduced.
Marking this as a duplicate of bug 997008.
*** This bug has been marked as a duplicate of bug 997008 ***