RHN Satellite has an action queue, this action queue has also a scheduler to manage tasks which are often packages we want to administrate (Install (deploy), remove, etc.). What happened here is that the client, when checking this queue with the tool rhn_check, it failed silently, reporting a successful installation to the RHN Satellite server's task scheduler, making this one believe everything went fine, no failures whatsoever. Now, rhn_check is a python script that is very similar to up2date, the difference is that he checks the scheduler instead of fetching packages, the common points is that they share the same python foundations and therefore the same rpm code. rhn_check will check first with the scheduler to see if we have pending tasks, since the admin scheduled an install of apache, it will then download the package's header (containing this broken %pre script) and evaluate it, if the syntax is correct, it will then perform an rpm transaction against it and execute each bit that it has been asked on that header. If everything went fine, it will then install the package and report to the Satellite a successful result, unless a failure was encountered, in which case we will send an error code and message. The problem came from the transaction, in fact, rpm contains a section called PSM, which is the one that will execute each one of the scriptlets and parts of the header. When trying to install a package, python (rhn_check) will code it's C wrapper and try to run a batch of transactions with : ts.run() which is in the wrapper rpmts_Run() This function returns a list. What happened in our case ? When running this scriptlet batch, we did : if (!rc) rpmpsmStage(transaction we have, PSM_PRE); if (!rc) ....etc. rpmpsmStage(transaction, PSM_FINI); /* we clear and end */ So you can see that if one of these fail, we just clear/clean/return. Within this Stage function we fork a child and execute a script shell, if we failed or if someone (BAD PRACTICE as we say we failed) exits with an error code and NOT ZERO, we clean/end fail. RPM in command line will fail with main returning -1. However, the wrapper will do something different! It will just want to return a list, so how Paul decided to do this ? Simple : rc = rpmts_Run() ps = rpmts_Problems() ... if (rc < 0) { /* Our case... */ create_a_list return this_empty_list } And then in Python : rc = ts.Run() if (rc) thins happen... And here comes our problem! We never checked whether this list was empty or not during the batch transaction! therefore I added : + elif type(rc) == type([]) and not len(rc): + raise up2dateErrors.RpmError(_("Failed running rpm transaction")) Which means : Well, we have a return yes, also if we have this return *and* it is a list *and* it's empty (our case) Then raise a simple generic exception as we cannot fetch (unless I write more lines) the name of the package and anyway we did exit (1). Then I added this exception to the code so it can return : + except up2dateErrors.RpmError, e: + return (18, "Failed: packages requested raised "\ + "dependency or transaction problems", {}) Making satellite aware of the error and sending the host to the right queue within the scheduler (failed hosts). Let me know if you need more details, Jose
Created attachment 145924 [details] up2date-4.4.69-scriplet_exception.patch
Steps to reproduce : 1. Create a software customised channel. 2. Push this package in that channel. 3. Subscribe a host to that channel. 4. Perform a schedule install of that package to that host. 5. From the client host, run : rhn_check. Actual results: -- [root@dhcp-0-238 ~]# rpm -ivh apache-2.0.55-VIS27.i386.rpm Preparing... ########################################### [100%] User wwwrun is missing. Please create this user prior to installing this package. error: %pre(apache-2.0.55-VIS27.i386) scriptlet failed, exit status 1 error: install: %pre scriptlet failed (2), skipping apache-2.0.55-VIS27 -- Success from the scheduler host status. No host had any failure. Expected results: D: Running transaction (final step)... D: Sending back response (18, 'Failed: packages requested raised dependency or transaction problems', {}) D: do_call packages.checkNeedUpdate ('rhnsd=1',) D: local action status: (0, 'rpm database not modified since last update (or package list recently updated)', {}) We might need to add a print before the raise so the customer can see an error or message. And on the failed hosts: "Failed: packages requested raised dependency or transaction problems". Jose
Created attachment 146160 [details] up2date-4.4.69-scriplet_exception-error-msg.patch Prints error message on error instead of silently report the error and give a prompt.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
There is a patch available, but there is development work needed on the RHN side to complete this bug, thus it will not be ready for QA until it is development complete.
this should be fixed now, applied patch and tested. committed to revision 110084 in svn.
yes, thats what i would expect it to raise. We are raising the exception in the up2date actions code and not in rpm itself and the exception we return is: + except up2dateErrors.RpmError, e: + return (18, "Failed: packages requested raised "\ + "dependency or transaction problems", {}) which is an error code 18.
most part of the patch proposed was already existing code that was commented out for some unkonwn reason, we uncommented that so we actually catch the exception rather than ignore it and end up as a success as it was doing previously. There is nothing much to add to the patch except catch the exception. Thats my take on it. The only decision we were trying to make initially was whether to throw the exception at the rpm level itself or wait untill it enters the actions code and as per Paul this cannot happen untill the next major release if we want to do it in rpm, so we decided to do it at the actions level. As far as the print statement I feel "packages requested raised dependency or transaction problems", is appropriate(in this case a transaction problem) as this path is also follwed while doing the dep solving and the error code needs to be as stated. I think i dont understand the concern here. isnt the patch working for you?
yep I see what you mean, I did consider that, as i explained earlier, we need to throw same exception in multiple scenerios, so I felt what we had was appripriate as it covered any dependency case and any transaction scenerio(which is fits our case or any other transaction issue).
oh my apologies Jose, i totally missed your new patch. Did not realize that. Thanks for clearing that up cliff. I will include that immediately and update you all.
added the logging to replace the print --- rpmUtils.py (revision 110190) +++ rpmUtils.py (working copy) @@ -31,7 +31,7 @@ from rhpl.translate import _, N_ - +from up2date_client import up2dateLog # mainly here to make conflicts resolution cleaner def findDepLocal(ts, dep): @@ -519,5 +519,8 @@ raise up2dateErrors.TransactionError(_( "Failed running transaction of packages: %s") % errors, deps=rc) elif type(rc) == type([]) and not len(rc): + # let the user know whats wrong + log = up2dateLog.initLog() + log.log_me("Failed running rpm transaction - %pre %pro failure ?.") raise up2dateErrors.RpmError(_("Failed running rpm transaction"))
fyi, the above will log the messages to /var/log/up2date or what ever the config is pointing to.
yea I included this bug into the errata already, will update it with latest package and changelog information
*** Bug 221961 has been marked as a duplicate of this bug. ***
(In reply to comment #33) > An interesting finding on this issue. up2date-4.4.5-1.i386.rpm does not > care about obsoletes list as mentioned earlier in this ticket. So , if I > use above version of up2date and issue the command up2date apache , it just > goes ahead and installs the package. However , up2date package that is > attached to this ticket fails with error as apache package get's added to > obsoletes list due to httpd being in the base channel. So it looks like > up2date got this additonal capability somewhere between version 4.4.5-1 and > the current patched version attached to this ticket. > > Regards , > Shashin. > > Internal Status set to 'Waiting on SEG' > Status set to: Waiting on Tech > > This event sent from IssueTracker by jplans > issue 107834 > - Could you paste the exact error message its failing with? - Also did you try commenting out the patch in this bug and try the same ?
marking verified. this seems to have been fixed with up2date-gnome-4.5.5-1.el4 up2date-4.5.5-1.el4
Has this bug been fixed in RHEL 3? Is so then which release of up2date contains the fix?
(In reply to comment #42) > Has this bug been fixed in RHEL 3? Is so then which release of up2date contains > the fix? the fix for rhel-3 will be released as a part of rhel-3.9. This is currently "on_qa".
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0250.html