Bug 598249
Summary: | [abrt] crash in system-config-printer-1.2.2-1.fc13: jobviewer.py:224:on_job_cancel_prompt_response:IndexError: list index out of range | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Iván Jiménez <icj> | ||||||
Component: | system-config-printer | Assignee: | Tim Waugh <twaugh> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 13 | CC: | jpopelka, twaugh | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | abrt_hash:4240b118 | ||||||||
Fixed In Version: | system-config-printer-1.2.3-3.fc13 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2010-07-01 18:38:23 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Iván Jiménez
2010-05-31 21:43:28 UTC
Created attachment 418403 [details]
File: backtrace
I went to localhost:631 and purged all jobs, but it still appears. I'll attach error.log and messages contains: May 31 16:35:43 endeavour hp[986]: io/hpmud/musb.c 1403: unable to write data hp:/usb/Deskjet_D2300_series?serial=TH66G160B804KT: 45 second io timeout May 31 16:35:43 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:36:13 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:36:43 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:37:08 endeavour python: abrt: detected unhandled Python exception in /usr/share/system-config-printer/system-config-printer.py May 31 16:37:08 endeavour abrtd: Directory 'pyhook-1275341828-1841' creation detected May 31 16:37:08 endeavour abrt[1883]: saved python crash dump of pid 1841 to /var/cache/abrt/pyhook-1275341828-1841.new May 31 16:37:09 endeavour abrtd: New crash /var/cache/abrt/pyhook-1275341828-1841, processing May 31 16:37:09 endeavour abrtd: Registered Action plugin 'RunApp' May 31 16:37:09 endeavour abrtd: RunApp('/var/cache/abrt/pyhook-1275341828-1841','test x"`cat component`" = x"xorg-x11-server-Xorg" && cp /var/log/Xorg.0.log .') May 31 16:37:13 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:37:15 endeavour abrtd: Getting crash infos... May 31 16:37:39 endeavour ntpd[1092]: 0.0.0.0 c615 05 clock_sync May 31 16:37:43 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:37:54 endeavour abrtd: Creating report... May 31 16:38:13 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:38:43 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:39:13 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:39:43 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:40:58 endeavour hp[986]: io/hpmud/musb.c 1403: unable to write data hp:/usb/Deskjet_D2300_series?serial=TH66G160B804KT: 45 second io timeout May 31 16:40:58 endeavour hp[986]: prnt/backend/hp.c 611: ERROR: 1009 media-empty-error; will retry in 30 seconds... May 31 16:43:29 endeavour abrtd: New bug id: 598249 May 31 16:49:31 endeavour abrtd: Getting crash infos... Created attachment 418405 [details]
error_log
I reproduced it; it happens when deleting a job second time. To be able to power off the printer, I restarted cups, in messages appeared: E [31/May/2010:17:07:31 -0500] [Job 369] Aborting job because it has no files. E [31/May/2010:17:07:31 -0500] [Job 370] Aborting job because it has no files. and job disappeared. As the led was still blinking, I unpluged and repluged the power cable. I was not able to reproduce it (deleting/canceling a job second time doesn't help in my case), but upstream commit (24f07c0) should avoid the crash. I wonder if the real problem is some sort of race condition between JobViewer.job_removed() and JobViewer.on_job_cancel_activate(). Here's the scenario I'm thinking of: user selects job Job is cancelled/completed separately cupsd sends D-Bus signal saying something changed monitor receives signal, schedules a call to get_notifications() User clicks Cancel Monitor.get_notifications() runs, calls JobViewer.job_removed() JobViewer.job_removed() removes it from self.jobids[] Now JobViewer.on_job_cancel_activate() is called but self.jobids is empty Does that sound possible? In other words, I think the len(self.jobids)==0 check should probably go into JobViewer.on_job_cancel_activate2. Some context for it: When I turned on the pc the printer's led was blinking. I tried to power it off but it didn't, so I went and canceled a pending job (error: out paper) from s-c-p but after 30 s nothing happened. So I tried delete and hit the crash. I'm trying to reproduce it but if I print without paper, the printer tries, goes to blinking and doesn't power off, but there aren't any pending jobs in s-c-p or cups; it appears as completed. So I guess it's very hw dependent. Tim, yes I had thought about some race condition. My scenario has been this: - User clicks Cancel - JobViewer.on_job_cancel_activate() calls on_job_cancel_activate2() - Dialog with "Do you really want to cancel/delete these job(s)?" is displayed. - Monitor.get_notifications() runs, calls JobViewer.job_removed() - JobViewer.job_removed() removes it from self.jobids[] - User responses "Yes" to that dialog and on_job_cancel_prompt_response() is called but self.jobids[] is empty In other words, I think that when I place (and I had thought about that) the condition in on_job_cancel_activate2() and the JobViewer.job_removed() is called when the user is looking at the "Really delete/cancel job(s)?" dialog, the crash occurs anyway. I think it's not much likely that the JobViewer.job_removed() is called between user's Cancel clicking and JobViewer.on_job_cancel_activate() calling. But my solution (better would be "if not self.jobids:") can catch this possibility either. Am I wrong ? There are two lists named 'jobids', one for each of the objects: * JobViewer.jobids, a list of job IDs for the rows that are selected * CancelJobsOperation.jobids, a list of job IDs to cancel When JobViewer.on_job_cancel_activate2() is called it creates a new instance of the CancelJobsOperation class, passing in the current JobViewer.jobids list as the list of jobs to cancel: def on_job_cancel_activate2(self, purge_job): op = CancelJobsOperation (self, self.jobids, purge_job) ... class CancelJobsOperation: def __init__ (self, jobviewer, jobids, purge_job): ... self.jobids = jobids I'm fairly sure that instantiating CancelJobsOperation this way creates a 'deep copy' of the list, i.e. we end up with two separate Python list objects. It is the CancelJobsOperation class which is responsible for presenting the confirmation dialog, and if the Monitor calls JobViewer.job_removed() at this point, the JobViewer.jobids list can become empty. But the CancelJobsOperation object has its own jobids list which is not touched by that. [Hmm, looking at CancelJobsOperation.cancelJob_error() it seems it could handle this situation (job canceled during confirmation dialog) better -- looks like it will just stop at that point, but it would be better if it carried on cancelling the other jobs in the list.] (In reply to comment #9) > I'm fairly sure that instantiating CancelJobsOperation this way creates a 'deep > copy' of the list, i.e. we end up with two separate Python list objects. Hmm, my python is not so sure about that. This snippet of code: class CancelJobsOperation: def __init__ (self, jobids): self.jobids = jobids def show(self): print self.jobids class Something: def __init__ (self): self.jobids = [1,2,3] def do_something(self): y = CancelJobsOperation(self.jobids) del self.jobids[0] y.show() s = Something() s.do_something() prints [2, 3] Can you look at this at clarify it? thanks > [Hmm, looking at CancelJobsOperation.cancelJob_error() it seems it could handle > this situation (job canceled during confirmation dialog) better -- looks like > it will just stop at that point, but it would be better if it carried on > cancelling the other jobs in the list.] Yes, in case of deep copy it would be good to cancel the remaining jobs. (In reply to comment #10) > (In reply to comment #9) > > I'm fairly sure that instantiating CancelJobsOperation this way creates a 'deep > > copy' of the list, i.e. we end up with two separate Python list objects. > Hmm, my python is not so sure about that. Aha, good catch! So I guess the fix here will be to change it to: class CancelJobsOperation: def __init__ (self, jobids): self.jobids = list (jobids) i.e. explicitly create a new list from the one we're passed. I wonder if there are any other places where I've got that sort of thing wrong... Thanks. Another try :-) is commited upstream. system-config-printer-1.2.3-3.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/system-config-printer-1.2.3-3.fc13 system-config-printer-1.2.3-3.fc13 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update system-config-printer'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/system-config-printer-1.2.3-3.fc13 system-config-printer-1.2.3-3.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report. |