Bug 624441

Summary: CUPS spawns a lot of backend process when being shut down while printing
Product: Red Hat Enterprise Linux 6 Reporter: Olivier Fourdan <ofourdan>
Component: cupsAssignee: Tim Waugh <twaugh>
Status: CLOSED ERRATA QA Contact: Jan Ščotka <jscotka>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: azelinka, jpopelka, kem, ltroan, twoerner, ykopkova
Target Milestone: rcKeywords: Patch
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: cups-1.4.2-37.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-19 14:09:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 627601    
Attachments:
Description Flags
Possible patch none

Description Olivier Fourdan 2010-08-16 13:35:12 UTC
Description of problem:

If cupsd is stopped while a job is being sent to a printer using a given backend, that backend is restarted multiple times (more than 2000 in my reproducer) before the CUPS scheduler actually terminates.

Version-Release number of selected component (if applicable):

cups-1.4.2-35.el6

Upstream svn as well

How reproducible:

Always

Steps to Reproduce:
1. Set cups LogLevel to debug2 in /etc/cups/cupsd.conf
1. Setup a socket printer
2. Print a huge page to that printer (so that the printing actually takes some times
3. stop cupsd

   # lp -d [PRINTER_NAME] [FILE_NAME]; /etc/init.d/cups stop
  
Actual results:

The backend is spawned multiple times, as seen from /var/log/cups/error_log

    # grep cupsdStartProcess /var/log/cups/error_log 

Expected results:

No additional backend process is started while the scheduler is terminating

Additional info:

Here follows the backtrace of the call stack that leads to start_job() when the server is being shut down:

(gdb) break cupsdStartProcess
Breakpoint 1 at 0x7fbf14ca4f00: file process.c, line 255.
(gdb) cont
Continuing.

Breakpoint 1, cupsdStartProcess (command=0x7fff2e111720 "/usr/lib/cups/backend/socket", argv=0x7fbf160364e0, envp=0x7fff2e10fda0, infd=-1, outfd=-1, errfd=17,
   backfd=19, sidefd=21, root=0, profile=0x0, job=0x7fbf16031b20, pid=0x7fbf16031c38) at process.c:255
255 {
(gdb) cont
Continuing.
Detaching after fork from child process 27588.

Program received signal SIGTERM, Terminated.
0x00007fbf11524ac3 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:82
82 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) cont
Continuing.

Breakpoint 1, cupsdStartProcess (command=0x7fff2e11af60 "/usr/lib/cups/backend/socket", argv=0x7fbf160362e0, envp=0x7fff2e1195e0, infd=-1, outfd=-1, errfd=7,
   backfd=9, sidefd=11, root=0, profile=0x0, job=0x7fbf16031b20, pid=0x7fbf16031c38) at process.c:255
255 {
(gdb) cont
Continuing.
Detaching after fork from child process 27601.

Breakpoint 1, cupsdStartProcess (command=0x7fff2e11af60 "/usr/lib/cups/backend/socket", argv=0x7fbf160388f0, envp=0x7fff2e1195e0, infd=-1, outfd=-1, errfd=7,
   backfd=9, sidefd=11, root=0, profile=0x0, job=0x7fbf16031b20, pid=0x7fbf16031c38) at process.c:255
255 {
(gdb) bt
#0  cupsdStartProcess (command=0x7fff2e11af60 "/usr/lib/cups/backend/socket", argv=0x7fbf160388f0, envp=0x7fff2e1195e0, infd=-1, outfd=-1, errfd=7, backfd=9,
   sidefd=11, root=0, profile=0x0, job=0x7fbf16031b20, pid=0x7fbf16031c38) at process.c:255
#1  0x00007fbf14c95e0e in cupsdContinueJob (job=<value optimized out>) at job.c:1220
#2  0x00007fbf14c96de0 in start_job () at job.c:4286
#3  cupsdCheckJobs () at job.c:429
#4  0x00007fbf14c9738f in finalize_job (job=0x7fbf16031b20, set_job_state=<value optimized out>) at job.c:3243
#5  0x00007fbf14c988d5 in cupsdSetJobState (job=0x7fbf16031b20, newstate=<value optimized out>, action=CUPSD_JOB_FORCE, message=<value optimized out>) at job.c:2692
#6  0x00007fbf14c98d04 in cupsdStopAllJobs (action=CUPSD_JOB_FORCE, kill_delay=0) at job.c:2723
#7  0x00007fbf14c98d93 in cupsdFreeAllJobs () at job.c:1451
#8  0x00007fbf14c7e6c1 in main (argc=<value optimized out>, argv=<value optimized out>) at main.c:1189
(gdb) cont
Continuing.
cupsdStartProcess (command=0x7fff2e11af60 <Address 0x7fff2e11af60 out of bounds>, argv=0x7fbf160388f0, envp=0x7fff2e1195e0, infd=-1, outfd=-1, errfd=7, backfd=Cannot access memory at address 0x7fff2e119550
)
   at process.c:255

This is also causing gdb to crash as well after a few iterations.

I suspect the scheduler is launching new backend processes because it loops on an array while at the same time the same job keeps being added at each iteration.

cupsdStopAllJobs() loops on the entries of "PrintingJobs":

2706 void
2707 cupsdStopAllJobs(
2708     cupsd_jobaction_t action,           /* I - Action */
2709     int               kill_delay)       /* I - Number of seconds before we kill */
2710 {
2711   cupsd_job_t   *job;                   /* Current job */
2712
2713
2714   DEBUG_puts("cupsdStopAllJobs()");
2715
2716   for (job = (cupsd_job_t *)cupsArrayFirst(PrintingJobs);
2717        job;
2718        job = (cupsd_job_t *)cupsArrayNext(PrintingJobs))
2719   {
2720     if (kill_delay)
2721       job->kill_time = time(NULL) + kill_delay;
2722
2723     cupsdSetJobState(job, IPP_JOB_PENDING, action, NULL);
2724   }
2725 }

cupsdSetJobState() called with action = CUPSD_JOB_FORCE will call finalize_job() which call cupsdCheckJobs() which in turn calls start_job() which calls cupsdSetJobState() again, which in turn does cupsArrayAdd() of the job to "PrintingJobs"...

So at each iteration, the same job keeps being added to the array "PrintingJobs" on which cupsdStopAllJobs() is precisely iterating on.

Comment 1 Olivier Fourdan 2010-08-16 13:45:11 UTC
Created attachment 438949 [details]
Possible patch

One possibility could be to stop the printer prior to stop the job, in the case where the action is forced (ie "action == CUPSD_JOB_FORCE").

As far as I can see, cupsdStopAllJobs() is called with "action == CUPSD_JOB_FORCE" only from cupsdFreeAllJobs() (there is one other call from main() but with action == CUPSD_JOB_DEFAULT)

That works and no additional backend is started when the scheduler is stopped while printing. Upon restart, the job get printed as well so that seems to work, however there might be some other side effects that I have not detected...

Comment 2 Jiri Popelka 2010-09-22 08:38:00 UTC
http://www.cups.org/strfiles/3679/str3679.patch

An alternate patch which tracks whether we are shutting down and doesn't start new jobs automatically if so.

Comment 9 errata-xmlrpc 2011-05-19 14:09:18 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0715.html