Bug 5832 - lpd leave cluttered queue
Summary: lpd leave cluttered queue
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: lpr
Version: 6.1
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Bill Nottingham
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 1999-10-11 14:21 UTC by Gene Czarcinski
Modified: 2014-03-17 02:10 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 1999-10-18 03:21:40 UTC
Embargoed:


Attachments (Terms of Use)

Description Gene Czarcinski 1999-10-11 14:21:38 UTC
I have checked the current bugs listed in bugtraq and this
not exact match so I am submitting it as a new problem (it
may be related).  This bug occurs on a base 6.0 cartman
installation as well as a rawhide updated system (with all
new fixes including lpr-0.41-3).

I have a "JetDirect" postscript printer configured on
192.168.1.100.  This was configured using "printtool" since
the graphical install does not configure a printer.

The problem is that files in the print queue are not being
"cleaned up" after the job has printed -- but ONLY for even
numbered jobs! ... the odd numbered job cleans the queue for
both that job AND the previous clutter.  After printing a
job, I get the following in response to the "lpq" command:

chaos: Warning,  no daemon present
Rank   Owner        Job   Files                   Total Size
1st    root         0     (standard input)        0 bytes

JetDirect lpd: no entries

Looking into /var/log/messages, I found the following entry:

... lpd [791]: error unlinking dfA000chaos: 2
... lpd [791]: lp: job could not be sent to remote host
(cfA000cqyD21v)

All jobs printed.  I did see once where the queue seemed to
"jam up" but could not repeat this problem.

This problem looks similar to one I previously reported
where email was sent for during file delete.

Since this problem does NOT occur on my 6.0 system, I also
tried replacing lpr with the one from my 6.0 system
(lpr-0.39-2).  This made NO difference the the problem on
6.1 which occured the same way.

Gene

Comment 1 Bill Nottingham 1999-10-11 15:19:59 UTC
Can you try the lpr RPMs at

http://people.redhat.com/notting/lpr/
-or-
ftp://people.redhat.com/notting/lpr/

and let me know if they work for you?

Comment 2 Gene Czarcinski 1999-10-11 16:07:59 UTC
Tested lpr-042-1 on my "fixes from rawhide" system (not the plain
base) .. sorry, no change.  I still get the "cluttered" queue.

Gene

Comment 3 Bill Nottingham 1999-10-11 16:09:59 UTC
Hmm.... could you try restarting the new lpd on a *clean* queue
and seeing if it works from there? Make sure you kill all previous
lpd processes (killall -9 lpd should do it.)

What does your /etc/printcap look like?

Comment 4 Bill Nottingham 1999-10-11 16:15:59 UTC
Out of curiousity, what happens if you copy the 'master-filter'
file from an older rhs-printfilters release to
/var/spool/lpd/lp/filter?

Comment 5 Gene Czarcinski 1999-10-11 16:56:59 UTC
1. to clarify, the system I am testing on is 6.1 plus all updates from
rawhide-19991007 except kdmps which should not effect anything).
However, the problem appears identical on 6.1 install from the iso
image.

2. rebooted the after upgrading to lpd-042-1 to make sure everything
was clean ... no change.

3. /var/spool/lpd/lp/filter on both my "good" 6.0 system (lpr-0.39-2)
and 6.1 (lpr-0.42-1) are identical (I will forward a copy via email).

4. Since installing lpr-0.39-2 on the 6.1 system still has the
problem, it must be something that has been there all along and only
comes to light under 6.1.

If you have any other ideas or things you want me to test, just ask.

I plan to "instument" the lpd code from lpr-0.42-1 since this was the
only way I could find the last problem that I was getting.

Gene

Comment 6 Bill Nottingham 1999-10-11 17:08:59 UTC
What is the permmissions on /var/spool/lpd/lp, and what are
the permissions on lpd?

Comment 7 Bill Nottingham 1999-10-11 17:47:59 UTC
What are the permissions on the lpd binary?
Is /var/spool/lpd/ NFS mounted?

Comment 8 Gene Czarcinski 1999-10-11 18:23:59 UTC
I guess my email with the attachment was not clear --
on the 6.1 system /usr/sbin/lpd is
-rwxr--r--   1 root      root        49244 Oct 11 13:34 /usr/sbin/lpd

This is on a separate partition (from /) of an ide drive.  It is NOT
NFS mounted.

Some additional info ...

1. I am running on dual 400MHz PII processors with 256MB ram.  I
thought there might be a timing issue (see below) so I rebooted with
the UP kernel and mem=32M ... no change.

2. I ran tail -f /var/log/messages which printing files and noticed
the following:

- print a file (a2ps /etc/resolv.conf)
    ... file prints
    ... NO log messages
    ... Queue is "cluttered with the "control file" (CF___)
    ... lpd daemon still "running"

- print another file (a2ps /etc/hosts)
    ... eror messages in log (error unlinking, etc) for the
        control file from the FIRST job to be printed!
    ... Queue uncluttered
    ... additional messages in log:
          lpd shutdown succeeded
          lpd startup succeeded

It appears to me that the "first time through", lpd "hangs" and when
the "second time through" cleans up the queue, the original process
finally terminates.

This is going to be a real turkey to troubleshoot unless someone gets
a brainstorm of an idea.

I have not "instrumented" anything yet but it looks like the "sendit"
function in printjob.c is the focus.

Gene

Comment 9 Gene Czarcinski 1999-10-11 20:57:59 UTC
OK, I have been playing with lpd and have come up with the following
patch which gets rid of the problem (remember, I did NOT say it fixed
anything) but does introduce a lot of trash into syslog.

Without the update around "197", lpd is still broken.  With ONLY the
"197" update, lpd is still broken.  This is not a fix!  However, this
patch may imply that a time.

I also noticed the following:

lpd working (patch applied): each file printed had a different process
ID.

lpd NOT working (patch not applied or all of patch not applied): file
processing had a single process ID for a pair of files ... after
cleanup, a new process ID was in use.

The process IDs are the "sub" processes.

Gene


--- lpr-0.42.orig/lpd/printjob.c        Fri Oct  8 11:43:09 1999
+++ lpr-0.42/lpd/printjob.c     Mon Oct 11 15:52:15 1999
@@ -197,10 +197,12 @@
                i = strlen(line);
                if (write(lfd, line, i) != i)
                        syslog(LOG_ERR, "%s: %s: %m", printer, LO);
+               syslog(LOG_ERR,"Processing %s **BEGIN**",q->q_name);
                if (!remote)
                        i = printit(q->q_name);
                else
                        i = sendit(q->q_name);
+               syslog(LOG_ERR,"Processing %s **DONE,
rc=%d**",q->q_name,i);
                /*
                 * Check to see if we are supposed to stop printing or
                 * if we are to rebuild the queue.
@@ -760,6 +762,7 @@
        int  ii;


+       syslog(LOG_ERR, "Processing file=%s",file);
        /*
         * open control file
         */
@@ -812,7 +815,9 @@
         * pass 1
         */
        while (getline(cfp)) {
+       syslog(LOG_ERR,"new line=%s",line);
        again:
+                syslog(LOG_ERR,"--- line=%s",line);
                 if (line[0] == 'S') {
                        if (dup_cfp)
                          fprintf(dup_cfp,"%s\n",line);
@@ -904,7 +909,8 @@
                            fi = open(tmpfile, O_RDONLY);
                            ifi =
open(last+1,O_WRONLY|O_CREAT|O_TRUNC, 0664);
                            if ((ifchild = doforkuser(DORETURN,euser))
== 0) { /*child*/
-                                dup2(fi,  0); /* input file is stdin
*/
+                                syslog(LOG_ERR,"child for %s",IF);
+                                dup2(fi,  0); /* input file is stdin
*/
                                 dup2(ifi, 1); /* temp file is  stdout
*/
                                 for (ii = 3; ii < NOFILE; ii++)
                                    (void) close(ii);
@@ -916,7 +922,9 @@
                          close(ifi);
                           close(fi);
                          /* wait on it to finish */
+                         syslog(LOG_ERR,"child is %i and
waiting",ifchild);
                          waitpid(ifchild, NULL, 0);
+                         syslog(LOG_ERR,"child is %i and
done",ifchild);
                         }

                         sendresult = sendfile('\3', last+1);
@@ -967,6 +975,7 @@
                return(REPRINT);
        }
 pass2:
+       syslog(LOG_ERR,"pass2");
        /*
         * pass 2
         */

Comment 10 Gene Czarcinski 1999-10-11 21:14:59 UTC
Oops ... got interrupted ...

I should have said that a "timing problem" may be implied.

This is going to be a headache to debug.

Gene

Comment 11 Bill Nottingham 1999-10-13 18:04:59 UTC
What are the contents of one of the leftover control files?
Is there a datafile left that corresponds to it?

Comment 12 Gene Czarcinski 1999-10-18 02:01:59 UTC
Bug fixed by Bill in lpr-0.43-1.  Works fine now.

Gene

Comment 13 Bill Nottingham 1999-10-18 03:21:59 UTC
Fixed in the errata release.


Note You need to log in before you can comment on or make changes to this bug.