I have checked the current bugs listed in bugtraq and this not exact match so I am submitting it as a new problem (it may be related). This bug occurs on a base 6.0 cartman installation as well as a rawhide updated system (with all new fixes including lpr-0.41-3). I have a "JetDirect" postscript printer configured on 192.168.1.100. This was configured using "printtool" since the graphical install does not configure a printer. The problem is that files in the print queue are not being "cleaned up" after the job has printed -- but ONLY for even numbered jobs! ... the odd numbered job cleans the queue for both that job AND the previous clutter. After printing a job, I get the following in response to the "lpq" command: chaos: Warning, no daemon present Rank Owner Job Files Total Size 1st root 0 (standard input) 0 bytes JetDirect lpd: no entries Looking into /var/log/messages, I found the following entry: ... lpd [791]: error unlinking dfA000chaos: 2 ... lpd [791]: lp: job could not be sent to remote host (cfA000cqyD21v) All jobs printed. I did see once where the queue seemed to "jam up" but could not repeat this problem. This problem looks similar to one I previously reported where email was sent for during file delete. Since this problem does NOT occur on my 6.0 system, I also tried replacing lpr with the one from my 6.0 system (lpr-0.39-2). This made NO difference the the problem on 6.1 which occured the same way. Gene
Can you try the lpr RPMs at http://people.redhat.com/notting/lpr/ -or- ftp://people.redhat.com/notting/lpr/ and let me know if they work for you?
Tested lpr-042-1 on my "fixes from rawhide" system (not the plain base) .. sorry, no change. I still get the "cluttered" queue. Gene
Hmm.... could you try restarting the new lpd on a *clean* queue and seeing if it works from there? Make sure you kill all previous lpd processes (killall -9 lpd should do it.) What does your /etc/printcap look like?
Out of curiousity, what happens if you copy the 'master-filter' file from an older rhs-printfilters release to /var/spool/lpd/lp/filter?
1. to clarify, the system I am testing on is 6.1 plus all updates from rawhide-19991007 except kdmps which should not effect anything). However, the problem appears identical on 6.1 install from the iso image. 2. rebooted the after upgrading to lpd-042-1 to make sure everything was clean ... no change. 3. /var/spool/lpd/lp/filter on both my "good" 6.0 system (lpr-0.39-2) and 6.1 (lpr-0.42-1) are identical (I will forward a copy via email). 4. Since installing lpr-0.39-2 on the 6.1 system still has the problem, it must be something that has been there all along and only comes to light under 6.1. If you have any other ideas or things you want me to test, just ask. I plan to "instument" the lpd code from lpr-0.42-1 since this was the only way I could find the last problem that I was getting. Gene
What is the permmissions on /var/spool/lpd/lp, and what are the permissions on lpd?
What are the permissions on the lpd binary? Is /var/spool/lpd/ NFS mounted?
I guess my email with the attachment was not clear -- on the 6.1 system /usr/sbin/lpd is -rwxr--r-- 1 root root 49244 Oct 11 13:34 /usr/sbin/lpd This is on a separate partition (from /) of an ide drive. It is NOT NFS mounted. Some additional info ... 1. I am running on dual 400MHz PII processors with 256MB ram. I thought there might be a timing issue (see below) so I rebooted with the UP kernel and mem=32M ... no change. 2. I ran tail -f /var/log/messages which printing files and noticed the following: - print a file (a2ps /etc/resolv.conf) ... file prints ... NO log messages ... Queue is "cluttered with the "control file" (CF___) ... lpd daemon still "running" - print another file (a2ps /etc/hosts) ... eror messages in log (error unlinking, etc) for the control file from the FIRST job to be printed! ... Queue uncluttered ... additional messages in log: lpd shutdown succeeded lpd startup succeeded It appears to me that the "first time through", lpd "hangs" and when the "second time through" cleans up the queue, the original process finally terminates. This is going to be a real turkey to troubleshoot unless someone gets a brainstorm of an idea. I have not "instrumented" anything yet but it looks like the "sendit" function in printjob.c is the focus. Gene
OK, I have been playing with lpd and have come up with the following patch which gets rid of the problem (remember, I did NOT say it fixed anything) but does introduce a lot of trash into syslog. Without the update around "197", lpd is still broken. With ONLY the "197" update, lpd is still broken. This is not a fix! However, this patch may imply that a time. I also noticed the following: lpd working (patch applied): each file printed had a different process ID. lpd NOT working (patch not applied or all of patch not applied): file processing had a single process ID for a pair of files ... after cleanup, a new process ID was in use. The process IDs are the "sub" processes. Gene --- lpr-0.42.orig/lpd/printjob.c Fri Oct 8 11:43:09 1999 +++ lpr-0.42/lpd/printjob.c Mon Oct 11 15:52:15 1999 @@ -197,10 +197,12 @@ i = strlen(line); if (write(lfd, line, i) != i) syslog(LOG_ERR, "%s: %s: %m", printer, LO); + syslog(LOG_ERR,"Processing %s **BEGIN**",q->q_name); if (!remote) i = printit(q->q_name); else i = sendit(q->q_name); + syslog(LOG_ERR,"Processing %s **DONE, rc=%d**",q->q_name,i); /* * Check to see if we are supposed to stop printing or * if we are to rebuild the queue. @@ -760,6 +762,7 @@ int ii; + syslog(LOG_ERR, "Processing file=%s",file); /* * open control file */ @@ -812,7 +815,9 @@ * pass 1 */ while (getline(cfp)) { + syslog(LOG_ERR,"new line=%s",line); again: + syslog(LOG_ERR,"--- line=%s",line); if (line[0] == 'S') { if (dup_cfp) fprintf(dup_cfp,"%s\n",line); @@ -904,7 +909,8 @@ fi = open(tmpfile, O_RDONLY); ifi = open(last+1,O_WRONLY|O_CREAT|O_TRUNC, 0664); if ((ifchild = doforkuser(DORETURN,euser)) == 0) { /*child*/ - dup2(fi, 0); /* input file is stdin */ + syslog(LOG_ERR,"child for %s",IF); + dup2(fi, 0); /* input file is stdin */ dup2(ifi, 1); /* temp file is stdout */ for (ii = 3; ii < NOFILE; ii++) (void) close(ii); @@ -916,7 +922,9 @@ close(ifi); close(fi); /* wait on it to finish */ + syslog(LOG_ERR,"child is %i and waiting",ifchild); waitpid(ifchild, NULL, 0); + syslog(LOG_ERR,"child is %i and done",ifchild); } sendresult = sendfile('\3', last+1); @@ -967,6 +975,7 @@ return(REPRINT); } pass2: + syslog(LOG_ERR,"pass2"); /* * pass 2 */
Oops ... got interrupted ... I should have said that a "timing problem" may be implied. This is going to be a headache to debug. Gene
What are the contents of one of the leftover control files? Is there a datafile left that corresponds to it?
Bug fixed by Bill in lpr-0.43-1. Works fine now. Gene
Fixed in the errata release.