Bug 426656

Summary: gs burns CPU reading print jobs 1 byte at a time
Product: [Fedora] Fedora Reporter: Daniel Berrangé <berrange>
Component: ghostscriptAssignee: Tim Waugh <twaugh>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 8   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-02 16:51:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Make gs read data in large chunks when STDIN is not a tty none

Description Daniel Berrangé 2007-12-23 22:44:48 UTC
Description of problem:
Printing large documents takes an unreasonable long amount of time. eg a A4
sized photo comes in at a 200 MB postscript file takes approx 15 minutes at 100%
cpu before the printer even gets a single byte of data.  stracing the 'gs'
binary shows it reading the postscript file 1 byte at a time from stdin. Clearly
this is sub-optimal. 

Version-Release number of selected component (if applicable):
ghostscript-8.61-5.fc8

How reproducible:
Always

Steps to Reproduce:
1. Create a postscript file
2. Run

 time "cat demo.ps |   /usr/bin/gs -r600 -g5100x6600 -q -dNOPROMPT -dSAFER
-sDEVICE=ppmraw -sOutputFile=- - | cat > demo.ppm"

3. strace the 'gs' process
  
Actual results:
read(0, "2", 1)                         = 1
read(0, " ", 1)                         = 1
read(0, "2", 1)                         = 1
read(0, "7", 1)                         = 1
read(0, " ", 1)                         = 1
read(0, "r", 1)                         = 1
read(0, "G", 1)                         = 1
read(0, "\n", 1)                        = 1
read(0, "3", 1)                         = 1
read(0, "4", 1)                         = 1
read(0, "0", 1)                         = 1
read(0, "3", 1)                         = 1

It ran for 20 minutes before I gave up waiting for it to finish


Expected results:
It should read in at least chunks of 1024 or greater

read(0, "0 7666 5 2 rf\n8 3 1 rG\n5755 7666"..., 1024) = 1024
read(0, "6 4 2 rf\n16 6 3 rG\n5918 7666 5 2"..., 1024) = 1024
read(0, "\n24 9 4 rG\n6067 7666 3 2 rf\n16 6"..., 1024) = 1024
read(0, "16 6 3 rG\n79 7663 3 3 rf\nK\n82 76"..., 1024) = 1024
read(0, "223 7663 10 3 rf\n8 3 1 rG\n233 76"..., 1024) = 1024

The total runtime is < 5 minutes.

Additional info:

Comment 1 Daniel Berrangé 2007-12-23 22:52:00 UTC
Attaching GDB to the 'gs' process showed the place where it was reading single
bytes is the "s_stdin_read_process" method in src/ziodevsc.c

This bit os code:

    if (mem->gs_lib_ctx->stdin_fn)
        count = (*mem->gs_lib_ctx->stdin_fn)
            (mem->gs_lib_ctx->caller_handle, (char *)pw->ptr + 1,
             mem->gs_lib_ctx->stdin_is_interactive ? 1 : wcount);

Note, that if stdin_is_interactive is true, then it reads a single byte.

Searching where this variable is set identifies the 'swproc' method in
src/imainarg.c

    switch (sw) {
	default:
	    return 1;
	case 0:		/* read stdin as a file char-by-char */
	    /* This is a ******HACK****** for Ghostview. */
	    minst->heap->gs_lib_ctx->stdin_is_interactive = true;


The 'sw' character being switched on here is the filename for the input, and
this switch will basically set the 'stdin_is_interactive' flag to true, if the
input filename is '-'  as some unspecified hack for Ghostview.

Well my printer driver calls 'gs' with a filename of '-' as part of its printing
pipeline, so the effect is that all my printjobs get processed 1 byte at a time
and take more than 20 minutes to print a single A4 size photograph.


Comment 2 Daniel Berrangé 2007-12-23 22:55:29 UTC
Created attachment 290312 [details]
Make gs read data in large chunks when STDIN is not a tty

Its unclear why the ghostview hack is neccessary, but the code looks as if it
is trying to use a filename of '-' to guess that it is being run interactively.
The more usual way todo this is to look at the STDIN file handle and see if it
is connected to a TTY or not. If it is connected to a TTY, then it is
interactive, if not, then it is a shell pipeline. POSIX provides a convenient
'isatty' API for figuring this out. The attached patch uses this function so
that it is only switched to '1 byte at a time' mode if really on a TTY. The
result is that my A4 photo print jobs now complete in < 5 minutes, instead of >
20.

Comment 3 Tim Waugh 2008-01-02 16:51:37 UTC

*** This bug has been marked as a duplicate of 416881 ***