Bug 7150 - Reporting missing glibc buffering in Redhat 6.0
Reporting missing glibc buffering in Redhat 6.0
Status: CLOSED NOTABUG
Product: Red Hat Linux
Classification: Retired
Component: glibc (Show other bugs)
6.0
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Cristian Gafton
www.adobe.com
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 1999-11-19 13:21 EST by jensen
Modified: 2008-05-01 11:37 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2000-01-04 21:42:35 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description jensen 1999-11-19 13:21:29 EST
Dear Redhat support person,

We would like to report a problem regarding missing
glibc buffering in Redhat 6.0. The problem has been
analyzed by one of my colleagues here at Adobe. The
full report is attached here:




===> begin

I've attached both a performance analysis comparing
Linux 2.0.X with 2.2.X and with NT 4.0, all on the
same system and the program used to collect the data.

The problem statement is:

Contrary to the C language specification, glibc does
not buffer file input when the input file is opened
with standard/default options. The standard says that
I/O is buffered by default, that the buffer size is
given by BUFSIZ and must be at least 256 bytes. File
buffer size may be altered by use of setbuf or setvbuf.

This results is performance that is slower for 2.2.X
when compared with 2.0.X. 2.0.X is faster than NT 4.0.
2.2.X is slower than NT 4.0.

2.2.5     NT 4.0  2.0.36
========================  Time in seconds
  52           6       5  16Mb file, read size 1 byte
  63          <1      <1  24Kb file (NFS), read size 1

The performance degradation is more significant for
small read operations and disasterous when the target
of the read is an NFS mounted file.


First, behaviour before the new libc. I used an old system, linked with the
static library so that the performance could be tested on the new kernel,
but more importantly, on the same hardware.

Test - using static library compile on 2.0.36 running on 2.2.5-15smp
       Target file is /disks/flex/apd2/dowling/test/buf1  (NFS)

Read Size  8192, Bytes read     24576, Time        0
Read Size  4096, Bytes read     24576, Time        0
Read Size  2048, Bytes read     24576, Time        0
Read Size  1024, Bytes read     24576, Time        0
Read Size   512, Bytes read     24576, Time        0
Read Size   256, Bytes read     24576, Time        0
Read Size   128, Bytes read     24576, Time        0
Read Size    64, Bytes read     24576, Time        1
Read Size    32, Bytes read     24576, Time        0
Read Size    16, Bytes read     24576, Time        0
Read Size     8, Bytes read     24576, Time        0
Read Size     4, Bytes read     24576, Time        0
Read Size     2, Bytes read     24576, Time        0
Read Size     1, Bytes read     24576, Time        0

=================
New libc - static to show that it isn't a static/shared problem

Test - using static library compile on 2.2.5-15smp running on 2.2.5-15smp
       Target file is /disks/flex/apd2/dowling/test/buf1  (NFS)

Read Size  8192, Bytes read     24576, Time        0
Read Size  4096, Bytes read     24576, Time        0
Read Size  2048, Bytes read     24576, Time        0
Read Size  1024, Bytes read     24576, Time        0
Read Size   512, Bytes read     24576, Time        0
Read Size   256, Bytes read     24576, Time        0
Read Size   128, Bytes read     24576, Time        0
Read Size    64, Bytes read     24576, Time        1
Read Size    32, Bytes read     24576, Time        2
Read Size    16, Bytes read     24576, Time        4
Read Size     8, Bytes read     24576, Time        7
Read Size     4, Bytes read     24576, Time       15
Read Size     2, Bytes read     24576, Time       32
Read Size     1, Bytes read     24576, Time       63

==================
New libc - shared - performance is essentially identical to
static case - the expected result.

Test - using shared library compile on 2.2.5-15smp running on 2.2.5-15smp
       Target file is /disks/flex/apd2/dowling/test/buf1  (NFS)

Read Size  8192, Bytes read     24576, Time        1
Read Size  4096, Bytes read     24576, Time        0
Read Size  2048, Bytes read     24576, Time        0
Read Size  1024, Bytes read     24576, Time        0
Read Size   512, Bytes read     24576, Time        0
Read Size   256, Bytes read     24576, Time        0
Read Size   128, Bytes read     24576, Time        1
Read Size    64, Bytes read     24576, Time        0
Read Size    32, Bytes read     24576, Time        2
Read Size    16, Bytes read     24576, Time        4
Read Size     8, Bytes read     24576, Time        7
Read Size     4, Bytes read     24576, Time       15
Read Size     2, Bytes read     24576, Time       31
Read Size     1, Bytes read     24576, Time       61

==========================================================
==========================================================
Using local (/tmp) file with much larger file.
This shows that the penalty imposed by the new libc
is significant (about 10X) even for local files.

=======
Old libc

Test - using static library compile on 2.0.36 running on 2.2.5-15smp
       Target file is /tmp/buf1
Read Size  8192, Bytes read  16777216, Time        1
Read Size  4096, Bytes read  16777216, Time        0
Read Size  2048, Bytes read  16777216, Time        1
Read Size  1024, Bytes read  16777216, Time        0
Read Size   512, Bytes read  16777216, Time        0
Read Size   256, Bytes read  16777216, Time        0
Read Size   128, Bytes read  16777216, Time        1
Read Size    64, Bytes read  16777216, Time        0
Read Size    32, Bytes read  16777216, Time        1
Read Size    16, Bytes read  16777216, Time        0
Read Size     8, Bytes read  16777216, Time        1
Read Size     4, Bytes read  16777216, Time        2
Read Size     2, Bytes read  16777216, Time        3
Read Size     1, Bytes read  16777216, Time        5


========
Test - using shared library compile on 2.2.5-15smp running on 2.2.5-15smp
       Target file is /tmp/buf1

Read Size  8192, Bytes read  16777216, Time        1
Read Size  4096, Bytes read  16777216, Time        0
Read Size  2048, Bytes read  16777216, Time        0
Read Size  1024, Bytes read  16777216, Time        0
Read Size   512, Bytes read  16777216, Time        1
Read Size   256, Bytes read  16777216, Time        0
Read Size   128, Bytes read  16777216, Time        1
Read Size    64, Bytes read  16777216, Time        1
Read Size    32, Bytes read  16777216, Time        2
Read Size    16, Bytes read  16777216, Time        4
Read Size     8, Bytes read  16777216, Time        7
Read Size     4, Bytes read  16777216, Time       13
Read Size     2, Bytes read  16777216, Time       26
Read Size     1, Bytes read  16777216, Time       52

=========================================================================
=========================================================================

Test run on NT 4.0 Server  - this is the same hardware as above
                             this machine is dual boot.

=========================================================================

Test - using VC 5.0   target is /disks/flex/apd2/dowling/test/buf1

Read Size   512, Bytes read     24576, Time        0
Read Size   256, Bytes read     24576, Time        0
Read Size   128, Bytes read     24576, Time        0
Read Size    64, Bytes read     24576, Time        0
Read Size    32, Bytes read     24576, Time        0
Read Size    16, Bytes read     24576, Time        0
Read Size     8, Bytes read     24576, Time        0
Read Size     4, Bytes read     24576, Time        0
Read Size     2, Bytes read     24576, Time        0
Read Size     1, Bytes read     24576, Time        0


============

Test - using VC 5.0  target is c:\buf1

Read Size   512, Bytes read  16777216, Time        2
Read Size   256, Bytes read  16777216, Time        1
Read Size   128, Bytes read  16777216, Time        1
Read Size    64, Bytes read  16777216, Time        1
Read Size    32, Bytes read  16777216, Time        1
Read Size    16, Bytes read  16777216, Time        2
Read Size     8, Bytes read  16777216, Time        1
Read Size     4, Bytes read  16777216, Time        3
Read Size     2, Bytes read  16777216, Time        3
Read Size     1, Bytes read  16777216, Time        6


Note that BUFSIZ for VC 5.0 is 512. this should actually be a
disadvantage for NT.

[ text/plain ] :

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

char buf[BUFSIZ];
int main(int argc, char **argv){
FILE *in;
int i, j, k, l, m, n;
time_t begin, end;
double difference;

    n = BUFSIZ;

    if(argc != 2){
        fprintf(stderr,"Usage: buff_test tempfilename\n");
        return 1;
    }
    in = fopen(argv[1],"r");
    if(in != NULL){
        fprintf(stderr,"tempfile: %s, already exists\n",argv[1]);
        return 1;
    }
    in = fopen(argv[1],"w");
    if(in == NULL){
        fprintf(stderr,"Unable to open file: %s, for write\n",argv[1]);
        return 1;
    }
    for(i=0;i<48;i++){
       k = fwrite(buf,n,1,in);
       if(k != 1){
           fprintf(stderr,"Error writing to: %s\n",argv[1]);
           return 1;
       }
    }
    fclose(in);
    for(;;){
        in = fopen(argv[1],"r");
        if(in == NULL)
                return 1;
        l = 0;
        begin = time(NULL);
        for(;;){
            k = fread(buf,n,1,in);
            if(k != 1)
    break;
            l += n;
        }
        end = time(NULL);
        difference = end - begin;
        printf("Read Size %5d, Bytes read %9d, Time %8g\n",
            n,l,difference);
        if(n == 1)
            break;
        fclose(in);
        n /= 2;
    }
    remove(argv[1]);
    return 0;
}



--
Freddy Jensen, Sr. Computer Scientist, Adobe Systems Incorporated
345 Park Avenue, San Jose, CA 95110-2704, Phone 408 536-2869 / 536-6000
Email: jensen@adobe.com, URL: http://www.adobe.com
--
Comment 1 Chris Siebenmann 1999-11-20 05:43:59 EST
For additional supporting evidence one can strace the test
program. On a 6.1-based system here this shows that the program
is actually issuing tiny read() syscalls; on a 5.1-based system
the smallest the read() gets is 4096 bytes.
Comment 2 Cristian Gafton 2000-01-04 21:42:59 EST
The test is completely bogus. libc5 (old libc) is using a buffer size of 1024,
whereas glibc is using a buffer size of 8192. There is no compare between the
amount of work that the kernel needs to do to read 1024 versus 8192 bytes at a
time.

Change your test program to use 1024 instead of BUFSIZ for read sizes, and
immedately after the fopen() call do a
	setbuffer(in, buf, 1024)

Secondly, the fread in glibc is a thread-safe function, whcih means that it has
to go through a lot of stuff to ensure proper locking et all. Use the
fread_unlocked is you don't need locking on the IO operations in this test.

Once you do that you will see performance aproaching what you have obtained with
the old-libc statically compiled binary. It will still be slower (about 25-30%),
but that is a far cry from the ten fold times you have got as a result of your
test.

And considering the added functionality, thread safety and increased complexity
of handling a syscall in newer kernels the performance penalty is not bad at
all.
Comment 3 dowling 2000-01-05 12:32:59 EST
gafton@redhat.com completely misses the point. The current implementation
of glibc is in violation of the ANSI C standard since it does not
provide at least BUFSIZ buffering by default.

Note that a "workaround" that requires adding code is not acceptable.
The program that caused us to notice the problem was gcc!

The assertion that "thread-safe" has to be that much slower is just
"bogus" and lazy.
Comment 4 jensen 2000-01-05 13:38:59 EST
The point is that the ANSI C standard requires at least
BUFSIZ buffering by default.

A "workaround" does not resolve the issue at all. We discovered the
problem by observing the behavior of gcc and cp!

"Thread-safe" performance issues as an excuse is just that, an excuse.
Output is buffered by default and if the workaround is used (setvbuf)
performance is OK. Is gafton@redhat.com asserting:

    1) buffered output is not thread-safe?

    2) input with the use of setvbuf is not thread-safe?

There are really quite a few problems with the new glibc, not the
least of which is its unnecessary intrusions into variable name space,
again in violation of the ANSI C standard.

Note You need to log in before you can comment on or make changes to this bug.