From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225 Description of problem: There's a bug in the select statement at line 1263. It gives an invalid read 1 beyond the fdset variable. The problem turns out to be the way fdset is defined. On BSD systems the code at line 1252 is correct. Linux systems define fdset to be a different size. The correct way to do this is to change 1252 to be: fdsetsz = sizeof(fd_set); Version-Release number of selected component (if applicable): openssh-3.5p1 How reproducible: Always Steps to Reproduce: valgrind --leak-check=yes --leak-resolution=high --num-callers=8 -v --logfile-fd=19 /usr/src/redhat/BUILD/openssh-3.5p1/sshd 19>out ssh -l localhost root /etc/rc.d/init.d/stop vi out Actual Results: valgrind reports a invalid read in select @ 1263 Additional info:
Created attachment 91747 [details] Select bug patch The attached patch fixes the bug
I added this statement to the sshd.c code at line 1263: error("fdsetsz is: %d, fd_set is: %d", fdsetsz, sizeof(fd_set)); When I look in the logs, this is what I get: sshd[10790]: error: fdsetsz is: 4, fd_set is: 128 This is a big difference. I also grepped for select throughout the package and the same problem is everywhere. This could be causing a lot of intermittant problems which results in connections being terminated or never starting since select is given bad descriptor sets to work with. I've been experiencing sshd connection problems which may be related to this. I've contacted the openssh team to get their feedback. I forgot to paste a valgrind trace in the original report so here it is: ==10790== Invalid read of size 4 ==10790== at 0x40170B7D: vgAllRoadsLeadToRome_select (vg_intercept.c:612) ==10790== by 0x40170DF2: __select (vg_intercept.c:681) ==10790== by 0x804E4EC: main (sshd.c:1264) ==10790== by 0x403DC5CC: __libc_start_main (in /lib/libc-2.3.2.so) ==10790== by 0x804C560: ??? (start.S:81) ==10790== Address 0x41363BFC is 0 bytes after a block of size 4 alloc'd ==10790== at 0x4015E40C: malloc (vg_clientfuncs.c:103) ==10790== by 0x8075FAA: xmalloc (xmalloc.c:28) ==10790== by 0x804E429: main (sshd.c:1253) ==10790== by 0x403DC5CC: __libc_start_main (in /lib/libc-2.3.2.so) ==10790== by 0x804C560: ??? (start.S:81) I'm elevating this problem from normal to high since it is all over the package and the size difference is significant. (4 vs 128.) This also means my patch only fixes 1 case and there are certainly more. Hope this helps...
I discussed the finding of this issue with Damien Miller. He explained what they were doing. Select 2 man page in Open BSD is a bit more informative than select 2 in Linux. I cross checked it by reading the Linux Kernel Source code and have concluded that the bug lies with Valgrind. I fixed valgrind and submitted a patch to that project. This bug should be closed.