The Problem: I have a good working expect script which runs fine on SunOS 4.1.3, Solaris 2.5.1, and Suse Linux 6.2. This script spawns a telnet to another machine, logs in, and executes some commands. When I run the exact same expect script on my brand new Dell PIII 450 PowerEdge Server "big ole honkin' PC" running Dell factory-installed RedHat 6.0, the script dumps core. The Investigation: 1. Expect's internal debugging is not much help. It gives: spawn telnet 145.1.1.12 parent: waiting for sync byte parent: telling child to go ahead parent: sync byte write: broken pipe 2. I run "gdb /usr/bin/expect core" on the core file and I get the following output: ********************************************* Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... (no debugging symbols found)... warning: core file may not match specified executable file. Core was generated by `expect -- ./sms_message.exp'. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/lib/libexpect5.28.so... (no debugging symbols found)...done. Reading symbols from /usr/lib/libtcl8.0.so...done. Reading symbols from /lib/libdl.so.2...done. Reading symbols from /lib/libm.so.6...done. Reading symbols from /lib/libutil.so.1...done. Reading symbols from /lib/libc.so.6...done. Reading symbols from /lib/ld-linux.so.2...done. #0 0x40106fea in _IO_vfprintf (s=0xbfff9f68, format=0x4003481d "ioctl(%s,I_PUSH,\"ptem\") = %s\n", ap=0xbfffc6e4) at vfprintf.c:1248 vfprintf.c:1248: No such file or directory. (gdb) ********************************************* 3. I am not familiar with the I_PUSH flag, so I search for it in /usr/src/linux-2.2.5: find /usr/src/linux-2.2.5 -follow -type f -exec grep -l I_PUS H {} \; /usr/src/linux-2.2.5/arch/sparc64/solaris/ioctl.c /usr/src/linux-2.2.5/drivers/sgi/char/streamable.c /usr/src/linux-2.2.5/ibcs/ChangeLog /usr/src/linux-2.2.5/ibcs/devtrace/devtrace.c /usr/src/linux-2.2.5/ibcs/iBCSemul/ioctl.c The only references I find are for Sparc64 and SGI, but I'm running on Intel!! I'm ignoring the ibcs stuff, because I'm running expect scripts... 4. I run "strace -o strace.out ./sms_message.exp, and it says (down at the end of the output) : ********************************************* open("/dev/ptmx", O_RDWR) = 7 ioctl(7, TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(7, 0x80045430, 0xbffff258) = 0 stat("/dev/pts/3", {st_mode=S_IFREG|S_ISUID|03, st_size=0, ...}) = 0 ioctl(7, 0x40045431, 0xbffff2e4) = 0 ioctl(7, TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(7, 0x80045430, 0xbfffe1f0) = 0 stat("/dev/pts/3", {st_mode=S_IFREG|S_ISUID|03, st_size=0, ...}) = 0 statfs("/dev/pts/3", {f_type=0x1cd1, f_bsize=1024, f_blocks=0, f_bfree=0, f_file s=0, f_ffree=0, f_namelen=255}}) = 0 fcntl(7, F_SETFD, FD_CLOEXEC) = 0 pipe([8, 9]) = 0 pipe([10, 11]) = 0 pipe([12, 13]) = 0 fork() = 2435 close(9) = 0 close(10) = 0 close(13) = 0 write(5, "parent: waiting for sync byte\r\n"..., 31) = 31 write(2, "parent: waiting for sync byte\r\n"..., 31) = 31 write(6, "parent: waiting for sync byte\r\n"..., 31) = 31 read(8, "", 1) = 0 --- SIGCHLD (Child exited) --- write(5, "parent: telling child to go ahea"..., 35) = 35 write(2, "parent: telling child to go ahea"..., 35) = 35 write(6, "parent: telling child to go ahea"..., 35) = 35 write(11, " ", 1) = -1 EPIPE (Broken pipe) --- SIGPIPE (Broken pipe) --- write(2, "parent: sync byte write: broken "..., 38) = 38 write(5, "parent: sync byte write: broken "..., 38) = 38 write(6, "parent: sync byte write: broken "..., 38) = 38 _exit(-1) = ? ********************************************* Looks like the forked child died quick?? At this point my best guess is: I have a binary compiled for either Sun's or SGI's particular way of handling ptys for telnet. Even weirder? - when I run expect from the command line and spawn a telnet, it works!! So I'm out of my depth here. Can you help??? I'll be happy to provide more traces, captures, whatever. By the way, I'm reporting this to RedHat because I thought you might be interested in how a brand new Dell-installed Linux system behaves, and my hope is that you back your entire product as well as your reputation would indicate. We're going to use this system to network-manage a whole bunch of Sun machines, routers, and Tandem machines - it's a fairly high-profile project and my little butt is somewhat on the line. Help Help Help!!! Thanks, Bill Rasco cwr001.com 847.538.4821 Here's some version info... expect1.1> expect_version 5.28.1 uname -a Linux sapphire.iden_lab.comm.mot.com 2.2.5-15smp #1 SMP Mon Apr 19 22:43:28 EDT 1999 i686 unknown % info tclversion 8.0 % ------- Additional Comments From 10/12/99 14:38 ------- Update: I acquired tcl8.0 and expect5.31 from their respective sites and built them. Now my script works fine. I guess this bug report now turns into a "fix the distribution" report. Anyways, I'm happy. Just be aware that RedHat 6.0 as shipped from Dell factory on PowerEdge Server has a bug that dumps core when "expect" program attempts to spawn a telnet from a script. Bill
Update: I got tcl8.0 and expect5.30 from their respective web sites and compiled them on the Dell machine. Now my script works fine. I guess this becomes a "fix the distribution that Dell is shipping" report. I'm happy. Just be aware that "expect" will dump core when trying to spawn a telnet from a script run on a Dell PowerEdge Server with factory-installed RedHat 6.0 Bill
This problem appears to be resolved.