From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4) Gecko/20011128 Netscape6/6.2.1 Description of problem: LTC1453-CS/Linux fails to start on RedHat Adv. Server 2.1 Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: Hardware Environment: IA-32 RH AS 2.1 Software Environment:RH AS2.1 Steps to Reproduce: 1.Install RedHad Advance Server 2.Install Open Streams LiS (Streams) 3.Install Communications Server for Linux V6.001 4.try to start Communications Server (sna start) and it fails Actual Results: Failure reported in error log. Traces show the sna device drivers are reporting "Device or resource busy" Expected Results: Cs/Linux should start. It has successfully started on: and Linux kernels - RH7.2 (2.4.7) - RH7.3 (2.4.18) - RH8.0 (2.4.18) - SuSE 8.0 (2.4.18) - SuSE 8.1 (2.4.19) - SuSE Linux Enterprise Server (2.4.19) Additional info: Additional Information: This problem is blocking a larger roll out of the Branch Server product for Linux project. The Cs/Linux is needed for the Red hat Adv. 2.1 system with WAS, DB/2 and a suite of other applications. Here is more details. The question at the end NEEDS TO BE ADDRESSED by Red Hat Development: We have reproduced the problem here with the same symptoms as you have seen. However the cause appears to be O/S related and we would like you to use your contacts within Redhat to see if you can get assistance before we investigate further. The drivers are loaded correctly (with insmod). The first driver that we talk to is the non-streams trace driver /dev/sna_trace (snapixt). We issue an open followed by a number of ioctls from the snaerrlog daemon prior to starting up the Streams part of the product. On RH AS2.1 we see the following behaviour: - The open appears to be sent correctly from the daemon to our driver and an OK return code of 0 causes a file handle to be assigned. - The first ioctl that we issue does not get received by our driver's ioctl code (as indicated by printks we have added). However the O/S replies with an OK (0) return code. - The second and subsequent ioctl calls arrive at our open routine in the driver rather than our ioctl routine where we reject it with EBUSY (since we only allow one open to this driver). This return code is reported back to the daemon. This causes CS/Linux to fail to start. You can reproduce the problem by using snaldmod or just loading snapix0 and snapixt (and mknod of /dev/sna_trace) and running snaerrlog directly as root. A failure is indicated by this program exiting. I have modified the snaerrlog to ignore one bad return code and introduce 1 second delays between the ioctl calls, I enclose the strace output. Before I go to the next step of diagnostics which would be to try and write a cut down driver and daemon that show this problem I wanted to find out if this was a known issue with ioctls. Note that we are not using the recommended _IO* macros to define command codes but using our own private values, my understanding is that this should be OK. Question:Is there something different about the RH AS 2.1 (2.4.9) that prevents ioctls working in drivers? Additional comments from the submitter We have found additional Linux Kernel information regarding this problem. It appears the the 2.5.47 kernel has changed so that the sna drivers cannot initiate. We need to find what changed so as to isolate a fix around what the new mechanism is for starting drivers on the new kernel. Attached are 2 e-mails from developers and I am attaching the logs supplied: I have been playing around with a 2.5.47 kernel (generic development kernel, not in any released distribution) to see how CS/Linux would do on it. The latest LiS does fine on this kernel, so after making some changes to snalinux.c and cc_snalinux I was able to generate an isolation module and load the CS/Linux drivers. Then I did 'sna start' and got the same "Device Busy" message that we are seeing on RHAS2.1. Can you send me the debug snaerrlog and instructions so I can confirm that it is the same cause? If it is the same cause we would know that it is because of a kernel change made by RedHat for AS2.1 that is being rolled into the 2.5.* development kernels and which would eventually find its way into a 2.6 production kernel. ------- Additionally ---------- Looks the same to me: RedHat AS 2.1 started with the same base as RedHat 7.2, but obviously they changed it (otherwise why have an AS2.1 to begin with). Whether the changes flowed from AS2.1 into the 2.5 kernel or the other way around is irrelavent, I think examples of both could be found. The point is that the change which affects CS/Linux was a deliberate one, and one that was accepted by Linus & others for inclusion in 2.5 and later. Now if we can find what that change was we would have a better chance of knowing how to work with it. additional comment re attachment
Created attachment 87269 [details] Attachment (425.txt) is Message log for CS/Linux showing sna drivers not initiating
> The question at the end NEEDS TO BE ADDRESSED by Red Hat Development: Don you think that's the right attitude to get a bug fixes? Anyway your description is very inaccurate, so if you want your problem debuged I'd suggest you post some descriptive straces and a pointer to the sources of your module.
Created attachment 87484 [details] Traces and source code as per update
I am the originator of this problem. I originally submitted it to the IBM Linux Technology Center in October, sorry that they did not pass on the strace file that I gave them. I include a gzip including: - the strace showing the opens and ioctls that I described (the device is /dev/sna_trace) right at the end - the messages file showing the printk messages that I added to prove that the ioctl commands ended up in our open function - a part of the user space code that issues the open and ioctls (see line 236 of svmtdaem.c and following) - a part of the driver that expects the open and ioctls (see lines 1188 and 818 of svmtrcdd.c). It is not possible to send the full code, CS/Linux is a very large product comprising some 12500 files with kernel drivers in excess of 2.5M bytes). Richard Hilditch SNAP-IX Group Data Connection Ltd. Tel: +44 20 8366 1177 Mail: richard Fax: +44 20 8367 8501 Web: http://www.dataconnection.com
I guess it would be enough if you send a pointer to the location of that code (i.e. website, ftp site, sourceforge project page, cvs repository)
LiS is not supported
LiS is not being used when these calls fail. The Kernel is not passing the CS/linux ioctls to its own trace device driver correctly. The problem has nothing to do with LiS. Jeff L Smith Comm. Server Development IBM
Can you reinvestigate this since LiS is not being used at the time of call failure. Reopening for your response. Thanks.
Raising priority as this is very important to the Comm Server development team.
Adding comments from Comm Serv development ------- Additional Comment #20 From Paul Landay(landay.com) 2003-01-13 07:24 ------- The problem occurs with the RHAS2.1 2.4.9-e.9 kernels: http://rhn.redhat.com/errata/RHSA-2002-227.html We have not tried the 2.4.9-e.10 kernels yet. ------- Additional Comment #21 From Paul Landay(landay.com) 2003-01-13 09:34 ------- I've now tried the 2.4.9-e.10 kernels and it still happens with that kernel also.
Cancelling bug as problem is determined not to be in Linux code.