Bug 127487

Summary: ifdown doesn't disconnect ADSL connection properly
Product: [Fedora] Fedora Reporter: Boris Glawe <public>
Component: initscriptsAssignee: Bill Nottingham <notting>
Status: CLOSED RAWHIDE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 2CC: robatino, rvokal
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-07-27 16:25:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 123268    

Description Boris Glawe 2004-07-08 20:34:58 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040624

Description of problem:
The ADSL connection is stable and performant.

On a reboot there are two strange things:

i)
The shutdownskripts say something like
"shutting down ADSL-connection"          [success]
"shutting down eth0"                               [success]
"shutting down ADSL-connection"          [failed]

ii)
When the machine boots again, it's no possible to connect to the
provider for a few minutes.

/var/log/messages says something about PAP authentication failed

After a short timeout I can perfectly reconnect and everything works fine

I contacted the providers newsgroup. According to them they have not
configured a timeout on their server. They guess that the
disconnection does not happen properly and that the server keeps the
session up and refuses (of course) an additional connection until it
realizes, that the old session has been terminated.

I have got another FC2 machine connected to a different ADSL provider,
which has the same problem. The problem is always reproduceable and
exists since the upgrade to FC2.

I have not tried to use many different network setups. I've configured
my network settings to automatically bring up the ADSL connection at
boot time and to disconnect on a shutdown.

The problem does not exist, if you manually disconnect and connect.


Version-Release number of selected component (if applicable):
initscripts-7.53-1

How reproducible:
Always

Steps to Reproduce:
1. disconnect the ADSL connection by rebooting the computer (not manually)
2. see how it fails to reconnect, if the machine reboots immediately
(within a period of a 1-2 minutes)
3.
    

Actual Results:  Connection fails until timeout is over. Then
reconnection is possible. The established connection is stable and
reliable

Additional info:

Comment 1 Nayef Abu-Ghazaleh 2004-07-16 15:48:29 UTC
I have the same problem...It looks to me that it's a race condition.
ppp0 (my adsl device) is listed  in $xdslinterfaces and $interfaces
 
When ifdown tries to shut down $xdslinterfaces, it calls
/etc/sysconfig/network-scripts/ifdown-ppp which in turn calls
adsl-stop without waiting for ppp0 to be actually shut down. ifdown
then tries to shut down the interfaces in "$interfaces",
check_device_down returns false (adsl-stop didn't finish shutting down
ppp0) which results in two attempts to shut down ppp0.
 
I fixed it by inserting the following (from if-down) in ifdown-ppp
after     adsl-stop is executed:
 
 waited=0
    while ! check_device_down ${DEVICE} && [ "$waited" -lt 50 ] ; do
        usleep 10000
        waited=$(($waited+1))
    done
 
It looks to me that it works now (didn't actually reboot to check). I
think there could be  other places where other scripts are being
exec'd with no code afterwards to check that the interface is actually
down.

Comment 2 Boris Glawe 2004-07-16 16:26:54 UTC
Thanks for your investigation ! 
Your solution is just a hack  - I don't think that it can be commited
to the FC3  release in that way. Instead this race condition has to be
avoided. I mean, waiting for the connection to disconnect consumes
time and linux is slow anyway in booting up and shutting down. I do
not have a better suggestion though at the moment.

What are the steps to take if one wants to commit any changes to the
distribution? I am afraid that this bugreport is not read by any
Redhat developers (There are too many reports). Who decides about
whether a fix is acceptable? How could you or me commit the changes ?

Comment 3 Nayef Abu-Ghazaleh 2004-07-16 22:59:37 UTC
Shouldn't  shut-down scripts give the scripts that shut down
interfaces some time to bring the interfaces down cleanly? 

Anyway, here's another fix. This removes all interfaces that have
already been added to other variables from the "interfaces" variable.

The following patch is for /etc/rc.d/init.d/network:

--- network~	2004-06-07 15:16:49.000000000 -0400
+++ network	2004-07-16 18:32:23.186168024 -0400
@@ -201,6 +201,7 @@
 	cipeinterfaces=""
 	xdslinterfaces=""
 	bridgeinterfaces=""
+	remaininginterfaces=""
 
 	# get list of bonding, cipe, and xdsl interfaces
 	for i in $interfaces; do
@@ -232,6 +233,7 @@
 			continue
 		fi
 		unset DEVICE TYPE BRIDGE
+		remaininginterfaces="$remaininginterfaces $i"
 	done
 	
 	for i in $cipeinterfaces $xdslinterfaces $bridgeinterfaces
$vlaninterfaces; do
@@ -244,7 +246,7 @@
 	done
 	
 	# shut down all interfaces (other than loopback)
-	for i in $interfaces ; do
+	for i in $remaininginterfaces ; do
 		eval $(fgrep "DEVICE=" ifcfg-$i)
 		if [ -z "$DEVICE" ] ; then DEVICE="$i"; fi
 

Comment 4 Andre Robatino 2004-07-27 07:45:27 UTC
  This race condition existed already in FC1 - see bug #114733.  I
have it also (on both an athlon and an i686) but without any problems
reconnecting to my ISP.  I think the platform should be changed to
"All" since it's a shell script.  I wouldn't normally get to see all
the shutdown messages, including this one, except that I shut down by
first logging out and then selecting shutdown from the login screen
(otherwise only the last few show up).  This probably explains why
this hasn't been reported more often.  BTW I am currently using the
updated initscripts-7.55.1-1.

Comment 5 Bill Nottingham 2004-07-27 15:59:03 UTC
Andre: does the patch posted immediately above work for you?


Comment 6 Bill Nottingham 2004-07-27 16:02:20 UTC
*** Bug 114733 has been marked as a duplicate of this bug. ***

Comment 7 Bill Nottingham 2004-07-27 16:25:11 UTC
Patch added in 7.60-1; certainly looks correct.

Comment 8 Andre Robatino 2004-07-28 06:57:49 UTC
  It seems to work, thanks.