Bug 58663

Summary: DHCP restarting SNMP causes loss of state info
Product: [Retired] eCos Reporter: Andrew Lunn <andrew.lunn>
Component: SNMPAssignee: Hugo Tyson <hmt>
Status: CLOSED WONTFIX QA Contact: eCos bugs internal list <es-ecos-bugs-int>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.5.2   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-06-20 16:11:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch - you can ignore the testcase changes. none

Description Andrew Lunn 2002-01-22 15:28:39 UTC
Description of Problem:

The DHCP client code was modified so that when a new/renewed IP address was
received the Snmpd code was restarted. This restart causes any additional MIBs
that the application code registers with register_mib() to the lost. Also any
usm username, passwords etc will be lost. Plus any trap sink information
installed with create_trap_session(). 

A better solution than restarting the daemon is needed, or a callback function
is required so that when the daemon restarts it gives the application a chance
to re-registers all its handlers.


Version-Release number of selected component (if applicable): 1.5.2 and current


How Reproducible:
100%

Steps to Reproduce:
1. register a mib with register_mib()
2. Make the DHCP client call SnmpdShutDown()
3. Let the daemon complete its shutdown and restart
4. Query the mib.. Its gone.

Comment 1 Hugo Tyson 2002-01-30 17:19:37 UTC
I refer the Honourable Gentleman to the comments in dhcp.h in the network
stack; essentially, you don't need a hook; you should disable
	CYGOPT_NET_DHCP_DHCP_THREAD
and write your own lease management thread to handle this.  It's easy.

We can try bolting multiple little bodges, such as a callback hook to deal
with issues like this, onto the default dhcp lease hander provided in the
dhcp_support.c but I don't think it leads to an ultimately satisfactory
solution.

The purpose of that default being provided is twofold: to provide a default
that works well enough for automated testing; and to get a default build of
the eCos library up and running "out of the box", prior to more
sophisticated application configuration and development.

I would say that adding additional MIBs to the SNMP system is certainly
"more sophisticated" than the default startup.  To manage this cleanly -
and other startup stuff that the application might need to do if the
ethernets cycle up and down or change IP address - the application should
provide the DHCP management thread.  The APIs that the simple default uses
are all published with exactly this intent; the body of the routine is
purposely trivial to permit easy comprehension and modification:

    while ( 1 ) {
        while ( 1 ) {
            cyg_semaphore_wait( &dhcp_needs_attention );
            if ( ! dhcp_bind() )                 // a lease expired
                break;                           // If we need to re-bind
        }
        dhcp_halt();                             // tear everything down
        init_all_network_interfaces();           // re-initialize
        for ( j = 0; j < CYGPKG_NET_NLOOP; j++ )
            init_loopback_interface( j );        // re-init lo0...
#ifdef CYGPKG_SNMPAGENT
        SnmpdShutDown(0);                        // Cycle the snmpd state
	/******* INSERT YOUR EXTRA INIT CODE HERE *********/
#endif
	/******* OR HERE *********/
    }

OK, I could make /******* INSERT YOUR EXTRA INIT CODE HERE *********/ read
thus in the default, of you really want to:
	{
		extern void (*dhcp_reinit_hook)( void );
		if ( dhcp_reinit_hook )
			dhcp_reinit_hook();
	}
but what happens when you want multiple hooks, or some specific info about
what happened, or a chain of these routines....


BTW, before that change to cycle the SNMP machinery, losing and
re-acquiring a lease simply killed SNMP stone dead - worse than just losing
any extra MIBs you registered!


Comment 2 Andrew Lunn 2002-02-04 15:44:59 UTC
Sorry about the slow reponse time. I've overloaded at the moment.

I agree about writing our own lease management thread. Been there, done that, 12
months or more ago.....

The problem is the solution is not that simple. The restarting of the snmp
thread is asyncronous. The call in the DHCP thread to the snmp agent sets a
flag. Some time later the agent wakes up when it has the next request. It
services this request and the restarts itself. 

This async nature means that doing things in the DHCP thread is useless. The
action has to be taken as part of the startup of the SNMP thread. 

I can see two ways to go....
1) figure out why the SNMP thread dies on some DHCP actions. This is probably
something to do with the TCP/IP stack which is broken. Fix what is broken so its
not necassary to restart the daemon. This will probably benifit other network
applications as well.

2) Add a callback into the SNMP code so that when it starts the application has
an option to do what it needs to do. Maybe this can already be done in a similar
way to the DHCP code. ie there is an SNMP transaction engine which eCos provides
and a driver thread which the application provides. 



Comment 3 Hugo Tyson 2002-02-05 13:46:28 UTC
Of course you're right Andrew - sorry, I'd forgotton that the SNMP state
cycle was asynchronous.  But here's the suggestion (which isn't very elegant)
but is certainly sufficient; it's your suggestion (2), the stuff you need is
already available.  The ugly bit is that as things stand you must have
dhcpd() running whilst you then modify the state of things from
another thread.

Basically snmptask.c just starts a thread which goes
	static void
	snmpdloop( void )
	{
	    while ( 1 )
	        snmpd();
	} 
so if you write your own copy of that, it can go
	static void
	snmpdloop( void )
	{
	    while ( 1 ) {
	        cyg_semaphore_post( &snmp_reinit_sema );
	        snmpd();
	    }
	} 
and you can have a lower priority thread that goes
	static void
	snmp_reinit_loop( void )
	{
	    while ( 1 ) {
	        cyg_semaphore_wait( &snmp_reinit_sema );
	        // INSERT ALL YOUR OWN INITIALIZATION CODE HERE
	    }
	} 
so it runs each time the snmpd() restarts.

OTOH just adding a function pointer to the dhcpd() API so that you just have
to replace the default thread setup in snmp_task() would do it too, like this:

Index: net/snmp/agent/current/src//snmpd.c
===================================================================
RCS file: /home/cvs/ecc/ecc/net/snmp/agent/current/src/snmpd.c,v
retrieving revision 1.3
diff -u -5 -p -r1.3 snmpd.c
--- net/snmp/agent/current/src//snmpd.c 2001/11/30 15:28:12     1.3
+++ net/snmp/agent/current/src//snmpd.c 2002/02/05 13:41:34
@@ -614,11 +614,11 @@ main(int argc, char *argv[])
       }
     }
 #endif
 
 #else /* __ECOS environment: */
-void snmpd( void )    {
+void snmpd( void *initfunc( void ) ) {
     int             ret;
     u_short         dest_port = SNMP_PORT;
 #define stderr_log 1
 #endif
 
@@ -682,10 +682,18 @@ void snmpd( void )    {
 
        /* we're up, log our version number */
        snmp_log(LOG_INFO, "UCD-SNMP version %s\n", VersionInfo);
 
        memset(addrCache, 0, sizeof(addrCache));
+
+        /*
+         * Call initialization function if necessary
+         */
+       DEBUGMSGTL(("snmpd", "Calling initfunc().\n"));
+        if ( initfunc )
+            (initfunc)();
+
        /* 
         * Forever monitor the dest_port for incoming PDUs.
         */
        DEBUGMSGTL(("snmpd", "We're up.  Starting to process data.\n"));
        receive();

Do you want me to do something like that - maybe pass the callback routine
into cyg_net_snmp_init() so you don't have to substitute your own copy of
that thread setup?  I'll have a look into it, shouldn't be too much trouble.

	- Huge


Comment 4 Hugo Tyson 2002-02-05 15:51:36 UTC
Created attachment 44542 [details]
Patch - you can ignore the testcase changes.

Comment 5 Hugo Tyson 2002-02-05 15:53:18 UTC
Patch attached and mailed to ASCOM; testcase tidied up
as well as adding the callback required.

Comment 6 Alex Schuilenburg 2003-06-20 16:11:56 UTC
This bug has moved to http://bugs.ecos.sourceware.org/show_bug.cgi?id=58663