When running unattended, the default Redhat kernels, which all use loadable modules, cannot be used. The problem occurred setting up a new ISP's Linux servers in their co-location closet. Symptom: after a short while, connectiosn cannot be established to the servers. Prerequisites: no traffic at all .. this is a NEW network. Cause: inactivity caused the kernel modules for the network drivers to be released, effectively removing all network support. Workarround: run ping at one packetet every few minutes to prevent modules being released. Interim solution: build kernels without modules support. Long-term solution: modules must be able to mark themselves un-available for removal. Critical modules, such as network support, must make use of this feature. Hardware in this case: Compaq Proliant (rackmount) servers Problem occurred on both the built-in (Intel) and add-in (3c905) network cards.
Remove /etc/cron.d/kmod, and your problem should go away.
You're right, it's not a bug. It's a design oversight. Consider: The intention is loadable modules is to support infrequently used devices. While the Proliant does not have PCMCIA cards, I could easily puchase a controller to add them to it. In this case, I would find my kernel growing as devices are added, but the modules would never be freed since you would have me remove the job which does that. The design oversight is that some modules (either always or, better still, under the control of the administrator building the kernel) must NEVER be removed. ISTM when building the kernel, you should (don't remember if you currently can) be able to select certain features which are NOT to be built as modules. In the default install case, however, using the Redhat prebuilt kernels, everything is a module (and for good reason). What happens is the system mysteriously ceases to function. If, for instance, there was a option (enabled by default) which told the module-removal job to NOT remove keyboard, display, floppy, CDROM, hard drive, and network controller modules, your system would appear more stable right out of the box. As things now stand, things are good for me and bad for my customer. Bad for my customer because they wasted several weeks trying to determine why their servers were not working reliably. Good for me because I get a nice, hefty consulting fee. So, if you really thing Redhat Linux should frustrate new users, and increase revenues for consultants, go ahead and re-close this bug. But if, like me, you think Linux should be reliable and stable, even when used by neophites, please leave it open until you actually RESOLVE it.
I agree that there must be some magic for devices whom "use" can be external triggered (like NICs etc.). But I think an ifconfig up'ed network device should always raise the use count of the driver module. Stinks like a kernel/driver problem to me. Disabling kmod is not actually a reasonable resolution to the problem.
I agree that droesen's comment. I have commented 'rmmod -as' in /etc/cron.d/kmod. but, I couldn't resolve my ploblem. My linux box still have one 'ping' process.
Shouldn't the fact that the interfaces are 'up' keep the modules from being removed? Isn't this the real bug here, these modules are 'in use' not becouse there is traffic, but becouse the interfaces are 'up'. (The module use counters come to mind here...).
Only modules that are marked as 'autoclean' are removed by rmmod -as. If you perform a modprobe directly, this doesn't happen. Only modules that are autoloaded by the kernel module loader are marked as 'autoclean'. The sound initialization scripts do a modprobe to load the modules, and they are not autoclean. If an ifup-eth script was added in /etc/sysconfig/network-scripts, that checked $(/sbin/lsmod) for the module listed in /etc/modules.conf, and did a modprobe if it wasn't in the kernel, that would solve the problem.
I have a very similar problem also using Redhat 7.0 with the 2.2.16-35 smp kernel and a 3c905 netcard. The network hangs after a varying time without error messages (also very similar to bug 22717). The way to get it back working is to stop the network unload the netcard module, load it again and restart the network. I agree that the time until the network hangs seems to depend on how much you use the network. Usage seems to prolong the time and vice versa. I don't think it has to do with auto-cleaned modules though, but rather a kernel/driver problem. I get the same problem when I load the modules by hand without the "auto clean" flag set. Removing the kmod script does not change the situation either. Also, when the network is hanging, lsmod still reports the netcard module as loaded and used. The problems still needs a solution though, since the networks always hangs after 1-3 days (depending on usage) of uptime on all my RedHat 7.0 machines. I would be greatful for hints on solution. The ping trick (every 5 minutes) doesn't work for me.
This seems like a different problem. If the 3c59x module shows the same, please open a separate bug.