Red Hat Bugzilla – Bug 60318
ether_input() does things in wrong order in eCos
Last modified: 2007-04-18 12:40:37 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.76 [en] (WinNT; U)
Description of problem:
We implement a Ethernet-like network interface in our software. When we receive data on that interface, we call ether_input() to pass it to the IP stack just like in a
standard BSD network driver. Depending on the priority of the thread that calls ether_input(), frames get stuck in ether_input() and are only passed on when the next
frame arrives (which in turn gets stuck).
This is due to the fact that ether_input() first schedules processing of the frame by calling schednetisr(FOO) and then inserts it into the queue afterwards. This
in OpenBSD since the kernel cannot be preempted, so processing starts after ether_input() inserted the frame into the queue and returned. It doesn't work in eCos
since the call to schednetisr(FOO) might cause a context switch immediately before the frame is enqueued. The context switch only happens too early if you're
ether_input() from a normal thread (as opposed to a DSR) and that thread's priority is lower that the network thread's priority. Unfortunately, both conditions are true
for our software and are difficult to change.
The patch in the attachment fixes ether_input() so that it doesn't rely on being called from a DSR. It should not change ether_input()'s behaviour if it actually is called
from a DSR.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Difficult without a driver that doesn't use DSRs, but I hope my description and the patch are convincing enough :-)
Created attachment 46601 [details]
Patch that fixes ether_input() bug
I understand the problem, but since this only is a problem in somewhat
special circumstances, would it not be better simply to lock the scheduler
around the call to ether_input() in your application? But that might spoil
latency for other operations...
Therefore I'll apply the patch you sent - thanks!
Right, ether_input() can potentially do a lot (especially when the Ethernet bridge is involved) and I wouldn't want to have the scheduler locked during
the whole time.
Also, I spent quite some time to understand and find the problem. If somebody ever wants to implement another Ethernet-like driver running in a
normal thread in the future (which might not be that uncommon in a RTOS), he shouldn't have to do the same again.
This bug has moved to http://bugs.ecos.sourceware.org/show_bug.cgi?id=60318