From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030708 Description of problem: If you set a socket to a particular interface using the setsockopt IP_MULTICAST_IF option, bring down that interface with ifconfig, remove that interface's device driver with rmmod, re-install a driver (new or the same driver) for that interface with insmod, and the interface comes back up with the same IP address, writing to the socket will fail with ENODEV. Version-Release number of selected component (if applicable): kernel-2.4.21-4.0.1EL How reproducible: Always Steps to Reproduce: 1. Run my code (Be sure to edit host IP address appropriately for a multicast interface on your machine) 2. Wait a little bit, then cat "output.txt" to see that the code is working fine. 3. Use ifconfig to bring down every network interface. 4. Use rmmod to remove all network device drivers. 5. Wait a little bit, then cat "output.txt" to see that write() fails with EINVAL. 6. Use insmod to add back all network device drivers (I assume this will always bring up all appropriate network interfaces with the same IP adddresses as before.) 7. Wait a little bit, then cat "output.txt" to see that the write() fails with ENODEV. Actual Results: In output.txt: rv = 1, errno = 0, str = Success rv = 1, errno = 0, str = Success rv = 1, errno = 0, str = Success rv = -1, errno = 22, str = Invalid argument rv = -1, errno = 22, str = Invalid argument rv = -1, errno = 22, str = Invalid argument rv = -1, errno = 19, str = No such device rv = -1, errno = 19, str = No such device rv = -1, errno = 19, str = No such device Expected Results: After the network device drivers and network interfaces are brought back up, writes through the socket should continue to work. In other words, we should see: rv = 1, errno = 0, str = Success in output.txt Additional info: Here is my console output (with duplicate output data removed for clarity): tdev2|/tmp 3>gcc reproducer.c tdev2|/tmp 4>./a.out & [1] 16119 tdev2|/tmp 5>cat output.txt rv = 1, errno = 0, str = Success rv = 1, errno = 0, str = Success tdev2|/tmp 6>ifconfig eth0 down; ifconfig eth1 down tdev2|/tmp 7>rmmod e1000 tdev2|/tmp 9>cat output.txt rv = 1, errno = 0, str = Success rv = 1, errno = 0, str = Success rv = -1, errno = 22, str = Invalid argument rv = -1, errno = 22, str = Invalid argument rv = -1, errno = 22, str = Invalid argument tdev2|/tmp 10>insmod e1000 Using /lib/modules/2.4.21-ia64-test/kernel/drivers/net/e1000/e10Intel(R) PRO/1000 Network Driver - version 5.1.11-k1 C00.o opyright (c) 1999-2003 Intel Corporation. PCI: Found IRQ 51 for device 01:00.0 eth0: Intel(R) PRO/1000 Network Connection PCI: Found IRQ 53 for device 06:01.0 eth1: Intel(R) PRO/1000 Network Connection PCI: Found IRQ 54 for device 06:01.1 eth2: Intel(R) PRO/1000 Network Connection tdev2|/tmp 11>e1000: eth0 NIC Link is Up 100 Mbps Full Duplex e1000: eth1 NIC Link is Up 100 Mbps Full Duplex arping(16251): unaligned access to 0x60000fffffffbf15, ip=0xe000000004754850 arping(16258): unaligned access to 0x60000fffffffbf15, ip=0xe000000004754850 tdev2|/tmp 11>arping(16312): unaligned access to 0x60000fffffffbf15, ip=0xe000000004754850 arping(16313): unaligned access to 0x60000fffffffbf15, ip=0xe000000004754850 tdev2|/tmp 11>cat output.txt rv = 1, errno = 0, str = Success rv = 1, errno = 0, str = Success rv = -1, errno = 22, str = Invalid argument rv = -1, errno = 22, str = Invalid argument rv = -1, errno = 22, str = Invalid argument rv = -1, errno = 19, str = No such device rv = -1, errno = 19, str = No such device rv = -1, errno = 19, str = No such device
#include <stdio.h> #include <stdlib.h> #include <sys/socket.h> #include <sys/types.h> #include <netdb.h> #include <unistd.h> #include <sys/stat.h> #include <fcntl.h> #include <netinet/in.h> #include <arpa/inet.h> #include <string.h> #include <errno.h> #define PERROR(x) do { perror(x); exit(1); } while (0) int main() { int output; int sock; int len, rv; struct sockaddr_in src, dest; /* Reproducer brings down all interfaces, so I assume you can only * run this in the console. So I write data to a file rather * than stderr/stdout */ output = open("output.txt", O_CREAT | O_WRONLY | O_APPEND, S_IRUSR | S_IWUSR); if (output < 0) PERROR("open"); sock = socket(AF_INET, SOCK_DGRAM, 0); if (sock < 0) PERROR("socket"); /* Change this to a local multicast interface IP on your machine */ rv = inet_pton(AF_INET, "192.168.20.3", &src.sin_addr); if (rv <= 0) PERROR("inet_pton"); rv = setsockopt(sock, IPPROTO_IP, IP_MULTICAST_IF, (const void *)&src.sin_addr, sizeof(struct in_addr)); if (rv < 0) PERROR("setsockopt"); /* Make up some multicast destination address */ memset(&dest, '\0', sizeof(dest)); dest.sin_family = AF_INET; dest.sin_port = htons(9000); rv = inet_pton(AF_INET, "239.2.11.72", &dest.sin_addr); if (rv <= 0) PERROR("inet_pton"); rv = connect(sock, (struct sockaddr *)&dest, sizeof(dest)); if (rv < 0) PERROR("connect"); /* Loop forever, always trying to write to this socket */ while (1) { char data = 0; char buffer[1000]; errno = 0; /* just in case */ rv = write(sock, &data, 1); len = sprintf(buffer, "rv = %d, errno = %d, str = %s\n", rv, errno, strerror(errno)); write(output, buffer, len); /* i assume this always works */ sleep(3); /* sleep a bit */ } }
U1 should have a fix for this bug.
We just retested this with 2.4.21-5EL and it has the same problem.
I asked davem to comment on whether this is a reasonable request, and he told me to contact David Stevens at IBM. Here's what he had to say: It is a judgement call, since I don't there is any established practice-- you can't remove an interface on 4.3BSD systems. But I would have to say, "no", the program is not reasonable. Here's why: The bug poster seems to have the idea that multicast group membership is associated with an IP address and should therefore be there in the later instance of the interface. But, even on BSD systems, and certainly on Linux systems, if you join a group on an interface, then delete the IP address and re-add that IP address on a different interface, none of the group memberships move with the IP address. Group membership is associated with the logical device, and that's particularly clear when using ip_mreqn or IPv6, which specify an interface by index, not address. You don't even have to have an IP address to join a group on an interface. There isn't any practical way to support what they're after, because you have cases like: two addresses, IP1 & IP2, on eth0, and you delete them and then add IP1 to eth1 and IP2 to eth2. I'm guessing he'd want the groups joined via IP1 to go to eth1 and the groups joined via IP2 to go to eth2, but there is no context like that saved, and where would groups joined by index go to? I think the way to think of it is that removing the module logically removes the interface, which is what the groups are associated with. At that point, they are gone, and even if the same physical interface is re-added with the same addresses, it still has a different interface index and is logically not the same interface as the groups were joined on. The reason it doesn't work in Linux is because each device has its list of group memberships and that is destroyed when the device is unregistered. The new device, when registered, will have no memberships until new group joins are done. I don't see this as a bug, and I don't believe any other OS supports anything like it. If they're for a high-availability failover mechanism, I think they want group member ship in a "parent" logical device with child physical devices that can come and go. I thought that's how ethernet bonding works, though I really know nothing about it. With something like that, the physical devices can come and go but as long as the parent device doesn't, the multicast group memberships aren't affected.
Hello, Based on your points, I agree that this may be correct behavior. However, are the errno values correct?? When I first saw this problem, errno == EINVAL suggested to me that bringing the network device back up would "fix" the invalid argument. Based on the statements by Dave Stevens, it would seem that errno should not equal EINVAL at any point in time. That ENODEV should be returned after you rmmod the NIC driver. Al
Created attachment 98755 [details] A patch to change the returned error This patch changes the returned EINVAL to ENODEV. davem, any comments on whether this is the right thing to do or not?
This code is checking to see if an ipv4 address is local to the system. ENODEV is quite an odd error code to return for that. EINVAL is a perfectly fine error return, I see no reason to change it.