Bug 839262

Summary: dhcp doesn't work on tagged vlan interfaces of atl1e adapter
Product: [Fedora] Fedora Reporter: Rudd-O DragonFear <rudd-o>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 17CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, nhorman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-05-06 10:47:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rudd-O DragonFear 2012-07-11 12:00:18 UTC
Description of problem:

dhclient eth0.103 fails to get a response, even though tcpdump -e -i eth0 shows the request and response on the wire.

FYI tcpdump -e -i eth0.103 only shows fhe request.  So the vlan tagged reply packet is being swallowed somewhere by the kernel.

It used to work in kernel 2.6.35, but it ceased working afterwards.

Version-Release number of selected component (if applicable):

/home/rudd-o 

Comment 1 Rudd-O DragonFear 2012-07-11 12:01:40 UTC
Your bug tracker eats bug reports with unicode characters.  Not cool.  This is the 21st century.

Reposting:

Description of problem:

dhclient eth0.103 fails to get a response, even though tcpdump -e -i eth0 shows the request and response on the wire.

FYI tcpdump -e -i eth0.103 only shows fhe request.  So the vlan tagged reply packet is being swallowed somewhere by the kernel.

It used to work in kernel 2.6.35, but it ceased working afterwards.

Version-Release number of selected component (if applicable):

/home/rudd-o:
uname -r
3.3.7-1.fc17.x86_64

How reproducible:

Always

Steps to Reproduce:
1. create tagged vlan interface
2. connect to tagged vlan port
3. assign static ip address
4. attempt to ping other host in the same vlan
5. verify that ping works fine
6. now run dhclient on the tagged vlan interface, while running a dhcp server serving dhcp addresses on the same vlan
7. observe how dhclient never gets a reply
  
Actual results:

dhclient does not get an address

Expected results:

dhclient gets an address

Additional info:

/home/rudd-o:
ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: off
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: off

what's worse: disabling rxvlan on the network card (a debugging step I took) reliably causes it to stop sending traffic altogether.

May be related to this bug: https://bugzilla.redhat.com/show_bug.cgi?id=699083

Comment 2 Rudd-O DragonFear 2012-07-11 12:02:38 UTC
tcpdumps proving the facts I mentioned above:

----------------------------

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0.103, link-type EN10MB (Ethernet), capture size 65535 bytes
04:43:20.254394 90:e6:ba:c8:a5:a9 (oui Unknown) > Broadcast, ethertype IPv4 (0x0800), length 342: 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 90:e6:ba:c8:a5:a9 (oui Unknown), length 300

----------------------------

tcpdump 'port 67 or port 68' -e -i eth0
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
04:36:58.131601 90:e6:ba:c8:a5:a9 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 346: vlan 103, p 0, ethertype IPv4, 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 90:e6:ba:c8:a5:a9 (oui Unknown), length 300
04:36:58.149582 98:2c:be:4e:e5:c9 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 346: vlan 103, p 0, ethertype IPv4, homeportal.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 300

Comment 3 Rudd-O DragonFear 2012-07-11 12:04:49 UTC
More info:


/home/rudd-o:
ifconfig eth0.103
eth0.103: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.253  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::92e6:baff:fec8:a5a9  prefixlen 64  scopeid 0x20<link>
        ether 90:e6:ba:c8:a5:a9  txqueuelen 0  (Ethernet)
        RX packets 9658  bytes 5041744 (4.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 9683  bytes 3226463 (3.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


0 <- ifconfig eth0.103
/home/rudd-o:
ifconfig eth0.102
eth0.102: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::92e6:baff:fec8:a5a9  prefixlen 64  scopeid 0x20<link>
        ether 90:e6:ba:c8:a5:a9  txqueuelen 0  (Ethernet)
        RX packets 11375  bytes 3249110 (3.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 10753  bytes 4479535 (4.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


0 <- ifconfig eth0.102
/home/rudd-o:
brctl show br0
bridge name     bridge id               STP enabled     interfaces
br0             8000.90e6bac8a5a9       no              eth0.102
                                                        tun0

Comment 4 Rudd-O DragonFear 2012-07-11 12:05:24 UTC
May be related: http://kerneltrap.org/mailarchive/linux-netdev/2010/1/21/6267138

Comment 5 Josh Boyer 2012-07-11 15:07:16 UTC
You're honestly better off trying the 3.4.4-5 kernel, or 3.5-rc6 from rawhide and if this still happens reporting it directly to the netdev mailing list.  They will be in a much better position to help you.

Comment 6 Neil Horman 2013-05-01 17:40:28 UTC
Is this still happening for you?  If so, can you run dropwatch while you attempt to preform a dhcp and send in the output

Comment 7 Rudd-O DragonFear 2013-05-05 05:03:53 UTC
I ditched F17 vlan use.  Cannot repro anymore.