Bug 1072800
| Summary: | The 1st icmp_seq lost for the first ping with big size after boot up | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | langfang <flang> | ||||||||
| Component: | kernel | Assignee: | Jiri Pirko <jpirko> | ||||||||
| Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||
| Severity: | low | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 7.0 | CC: | acathrow, flang, hhuang, jasowang, jpirko, juzhang, mst, qiguo, qzhang, rhod, rkhan, virt-maint, vyasevic, xfu | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2014-06-10 07:43:11 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
langfang
2014-03-05 09:06:06 UTC
A few questions: 1) Is the guest manually assigning IPv6 addresses or using autoconfig? 2) Do you see the same problem with IPv4? My first suspect right now in multicast snooping on the bridge. -vlad Unfortunately, didn't make it into 7.0.0. QE, MacVTap should work as of today's kernel build, so it might be interesting to test it. It is easy reproduced when just boot guest and firt ping host, then will have 20% pkg lost, the first icmp lost, so no related with set_link. (In reply to Ronen Hod from comment #6) > Unfortunately, didn't make it into 7.0.0. > > QE, > MacVTap should work as of today's kernel build, so it might be interesting > to test it. Test with macvtap, hit same issue: host# ip -d link show v1 84: v1@enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 500 link/ether 16:4a:71:96:37:d9 brd ff:ff:ff:ff:ff:ff promiscuity 0 macvtap mode vepa After guest bootup, ping guest to an external host that in same subnet with the hypervisor: # ping 10.66.4.218 -s 65507 -c 5 PING 10.66.4.218 (10.66.4.218) 65507(65535) bytes of data. 65515 bytes from 10.66.4.218: icmp_seq=2 ttl=64 time=1.81 ms 65515 bytes from 10.66.4.218: icmp_seq=3 ttl=64 time=1.69 ms 65515 bytes from 10.66.4.218: icmp_seq=4 ttl=64 time=1.68 ms 65515 bytes from 10.66.4.218: icmp_seq=5 ttl=64 time=1.69 ms --- 10.66.4.218 ping statistics --- 5 packets transmitted, 4 received, 20% packet loss, time 4002ms rtt min/avg/max/mdev = 1.687/1.720/1.810/0.072 ms Test this case with bare metal installed rhel7 system, hit this issue, so this bug is not a qemu-kvm bug. And found that just the 1st icmp_seq lost, however counts I set, only lost the 1st one. Components: # uname -r 3.10.0-121.el7.x86_64 Steps: 1.Boot host 2.Ping any system with size 65507 and any counts #ping 10.66.10.230 -s 65507 -c PING 10.66.10.230 (10.66.10.230) 65507(65535) bytes of data. 65515 bytes from 10.66.10.230: icmp_seq=2 ttl=64 time=1.65 ms 65515 bytes from 10.66.10.230: icmp_seq=3 ttl=64 time=1.59 ms 65515 bytes from 10.66.10.230: icmp_seq=4 ttl=64 time=1.58 ms 65515 bytes from 10.66.10.230: icmp_seq=5 ttl=64 time=1.55 ms 65515 bytes from 10.66.10.230: icmp_seq=6 ttl=64 time=1.58 ms 65515 bytes from 10.66.10.230: icmp_seq=7 ttl=64 time=1.71 ms 65515 bytes from 10.66.10.230: icmp_seq=8 ttl=64 time=1.37 ms 65515 bytes from 10.66.10.230: icmp_seq=9 ttl=64 time=1.72 ms 65515 bytes from 10.66.10.230: icmp_seq=10 ttl=64 time=1.54 ms 65515 bytes from 10.66.10.230: icmp_seq=11 ttl=64 time=1.71 ms 65515 bytes from 10.66.10.230: icmp_seq=12 ttl=64 time=1.51 ms 65515 bytes from 10.66.10.230: icmp_seq=13 ttl=64 time=1.71 ms 65515 bytes from 10.66.10.230: icmp_seq=14 ttl=64 time=1.55 ms 65515 bytes from 10.66.10.230: icmp_seq=15 ttl=64 time=1.68 ms 65515 bytes from 10.66.10.230: icmp_seq=16 ttl=64 time=1.50 ms 65515 bytes from 10.66.10.230: icmp_seq=17 ttl=64 time=1.36 ms 65515 bytes from 10.66.10.230: icmp_seq=18 ttl=64 time=1.75 ms 65515 bytes from 10.66.10.230: icmp_seq=19 ttl=64 time=1.65 ms 65515 bytes from 10.66.10.230: icmp_seq=20 ttl=64 time=1.50 ms --- 10.66.10.230 ping statistics --- 20 packets transmitted, 19 received, 5% packet loss, time 19032ms rtt min/avg/max/mdev = 1.369/1.594/1.755/0.117 ms So according to above, this bug is not a qemu bug, I will change the component to kernel, feel free to fix me if anything wrong I made, Thanks, langfang, Qian Guo. Would you be able to record the wire using tcpdump? That is believe would show us what is going wrong here. Thanks. Created attachment 895393 [details]
capture the icmp wire when ping with 65507 size
The same problem is in rhel6. Upstream fixes this in between v3.13 and v3.14. Continuing the investigation. Created attachment 895480 [details]
tcpdump of working ping (<~40000 size)
Created attachment 895481 [details]
tcpdump of not working ping - first request frags cut (>~40000 size)
As you can see, after arp is done, the first icmp packet is with offset 25160. So it looks like first couple of frags were lost.
This is fixed somewhere in between v3.13 and v3.14 Please ignore comment 17. I just reproduced this on latest net-next kernel. I digged into this. Found out the cause. Since neigh is unresolved at the beginning, ping fragments are put into neigh->arp_queue. However, the length of this queue is limited by arp_queue_len_bytes (/proc/sys/net/ipv4/neigh/*/unres_qlen_bytes). For more info see __neigh_event_send(). Since the fragment truesize with 1500 mtu is 2304 bytes, only 28 fragments fit in and the rest is dropped. That is consistent with what we see in tcpdump. Long story short, this is not a bug. Feel free to adjust unres_qlen_bytes value which would allow even long pings to come through. |