Bug 1832332
Summary: | "[sig-network] Services should be rejected when no endpoints exist" test fails frequently on RHEL7 nodes | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Vikas Laad <vlaad> | |
Component: | kernel | Assignee: | Paolo Abeni <pabeni> | |
kernel sub component: | arp/icmp | QA Contact: | Jianlin Shi <jishi> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | urgent | |||
Priority: | urgent | CC: | aconstan, atragler, bbennett, cglombek, danw, dcbw, dhoward, ecordell, gnault, jdesousa, jiji, jstancek, miabbott, mkumatag, nmurray, pabeni, periklis, ptalbert, ricarril, rteague, sdodson, skunkerk, sukulkar, vrutkovs, walters, weliang, wking, ykashtan, zzhao | |
Version: | 7.8 | Flags: | pabeni:
needinfo-
|
|
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | OKDBlocker | |||
Fixed In Version: | kernel-3.10.0-1148.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1781575 | |||
: | 1834184 (view as bug list) | Environment: | ||
Last Closed: | 2020-09-29 21:14:38 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1781575 | |||
Bug Blocks: | 1779811, 1825255, 1831684, 1834184 |
Comment 1
Juan Luis de Sousa-Valadas
2020-05-06 16:58:02 UTC
I don't quite understand the question. The system has many iptables rules with a REJECT "icmp-port-unreachable" action and the concern is that when traffic matches these rules the system does not always transmit the expected ICMP packet back to the sender unless the icmp_ratelimit is removed or the default ratemask adjusted to allow Type 3 Dest Unreachable? Note that even if the ICMP messages are rate limited, the traffic which triggered it is still dropped. Upstream and the RHEL kernel (as of RHEL 7) also includes *global* limits controlled by icmp_msgs_per_sec and icmp_msgs_burst. This is checked first against ICMP matching the icmp_ratemask. Then the rate limit controlled by icmp_ratelimit is checked; this is a per-peer limit. So if a given remote host sends a flood of traffic which is all REJECT'd the kernel will sent back *at most* 1 ICMP Dest Unreachable message per second (assuming icmp_ratelimit is the default 1000ms). But it will only do that if the global limit hasn't been met. 567 void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info) 568 { 569 struct iphdr *iph; 570 int room; 571 struct icmp_bxm icmp_param; 572 struct rtable *rt = skb_rtable(skb_in); 573 struct ipcm_cookie ipc; 574 struct flowi4 fl4; 575 __be32 saddr; 576 u8 tos; 577 u32 mark; 578 struct net *net; 579 struct sock *sk; ...... 657 /* Check global sysctl_icmp_msgs_per_sec ratelimit, unless 658 * incoming dev is loopback. If outgoing dev change to not be 659 * loopback, then peer ratelimit still work (in icmpv4_xrlim_allow) 660 */ 661 if (!(skb_in->dev && (skb_in->dev->flags&IFF_LOOPBACK)) && 662 !icmpv4_global_allow(net, type, code)) 663 goto out_bh_enable; ...... 721 /* peer icmp_ratelimit */ 722 if (!icmpv4_xrlim_allow(net, rt, &fl4, type, code)) 723 goto ende; ...... 739 ende: 740 ip_rt_put(rt); 741 out_unlock: 742 icmp_xmit_unlock(sk); 743 out_bh_enable: 744 local_bh_enable(); 745 out:; 746 } 747 EXPORT_SYMBOL(icmp_send); 298 static bool icmpv4_global_allow(struct net *net, int type, int code) 299 { 300 if (icmpv4_mask_allow(net, type, code)) 301 return true; 302 303 if (icmp_global_allow()) 304 return true; 305 306 return false; 307 } 282 static bool icmpv4_mask_allow(struct net *net, int type, int code) 283 { 284 if (type > NR_ICMP_TYPES) 285 return true; 286 287 /* Don't limit PMTU discovery. */ 288 if (type == ICMP_DEST_UNREACH && code == ICMP_FRAG_NEEDED) 289 return true; 290 291 /* Limit if icmp type is enabled in ratemask. */ 292 if (!((1 << type) & net->ipv4.sysctl_icmp_ratemask)) 293 return true; 294 295 return false; 296 } 243 /** 244 * icmp_global_allow - Are we allowed to send one more ICMP message ? 245 * 246 * Uses a token bucket to limit our ICMP messages to sysctl_icmp_msgs_per_sec. 247 * Returns false if we reached the limit and can not send another packet. 248 * Note: called with BH disabled 249 */ 250 bool icmp_global_allow(void) 251 { 252 u32 credit, delta, incr = 0, now = (u32)jiffies; 253 bool rc = false; 254 255 /* Check if token bucket is empty and cannot be refilled 256 * without taking the spinlock. 257 */ 258 if (!icmp_global.credit) { 259 delta = min_t(u32, now - icmp_global.stamp, HZ); 260 if (delta < HZ / 50) 261 return false; 262 } 263 264 spin_lock(&icmp_global.lock); 265 delta = min_t(u32, now - icmp_global.stamp, HZ); 266 if (delta >= HZ / 50) { 267 incr = sysctl_icmp_msgs_per_sec * delta / HZ ; 268 if (incr) 269 icmp_global.stamp = now; 270 } 271 credit = min_t(u32, icmp_global.credit + incr, sysctl_icmp_msgs_burst); 272 if (credit) { 273 credit--; 274 rc = true; 275 } 276 icmp_global.credit = credit; 277 spin_unlock(&icmp_global.lock); 278 return rc; 279 } 280 EXPORT_SYMBOL(icmp_global_allow); 313 static bool icmpv4_xrlim_allow(struct net *net, struct rtable *rt, 314 struct flowi4 *fl4, int type, int code) 315 { 316 struct dst_entry *dst = &rt->dst; 317 struct inet_peer *peer; 318 bool rc = true; 319 320 if (icmpv4_mask_allow(net, type, code)) 321 goto out; 322 323 /* No rate limit on loopback */ 324 if (dst->dev && (dst->dev->flags&IFF_LOOPBACK)) 325 goto out; 326 327 peer = inet_getpeer_v4(net->ipv4.peers, fl4->daddr, 1); 328 rc = inet_peer_xrlim_allow(peer, net->ipv4.sysctl_icmp_ratelimit); 329 if (peer) 330 inet_putpeer(peer); 331 out: 332 return rc; 333 } 519 /* 520 * Check transmit rate limitation for given message. 521 * The rate information is held in the inet_peer entries now. 522 * This function is generic and could be used for other purposes 523 * too. It uses a Token bucket filter as suggested by Alexey Kuznetsov. 524 * 525 * Note that the same inet_peer fields are modified by functions in 526 * route.c too, but these work for packet destinations while xrlim_allow 527 * works for icmp destinations. This means the rate limiting information 528 * for one "ip object" is shared - and these ICMPs are twice limited: 529 * by source and by destination. 530 * 531 * RFC 1812: 4.3.2.8 SHOULD be able to limit error message rate 532 * SHOULD allow setting of rate limits 533 * 534 * Shared between ICMPv4 and ICMPv6. 535 */ 536 #define XRLIM_BURST_FACTOR 6 537 bool inet_peer_xrlim_allow(struct inet_peer *peer, int timeout) 538 { 539 unsigned long now, token; 540 bool rc = false; 541 542 if (!peer) 543 return true; 544 545 token = peer->rate_tokens; 546 now = jiffies; 547 token += now - peer->rate_last; 548 peer->rate_last = now; 549 if (token > XRLIM_BURST_FACTOR * timeout) 550 token = XRLIM_BURST_FACTOR * timeout; 551 if (token >= timeout) { 552 token -= timeout; 553 rc = true; 554 } 555 peer->rate_tokens = token; 556 return rc; 557 } 558 EXPORT_SYMBOL(inet_peer_xrlim_allow); The kernel does not log any details or statistics about this activity; you'd have to do something like perf or stap to track specific instances when icmp_send() was "blocked" by one of these limits. (In reply to Patrick Talbert from comment #3) > I don't quite understand the question. > > The system has many iptables rules with a REJECT "icmp-port-unreachable" > action and the concern is that when traffic matches these rules the system > does not always transmit the expected ICMP packet back to the sender unless > the icmp_ratelimit is removed or the default ratemask adjusted to allow Type > 3 Dest Unreachable? It's not that it "does not always transmit the expected ICMP packet". It's that it almost 100% reliably fails to transmit the expected ICMP packet, even when the network is otherwise almost completely idle and so no rate limiting should be occurring. The behavior also seems to vary a lot between releases. It never worked right in RHEL 7; it started working right in RHEL 8, but is now broken again in RHEL 8.2, which has been bisected to a particular commit: https://bugzilla.redhat.com/show_bug.cgi?id=1781575#c37. (In reply to Dan Winship from comment #4) > (In reply to Patrick Talbert from comment #3) > > I don't quite understand the question. > > > > The system has many iptables rules with a REJECT "icmp-port-unreachable" > > action and the concern is that when traffic matches these rules the system > > does not always transmit the expected ICMP packet back to the sender unless > > the icmp_ratelimit is removed or the default ratemask adjusted to allow Type > > 3 Dest Unreachable? > > It's not that it "does not always transmit the expected ICMP packet". It's > that it almost 100% reliably fails to transmit the expected ICMP packet, > even when the network is otherwise almost completely idle and so no rate > limiting should be occurring. Do you have steps to reproduce this that does not involve an entire Openshift deployment? > > The behavior also seems to vary a lot between releases. It never worked > right in RHEL 7; it started working right in RHEL 8, but is now broken again > in RHEL 8.2, which has been bisected to a particular commit: > https://bugzilla.redhat.com/show_bug.cgi?id=1781575#c37. Ah that's great. Has anyone gone the next step to see which of the several commits from that BZ is causing the condition? *** Bug 1829961 has been marked as a duplicate of this bug. *** *** Bug 1831684 has been marked as a duplicate of this bug. *** In staring at the commits from BZ1765639 I really do not immediately see how those would impact this issue. There is a lot going on here so it's possible some other change had an unexpected knock-on effect: $ git log --oneline kernel-4.18.0-151.el8..kernel-4.18.0-152.el8 net/ include/net/ | wc -l 462 But definitely if there is an existing reproducer then this is ripe for a further bisect. > Do you have steps to reproduce this that does not involve an entire Openshift deployment? If it helps we can easily get anyone who needs it a cluster already spun up and a kubeconfig. Also I'm happy to join a Bluejeans and help in realtime. FWIW I have a setup now w/Systemtap to help me debug this and am capturing live notes here: https://hackmd.io/B3IVIiQeTei6TFz0kXp6Ng But I don't know this code (and only know rudimentary SystemTap). I've updated the hackmd but just to post a checkpoint of my findings, using this systemtap script: ``` #! /usr/bin/env stap probe begin { println("Watching ICMP, Ctrl-C to exit") } probe kernel.function("inet_peer_xrlim_allow") { if ($peer == NULL) { println("inet_peer_xrlim_allow(NULL peer)") } } probe kernel.function("inet_peer_xrlim_allow").return { printf("inet_peer_xrlim_allow last=%s tokens=%s ret=%s\n", $peer->rate_last$, $peer->rate_tokens$, $$return) } probe kernel.function("icmpv4_xrlim_allow").return { printf("icmpv4_xrlim_allow(type=%d code=%d) %s\n", $type, $code, $$return) } probe kernel.function("icmpv4_global_allow").return { if ($return == 0) { println("icmpv4_global_allow: denied") } } probe module("nf_reject_ipv4").function("nf_send_unreach") { printf("nf_send_unreach: %s\n", $skb_in->dev->name$) } ``` Here's what I see from the RHEL 8.1 kernel and a successful test: # staprun inetpeer81.ko Watching ICMP, Ctrl-C to exit icmpv4_xrlim_allow(type=5 code=1) return=0x1 nf_send_unreach: "tun0" inet_peer_xrlim_allow last=4361979389 tokens=1 ret=return=0x0 icmpv4_xrlim_allow(type=3 code=3) return=0x0 icmpv4_xrlim_allow(type=5 code=1) return=0x1 nf_send_unreach: "tun0" inet_peer_xrlim_allow last=4361980416 tokens=6 ret=return=0x0 icmpv4_xrlim_allow(type=3 code=3) return=0x0 icmpv4_xrlim_allow(type=5 code=1) return=0x1 nf_send_unreach: "tun0" inet_peer_xrlim_allow last=4361984622 tokens=7 ret=return=0x0 icmpv4_xrlim_allow(type=3 code=3) return=0x0 nf_send_unreach: "tun0" inet_peer_xrlim_allow last=4361984628 tokens=13 ret=return=0x1 icmpv4_xrlim_allow(type=3 code=3) return=0x1 Using kernel-4.18.0-193.el8.x86_64 what I see is: icmpv4_xrlim_allow(type=5 code=1) return=0x1 icmpv4_xrlim_allow(type=5 code=1) return=0x1 icmpv4_xrlim_allow(type=5 code=1) return=0x1 icmpv4_xrlim_allow(type=5 code=1) return=0x1 icmpv4_xrlim_allow(type=5 code=1) return=0x1 nf_send_unreach: "tun0" inet_peer_xrlim_allow last=4347872279 tokens=0 ret=return=0x0 icmpv4_xrlim_allow(type=3 code=3) return=0x0 icmpv4_xrlim_allow(type=5 code=1) return=0x1 nf_send_unreach: "tun0" inet_peer_xrlim_allow last=4347873280 tokens=0 ret=return=0x0 icmpv4_xrlim_allow(type=3 code=3) return=0x0 icmpv4_xrlim_allow(type=5 code=1) return=0x1 icmpv4_xrlim_allow(type=5 code=1) return=0x1 icmpv4_xrlim_allow(type=5 code=1) return=0x1 nf_send_unreach: "tun0" inet_peer_xrlim_allow last=4347877380 tokens=0 ret=return=0x0 icmpv4_xrlim_allow(type=3 code=3) return=0x0 nf_send_unreach: "tun0" inet_peer_xrlim_allow last=4347878400 tokens=0 ret=return=0x0 icmpv4_xrlim_allow(type=3 code=3) return=0x0 nf_send_unreach: "tun0" Which, notice tokens is always zero. Hum...so I finally read the comment "Note that the same inet_peer fields are modified by functions in route.c too" And looking there, I notice we lost an increment of rate_tokens: diff -u kernel-4.18.0-117.el8/linux-4.18.0-117.el8_1_0.x86_64/net/ipv4/route.c kernel-4.18.0-193.el8/linux-4.18.0-193.el8_2_0.x86_64/net/ipv4/route.c --- kernel-4.18.0-117.el8/linux-4.18.0-117.el8_1_0.x86_64/net/ipv4/route.c 2019-07-16 13:21:04.000000000 +0000 +++ kernel-4.18.0-193.el8/linux-4.18.0-193.el8_2_0.x86_64/net/ipv4/route.c 2020-03-27 13:57:18.000000000 +0000 @@ -908,16 +908,15 @@ if (peer->rate_tokens == 0 || time_after(jiffies, (peer->rate_last + - (ip_rt_redirect_load << peer->rate_tokens)))) { + (ip_rt_redirect_load << peer->n_redirects)))) { __be32 gw = rt_nexthop(rt, ip_hdr(skb)->daddr); icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, gw); peer->rate_last = jiffies; - ++peer->rate_tokens; ++peer->n_redirects; which seems to have come from https://github.com/torvalds/linux/commit/b406472b5ad79ede8d10077f0c8f05505ace8b6d Not certain this is it but certainly the token values in the previous kernel are small and possibly were just incremented there. (In reply to Colin Walters from comment #11) > Hum...so I finally read the comment > > "Note that the same inet_peer fields are modified by functions in route.c > too" > > And looking there, I notice we lost an increment of rate_tokens: > > diff -u > kernel-4.18.0-117.el8/linux-4.18.0-117.el8_1_0.x86_64/net/ipv4/route.c > kernel-4.18.0-193.el8/linux-4.18.0-193.el8_2_0.x86_64/net/ipv4/route.c > --- kernel-4.18.0-117.el8/linux-4.18.0-117.el8_1_0.x86_64/net/ipv4/route.c > 2019-07-16 13:21:04.000000000 +0000 > +++ kernel-4.18.0-193.el8/linux-4.18.0-193.el8_2_0.x86_64/net/ipv4/route.c > 2020-03-27 13:57:18.000000000 +0000 > @@ -908,16 +908,15 @@ > if (peer->rate_tokens == 0 || > time_after(jiffies, > (peer->rate_last + > - (ip_rt_redirect_load << peer->rate_tokens)))) { > + (ip_rt_redirect_load << peer->n_redirects)))) { > __be32 gw = rt_nexthop(rt, ip_hdr(skb)->daddr); > > icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, gw); > peer->rate_last = jiffies; > - ++peer->rate_tokens; > ++peer->n_redirects; > > which seems to have come from > https://github.com/torvalds/linux/commit/ > b406472b5ad79ede8d10077f0c8f05505ace8b6d > > Not certain this is it but certainly the token values in the previous kernel > are small and possibly were just incremented there. Nice STAP. Thank you for looking at this. I saw that commit as well but it's only touching the ip_rt_send_redirect() function. Is this environment generating a lot of redirects? A simple netstat -s would tell you. And/or also stap ip_rt_send_redirect() to see if it is running. I will make kernels with and without 5f1b3b571c08 and post links here when they are finished. This is a test build of the RHEL 8.2 GA kernel-4.18.0-193.el8 with commit 5f1b3b571c08 reverted: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=28447004 I think the root cause is (In reply to Colin Walters from comment #11) > kernel-4.18.0-117.el8/linux-4.18.0-117.el8_1_0.x86_64/net/ipv4/route.c > kernel-4.18.0-193.el8/linux-4.18.0-193.el8_2_0.x86_64/net/ipv4/route.c > --- kernel-4.18.0-117.el8/linux-4.18.0-117.el8_1_0.x86_64/net/ipv4/route.c > 2019-07-16 13:21:04.000000000 +0000 > +++ kernel-4.18.0-193.el8/linux-4.18.0-193.el8_2_0.x86_64/net/ipv4/route.c > 2020-03-27 13:57:18.000000000 +0000 > @@ -908,16 +908,15 @@ > if (peer->rate_tokens == 0 || > time_after(jiffies, > (peer->rate_last + > - (ip_rt_redirect_load << peer->rate_tokens)))) { > + (ip_rt_redirect_load << peer->n_redirects)))) { > __be32 gw = rt_nexthop(rt, ip_hdr(skb)->daddr); > > icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, gw); > peer->rate_last = jiffies; > - ++peer->rate_tokens; > ++peer->n_redirects; > > which seems to have come from > https://github.com/torvalds/linux/commit/ > b406472b5ad79ede8d10077f0c8f05505ace8b6d I think the missing increment here is really the root cause. If the redirect rate is high enough - and looking at the STAP traces there are quite a bit of redirects - we the above test will always succeeds because 'rate_token' start as 0, the redirects refresh 'rate_last' value at a quite high frequency, while keeping 'rate_token' at 0, and inet_peer_xrlim_allow() does not get any chance to success, as the delta between 'now' and 'rate_token' is low. Reverting the above commit will bring back bz#1753092 - which is likely less critical. I think: diff --git a/net/ipv4/route.c b/net/ipv4/route.c index c28ce1b84dd2..9fc9297d4080 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -905,7 +905,7 @@ void ip_rt_send_redirect(struct sk_buff *skb) /* Check for load limit; set rate_last to the latest sent * redirect. */ - if (peer->rate_tokens == 0 || + if (peer->n_redirects == 0 || time_after(jiffies, (peer->rate_last + (ip_rt_redirect_load << peer->n_redirects)))) { will address the issue in a possibly safer way. A scratch build with the above change will be soon available at: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=28448170 Could you please give the above a spin in your testbed? (In reply to Dan Winship from comment #4) > (In reply to Patrick Talbert from comment #3) > > I don't quite understand the question. > > > > The system has many iptables rules with a REJECT "icmp-port-unreachable" > > action and the concern is that when traffic matches these rules the system > > does not always transmit the expected ICMP packet back to the sender unless > > the icmp_ratelimit is removed or the default ratemask adjusted to allow Type > > 3 Dest Unreachable? > > It's not that it "does not always transmit the expected ICMP packet". It's > that it almost 100% reliably fails to transmit the expected ICMP packet, > even when the network is otherwise almost completely idle and so no rate > limiting should be occurring. > According to comment 10, that's not true. ICMP Redirects are sent and that likely is the problem. > The behavior also seems to vary a lot between releases. It never worked > right in RHEL 7; it started working right in RHEL 8, but is now broken again > in RHEL 8.2, > As Paolo found out in comment 14, the problem likely comes from the fact that the router has to send ICMP Redirect messages. Those are rate limited, which is also the case for ICMP Destination Unreachable messages. The problem is that one can influence the other, and those interactions have changed over time. That's probably what made you think that "It never worked right in RHEL 7". Just start sending packets to the right gateway and you should start seeing the Destination Unreachable messages you're expecting. > which has been bisected to a particular commit: > https://bugzilla.redhat.com/show_bug.cgi?id=1781575#c37. > I'm sorry, but the message pointed to by this link is wrong. It points to a completely unrelated bz, just because that bz has "icmp" in one of its commit messages. At least it got me to look at this problem... To make it short, ICMP Redirects might rate limit your ICMP Destination Unreachable messages. This has always been the case and is still the case even with the patch of comment 14 (just to a lesser extend). That patch is probably a step in the right direction, but be prepared for more evolutions in this area. ICMP Redirect messages are a common symptom of a bad network design or bad configuration on a router or an end host. Someone should figure out why the peer doesn't uses the right gateway. Assuming that's fixable, and that you can do without ICMP Redirects, then your test should work with any kernel version. That doesn't prevent improving the rate limiting algorithms in the kernel, but let's do that for good reasons. patch from comment#14 posted upstream: https://lore.kernel.org/netdev/7f71c9a7ba0d514c9f2d006f4797b044c824ae84.1588954755.git.pabeni@redhat.com/T/#u This bug seems again to have the "redhat" private flag - anyone mind if I de-restrict it? > patch from comment#14 posted upstream: Presuming that all goes through, I'm trying to think about next steps here. For OpenShift/RHEL CoreOS we have a rule that we can "cherry pick" things from RHEL but only after they've been attached to an errata: https://gitlab.cee.redhat.com/coreos/redhat-coreos/blob/master/README.md#overridingusing-specific-package-versions I think then what we'd need to decide is whether to queue this for for an 8.2.X update and cherry-pick it along with the rest of RHEL 8.2, or would it need to wait for 8.3? The severity of this issue is a bit tricky; I personally wouldn't call it *critical* but we have the Kubernetes test here for a reason, and there are customer cases around this. > To make it short, ICMP Redirects might rate limit your ICMP Destination Unreachable messages. This has always been the case and is still the case even with the patch of comment 14 (just to a lesser extend). That patch is probably a step in the right direction, but be prepared for more evolutions in this area. Fair enough, it seems not unlikely to me that something needs to be fixed in the OpenShift SDN too. But I know very little about that and will let one of those engineers comment. (In reply to Colin Walters from comment #18) > I think then what we'd need to decide is whether to queue this for for an > 8.2.X update and cherry-pick it along with the rest of RHEL 8.2, or would it > need to wait for 8.3? I think we can start filing the 8.3 clone for this bz and I think this could deserve a z-stream backport, so I would ask the z-stream flag. Than OpenShift may pick whatever course of action is more suitable, WDYT? > I think we can start filing the 8.3 clone for this bz and I think this could deserve a z-stream backport, so I would ask the z-stream flag. Than OpenShift may pick whatever course of action is more suitable, WDYT?
Sounds good - thanks for taking care of this. I think we'll start a discussion on next steps probably on aos-devel@ after we have the official builds going and queued into Errata Tool.
(In reply to Colin Walters from comment #18) > > To make it short, ICMP Redirects might rate limit your ICMP Destination Unreachable messages. This has always been the case and is still the case even with the patch of comment 14 (just to a lesser extend). That patch is probably a step in the right direction, but be prepared for more evolutions in this area. > > Fair enough, it seems not unlikely to me that something needs to be fixed in > the OpenShift SDN too. But I know very little about that and will let one > of those engineers comment. Yes, it sounds like it's doing something wrong and we should figure out what. It's annoying that seemingly the only sign that it's doing something wrong is that something unrelated fails. :-/ Can QA please provide an ack here? Beyond the integration test in openshift scenario there is a limited scope reproducer attached to the cloned rhel8 bz#1834184 Hi, sorry I'm openshift QE, I think weliang already given the testing in comment 34 in openshift side. I assign this bug to kernel QE. please let me know you dislike this. thanks. Patch(es) committed on kernel-3.10.0-1148.el7 Verified on 3.10.0-1148: :: [ 20:23:47 ] :: [ BEGIN ] :: Running 'tcpdump -r ping_plus_redir.pcap' reading from file ping_plus_redir.pcap, link-type EN10MB (Ethernet) 20:23:31.666882 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:31.666944 IP 192.168.1.101 > 192.168.1.2: ICMP redirect 192.168.2.2 to host 192.168.1.102, length 92 20:23:31.666956 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:32.852687 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:32.852714 IP 192.168.1.101 > 192.168.1.2: ICMP redirect 192.168.2.2 to host 192.168.1.102, length 92 20:23:32.852716 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:34.043654 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:34.043678 IP 192.168.1.101 > 192.168.1.2: ICMP redirect 192.168.2.2 to host 192.168.1.102, length 92 20:23:34.043680 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:35.230761 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:35.230787 IP 192.168.1.101 > 192.168.1.2: ICMP redirect 192.168.2.2 to host 192.168.1.102, length 92 20:23:35.230790 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:36.419676 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:36.419703 IP 192.168.1.101 > 192.168.1.2: ICMP redirect 192.168.2.2 to host 192.168.1.102, length 92 20:23:36.419705 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:37.606692 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:37.606718 IP 192.168.1.101 > 192.168.1.2: ICMP redirect 192.168.2.2 to host 192.168.1.102, length 92 20:23:37.606720 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:38.791720 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:38.791737 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:39.978665 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:39.978690 IP 192.168.1.101 > 192.168.1.2: ICMP redirect 192.168.2.2 to host 192.168.1.102, length 92 20:23:39.978692 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:41.163679 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:41.163696 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:42.347693 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 20:23:42.347710 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 5621, seq 1, length 64 :: [ 20:23:47 ] :: [ PASS ] :: Command 'tcpdump -r ping_plus_redir.pcap' (Expected 0, got 0) [root@kvm-06-guest02 bz1834184_redirect_rate]# uname -a Linux kvm-06-guest02.hv2.lab.eng.bos.redhat.com 3.10.0-1148.el7.x86_64 #1 SMP Wed Jun 3 15:04:49 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux Was there a reason this was not added to the upcoming 7.8.z errata? https://errata.devel.redhat.com/advisory/56015 (In reply to Russell Teague from comment #43) > Was there a reason this was not added to the upcoming 7.8.z errata? > https://errata.devel.redhat.com/advisory/56015 There doesn't appear to be any 7.8.z BZ created/approved for it. zstream+ and ZTR is RHEL8 way of requesting zstream. Adding 7.8.z? flag, with PMApproved or GSSApproved added PM tooling should create 7.8.z clone. I thought this was the 7.8.z BZ based on the Version field, however this bug has the 7.9 errata attached. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:4060 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |