Bug 2129713
Summary: | passt: can not find IPV6 gateway on vm | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Quan Wenli <wquan> |
Component: | passt | Assignee: | Stefano Brivio <sbrivio> |
Status: | CLOSED COMPLETED | QA Contact: | Lei Yang <leiyang> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 9.2 | CC: | chayang, coli, jinzhao, juzhang |
Target Milestone: | rc | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | 0^20220929.g06aa26f-1 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-11-20 10:32:17 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Quan Wenli
2022-09-26 06:23:40 UTC
From a capture shared offline: router advertisements are sent with an incorrect checksum, whose value is 58 (decimal, same as ICMPv6 protocol number) plus the expected checksum. It's as if in some builds, this snippet from ndp.c: ip6hr->hop_limit = IPPROTO_ICMPV6; ihr->icmp6_cksum = 0; ihr->icmp6_cksum = csum_unaligned(ip6hr, sizeof(*ip6hr) + sizeof(*ihr) + len, 0); ip6hr->version = 6; ip6hr->nexthdr = IPPROTO_ICMPV6; ip6hr->hop_limit = 255; where, for convenience, hop_limit is first set to IPPROTO_ICMPV6 to match the IPv6 pseudo-header for ICMPv6 checksum, and later set to its intended value, happened to be equivalent to: ihr->icmp6_cksum = 0; ihr->icmp6_cksum = csum_unaligned(ip6hr, sizeof(*ip6hr) + sizeof(*ihr) + len, 0); ip6hr->version = 6; ip6hr->nexthdr = IPPROTO_ICMPV6; ip6hr->hop_limit = 255; At a first glance I don't see any justification why the compiler would be allowed to elide the initial assignment of hop_limit, though. From a second (still quick) look: - the gcc version used in passt-0.git.2022_06_08.d7d467f-0.el9.x86_64 is 11.2.1-9.4.el9 (https://download.copr.fedorainfracloud.org/results/sbrivio/passt/epel-9-x86_64/04776284-passt/builder-live.log.gz) -- this is the same as the one used for the most recent EPEL 9 build, 0^20220924.g8978f65-1.el9.x86_64 - comparing that part of the NDP implementation between the most recent EPEL 9 build and the most recent Fedora 36 build (using gcc 12.2.1-2.fc36.x86_64), it looks like there are some notable differences -- something makes me think that the hop_limit store is actually missing in the EPEL 9 build, but I couldn't grasp enough of it, yet - if that store is really missing, this would be similar in nature to the issue described at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101354, which however applies only to functions marked as "naked", not the case here It might be worth to check for differences in intermediate files (passing -save-temps in CFLAGS) between the two gcc versions. Weird: $ gcc --version gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9) Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ (make clean; CFLAGS='-g' make) >/dev/null $ objdump -DSdslrx passt|grep -A15 "ndp.c\:192" -m1 /root/passt_gcc_debug/passt/ndp.c:192 ihr->icmp6_cksum = 0; b8bc: 31 c9 xor %ecx,%ecx /root/passt_gcc_debug/passt/ndp.c:193 ihr->icmp6_cksum = csum_unaligned(ip6hr, sizeof(*ip6hr) + b8be: 31 d2 xor %edx,%edx /root/passt_gcc_debug/passt/ndp.c:191 ip6hr->hop_limit = IPPROTO_ICMPV6; b8c0: c6 44 24 35 3a movb $0x3a,0x35(%rsp) __bswap_16(): /usr/include/bits/byteswap.h:37 return __builtin_bswap16 (__bsx); b8c5: 66 c1 c0 08 rol $0x8,%ax ndp(): /root/passt_gcc_debug/passt/ndp.c:193 ihr->icmp6_cksum = csum_unaligned(ip6hr, sizeof(*ip6hr) + Here, 0x3a (IPPROTO_ICMPV6) is stored before the call to csum_unaligned(). But not if I build with -flto=auto (that's a default flag for at least EPEL 9 packages): $ (make clean; CFLAGS='-g -flto=auto' make) >/dev/null $ objdump -DSdslrx passt|grep -A15 "ndp.c\:192" -m1 /root/passt_gcc_debug/passt/ndp.c:192 ihr->icmp6_cksum = 0; bb0c: 45 31 c0 xor %r8d,%r8d __bswap_16(): /usr/include/bits/byteswap.h:37 return __builtin_bswap16 (__bsx); bb0f: 66 c1 c0 08 rol $0x8,%ax ndp(): /root/passt_gcc_debug/passt/ndp.c:192 bb13: 66 44 89 44 24 58 mov %r8w,0x58(%rsp) /root/passt_gcc_debug/passt/ndp.c:190 ip6hr->payload_len = htons(sizeof(*ihr) + len); bb19: 66 89 44 24 32 mov %ax,0x32(%rsp) /root/passt_gcc_debug/passt/ndp.c:193 ihr->icmp6_cksum = csum_unaligned(ip6hr, sizeof(*ip6hr) + bb1e: 48 8d 46 28 lea 0x28(%rsi),%rax I also tested on a recent gcc 12.2.0, same behaviour. I introduced a workaround, that is, declaring csum_unaligned() as "noipa" for the affected gcc versions, depending on CFLAGS: https://passt.top/passt/commit/?id=06aa26fcf398f5d19ab46e42996190d7f95e837a and it's now available in the 0^20220929.g06aa26f-1 EPEL 9 build (Copr repository only at the moment). Verified with passt-0^20221104.ge308018-1.el9.x86_64, it's passed and can get IPV6 gateway on the VM on host: [root@dell-per440-18 ~]# rpm -qa |grep passt passt-0^20221104.ge308018-1.el9.x86_64 [root@dell-per440-18 ~]# gcc --version gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9) on VM: [root@dell-per440-18 ~]# dhclient -6 eth0 grep: /etc/sysconfig/network-scripts/ifcfg-*: No such file or directory grep: /etc/sysconfig/network-scripts/ifcfg-*: No such file or directory grep: /etc/sysconfig/network-scripts/ifcfg-*: No such file or directory grep: /etc/sysconfig/network-scripts/ifcfg-*: No such file or directory grep: /etc/sysconfig/network-scripts/ifcfg-*: No such file or directory grep: /etc/sysconfig/network-scripts/ifcfg-*: No such file or directory [root@dell-per440-18 ~]# ip -j -6 ro sh|jq -rM '.[] | select(.dst == "default").gateway' fe80::cee1:9402:8b35:be41 Set it to Veirified. |