Bug 1883537
| Summary: | [leapp] Unable to upgrade to RHEL 8 behind some proxies (auth_method) | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Christophe Besson <cbesson> |
| Component: | leapp-repository | Assignee: | Leapp Notifications Bot <leapp-notifications-bot> |
| Status: | ASSIGNED --- | QA Contact: | upgrades-and-conversions |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.8 | CC: | fkrska, paygupta, pstodulk |
| Target Milestone: | rc | Keywords: | Reopened, Upgrades |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-11-12 12:11:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1818077, 1818088 | ||
Some networking syscalls with some truncated strings (IP, creds).
123630 16:18:13.982040 sendto(4<TCP:[IPADDR:54158->IPADDR:8080]>, "CONNECT subscription.rhsm.redhat.com:443 HTTP/1.0\r\n", 51, 0, NULL, 0) = 51 <0.000037>
:
123630 16:18:13.982219 sendto(4<TCP:[IPADDR:54158->IPADDR:8080]>, "Host: subscription.rhsm.redhat.com:443\r\n", 40, 0, NULL, 0) = 40 <0.000024>
:
123630 16:18:13.982369 sendto(4<TCP:[IPADDR:54158->IPADDR:8080]>, "Proxy-Authorization: Basic XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\r\n", 73, 0, NULL, 0) = 73 <0.000023>
:
123630 16:18:13.982510 sendto(4<TCP:[IPADDR:54158->IPADDR:8080]>, "User-Agent: RHSM/1.0 (cmd=subscription-manager)\r\n", 49, 0, NULL, 0) = 49 <0.000023>
:
:
:
124126 16:20:04.977792 connect(12<TCP:[993277]>, {sa_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("IPADDR")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000038>
124126 16:20:04.980856 sendto(12<TCP:[IPADDR:52130->IPADDR:8080]>, "CONNECT cdn.redhat.com:443 HTTP/1.1\r\nHost: cdn.redhat.com:443\r\nUser-Agent: urlgrabber/3.10 yum/3.4.3\r\nProxy-Connection: Keep-Alive\r\n\r\n", 134, MSG_NOSIGNAL, NULL, 0) = 134 <0.000030>
:
124126 16:20:05.233820 sendto(12<TCP:[IPADDR:52130->IPADDR:8080]>, "CONNECT cdn.redhat.com:443 HTTP/1.1\r\nHost: cdn.redhat.com:443\r\nProxy-Authorization: NTLM XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\r\nUser-Agent: urlgrabber/3.10 yum/3.4.3\r\nProxy-Connection: Keep-Alive\r\n\r\n", 358, MSG_NOSIGNAL, NULL, 0) = 358 <0.000036>
:
124126 16:20:05.257025 recvfrom(12<TCP:[IPADDR:52130->IPADDR:8080]>, "HTTP/1.1 200 Connection established\r\nDate: Tue, 29 Sep 2020 10:43:50 GMT\r\nContent-Type: text/html\r\nProxy-Connection: Keep-Alive\r\nVia: 1.1 COLOIG\r\n\r\n", 16384, 0, NULL, NULL) = 148 <0.000022>
:
:
:
128604 16:25:14.113940 connect(9<TCP:[1052438]>, {sa_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("IPADDR")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000043>
128604 16:25:14.117018 sendto(9<TCP:[IPADDR:52162->IPADDR:8080]>, "CONNECT cdn.redhat.com:443 HTTP/1.1\r\nHost: cdn.redhat.com:443\r\nUser-Agent: libdnf\r\nProxy-Connection: Keep-Alive\r\nCache-Control: no-cache\r\nPragma: no-cache\r\n\r\n", 158, MSG_NOSIGNAL, NULL, 0) = 158 <0.000032>
128604 16:25:14.139161 sendto(9<TCP:[IPADDR:52162->IPADDR:8080]>, "CONNECT cdn.redhat.com:443 HTTP/1.1\r\nHost: cdn.redhat.com:443\r\nUser-Agent: libdnf\r\nProxy-Connection: Keep-Alive\r\nCache-Control: no-cache\r\nPragma: no-cache\r\n\r\n", 158, MSG_NOSIGNAL, NULL, 0 <unfinished ...>
:
128604 16:25:14.139093 <... recvfrom resumed>"HTTP/1.1 407 Proxy Authorization Required\r\nDate: Tue, 29 Sep 2020 10:48:59 GMT\r\nContent-Type: text/html\r\nProxy-Connection: keep-alive\r\nVia: 1.1 COLOIG\r\nCache-Control: no-store\r\nContent-Language: en\r\nProxy-Authenticate: Negotiate\r\nProxy-Authenticate: NTLM\r\nContent-Length: 666\r\n\r\n<HEAD><TITLE>Proxy Authorization Required</TITLE></HEAD>\n<BODY BGCOLOR=\"white\" FGCOLOR=\"black\"><H1>Proxy Authorization Required</H1><HR>\n<FONT FACE=\"Helvetica,Arial\"><B>\nDescription: Authorization is required for access to this proxy</B></FONT>\n<HR>\n<!-- default \"Proxy Authorization Required\" response (407) -->\n</BODY>\n \0", 16384, 0, NULL, NULL) = 945 <0.000023>
Hi Chris, if I understand well, the DNF is not working on the host system neither, correct? If it's the case, your solution is not workaround but it's expected solution to this problem as in such a case it's misconfigured system. The DNF has to be working on RHEL 7. Please ensure the DNF was (is) working on the host machine (do not run leapp, just try to install something using dnf from repositories beyond the NTLM proxy). If DNF on the host system is not working, it's DNF bug or misconfigured system. Hello Petr, yes dnf has to work on the host, that's also why I requested to add this directive (hopefully it will work). From my understanding, forcing the NTLM auth is needed with DNF whereas it is not needed with YUM. I can't tell if it works without adding this directive into dnf.conf on RHEL 8. I agree that if it works directly this is a bug of DNF in RHEL 7. However, most of users are not aware they use NTLM auth, and additionally they don't use DNF on such a system, so it might be checked prior to the upgrade. Indeed, it's impossible to find out the issue without using strace or tcpdump, which is very heavy to parse (> 1GB) I see. We will discuss it. Maybe for the start we could document it for people in our upgrade documentation as a prerequisite step. The case became increasingly complex. The customer uses a hostname as proxy in rhsm.conf. 4 IP are behind this hostname (kind of DNS round-robin), and the proxies in question can work with various authentication methods. On RHEL 7.8, subscription-manager and yum always works. subscription-manager always uses the `basic` authentication method whereas yum always uses `ntlm`. By default, dnf 4.0 doesn't work and comes with a new directive `proxy_auth_method` set as `any` by default. Forcing `basic` or `ntlm` into dnf.conf makes dnf working on RHEL 7. For an unknown reason, only forcing `basic` works on RHEL 8 (tested on another VM behind the same infrastructure). I will likely file a bug against dnf too. To move forward, we added proxy_auth_method=basic on the machine to be upgraded, and the behaviour has changed. The first download step worked (`dnf y install dnf`), that is to say, the DNF from RHEL 7 (4.0) installs the DNF from RHEL 8 (4.2). The first 192 packages coming as dependencies are correctly downloaded and installed. In the 2nd step (`dnf rhel-upgrade download`...), dnf 4.2 uses the config file which has been freshly installed in the target directory, but the directive we added to make it working is missing: ~~~ 27973 16:57:54.774838 read(3</etc/dnf/dnf.conf>, "[main]\ngpgcheck=1\ninstallonly_limit=3\nclean_requirements_on_remove=True\nbest=True\nskip_if_unavailable=False\n", 8191) = 108 <0.000028> ~~~ So it fails to download the 2nd subset of packages (>300 in this case). AFAIK, there is no workaround for now. Filed the following BZ against dnf: #1884642
Suggested as a workaround to bind mount the dnf.conf with the directive inside the container like this.
Still awaiting for the feedback.
--- /usr/share/leapp-repository/repositories/system_upgrade/el7toel8/libraries/mounting.py.orig 2020-10-02 15:34:27.876288450 +0200
+++ /usr/share/leapp-repository/repositories/system_upgrade/el7toel8/libraries/mounting.py 2020-10-02 15:34:53.180713779 +0200
@@ -7,7 +7,7 @@
from leapp.libraries.common.config import get_all_envs
-ALWAYS_BIND = ['/etc/hosts:/etc/hosts']
+ALWAYS_BIND = ['/etc/hosts:/etc/hosts', '/etc/dnf/dnf.conf:/etc/dnf/dnf.conf']
ErrorData = namedtuple('ErrorData', ['summary', 'details'])
Just to tell the patch above worked for the customer. (binding `dnf.conf` within the container, with the statement `proxy_auth_method=basic` added) We can't solve the problem itself however we can make it possible to include the dnf.conf into the upgrade DNF configuration to make it possible to apply DNF configurations from the original rhel7 system to be used during the upgrade. That will allow to make the upgrade a supported way without having to apply code changes. We have another case with the same kind of symptoms, and only the inclusion of dnf.conf might fix that. - customer is using a proxy with basic authentication to reach CDN repos. - yum is broken since the install of leapp with its deps (especially dnf/libdnf from el7 extras). - yum repolist shows dnf warnings WARNING:dnf:Failed to synchronize cache for repo 'rhel-7-server-rpms', ignoring this repo. - leapp fails for the very same reasons (it can't reach the repositories) After many checks it appears: - No 3rd party (rpmVa OK, no pip pkgs, no module tainting the kernel) - yum works again after removing dnf (please note on RHEL "yum repolist" loads dnf python files when it is installed). - an HTTP 407 (authorization required) is returned by the proxy but not visible in the outputs (can be seen from the strace, or from dnf.librepo.log in the container). - the default proxy_auth_method for dnf is "any", and it tries to speak NTLM because the proxy speaks it. - forcing proxy_auth_method=basic from dnf.conf and customizing the dict FILES_TO_COPY_IF_PRESENT to include it in `system_upgrade/common/actors/scanfilesfortargetuserspace/libraries/scanfilesfortargetuserspace.py` fixes the issue. - customer also has to use this directive on RHEL8/DNF4.2 once the system is fully upgraded. Cannot really reproduce the issue (in my partial reproducer yum still fails if I remove dnf). # uname -r 3.10.0-1160.80.1.el7.x86_64 # rpm -qa | grep leapp leapp-deps-0.15.0-2.el7_9.noarch python2-leapp-0.15.0-2.el7_9.noarch leapp-upgrade-el7toel8-deps-0.17.0-1.el7_9.noarch leapp-upgrade-el7toel8-0.17.0-1.el7_9.noarch leapp-0.15.0-2.el7_9.noarch # dnf clean all 0 files removed # grep proxy /etc/dnf/dnf.conf proxy=http://192.168.122.1:3128 proxy_username=foo proxy_password=foobar # dnf repolist el7-base 0.0 B/s | 0 B 00:00 el7-extras 0.0 B/s | 0 B 00:00 Failed to synchronize cache for repo 'el7-base', ignoring this repo. Failed to synchronize cache for repo 'el7-extras', ignoring this repo. From /var/log/dnf.librepo.log: 2022-11-17T10:40:46Z DEBUG check_transfer_statuses: Error during transfer: Status code: 407 for http://rhsm-pulp.corp.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/repodata/repomd.xml # dnf --setopt=proxy_auth_method=basic repolist el7-base 688 kB/s | 112 MB 02:46 el7-extras 318 kB/s | 1.5 MB 00:04 Last metadata expiration check: 0:00:03 ago on Thu 17 Nov 2022 05:43:56 AM EST. repo id repo name status el7-base el7-base 33,369 el7-extras el7-extras 1,444 ####### Broken Squid config ############## # INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS # auth_param basic program /usr/lib64/squid/basic_ncsa_auth /etc/squid/passwords auth_param basic realm proxy auth_param ntlm program /usr/bin/ntlm_auth --helper-protocol=squid-2.5-ntlmssp acl ntlm_users proxy_auth REQUIRED http_access allow ntlm_users |
Description of problem: A customer is experiencing an issue with a NTLM proxy during the ugprade. As per the strace, the connections through the NTLM proxy with credentials work with subscription-manager/RHSM, and then with yum/urlgrabber, but NOT with libdnf, the 2nd step used to get the RHEL 8 metadata. This time, we can see in a recvfrom() the proxy returned the error below: ~~~ HTTP/1.1 407 Proxy Authorization Required Date: Tue, 29 Sep 2020 10:48:59 GMT Content-Type: text/html Proxy-Connection: keep-alive Via: 1.1 COLOIG Cache-Control: no-store Content-Language: en Proxy-Authenticate: Negotiate Proxy-Authenticate: NTLM Content-Length: 666 <--cut--> ~~~ Version-Release number of selected component (if applicable): leapp-repository-0.10.0-2.el7_8.noarch libdnf-0.22.5-1.el7_6.x86_64 How reproducible: 100% for the customer Actual results: ~~~ 2020-09-25 12:56:49.864423 [ERROR] Actor: target_userspace_creator Message: Unable to install RHEL 8 userspace packages. Summary: Details: Command ['systemd-nspawn', '--register=no', '--quiet', '-D', '/var/lib/leapp/scratch/mounts/root_/system_overlay', '--bind=/etc/hosts:/etc/hosts', '--setenv=LEAPP_NO_RHSM=0', '--setenv=LEAPP_EXPERIMENTAL=0', '--setenv=LEAPP_COMMON_TOOLS=:/etc/leapp/repos.d/system_upgrade/el7toel8/tools', '--setenv=LEAPP_COMMON_FILES=:/etc/leapp/repos.d/system_upgrade/el7toel8/files', '--setenv=LEAPP_UNSUPPORTED=0', '--setenv=LEAPP_EXECUTION_ID=6f21afac-8567-4169-bbd9-f230750dcec3', '--setenv=LEAPP_HOSTNAME=XXXXXXXXXXXXXXXXXXX', 'dnf', 'install', '-y', '--nogpgcheck', '--setopt=module_platform_id=platform:el8', '--setopt=keepcache=1', '--releasever', u'8.2', '--installroot', '/el8target', '--disablerepo', '*', '--enablerepo', u'rhel-8-for-x86_64-baseos-rpms', '--enablerepo', u'rhel-8-for-x86_64-appstream-rpms', '--enablerepo', u'rhel-8-for-x86_64-baseos-rpms', '--enablerepo', u'rhel-8-for-x86_64-appstream-rpms', '--enablerepo', u'rhel-8-for-x86_64-appstream-rpms', '--enablerepo', u'rhel-8-for-x86_64-baseos-rpms', 'dnf'] failed with exit code 1. Stderr: Failed to create directory /var/lib/leapp/scratch/mounts/root_/system_overlay//sys/fs/selinux: Read-only file system Failed to create directory /var/lib/leapp/scratch/mounts/root_/system_overlay//sys/fs/selinux: Read-only file system Host and machine ids are equal (96b3ebe3b3db4df38171b3cc3b35a3f4): refusing to link journals Failed to synchronize cache for repo 'rhel-8-for-x86_64-appstream-rpms', ignoring this repo. Failed to synchronize cache for repo 'rhel-8-for-x86_64-baseos-rpms', ignoring this repo. Error: Unable to find a match: dnf ~~~ Additional info: Sounds close to this kind of bug, fixed in Fedora 26: https://bugzilla.redhat.com/show_bug.cgi?id=1387622 Suggested the following workaround (still waiting for the feedback as I can't reproduce myself): ~~~ # echo proxy_auth_method=ntlm > /etc/dnf/dnf.conf ~~~