Bug 1724245 - leapp should not rely on example.com
Summary: leapp should not rely on example.com
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: leapp-repository
Version: 7.6
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Leapp Notifications Bot
QA Contact: upgrades-and-conversions
URL:
Whiteboard:
Depends On:
Blocks: 1818088
TreeView+ depends on / blocked
 
Reported: 2019-06-26 14:42 UTC by Christophe Besson
Modified: 2023-07-31 07:12 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OAMG-3039 0 None None None 2023-05-11 08:06:37 UTC

Description Christophe Besson 2019-06-26 14:42:36 UTC
Description of problem:
During the leapp upgrade process, any problem related to a bad configured proxy may lead to a connection to "https://example.com" to check the internet access.

Some customers can be very restrictive in their proxy rules, so leapp should not rely on this kind of external address.


Version-Release number of selected component (if applicable):
leapp-repository-0.7.0-5.el7_6

How reproducible:


Steps to Reproduce:
1. Configure a correct proxy in rhsm.conf and yum.conf (in my case, 192.168.122.1:3128)
2. Set a "bad" proxy env var (to simulate a proxy error, e.g. 407 Proxy Auth Err ; in my case a non-listening port: export https_proxy=http://192.168.122.8080)
3. Run leapp upgrade from an up-to-date RHEL7.6

Actual results:
============================================================
                        ERRORS
============================================================

2019-06-26 09:09:03.240470 [ERROR] Actor: prepare_upgrade_transaction Message:  A Leapp Command Error occurred.  . Possible spurious failure: There was probably a problem with internet conection (Failed to open url 'https://example.com' with error: <urlopen error [Errno 113] No route to host>). Check your connection and try again.

Expected results:
At least replacing that with "redhat.com" seems to be better.

Additional info:
# Things looks good for RHSM and DNF downloads
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
27416 09:00:52.794760 getsockopt(3<TCP:[192.168.122.27:58290->192.168.122.1:3128]>, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 <0.000017>
27416 09:00:52.794855 poll([{fd=3<TCP:[192.168.122.27:58290->192.168.122.1:3128]>, events=POLLOUT}], 1, 180000) = 1 ([{fd=3, revents=POLLOUT}]) <0.000015>
27416 09:00:52.794920 sendto(3<TCP:[192.168.122.27:58290->192.168.122.1:3128]>, "CONNECT subscription.rhsm.redhat.com:443 HTTP/1.0\r\n", 51, 0, NULL, 0) = 51 <0.000036>
...
27539 09:01:23.521448 connect(20<TCP:[66554]>, {sa_family=AF_INET, sin_port=htons(3128), sin_addr=inet_addr("192.168.122.1")}, 16) = -1 EINPROGRESS (Operation now in progres
s) <0.000039>
27539 09:01:23.521653 poll([{fd=20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, events=POLLOUT|POLLWRNORM}], 1, 0) = 1 ([{fd=20, revents=POLLOUT|POLLWRNORM}]) <0.000013>
27539 09:01:23.521763 getsockopt(20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 <0.000011>
27539 09:01:23.521806 getpeername(20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, {sa_family=AF_INET, sin_port=htons(3128), sin_addr=inet_addr("192.168.122.1")}, [16]) = 0 <0.000010>
27539 09:01:23.521846 getsockname(20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, {sa_family=AF_INET, sin_port=htons(58314), sin_addr=inet_addr("192.168.122.27")}, [16]) = 0 <0.000010>
27539 09:01:23.521894 sendto(20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, "CONNECT cdn.redhat.com:443 HTTP/1.1\r\nHost: cdn.redhat.com:443\r\nUser-Agent: libdnf\r\nProxy-Connection: Keep-Alive\r\n\r\n", 115, MSG_NOSIGNAL, NULL, 0) = 115 <0.000029>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

# A bad configured proxy leads here to a "No route to host"
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
27331 09:09:03.145710 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 31<TCP:[74014]> <0.000856>
27331 09:09:03.146712 connect(31<TCP:[74014]>, {sa_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("192.168.122.1")}, 16) = -1 EHOSTUNREACH (No route to host) <0.000363>
27331 09:09:03.147187 close(31<TCP:[74014]>) = 0 <0.000013>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

# A proxy auth error is difficult to diagnose (here a customer output):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
9778  08:32:19.843879 close(8</var/cache/dnf/rhel-8-for-x86_64-appstream-rpms-1899f526e47881cb/tmpdir.qZnK52/repodata/repomd.xml>) = 0 <0.000009>
9778  08:32:19.843926 write(4</var/log/dnf.librepo.log>, "2019-06-25T06:32:19Z DEBUG check_transfer_statuses: Error during transfer: Curl error (56): Failure when receiving data from the peer for https://**************/pulp/repos/*******/Library/content/dist/rhel8/8/x86_64/appstream/os/repodata/repomd.xml [Received HTTP code 407 from proxy after CONNECT]\n", 307) = 307 <0.000013>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

# Here is the faulty code from /usr/share/leapp-repository/repositories/system_upgrade/el7toel8/actors/prepareupgradetransaction/libraries/preparetransaction.py:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
def connection_guard(url='https://example.com'):
    def closure():
        try:
            urlopen(url)
            return None
        except URLError as e:
            cause = '''Failed to open url '{url}' with error: {error}'''.format(url=url, error=e)
            return ('There was probably a problem with internet conection ({cause}).'
                    ' Check your connection and try again.'.format(cause=cause))
    return closure
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Comment 2 Petr Stodulka 2019-06-26 15:16:44 UTC
I see. The point for that was to reduce significant amount of bugrepoports which people send, because of crashes when the reason of the problem is in network connection. I guess we will have to come up with different solution or print that info without any additional check.

Comment 3 Christophe Besson 2019-07-04 14:29:45 UTC
In the Actor "prepareupgradetransaction", there are at least 3 steps which may call the "guards" (connection_guard, space_guard and permission_guard soon):
- get_rhsm_system_release()
- update_rhel_subscription()
- dnf_plugin_rpm_download()

Any non-zero exit code from underlying commands leads to these "guards" checks, which are not unwelcomed, but that does not help to find the root cause of a problem. Several commands may return a non-zero exit code (e.g. iptables-service isn't present but this is not a blocking error), and this is not sufficient to identify why leapp stops with an undefined error.

-> A customer having its remote repositories on a Satellite server can't always access to an external site, so checking "example.com" isn't good.
-> This customer doesn't need a proxy to reach its repos, but he configured it anyway, he didn't it bad and didn't see that leads to a 407 Proxy Auth Error. Only a strace shows that.
-> Once the proxy issue was resolved, there was still a problem, with the same error message (can't access to example.com, please check the internet connection). The 2nd problem was an incomplete repomd.xml, there were missing dependencies and it was due to a sync problem on its Satellite server.

In order to help the debugging, copying the following logs in /var/log/leapp/dnf-debugdata could be a good thing:
/var/log/rhsm/rhsm.log
/var/log/dnf.log
/var/log/dnf.librepo.log
/var/log/dnf.rpm.log
/var/log/hawkey.log

Indeed, what happens isn't fully logged in a persistent manner, since these files are removed just before unmounting the overlay.


Note You need to log in before you can comment on or make changes to this bug.