Bug 1724245

Summary: leapp should not rely on example.com
Product: Red Hat Enterprise Linux 7 Reporter: Christophe Besson <cbesson>
Component: leapp-repositoryAssignee: Leapp Notifications Bot <leapp-notifications-bot>
Status: CLOSED MIGRATED QA Contact: upgrades-and-conversions
Severity: medium Docs Contact:
Priority: medium    
Version: 7.6CC: cbesson, cww, fkrska, mbocek, pstodulk
Target Milestone: rcKeywords: MigratedToJIRA, Upgrades
Target Release: ---Flags: pm-rhel: mirror+
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-12 11:02:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1818088    

Description Christophe Besson 2019-06-26 14:42:36 UTC
Description of problem:
During the leapp upgrade process, any problem related to a bad configured proxy may lead to a connection to "https://example.com" to check the internet access.

Some customers can be very restrictive in their proxy rules, so leapp should not rely on this kind of external address.


Version-Release number of selected component (if applicable):
leapp-repository-0.7.0-5.el7_6

How reproducible:


Steps to Reproduce:
1. Configure a correct proxy in rhsm.conf and yum.conf (in my case, 192.168.122.1:3128)
2. Set a "bad" proxy env var (to simulate a proxy error, e.g. 407 Proxy Auth Err ; in my case a non-listening port: export https_proxy=http://192.168.122.8080)
3. Run leapp upgrade from an up-to-date RHEL7.6

Actual results:
============================================================
                        ERRORS
============================================================

2019-06-26 09:09:03.240470 [ERROR] Actor: prepare_upgrade_transaction Message:  A Leapp Command Error occurred.  . Possible spurious failure: There was probably a problem with internet conection (Failed to open url 'https://example.com' with error: <urlopen error [Errno 113] No route to host>). Check your connection and try again.

Expected results:
At least replacing that with "redhat.com" seems to be better.

Additional info:
# Things looks good for RHSM and DNF downloads
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
27416 09:00:52.794760 getsockopt(3<TCP:[192.168.122.27:58290->192.168.122.1:3128]>, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 <0.000017>
27416 09:00:52.794855 poll([{fd=3<TCP:[192.168.122.27:58290->192.168.122.1:3128]>, events=POLLOUT}], 1, 180000) = 1 ([{fd=3, revents=POLLOUT}]) <0.000015>
27416 09:00:52.794920 sendto(3<TCP:[192.168.122.27:58290->192.168.122.1:3128]>, "CONNECT subscription.rhsm.redhat.com:443 HTTP/1.0\r\n", 51, 0, NULL, 0) = 51 <0.000036>
...
27539 09:01:23.521448 connect(20<TCP:[66554]>, {sa_family=AF_INET, sin_port=htons(3128), sin_addr=inet_addr("192.168.122.1")}, 16) = -1 EINPROGRESS (Operation now in progres
s) <0.000039>
27539 09:01:23.521653 poll([{fd=20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, events=POLLOUT|POLLWRNORM}], 1, 0) = 1 ([{fd=20, revents=POLLOUT|POLLWRNORM}]) <0.000013>
27539 09:01:23.521763 getsockopt(20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 <0.000011>
27539 09:01:23.521806 getpeername(20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, {sa_family=AF_INET, sin_port=htons(3128), sin_addr=inet_addr("192.168.122.1")}, [16]) = 0 <0.000010>
27539 09:01:23.521846 getsockname(20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, {sa_family=AF_INET, sin_port=htons(58314), sin_addr=inet_addr("192.168.122.27")}, [16]) = 0 <0.000010>
27539 09:01:23.521894 sendto(20<TCP:[192.168.122.27:58314->192.168.122.1:3128]>, "CONNECT cdn.redhat.com:443 HTTP/1.1\r\nHost: cdn.redhat.com:443\r\nUser-Agent: libdnf\r\nProxy-Connection: Keep-Alive\r\n\r\n", 115, MSG_NOSIGNAL, NULL, 0) = 115 <0.000029>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

# A bad configured proxy leads here to a "No route to host"
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
27331 09:09:03.145710 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 31<TCP:[74014]> <0.000856>
27331 09:09:03.146712 connect(31<TCP:[74014]>, {sa_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("192.168.122.1")}, 16) = -1 EHOSTUNREACH (No route to host) <0.000363>
27331 09:09:03.147187 close(31<TCP:[74014]>) = 0 <0.000013>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

# A proxy auth error is difficult to diagnose (here a customer output):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
9778  08:32:19.843879 close(8</var/cache/dnf/rhel-8-for-x86_64-appstream-rpms-1899f526e47881cb/tmpdir.qZnK52/repodata/repomd.xml>) = 0 <0.000009>
9778  08:32:19.843926 write(4</var/log/dnf.librepo.log>, "2019-06-25T06:32:19Z DEBUG check_transfer_statuses: Error during transfer: Curl error (56): Failure when receiving data from the peer for https://**************/pulp/repos/*******/Library/content/dist/rhel8/8/x86_64/appstream/os/repodata/repomd.xml [Received HTTP code 407 from proxy after CONNECT]\n", 307) = 307 <0.000013>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

# Here is the faulty code from /usr/share/leapp-repository/repositories/system_upgrade/el7toel8/actors/prepareupgradetransaction/libraries/preparetransaction.py:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
def connection_guard(url='https://example.com'):
    def closure():
        try:
            urlopen(url)
            return None
        except URLError as e:
            cause = '''Failed to open url '{url}' with error: {error}'''.format(url=url, error=e)
            return ('There was probably a problem with internet conection ({cause}).'
                    ' Check your connection and try again.'.format(cause=cause))
    return closure
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Comment 2 Petr Stodulka 2019-06-26 15:16:44 UTC
I see. The point for that was to reduce significant amount of bugrepoports which people send, because of crashes when the reason of the problem is in network connection. I guess we will have to come up with different solution or print that info without any additional check.

Comment 3 Christophe Besson 2019-07-04 14:29:45 UTC
In the Actor "prepareupgradetransaction", there are at least 3 steps which may call the "guards" (connection_guard, space_guard and permission_guard soon):
- get_rhsm_system_release()
- update_rhel_subscription()
- dnf_plugin_rpm_download()

Any non-zero exit code from underlying commands leads to these "guards" checks, which are not unwelcomed, but that does not help to find the root cause of a problem. Several commands may return a non-zero exit code (e.g. iptables-service isn't present but this is not a blocking error), and this is not sufficient to identify why leapp stops with an undefined error.

-> A customer having its remote repositories on a Satellite server can't always access to an external site, so checking "example.com" isn't good.
-> This customer doesn't need a proxy to reach its repos, but he configured it anyway, he didn't it bad and didn't see that leads to a 407 Proxy Auth Error. Only a strace shows that.
-> Once the proxy issue was resolved, there was still a problem, with the same error message (can't access to example.com, please check the internet connection). The 2nd problem was an incomplete repomd.xml, there were missing dependencies and it was due to a sync problem on its Satellite server.

In order to help the debugging, copying the following logs in /var/log/leapp/dnf-debugdata could be a good thing:
/var/log/rhsm/rhsm.log
/var/log/dnf.log
/var/log/dnf.librepo.log
/var/log/dnf.rpm.log
/var/log/hawkey.log

Indeed, what happens isn't fully logged in a persistent manner, since these files are removed just before unmounting the overlay.

Comment 13 RHEL Program Management 2023-09-12 11:02:06 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 14 RHEL Program Management 2023-09-12 11:02:33 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.