Bug 2144465 - Early subscription using "rhsm" directive fails due to network not being operational
Summary: Early subscription using "rhsm" directive fails due to network not being oper...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: anaconda
Version: 9.1
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Anaconda Maintenance Team
QA Contact: Release Test Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-21 11:03 UTC by Renaud Métrich
Modified: 2023-08-17 10:46 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-140012 0 None None None 2022-11-21 11:13:28 UTC

Description Renaud Métrich 2022-11-21 11:03:46 UTC
Description of problem:

We have a customer installing his physical system configured with bonding+LACP and a kickstart specifying "rhsm" directive.

We can see "rhsm" is failing with "Name or service not known", which is due to not having the network fully operational even though NetworkManager task finished.

This seems due to LACP taking time to complete and dropping packets until it's really ready.

We are sure the issue is network-related since, by using the following %pre script, we could confirm no network resolution was happening (we could have tested with an IP address instead as well):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
%pre
systemd-run -u ping.service /bin/sh -c "while :; do ping -c 1 www.google.com; sleep 1; done"
%end
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

- ping service output:

    -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
    Nov 21 09:55:03 system.hostname systemd[1]: Started /bin/sh -c while :; do ping -c 1 www.google.com; sleep 1; done.
    Nov 21 09:55:03 system.hostname sh[2433]: ping: www.google.com: Name or service not known
    Nov 21 09:55:04 system.hostname sh[2441]: ping: www.google.com: Name or service not known
    Nov 21 09:55:10 system.hostname sh[2506]: ping: www.google.com: Name or service not known
    Nov 21 09:55:11 system.hostname sh[2867]: ping: www.google.com: Name or service not known
    Nov 21 09:55:12 system.hostname sh[2872]: ping: www.google.com: Name or service not known
    Nov 21 09:55:18 system.hostname sh[2877]: PING www.google.com (216.58.215.228) 56(84) bytes of data.
    Nov 21 09:55:18 system.hostname sh[2877]: --- www.google.com ping statistics ---
    Nov 21 09:55:19 system.hostname sh[2887]: PING www.google.com (172.217.168.4) 56(84) bytes of data.
    Nov 21 09:55:19 system.hostname sh[2887]: --- www.google.com ping statistics ---
    -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

- Network Manager configuring the bond:

    -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
    Nov 21 09:55:05 system.hostname NetworkManager[2143]: <debug> [1669024505.4533] device[bf104700e7268d2a] (bond0): slave ens3f1 state change 90 (secondaries) -> 100 (activated)
    [...]
    Nov 21 09:55:05 system.hostname nm-dispatcher[2444]: req:14 'up' [bond0]: start running ordered scripts...
    Nov 21 09:55:05 system.hostname nm-dispatcher[2444]: req:14 'up' [bond0], "/usr/lib/NetworkManager/dispatcher.d/04-iscsi": run script
    Nov 21 09:55:05 system.hostname nm-dispatcher[2444]: req:14 'up' [bond0], "/usr/lib/NetworkManager/dispatcher.d/04-iscsi": complete
    Nov 21 09:55:05 system.hostname nm-dispatcher[2444]: req:14 'up' [bond0], "/usr/lib/NetworkManager/dispatcher.d/20-chrony-dhcp": run script
    Nov 21 09:55:05 system.hostname nm-dispatcher[2444]: req:14 'up' [bond0], "/usr/lib/NetworkManager/dispatcher.d/20-chrony-dhcp": complete
    Nov 21 09:55:05 system.hostname nm-dispatcher[2444]: req:14 'up' [bond0], "/usr/lib/NetworkManager/dispatcher.d/20-chrony-onoffline": run script
    Nov 21 09:55:05 system.hostname nm-dispatcher[2444]: req:14 'up' [bond0], "/usr/lib/NetworkManager/dispatcher.d/20-chrony-onoffline": complete
    Nov 21 09:55:05 system.hostname nm-dispatcher[2444]: req:14 'up' [bond0]: completed (3 scripts)
    Nov 21 09:55:06 system.hostname anaconda[2234]: anaconda: network: Apply kickstart result: ['bond0']
    Nov 21 09:55:06 system.hostname org.fedoraproject.Anaconda.Modules.Network[2311]: DEBUG:anaconda.modules.network.network:/etc/NetworkManager/system-connections/bond0.nmconnection:
    Nov 21 09:55:06 system.hostname org.fedoraproject.Anaconda.Modules.Network[2311]: DEBUG:anaconda.modules.network.network:id=bond0
    Nov 21 09:55:06 system.hostname org.fedoraproject.Anaconda.Modules.Network[2311]: DEBUG:anaconda.modules.network.network:interface-name=bond0
    Nov 21 09:55:06 system.hostname org.fedoraproject.Anaconda.Modules.Network[2311]: DEBUG:anaconda.modules.network.network:{'connection': {'autoconnect-retries': <1>, 'id': <'bond0'>, 'interface-name': <'bond0'>, 'multi-connect': <1>, 'permissions': <@as []>, 'timestamp': <uint64 1669024478>, 'type': <'bond'>, 'uuid': <'0aed94b9-b30d-4366-b144-3d9b797fb2c3'>}, '802-3-ethernet': {'auto-negotiate': <false>, 'mac-address-blacklist': <@as []>, 'mtu': <uint32 1500>, 's390-options': <@a{ss} {}>}, 'bond': {'interface-name': <'bond0'>, 'options': <{'lacp_rate': '1', 'miimon': '100', 'mode': '802.3ad', 'xmit_hash_policy': 'layer2+3'}>}, 'ipv4': {'address-data': <[{'address': <'XXX'>, 'prefix': <uint32 xx>}]>, 'dns': <[uint32 xxx, xxx]>, 'dns-search': <@as []>, 'gateway': <'XXX'>, 'method': <'manual'>, 'route-data': <@aa{sv} []>}, 'ipv6': {'addr-gen-mode': <0>, 'address-data': <@aa{sv} []>, 'dns-search': <@as []>, 'method': <'ignore'>, 'route-data': <@aa{sv} []>}, 'proxy': {}, 'user': {'data': <{'org.freedesktop.NetworkManager.origin': 'nm-initrd-generator'}>}}
    -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

- RHSM executing

    -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
    Nov 21 09:55:08 system.hostname org.fedoraproject.Anaconda.Modules.Subscription[2305]: DEBUG:dasbus.connection:Publishing an object at /org/fedoraproject/Anaconda/Modules/Subscription/Task/2.
    Nov 21 09:55:08 system.hostname org.fedoraproject.Anaconda.Modules.Subscription[2305]: INFO:anaconda.threading:Running Thread: AnaTaskThread-RegisterAndSubscribeTask-1 (140092470896192)
    Nov 21 09:55:08 system.hostname org.fedoraproject.Anaconda.Modules.Subscription[2305]: DEBUG:anaconda.modules.subscription.runtime:registration attempt: provisioning system for Satellite
    Nov 21 09:55:08 system.hostname org.fedoraproject.Anaconda.Modules.Subscription[2305]: DEBUG:anaconda.modules.subscription.runtime:registration attempt: downloading Satellite provisioning script
    Nov 21 09:55:08 system.hostname org.fedoraproject.Anaconda.Modules.Subscription[2305]: DEBUG:anaconda.modules.subscription.runtime:subscription: downloading Satellite provisioning script
    Nov 21 09:55:08 system.hostname org.fedoraproject.Anaconda.Modules.Subscription[2305]: DEBUG:anaconda.modules.subscription.satellite:subscription: fetching Satellite provisioning script from: http://satellite.server/pub/katello-rhsm-consumer
     :
    Nov 21 09:55:08 system.hostname org.fedoraproject.Anaconda.Modules.Subscription[2305]: DEBUG:anaconda.modules.subscription.satellite:subscription: can't download Satellite provisioning script from http://satellite.server/pub/katello-rhsm-consumer with proxy: {}. Error: HTTPConnectionPool(host='satellite.server', port=80): Max retries exceeded with url: /pub/katello-rhsm-consumer (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f69d1f9b550>: Failed to establish a new connection: [Errno -2] Name or service not known'))
    -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Version-Release number of selected component (if applicable):

anaconda-34.25.0.29-1.el9_0

How reproducible:

Always on customer system. Don't have LACP to check.

Steps to Reproduce:
1. Have a bond with LACP and "slow negotiation"
2. Specify "rhsm" directive in kickstart

Actual results:

rhsm fails

Expected results:

rhsm succeeds

Comment 4 Jiri Konecny 2022-11-25 10:10:06 UTC
Hi Radku, could you please take a look on this if it is a NetworkManager issue or RHSM?


Note You need to log in before you can comment on or make changes to this bug.