Bug 1625914 - Installation for aws m5/c5 server is failed because of trying to insert 'xen_netfront'
Summary: Installation for aws m5/c5 server is failed because of trying to insert 'xen_...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.11.z
Assignee: Scott Dodson
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-06 08:38 UTC by Chao Yang
Modified: 2019-05-14 01:58 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-25 15:20:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Chao Yang 2018-09-06 08:38:39 UTC
Description of problem:
Installation for aws m5/c5 server is failed because of trying to insert 'xen_netfront'

Version-Release number of the following components:
openshift v3.11.0-0.28.0
openshift-ansible-3.11.0-0.28.0.git.0.730d4be.el7.noarch.rpm

How reproducible:
Always

Steps to Reproduce:
1 set below parameters
vm_type: m5.large
image: qe-rhel-7-release

Actual results:
TASK [openshift_storage_glusterfs : load kernel modules] ***********************
Thursday 06 September 2018  11:11:10 +0800 (0:00:00.135)       0:02:29.580 **** 
fatal: [ec2-18-232-57-9.compute-1.amazonaws.com]: FAILED! => {"changed": false, "msg": "Unable to start service systemd-modules-load.service: Job for systemd-modules-load.service failed because the control process exited with error code. See \"systemctl status systemd-modules-load.service\" and \"journalctl -xe\" for details.\n"}
fatal: [ec2-52-90-226-187.compute-1.amazonaws.com]: FAILED! => {"changed": false, "msg": "Unable to start service systemd-modules-load.service: Job for systemd-modules-load.service failed because the control process exited with error code. See \"systemctl status systemd-modules-load.service\" and \"journalctl -xe\" for details.\n"}
fatal: [ec2-34-229-212-114.compute-1.amazonaws.com]: FAILED! => {"changed": false, "msg": "Unable to start service systemd-modules-load.service: Job for systemd-modules-load.service failed because the control process exited with error code. See \"systemctl status systemd-modules-load.service\" and \"journalctl -xe\" for details.\n"}


Expected results:
Installation should be successfull.

Additional info:
[root@ip-172-18-5-48 ~]# systemctl status systemd-modules-load.service
● systemd-modules-load.service - Load Kernel Modules
   Loaded: loaded (/usr/lib/systemd/system/systemd-modules-load.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since 三 2018-09-05 23:11:10 EDT; 15min ago
     Docs: man:systemd-modules-load.service(8)
           man:modules-load.d(5)
  Process: 7270 ExecStart=/usr/lib/systemd/systemd-modules-load (code=exited, status=1/FAILURE)
 Main PID: 7270 (code=exited, status=1/FAILURE)

9月 05 23:11:10 ip-172-18-5-48.ec2.internal systemd[1]: Starting Load Kernel Modules...
9月 05 23:11:10 ip-172-18-5-48.ec2.internal systemd-modules-load[7270]: Inserted module 'dm_thin_pool'
9月 05 23:11:10 ip-172-18-5-48.ec2.internal systemd-modules-load[7270]: Inserted module 'target_core_user'
9月 05 23:11:10 ip-172-18-5-48.ec2.internal systemd-modules-load[7270]: Failed to insert 'xen_netfront': No such device
9月 05 23:11:10 ip-172-18-5-48.ec2.internal systemd[1]: systemd-modules-load.service: main process exited, code=exited, status=1/FAILURE
9月 05 23:11:10 ip-172-18-5-48.ec2.internal systemd[1]: Failed to start Load Kernel Modules.
9月 05 23:11:10 ip-172-18-5-48.ec2.internal systemd[1]: Unit systemd-modules-load.service entered failed state.
9月 05 23:11:10 ip-172-18-5-48.ec2.internal systemd[1]: systemd-modules-load.service failed.

Comment 1 Scott Dodson 2018-09-06 12:31:14 UTC
All openshift-ansible is doing is writing out /etc/modules-load.d/glusterfs.conf
with the contents of the following

dm_thin_pool
dm_multipath
target_core_user
{% if (groups.glusterfs is defined and inventory_hostname in groups.glusterfs) or (groups.glusterfs_registry is defined and inventory_hostname in groups.glusterfs_registry) %}
dm_snapshot
dm_mirror

Then we restart `systemd-modules-load.service` when that file changes.

Was anything else done on the host to load the xen_netfront module? Can you gather the contents of /etc/modules-load.d ?

Comment 3 Scott Dodson 2018-09-06 13:21:00 UTC
What AMI are you using on this host type?

Comment 5 Lin Liu 2018-09-25 01:23:44 UTC
(In reply to Scott Dodson from comment #1)
> All openshift-ansible is doing is writing out
> /etc/modules-load.d/glusterfs.conf
> with the contents of the following
> 
> dm_thin_pool
> dm_multipath
> target_core_user
> {% if (groups.glusterfs is defined and inventory_hostname in
> groups.glusterfs) or (groups.glusterfs_registry is defined and
> inventory_hostname in groups.glusterfs_registry) %}
> dm_snapshot
> dm_mirror
> 
> Then we restart `systemd-modules-load.service` when that file changes.
> 
> Was anything else done on the host to load the xen_netfront module? Can you
> gather the contents of /etc/modules-load.d ?

Just FYI, m5/c5 instances are VMs based on KVM Hypervisor, so don't need to load xen_netfront module any more. Please refer to this bug 1497392 for more information.

And m5/c5 instance requires ENA driver in kernel, please make sure it's integrated and the ena support are enabled with AMI or instance.

More information about EC2 instances please refer to or contact my team: https://docs.engineering.redhat.com/pages/viewpage.action?spaceKey=XENQE&title=RHEL+on+AWS+EC2

Comment 6 Vitaly Kuznetsov 2018-09-25 07:01:17 UTC
Yes, c5/m5 and upcoming new instances are not Xen based, two required modules are ena and nvme.

Comment 7 Scott Dodson 2018-09-25 14:14:21 UTC
Vitaly, Lin,

So, the solution here is "Use a more recent AMI" ?

Comment 8 Vitaly Kuznetsov 2018-09-25 14:49:59 UTC
(In reply to Scott Dodson from comment #7)
> Vitaly, Lin,
> 
> So, the solution here is "Use a more recent AMI" ?

Yes,

I just double checked RHEL7.5 ami-680d9010 (us-west-2) and there is no xen_netfront in /etc/modprobe.d (it was there before BZ#1497392).

Comment 9 Scott Dodson 2018-09-25 15:20:14 UTC
Vitaly,

Thanks!

CLOSED NOTABUG the AMI in use is not compatible with the machine time specified. Newer AMI versions have been updated to work appropriately on the new machine times please update the AMI.

Comment 10 Dani Munne 2018-11-08 16:02:11 UTC
Another workaround consists on disabling the module from being loaded as is not present on the RHEL 7.4 AMI because is not xen anymore (mentioned on previous comments).

Steps to solve the issue:

 - echo "blacklist xen_netfront" >> /etc/modprobe.d/local-blacklist.conf
 - echo "install xen_netfront /bin/false" >> /etc/modprobe.d/local-blacklist.conf

More info: https://access.redhat.com/solutions/41278


Note You need to log in before you can comment on or make changes to this bug.