Bug 1866738
Summary: | [OSP] OCP failed to install with rhcos-46.82.202008030340-0 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | weiwei jiang <wjiang> | ||||||
Component: | RHCOS | Assignee: | Benjamin Gilbert <bgilbert> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 4.6 | CC: | achernet, bbreard, bgilbert, dsanzmor, gfontana, imcleod, jialiu, jligon, miabbott, nstielau, rpecora, smaitra | ||||||
Target Milestone: | --- | Keywords: | TestBlocker, TestBlockerForLayeredProduct | ||||||
Target Release: | 4.6.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | Telco | ||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2020-10-27 16:25:22 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
weiwei jiang
2020-08-06 08:51:12 UTC
Created attachment 1710627 [details]
rhcos boot log
Also failing on baremetal. Last lines of console are (entire log attached): [ 235.444464] EDAC MC0: Giving out device to module amd64_edac controller F17h_M30h: DEV 0000:00:18.3 ) [ 235.444476] EDAC PCI0: Giving out device to module amd64_edac controller EDAC PCI controller: DEV 00) [ 235.444476] AMD64 EDAC driver v3.5.0 [ 235.445249] Console: switching to colour dummy device 80x25 [ 235.453002] ipmi_si IPI0001:00: The BMC does not support setting the recv irq bit, compensating, but. [ 235.459752] [TTM] Zone kernel: Available graphics memory: 32768880 KiB [ 235.484067] ipmi_si IPI0001:00: Using irq 10 [ 235.487179] [TTM] Zone dma32: Available graphics memory: 2097152 KiB [ 235.487180] [TTM] Initializing pool allocator [ 235.518043] ipmi_si IPI0001:00: Found new BMC (man_id: 0x0002a2, prod_id: 0x0100, dev_id: 0x20) [ 235.520634] [TTM] Initializing DMA pool allocator [ 235.595246] ipmi_si IPI0001:00: IPMI kcs interface initialized [ 235.602624] fbcon: mgag200drmfb (fb0) is primary device [ 235.602665] Console: switching to colour frame buffer device 128x48 [ 235.604194] ipmi_ssif: IPMI SSIF Interface driver [ 236.003995] mgag200 0000:c2:00.0: fb0: mgag200drmfb frame buffer device [ 236.012013] [drm] Initialized mgag200 1.0.0 20110418 for 0000:c2:00.0 on minor 0 [ 236.167040] i40e: Registered client i40iw [ 236.296748] Rounding down aligned max_sectors from 4294967295 to 4294967288 [ 236.303896] db_root: cannot open: /etc/target [ 236.324538] iscsi: registered transport (iser) [ 236.398804] RPC: Registered named UNIX socket transport module. [ 236.398806] RPC: Registered udp transport module. [ 236.398807] RPC: Registered tcp transport module. [ 236.398808] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 236.449055] RPC: Registered rdma transport module. [ 236.449056] RPC: Registered rdma backchannel transport module. Cannot open access to console, the root account is locked. See sulogin(8) man page for more details. Press Enter to continue. [ 738.522839] kauditd_printk_skb: 235 callbacks suppressed [ 738.522840] audit: type=1400 audit(1596712624.001:247): avc: denied { unlink } for pid=2044 comm=0 Cannot open access to console, the root account is locked. See sulogin(8) man page for more details. Press Enter to continue. Cannot ssh server, it is on connection refused Created attachment 1710642 [details]
rhcos boot log on baremetal
In the bare metal case, the bootstrap.log has its messages truncated which makes debugging a bit difficult, however the big thing standing out is the SELinux denials all over the place. There's some ACPI errors in there, but not sure if they are fatal. @dsanzmor Is it possible to get the bootstrap.log without the lines truncated? The bare metal failure may be better tracked in a separate BZ. For the OSP case, the attached log shows that the node appears to have booted successfully with an IP and a login prompt. The package diff between the two versions, 46.82.202007212240-0 and 46.82.202008030340-0, shows a number of changes, but without additional information, it is hard to understand what is failing. @wjiang can you provide any other information to help investigation? ``` $ ./differ.py -fe api.ci -fr rhcos-4.6/46.82.202007212240-0 -se api.ci -sr rhcos-4.6/46.82.202008030340-0 { "sources": { "rhcos-4.6/46.82.202007212240-0": "https://releases-art-rhcos.svc.ci.openshift.org/art/storage/releases/rhcos-4.6/46.82.202007212240-0/x86_64/commitmeta.json", "rhcos-4.6/46.82.202008030340-0": "https://releases-art-rhcos.svc.ci.openshift.org/art/storage/releases/rhcos-4.6/46.82.202008030340-0/x86_64/commitmeta.json" }, "diff": { "NetworkManager": { "rhcos-4.6/46.82.202007212240-0": "NetworkManager-1.22.8-5.el8_2.x86_64", "rhcos-4.6/46.82.202008030340-0": "NetworkManager-1.22.8-6.el8_2.x86_64" }, "NetworkManager-libnm": { "rhcos-4.6/46.82.202007212240-0": "NetworkManager-libnm-1.22.8-5.el8_2.x86_64", "rhcos-4.6/46.82.202008030340-0": "NetworkManager-libnm-1.22.8-6.el8_2.x86_64" }, "NetworkManager-ovs": { "rhcos-4.6/46.82.202007212240-0": "NetworkManager-ovs-1.22.8-5.el8_2.x86_64", "rhcos-4.6/46.82.202008030340-0": "NetworkManager-ovs-1.22.8-6.el8_2.x86_64" }, "NetworkManager-team": { "rhcos-4.6/46.82.202007212240-0": "NetworkManager-team-1.22.8-5.el8_2.x86_64", "rhcos-4.6/46.82.202008030340-0": "NetworkManager-team-1.22.8-6.el8_2.x86_64" }, "NetworkManager-tui": { "rhcos-4.6/46.82.202007212240-0": "NetworkManager-tui-1.22.8-5.el8_2.x86_64", "rhcos-4.6/46.82.202008030340-0": "NetworkManager-tui-1.22.8-6.el8_2.x86_64" }, "conmon": { "rhcos-4.6/46.82.202007212240-0": "conmon-2.0.17-1.rhaos4.5.el8.x86_64", "rhcos-4.6/46.82.202008030340-0": "conmon-2.0.20-1.rhaos4.6.el8.x86_64" }, "containers-common": { "rhcos-4.6/46.82.202007212240-0": "containers-common-1.0.0-1.module+el8.2.1+6676+604e1b26.x86_64", "rhcos-4.6/46.82.202008030340-0": "containers-common-1.1.1-2.rhaos4.6.el8.x86_64" }, "coreos-installer": { "rhcos-4.6/46.82.202007212240-0": "coreos-installer-0.2.0-4.rhaos4.6.el8.x86_64", "rhcos-4.6/46.82.202008030340-0": "coreos-installer-0.5.0-1.rhaos4.6.el8.x86_64" }, "coreos-installer-systemd": { "rhcos-4.6/46.82.202007212240-0": "coreos-installer-systemd-0.2.0-4.rhaos4.6.el8.x86_64", "rhcos-4.6/46.82.202008030340-0": "Not present" }, "cri-o": { "rhcos-4.6/46.82.202007212240-0": "cri-o-1.19.0-41.rhaos4.6.git988f60e.el8.x86_64", "rhcos-4.6/46.82.202008030340-0": "cri-o-1.19.0-61.rhaos4.6.git79c1228.el8.x86_64" }, "grub2-common": { "rhcos-4.6/46.82.202007212240-0": "grub2-common-2.02-82.el8_2.1.noarch", "rhcos-4.6/46.82.202008030340-0": "grub2-common-2.02-87.el8_2.noarch" }, "grub2-efi-x64": { "rhcos-4.6/46.82.202007212240-0": "grub2-efi-x64-2.02-82.el8_2.1.x86_64", "rhcos-4.6/46.82.202008030340-0": "grub2-efi-x64-2.02-87.el8_2.x86_64" }, "grub2-pc": { "rhcos-4.6/46.82.202007212240-0": "grub2-pc-2.02-82.el8_2.1.x86_64", "rhcos-4.6/46.82.202008030340-0": "grub2-pc-2.02-87.el8_2.x86_64" }, "grub2-pc-modules": { "rhcos-4.6/46.82.202007212240-0": "grub2-pc-modules-2.02-82.el8_2.1.noarch", "rhcos-4.6/46.82.202008030340-0": "grub2-pc-modules-2.02-87.el8_2.noarch" }, "grub2-tools": { "rhcos-4.6/46.82.202007212240-0": "grub2-tools-2.02-82.el8_2.1.x86_64", "rhcos-4.6/46.82.202008030340-0": "grub2-tools-2.02-87.el8_2.x86_64" }, "grub2-tools-extra": { "rhcos-4.6/46.82.202007212240-0": "grub2-tools-extra-2.02-82.el8_2.1.x86_64", "rhcos-4.6/46.82.202008030340-0": "grub2-tools-extra-2.02-87.el8_2.x86_64" }, "grub2-tools-minimal": { "rhcos-4.6/46.82.202007212240-0": "grub2-tools-minimal-2.02-82.el8_2.1.x86_64", "rhcos-4.6/46.82.202008030340-0": "grub2-tools-minimal-2.02-87.el8_2.x86_64" }, "ignition": { "rhcos-4.6/46.82.202007212240-0": "ignition-2.3.0-1.rhaos4.6.gitee616d5.el8.x86_64", "rhcos-4.6/46.82.202008030340-0": "ignition-2.5.0-1.rhaos4.6.git0d6f3e5.el8.x86_64" }, "openshift-clients": { "rhcos-4.6/46.82.202007212240-0": "openshift-clients-4.6.0-202007212120.p0.git.3658.e2f0cb0.el8.x86_64", "rhcos-4.6/46.82.202008030340-0": "openshift-clients-4.6.0-202008011451.p0.git.3685.3939f2f.el8.x86_64" }, "openshift-hyperkube": { "rhcos-4.6/46.82.202007212240-0": "openshift-hyperkube-4.6.0-202007110420.p1.git.0.4de1d1d.el8.x86_64", "rhcos-4.6/46.82.202008030340-0": "openshift-hyperkube-4.6.0-202008011154.p0.git.93402.577b186.el8.x86_64" }, "openvswitch2.13": { "rhcos-4.6/46.82.202007212240-0": "openvswitch2.13-2.13.0-39.el8fdp.x86_64", "rhcos-4.6/46.82.202008030340-0": "openvswitch2.13-2.13.0-49.el8fdp.x86_64" }, "shim-x64": { "rhcos-4.6/46.82.202007212240-0": "shim-x64-15-11.x86_64", "rhcos-4.6/46.82.202008030340-0": "shim-x64-15-15.el8_2.x86_64" }, "skopeo": { "rhcos-4.6/46.82.202007212240-0": "skopeo-1.0.0-1.module+el8.2.1+6676+604e1b26.x86_64", "rhcos-4.6/46.82.202008030340-0": "skopeo-1.1.1-2.rhaos4.6.el8.x86_64" }, "toolbox": { "rhcos-4.6/46.82.202007212240-0": "toolbox-0.0.7-1.rhaos4.5.el8.noarch", "rhcos-4.6/46.82.202008030340-0": "toolbox-0.0.8-1.rhaos4.6.el8.noarch" }, "coreos-installer-bootinfra": { "rhcos-4.6/46.82.202007212240-0": "Not present", "rhcos-4.6/46.82.202008030340-0": "coreos-installer-bootinfra-0.5.0-1.rhaos4.6.el8.x86_64" }, "openssl-pkcs11": { "rhcos-4.6/46.82.202007212240-0": "Not present", "rhcos-4.6/46.82.202008030340-0": "openssl-pkcs11-0.4.10-2.el8.x86_64" } } } ``` I tried again, and the same result, even though the boot log show login prompt, but I can not ssh into the server. So any advice I can have a try to fetch more details? # openstack server list --name wj46 +--------------------------------------+-----------------------------+--------+--------------------------------------------------------+-------------------------+-----------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-----------------------------+--------+--------------------------------------------------------+-------------------------+-----------+ | 641fd4bd-a4e3-4497-a7f1-595a6d787390 | wj46ios807a-q8ht2-bootstrap | ACTIVE | wj46ios807a-q8ht2-openshift=192.168.0.211, 10.0.103.46 | wj46ios807a-q8ht2-rhcos | m1.xlarge | | 453e0f5e-5f00-4f49-bfce-57bbf7c9f8f9 | wj46ios807a-q8ht2-master-2 | ACTIVE | wj46ios807a-q8ht2-openshift=192.168.1.2 | wj46ios807a-q8ht2-rhcos | m1.xlarge | | e5fb6bfc-79cb-4e27-b90a-91a10d7fa002 | wj46ios807a-q8ht2-master-1 | ACTIVE | wj46ios807a-q8ht2-openshift=192.168.2.220 | wj46ios807a-q8ht2-rhcos | m1.xlarge | | 6f4fdc91-71a7-4293-bc85-656891def5dd | wj46ios807a-q8ht2-master-0 | ACTIVE | wj46ios807a-q8ht2-openshift=192.168.3.150 | wj46ios807a-q8ht2-rhcos | m1.xlarge | +--------------------------------------+-----------------------------+--------+--------------------------------------------------------+-------------------------+-----------+ $ ssh -i ~/.ssh/openshift-qe.pem core.103.46 -v 255 ↵ OpenSSH_8.1p1, OpenSSL 1.1.1g FIPS 21 Apr 2020 debug1: Reading configuration data /home/wjiang/.ssh/config debug1: /home/wjiang/.ssh/config line 12: Applying options for 10.* debug1: Reading configuration data /etc/ssh/ssh_config debug1: Reading configuration data /etc/ssh/ssh_config.d/05-redhat.conf debug1: Reading configuration data /etc/crypto-policies/back-ends/openssh.config debug1: /etc/ssh/ssh_config.d/05-redhat.conf line 8: Applying options for * debug1: Connecting to 10.0.103.46 [10.0.103.46] port 22. debug1: fd 4 clearing O_NONBLOCK debug1: Connection established. debug1: identity file /home/wjiang/.ssh/openshift-qe.pem type -1 debug1: identity file /home/wjiang/.ssh/openshift-qe.pem-cert type -1 debug1: identity file /home/wjiang/.ssh/libra-new.pem type -1 debug1: identity file /home/wjiang/.ssh/libra-new.pem-cert type -1 debug1: Local version string SSH-2.0-OpenSSH_8.1 debug1: Remote protocol version 2.0, remote software version OpenSSH_8.0 debug1: match: OpenSSH_8.0 pat OpenSSH* compat 0x04000000 debug1: Authenticating to 10.0.103.46:22 as 'core' debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug1: kex: algorithm: curve25519-sha256 debug1: kex: host key algorithm: ecdsa-sha2-nistp256 debug1: kex: server->client cipher: aes256-gcm MAC: <implicit> compression: none debug1: kex: client->server cipher: aes256-gcm MAC: <implicit> compression: none debug1: kex: curve25519-sha256 need=32 dh_need=32 debug1: kex: curve25519-sha256 need=32 dh_need=32 debug1: expecting SSH2_MSG_KEX_ECDH_REPLY debug1: Server host key: ecdsa-sha2-nistp256 SHA256:RDYSsCy6DvPzso9rZrUgjjdB7RoCClCf/1CARjQXkS4 Warning: Permanently added '10.0.103.46' (ECDSA) to the list of known hosts. debug1: rekey out after 4294967296 blocks debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug1: SSH2_MSG_NEWKEYS received debug1: rekey in after 4294967296 blocks debug1: Will attempt key: /home/wjiang/.ssh/openshift-qe.pem explicit debug1: Will attempt key: /home/wjiang/.ssh/libra-new.pem explicit debug1: SSH2_MSG_EXT_INFO received debug1: kex_input_ext_info: server-sig-algs=<ssh-ed25519,ssh-rsa,rsa-sha2-256,rsa-sha2-512,ssh-dss,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521> debug1: SSH2_MSG_SERVICE_ACCEPT received debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic debug1: Next authentication method: publickey debug1: Trying private key: /home/wjiang/.ssh/openshift-qe.pem debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic debug1: Trying private key: /home/wjiang/.ssh/libra-new.pem debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic debug1: No more authentication methods to try. core.103.46: Permission denied (publickey,gssapi-keyex,gssapi-with-mic). If the server boot well, then how to check if the ignition is injected to the server, as 4.6 have different output than 4.5. [ 7.784431] ignition[719]: Ignition 2.5.0 [ 7.787822] ignition[719]: Stage: fetch-offline [ 7.791479] ignition[719]: fetched base config from "system" [ 7.795587] ignition[719]: reading system config file "/usr/lib/ignition/base.ign" [ 7.801315] ignition[719]: no config URL provided [ 7.804568] ignition[719]: reading system config file "/usr/lib/ignition/user.ign" [ 7.810299] ignition[719]: no config at "/usr/lib/ignition/user.ign" [ 7.814545] ignition[719]: failed to fetch config from metadata service: resource requires networking [[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (8s / no limit)[K[[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (8s / no limit)[K[[0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (9s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (9s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-o$ fline) (10s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Ignition (fetch-offline) (11s / no limit)[K[ [0;31m*[0;1;31m*[0m] A start job is running for Ignition (fetch-offline) (11s / no limit)[K[ [0;31$ *[0m] A start job is running for Ignition (fetch-offline) (12s / no limit)[K[ [0;31m*[0;1;31m*[0m] A start job is running for Ignition (fetch-offline) (12s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Ign$ tion (fetch-offline) (13s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (14s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (14s / no limit)[K[[0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (15s / no limit)[K[[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (15s / no limit)[K[[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (16s / no limit)[K[[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (16s / no limit)[K[[0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (17s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (18s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (18s / no l$ mit)[K[ [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Ignition (fetch-offline) (19s / no limit)[K[ [0;31m*[0;1;31m*[0m] A start job is running for Ignition (fetch-offline) (19s / no limit)[K[ [0;31m*[0m] A start job $ s running for Ignition (fetch-offline) (20s / no limit)[K[ [0;31m*[0;1;31m*[0m] A start job is running for Ignition (fetch-offline) (21s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Ignition (fetch-offlin$ ) (21s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (22s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (22s / no limit)[K[[0;31m$ [0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (23s / no limit)[K[[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (23s / no limit)[K[[0m[0;31m* [0m] A start job is runni$ g for Ignition (fetch-offline) (24s / no limit)[K[[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (25s / no limit)[K[[0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (2$ s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (26s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (26s / no limit)[K[ [0;31m*[$ ;1;31m*[0m[0;31m*[0m] A start job is running for Ignition (fetch-offline) (27s / no limit)[K[ [0;31m*[0;1;31m*[0m] A start job is running for Ignition (fetch-offline) (28s / no limit)[K[ [0;31m*[0m] A start job is running for Igni$ ion (fetch-offline) (28s / no limit)[K[ [0;31m*[0;1;31m*[0m] A start job is running for Ignition (fetch-offline) (29s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Ignition (fetch-offline) (29s / no limit)$ K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (30s / no limit)[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (31s / no limit)[K[[0;31m*[0;1;31m*[0m[0;31m$ [0m] A start job is running for Ignition (fetch-offline) (31s / no limit)[K[[0;1;31m*[0m[0;31m* [0m] A start job is running for Ignition (fetch-offline) (32s / no limit)[K[[0m[0;31m* [0m] A start job is running for Ignition (fe$ ch-offline) (32s / no limit)[ 37.786130] ignition[719]: neither config drive nor metadata service were available in time. Continuing without a config... [ 37.795374] ignition[719]: not a config (empty): provider config was empty, continuing with empty cache config [ 37.801472] ignition[719]: timed out while fetching config from config drive (CONFIG-2) [K[[0;32m OK [0m] Started Ignition (fetch-offline). [ 37.808937] systemd[1]: Started Ignition (fetch-offline). [ 37.812515] ignition[719]: timed out while fetching config from config drive (config-2) Starting Copy CoreOS Firstboot Networking Config... [ 37.819200] systemd[1]: Starting Copy CoreOS Firstboot Networking Config... Starting Check for FIPS mode... [ 37.824334] ignition[719]: fetch-offline: fetch-offline passed Does this means ignition fetch failed I am running a baremetal install using the same boot image to launch vms, hit the same issues. Bootstrap failed, but I even can not ssh log into the bootstrap vms. This is also blocking QE's baremetal install. See https://bugzilla.redhat.com/show_bug.cgi?id=1867091 for the bare metal issue For the OSP case, it looks like the Igntion fetch failed: ``` [ 37.786130] ignition[719]: neither config drive nor metadata service were available in time. Continuing without a config... [ 37.795374] ignition[719]: not a config (empty): provider config was empty, continuing with empty cache config [ 37.801472] ignition[719]: timed out while fetching config from config drive (CONFIG-2) [K[[0;32m OK [0m] Started Ignition (fetch-offline). [ 37.808937] systemd[1]: Started Ignition (fetch-offline). [ 37.812515] ignition[719]: timed out while fetching config from config drive (config-2) ``` Earlier it looks like networking isn't available: `[ 7.814545] ignition[719]: failed to fetch config from metadata service: resource requires networking` @slowrie can you investigate this further? This is https://github.com/coreos/ignition/issues/1056, which is fixed upstream and needs a backport to RHCOS. Repro: launch an RHCOS instance in an OpenStack cluster that exposes userdata via the OpenStack metadata service. If the Ignition config is not applied, and you see failed to fetch config from metadata service: resource requires networking ... neither config drive nor metadata service were available in time. Continuing without a config... ...you have the bug. Additional details: - the linked fix was included in uptream Ignition 2.6.0 - a downstream package of that version was made: `ignition-2.6.0-1.rhaos4.6.git947598e.el8` Checked OCP on OSP both IPI and UPI with rhcos-46.82.202008080704-0 and ignition- 2.6.0-1.rhaos4.6.git947598e.el8 Also checked with IPI on Baremetal with OpenStack, work well now. If anything is okay, pls bump up rhcos verison in data/data/rhcos.json, so that QE can verify this bug. This should be the installer PR - https://github.com/openshift/installer/pull/4036 (In reply to Micah Abbott from comment #18) > This should be the installer PR - > https://github.com/openshift/installer/pull/4036 Hi, the RHCOS in the PR is not contain the fix. Please help update RHCOS version greater than rhcos-46.82.202008080704. Thanks (In reply to weiwei jiang from comment #16) > Checked OCP on OSP both IPI and UPI with rhcos-46.82.202008080704-0 and > ignition- 2.6.0-1.rhaos4.6.git947598e.el8 > > Also checked with IPI on Baremetal with OpenStack, work well now. Move to verified according to comment 16, and will open another BZ for the installer RHCOS bindings. The BZ for installer RHCOS binding - https://bugzilla.redhat.com/show_bug.cgi?id=1867853 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 Bug seemed to be closed, otherwise I got baremetal pxe boot on IBM Power 9 - ppcle architecture issue. After pxe boot started the RHCOS installation (OCP 4.6.13) the installation program stucked at message: [ 15.573440] RPC: Registered rdma backchannel transport module. Steps to reproduce: 1. Tftpboot package fair configured to be reached over Red Hat Enterprise Linux 8. 2. GRUB2 pxe menu and grub2.conf at tftp server. 3. A Live Image ISO of CoreOS (rhcos-live.ppc64le.iso). 4. An initrd and rootfs image available at tftp server. 5. A Httpd server to reside iso, images and ignition config file Start the VM from HMC (hardware management console) to run over PXE Boot Menu, the machine reached the tft server, grab an IP address leased by dhcp and starts grub menu with the similar options below: ########### ##grub2.cfg default=0 fallback=1 timeout=1 menuentry "Bootstrap CoreOS (BIOS)" { echo "Loading kernel Bootstrap" linux "/rhcos-live-kernel-ppc64le" rd.neednet=1 ip=dhcp console=tty0 console=ttyS0 coreos.inst=yes coreos.inst.install_dev=sda coreos.inst.image_url=http://10.19.7.81/rhcos/rhcos-metal.ppc64le.raw.gz coreos.inst.ignition_url=http://10.19.7.81/bootstrap.ign echo "Loading initrd" initrd "/rhcos-4.3.18-ppc64le-installer-initramfs.ppc64le.img" } ########## The installation starts and afterward it stucked at error described. Expect Results: RHCOS Installation Succeed Actual Results: Freezes the installation Some Installation error log: [ 13.138381] systemd[1]: Starting Create Static Device Nodes in /dev... [ 13.181268] synth uevent: /devices/vio: failed to send uevent [ 13.181270] vio vio: uevent: failed to send synthetic uevent [ 13.182784] synth uevent: /devices/vio/4000: failed to send uevent [ 13.182785] vio 4000: uevent: failed to send synthetic uevent [ 13.182836] synth uevent: /devices/vio/4001: failed to send uevent [ 13.182837] vio 4001: uevent: failed to send synthetic uevent [ 13.182887] synth uevent: /devices/vio/4002: failed to send uevent [ 13.182888] vio 4002: uevent: failed to send synthetic uevent [ 13.182938] synth uevent: /devices/vio/4004: failed to send uevent [ 13.182939] vio 4004: uevent: failed to send synthetic uevent [ 13.267097] systemd-journald[1224]: Received request to flush runtime journal from PID 1 [ 13.522803] pseries_rng: Registering IBM pSeries RNG driver [ 15.458025] Rounding down aligned max_sectors from 4294967295 to 4294967168 [ 15.458229] db_root: cannot open: /etc/target [ 15.476879] iscsi: registered transport (iser) [ 15.553414] RPC: Registered named UNIX socket transport module. [ 15.553462] RPC: Registered udp transport module. [ 15.553487] RPC: Registered tcp transport module. [ 15.553511] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 15.573410] RPC: Registered rdma transport module. [ 15.573440] RPC: Registered rdma backchannel transport module. That appears to be a different issue from the one reported here. Please open a new bug. |