Bug 1819215
| Summary: | Cannot reboot into a tang encrypted disk | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Michael Nguyen <mnguyen> |
| Component: | Documentation | Assignee: | Vikram Goyal <vigoyal> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Xiaoli Tian <xtian> |
| Severity: | high | Docs Contact: | Vikram Goyal <vigoyal> |
| Priority: | medium | ||
| Version: | 4.4 | CC: | aos-bugs, bbreard, imcleod, jligon, jokerman, kalexand, miabbott, nstielau, smilner |
| Target Milestone: | --- | ||
| Target Release: | 4.5.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-02-09 21:57:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This is a known limitation, see step three in the docs: https://docs.openshift.com/container-platform/4.3/installing/install_config/installing-customizing.html#installation-special-config-encrypt-disk-tang_installing-customizing Can you verify that you have the `ip=` line? Sorry I totally missed that part of the documentation. With ip=dhcp only, I don't get networking in the initramfs and coreos-luks-open.service fails. See output below of the emergency shell. If I run dhclient inside the emergency shell then restart coreos-luks-open.service it will boot into RHCOS. In terms of purely kargs, adding `rd.neednet=1` is the only thing that worked for me. I'm testing on libvirt if that makes any difference. Entering emergency mode. Exit the shell to continue. Type "journalctl" to view system logs. You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot after mounting them and attach it to a bug report. :/# :/# systemctl --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● coreos-luks-open.service loaded failed failed CoreOS LUKS Opener LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 1 loaded units listed. Pass --all to see loaded but inactive units, too. To show all installed unit files use 'systemctl list-unit-files'. :/# cat /proc/cmdline BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-25bd477a16f8c9f6ef9dbbac1e2ebf96254988bf7bbb8dcf29a393ca13d6523c/vmlinuz-4.18.0-147.5.1.el8_1.x86_64 ip=dhcp rhcos.root=crypt_rootfs console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu rd.luks.options=discard ostree=/ostree/boot.1/rhcos/25bd477a16f8c9f6ef9dbbac1e2ebf96254988bf7bbb8dcf29a393ca13d6523c/0 :/# journalctl -b -u coreos-luks-open.service :/# journalctl -b -u coreos-luks-open.service --no-pager -- Logs begin at Wed 2020-04-01 13:15:22 UTC, end at Wed 2020-04-01 13:15:55 UTC. -- Apr 01 13:15:24 localhost systemd[1]: Starting CoreOS LUKS Opener... Apr 01 13:15:24 localhost coreos-cryptfs[634]: base64: invalid input Apr 01 13:15:24 localhost coreos-cryptfs[634]: coreos-cryptfs: /dev/vda4 is configured for Clevis pin 'tang' Apr 01 13:15:24 localhost coreos-cryptfs[634]: coreos-cryptfs: Checking for default route. Apr 01 13:15:24 localhost coreos-cryptfs[634]: coreos-cryptfs: Waiting for DNS resolver to appear. Apr 01 13:15:55 localhost coreos-cryptfs[634]: coreos-cryptfs: failed to find /etc/resolv.conf Apr 01 13:15:55 localhost systemd[1]: coreos-luks-open.service: Main process exited, code=exited, status=1/FAILURE Apr 01 13:15:55 localhost systemd[1]: coreos-luks-open.service: Failed with result 'exit-code'. Apr 01 13:15:55 localhost systemd[1]: Failed to start CoreOS LUKS Opener. Apr 01 13:15:55 localhost systemd[1]: coreos-luks-open.service: Triggering OnFailure= dependencies. (In reply to Michael Nguyen from comment #2) > In terms of purely kargs, adding `rd.neednet=1` is the only thing that > worked for me. I'm testing on libvirt if that makes any difference. Sounds like this might just be a docs issue; we probably want to instruct customers to provide both `ip=` and `rd.neednet=1`? Mike, could you test this configuration with a static ip via `ip=` and the inclusion of `rd.neednet=1`? I'm curious how they will play together. For some background, `ip=...` is only activated when `rd.neednet=1` is used. See [1], which makes `ip=...` a noop with out the `rd.neednet=1`. We need to get the docs updated. [1] https://github.com/dracutdevs/dracut/blob/RHEL-8/modules.d/35network-legacy/net-genrules.sh#L3 For some background, `ip=...` is only activated when `rd.neednet=1` is used. See [1], which makes `ip=...` a noop with out the `rd.neednet=1`. We need to get the docs updated. [1] https://github.com/dracutdevs/dracut/blob/RHEL-8/modules.d/35network-legacy/net-genrules.sh#L3 Chris asked for confirmation in https://github.com/openshift/openshift-docs/pull/21177 This was a Docs fix (see attached PR), so moving to the Docs team to shepherd the BZ along. |
Description of problem: I am able to enable Tang on first boot (and verify the disk is encrypted), but it will fail decrypting the disk after rebooting. First boot has rd.neednet=1 that is removed after the first boot. If I `rpm-ostree kargs --append rd.neednet=1` before the reboot it will work. This is happening on rhcos-4.3 and rhcos-4.4 Version-Release number of selected component (if applicable): rhcos-43.81.202001142154.0 rhcos-44.81.202003230949-0 How reproducible: Always Steps to Reproduce: 1. Enable tang using an ignition config with the following snippet Ignition Snippet ------------------------- "storage": { "files": [ { "filesystem": "root", "path": "/etc/clevis.json", "contents": { "source": "data:text/plain;base64,<your base64 tang pin>" }, "mode": 420 } ] } Sample base64 tang config --------------------------- cat << EOF | base64 -w0 { "url": "http://10.0.2.2", "thp": "ABCDEFGHIJKLMNO" } 2. Boot RHCOS with the ignition file above 3. Verify Tang encryption is working after the system is booted `sudo cryptsetup luksDump /dev/disk/by-partlabel/luks_root` 4. Reboot 5. Verify system drops into the emergency shell and never completed booting into RHCOS. Actual results: System never reboots into RHCOS and drops into the emergency shell. Expected results: System reboots into RHCOS Additional info: https://docs.openshift.com/container-platform/4.3/installing/install_config/installing-customizing.html