Description of problem: Booting with 'image' configuratoin instead of 'initrd' results in OCI runtime error: % podman run --security-opt label=disable --runtime /usr/bin/kata-runtime -it alpine sh Error: Failed to check if grpc server is working: context deadline exceeded: OCI runtime error Version-Release number of selected component (if applicable): kata-runtime-1.8.2-4.fc31.x86_64, kata-runtime-1.9.0-1.fc31.x86_64 How reproducible: Always Steps to Reproduce: 1. Edit /usr/share/kata-containers/defaults/configuration.toml to replace initrd = "/usr/share/kata-containers/kata-containers-initrd.img" #image = "/usr/share/kata-containers/kata-containers.img" with #initrd = "/usr/share/kata-containers/kata-containers-initrd.img" image = "/usr/share/kata-containers/kata-containers.img" 2. Run podman, e.g. podman run --security-opt label=disable --runtime /usr/bin/kata-runtime -it alpine sh Actual results: Error: Failed to check if grpc server is working: context deadline exceeded: OCI runtime error Expected results: Image boots as with initrd Additional info:
Hey Cole, I wonder if you had tested this configuration
Ouptut of dmesg with `enable_debug=true` is not that informative: [2818435.190330] IPv6: ADDRCONF(NETDEV_CHANGE): vethf3835a1d: link becomes ready [2818435.190409] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [2818435.196248] cni-podman0: port 1(vethf3835a1d) entered blocking state [2818435.196250] cni-podman0: port 1(vethf3835a1d) entered disabled state [2818435.196322] device vethf3835a1d entered promiscuous mode [2818435.196396] cni-podman0: port 1(vethf3835a1d) entered blocking state [2818435.196397] cni-podman0: port 1(vethf3835a1d) entered forwarding state [2818435.438314] eth0: Caught tx_queue_len zero misconfig [2818438.450773] cni-podman0: port 1(vethf3835a1d) entered disabled state [2818438.456599] device vethf3835a1d left promiscuous mode [2818438.456608] cni-podman0: port 1(vethf3835a1d) entered disabled state
Yes I tested this config, as mentioned before this is what the kernel nvdimm request is about: https://bugzilla.redhat.com/show_bug.cgi?id=1750581 https://bugzilla.redhat.com/show_bug.cgi?id=1696481 Without those modules built into the kernel, image= has no chance of working at the moment. That's likely the root issue. Unfortunately debugging these things with kata is a real pain. The way I did it, was to extract a qemu command to run manually, then I had easier options to capture appliance boot output, pass kernel arguments, also pass a custom initrd with dracut-systemd and systemd-initrd modules, and the 'rescue' boot option, which gives a way to inspect the appliance state.
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle. Changing version to 32.
As on Fedora we've decided to not use the image, but only the initrd method (see https://src.fedoraproject.org/rpms/kata-osbuilder/c/bd4294598d9dfb5807b8b2490c870dd2ebf9cf32?branch=master) we can close this one as WONTFIX. If there's the need to revisit this in the future, let's just re-open it without any issue.