Bug 1750581 - enable nvdimm drivers for kata-containers usage?
Summary: enable nvdimm drivers for kata-containers usage?
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-09 23:43 UTC by Cole Robinson
Modified: 2020-04-02 19:42 UTC (History)
22 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-04-02 19:42:57 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Cole Robinson 2019-09-09 23:43:10 UTC
We are working on packaging kata containers for Fedora. kata containers is an OCI container runtime that inserts a qemu/kvm VM inbetween your host and your docker/podman container.

kata wants the KVM VM appliance to be as small as possible, so it doesn't take up much memory and it starts up quickly. There's two methods of booting the VM appliance filesystem: stuff it in an initrd, or pass it to the VM through nvdimm. The latter piece won't work for Fedora though unless the necessary drivers are built into the kernel.

Here's the minimal kernel config diff I could distill to make it work, against the f30 kernel. Is this an acceptable addition? FWIW I haven't really consulted with other virt folks yet to see if we even want to go this direction, but I just got it all working and figured I'd float the idea early. Thanks!


--- .config	2019-09-09 18:59:33.354576361 -0400
+++ minconfig	2019-09-09 19:01:02.004509091 -0400
@@ -531,7 +531,7 @@ CONFIG_ACPI_SBS=m
 CONFIG_ACPI_HED=y
 CONFIG_ACPI_CUSTOM_METHOD=m
 CONFIG_ACPI_BGRT=y
-CONFIG_ACPI_NFIT=m
+CONFIG_ACPI_NFIT=y
 # CONFIG_NFIT_SECURITY_DEBUG is not set
 CONFIG_ACPI_HMAT=y
 CONFIG_HAVE_ACPI_APEI=y
@@ -7917,20 +7917,20 @@ CONFIG_THUNDERBOLT=m
 # CONFIG_ANDROID is not set
 # end of Android
 
-CONFIG_LIBNVDIMM=m
-CONFIG_BLK_DEV_PMEM=m
-CONFIG_ND_BLK=m
+CONFIG_LIBNVDIMM=y
+CONFIG_BLK_DEV_PMEM=y
+CONFIG_ND_BLK=y
 CONFIG_ND_CLAIM=y
-CONFIG_ND_BTT=m
+CONFIG_ND_BTT=y
 CONFIG_BTT=y
-CONFIG_ND_PFN=m
+CONFIG_ND_PFN=y
 CONFIG_NVDIMM_PFN=y
 CONFIG_NVDIMM_DAX=y
 CONFIG_NVDIMM_KEYS=y
 CONFIG_DAX_DRIVER=y
 CONFIG_DAX=y
-CONFIG_DEV_DAX=m
-CONFIG_DEV_DAX_PMEM=m
+CONFIG_DEV_DAX=y
+CONFIG_DEV_DAX_PMEM=y
 CONFIG_DEV_DAX_KMEM=m
 # CONFIG_DEV_DAX_PMEM_COMPAT is not set
 CONFIG_NVMEM=y

Comment 1 Laura Abbott 2019-09-10 08:32:41 UTC
I'm not opposed to this change. I think this is a reasonable argument for why we want to have the drivers built in. We can leave this bug open until you make a final decision.

Comment 2 Cole Robinson 2019-09-10 14:12:54 UTC
Thanks Laura.

CCing some kata+virt folks. stefanha, dgilbert, any comments here? I'm still not that familiar with kata so I'd appreciate a second set of eyes

Comment 3 Stefan Hajnoczi 2019-09-18 13:41:12 UTC
I think supporting boot from NVDIMM in Fedora is reasonable for the Kata Containers use case.

Comment 4 Cole Robinson 2019-09-18 22:22:54 UTC
Thanks Stefan. Laura if you could add this to f31+ that would be great. f30 as well if it's not an issue

Comment 5 Christophe de Dinechin 2019-09-26 13:40:26 UTC
(In reply to Cole Robinson from comment #0)
> kata wants the KVM VM appliance to be as small as possible, so it doesn't
> take up much memory and it starts up quickly. There's two methods of booting
> the VM appliance filesystem: stuff it in an initrd, or pass it to the VM
> through nvdimm. The latter piece won't work for Fedora though unless the

Looks reasonable to me. How would you configure that, though? Is that something
you plan to add to kata-osbuilder?

Comment 6 Dr. David Alan Gilbert 2019-09-26 13:44:32 UTC
It's worth checking if the vhost-user stuff plays well with nvdimm.

Comment 7 Cole Robinson 2019-09-26 14:08:06 UTC
(In reply to Christophe de Dinechin from comment #5)
> (In reply to Cole Robinson from comment #0)
> > kata wants the KVM VM appliance to be as small as possible, so it doesn't
> > take up much memory and it starts up quickly. There's two methods of booting
> > the VM appliance filesystem: stuff it in an initrd, or pass it to the VM
> > through nvdimm. The latter piece won't work for Fedora though unless the
> 
> Looks reasonable to me. How would you configure that, though? Is that
> something
> you plan to add to kata-osbuilder?

nvdimm is what kata uses for the 'image' boot option in configuration-qemu.toml. Current kata-osbuilder in Fedora 31+ will generate and image and an initrd

(In reply to Dr. David Alan Gilbert from comment #6)
> It's worth checking if the vhost-user stuff plays well with nvdimm.

I haven't tried it. I figure enabling the kernel modules is good to do regardless, since it will give us the opportunity to actually test these things with the stock kernel. If nvdimm doesn't play well with vhost-user we can stick with initrd as the default appliance boot method (IMO we should do that anyways as a first packaging pass)

Comment 8 Cole Robinson 2019-10-18 14:06:03 UTC
Turns out CONFIG_NVDIMM=m was an explicit request from intel (for non-kata reasons) back in April, see https://bugzilla.redhat.com/show_bug.cgi?id=1696481

I've asked for more info there

Comment 9 Cole Robinson 2020-04-02 19:42:57 UTC
We are choosing the ignore the image= option in kata for the forseeable future so we don't need this change anymore. We will reopen in the future if it becomes relevant again


Note You need to log in before you can comment on or make changes to this bug.