Bug 2211777 - [OSP17.1] Integrated benchmark through introspection fails due to fio lacking libaio plugin
Summary: [OSP17.1] Integrated benchmark through introspection fails due to fio lacking...
Keywords:
Status: ON_DEV
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic-python-agent-builder
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z2
: 17.1
Assignee: Julia Kreger
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-01 21:37 UTC by Julia Kreger
Modified: 2023-08-11 13:59 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:
lsvaty: needinfo? (jhakimra)
jkreger: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 885100 0 None MERGED Add libaio engine for fio on Centos/Rhel9 2023-07-31 19:49:54 UTC
OpenStack gerrit 886671 0 None NEW Add libaio engine for fio on Centos/Rhel9 2023-07-31 19:49:58 UTC
Red Hat Issue Tracker OSP-25566 0 None None None 2023-06-01 21:39:37 UTC

Description Julia Kreger 2023-06-01 21:37:54 UTC
Description of problem:

IPA Images on EL9 lack the FIO plugin required to perform disk performance tests


Version-Release number of selected component (if applicable):


How reproducible:

Always

Steps to Reproduce in a full deployment

1) Install with the inspection_runbench option set to true in the undercloud.conf.
2) Attempt to introspect a node, it will timeout and will have errors visible on the console.

Steps to Reproduce:
1. Extract agent.ramdisk using zcat agent.ramdisk | cpio -i --make-directories && chroot .
2. execute: fio --ioengine=libaio --invalidate=1 --ramp_time=5 --iodepth=32 --runtime=10 --time_based --direct=1 --output-format=json --bs=1M --rw=read --name=MYJOB-sdb --filename=test.io
3.

Actual results:

Introspection times out, due to the fio error below:

Engine libaio not found; Either name is invalid, was not built, or fio-engine-libaio package is missing.
fio: engine libaio not loadable
fio: failed to load engine


Expected results:

Introspection completes, command exits without error.

Additional info:

Comment 2 Julia Kreger 2023-06-01 22:03:04 UTC
Change posted to upstream gerrit to install the missing fio-engine-libaio package which was split from just the fio package in EL8 and in EL9 just installing "fio" is not enough. Meaning we need to install the additional required RPM in the packages. Change can be found at: https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/885100

I checked the other underlying benchmark code, and we don't use fio for other tasks, and just installing the fio-engine-libaio rpm resolves the underlying issue allowing the command to run. This issue can be routed around by patching the ramdisk with the additional RPM.

Comment 3 Harald Jensås 2023-06-12 20:00:30 UTC
Workaround: Set `ironic_runbench` to `false` in undercloud.conf

inspection_runbench
    Runs a set of benchmarks during node introspection. Set this parameter to true to enable the benchmarks.
    This option is necessary if you intend to perform benchmark analysis when inspecting the hardware of registered nodes.

In most cases benchmarking is disabled, _it is disabled when using default configuration_.
Customers that run into this issue can disable bench-marking, to enable introspection too succeed. (Obviously without collecting benchmark data in this case.)

If benchmark data is essential for the customer, the workaround option is to modify (patch) the ramdisk, i.e install the fio-engine-libaio RPM package in the image.

Comment 6 Julia Kreger 2023-06-21 17:04:28 UTC
I chatted with the consultant who was working with the customer last week, and they disabled the benchmark execution as they were not reviewing the data.

As such, they have worked around the issue created in the configuration they were carrying from earlier releases. Apparently from OSP10.

In any event, as such I've lowered the severity and priority, and backported the change upstream. We can shoot for Z1 on this fix.


Note You need to log in before you can comment on or make changes to this bug.