Description of problem (please be detailed as possible and provide log snippests): Note: classified this bz as "ceph" -> RBD. I think it is just ceph related, please re-assign as necessary, I am not sure in which category to put this bz. On nodes where OSDs are scheduled I see very often below error message [1] seems some ceph command intended to be executed has wrong parameters and it fails - reporting closest matches for command in order to work. [1] Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: ceph daemon health check failed with the following output: Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > no valid command found; 10 closest matches: Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > 0 Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > 1 Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > 2 Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > abort Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > assert Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > bluefs debug_inject_read_zeros Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > bluefs files list Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > bluefs stats Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > bluestore allocator dump block Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > bluestore allocator fragmentation block Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > admin_socket: invalid command Jul 04 11:09:21 f12-h06-000-1029u kubenswrapper[4818]: > Jul 04 11:09:22 f12-h06-000-1029u ovs-vswitchd[2164]: ovs|451613|connmgr|INFO|br-ex<->unix#2713321: 2 flow_mods in the last 0 s (2 adds) Version of all relevant components (if applicable): ODF v4.13 ceph version 17.2.6-70.el9cp (fe62dcdbb2c6e05782a3e2b67d025b84ff5047cc) quincy (stable) Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? No Is there any workaround available to the best of your knowledge? NA Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 3 Can this issue reproducible? Yes, check logs on nodes where OSDs pods are scheduled Can this issue reproduce from the UI? NA If this is a regression, please provide more details to justify this: Steps to Reproduce: There is not actual steps to reproduce, I think we just need to check what command fails and provide it with right parameters. Actual results: error message in logs - command fails with wrong parameters Expected results: command to succeed and to avoid misleading error messages is logs Additional info: NA
Thanks Parth for taking a further look.
Please reopen if there is more to investigate