Bug 1779708
Summary: | no clear indication of profile activation error | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Jaroslav Škarvada <jskarvad> |
Component: | tuned | Assignee: | Jaroslav Škarvada <jskarvad> |
Status: | CLOSED WONTFIX | QA Contact: | qe-baseos-daemons |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 8.2 | CC: | aaron_wilk, atheurer, jeder, jmencak, jskarvad, krister, lcapitulino, lmanasko, nilal, pezhang, riehecky, tcerna |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | 8.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1385838 | Environment: | |
Last Closed: | 2021-06-04 07:29:48 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1385838 | ||
Bug Blocks: | 1400961, 1472751, 1485946, 1932086 |
Comment 2
Jaroslav Škarvada
2021-02-15 10:51:25 UTC
(In reply to Jaroslav Škarvada from comment #2) > In the current code, there is limited support for this feature. The current <snip> > > I propose the following: > - Store the error status in the daemon (it's currently thrown away after the > D-Bus signal is sent). > - Extend the D-Bus API and add method which will report the error status > back anytime it's called (currently the status is received only once through > the signal). Probably the least intrusive way is to add 'active_profile2' > D-Bus method which will work the same way as the 'active_profile' method, > i.e. return the profile/s name/s and in addition also return the errors. If I am not mistaken here then we will have something like 'tuned-adm active-debug/status' in parallel to 'tuned-adm active'? If so, then IMHO it may cause some confusion as to which one a user should run, yes we will have --help to read about it but someone who was already using tuned-adm active may not care to explore this new option. Considering my understanding of your suggestion is correct? How difficult will it be to extend the existing tuned-adm active to report the errors with output such as "Current active profile (with errors): xyz"? > - Switch 'tuned-adm' tool to use the 'active_profile2' D-Bus method and > return the error information with the 'tuned-adm active'. This could either > show the status and few lines of the error (there could be some additional > argument, e.g. '--all', to display all errors or we could add some new > option e.g. 'profile_status'). I agree with having the 'profile_status' option to report the errors explicitly as long as we have some sort of indication in the tuned-adm active that the profile is applied with 'x' number of errors. > - Fix 'tuned-adm' to report the error on 'stderr' not on 'stdout'. > - Extend the 'log.error()' function which is used in the plugins > implementation to store the error string and if there is some error throw > the 'InvalidProfileException' after the profile is processed. > > The questions are: > - Should we extend the 'log.error()' which is used in TuneD plugins or > propose new call, e.g. 'log.fail()'? > I think the current logging API supports more than enough methods > ('log.error()', 'log.info()', 'log.warn()', 'log.debug()'), so personally I > wouldn't add new log method. Let's say the error is always fatal and if > there are some places in the plugins we do not want to cause switch to the > 'failing state', change them to the 'log.warn()'. Totally agree, instead of adding new options we should leverage the existing one that is log.error(). For the places where the error is not fatal, we should replace log.error() with log.warn() instead. > - Should we a) fail early when the first 'log.error()' is called or b) > process the whole profile log all errors and fail afterwards? > I would go with the b) approach, it's more backward compatible and it could > be useful to load the profile even if some errors occurred. Although, I am not strictly against this but for example in a Real-time environment we receive "-ENOSPC" from the kernel then we should definitely terminate and in no way give an impression that the profile is correctly applied. For situations where we would want to continue applying the profile even when we are getting errors from the plugins we should not have log.error() in the first place, we should replace those occurrences with log.warn(). Having said that, I agree that modifying these occurrences of non-fatal log.error() with log.warn() may be difficult because there are several profiles. Thanks (In reply to Nitesh Narayan Lal from comment #3) > (In reply to Jaroslav Škarvada from comment #2) > > In the current code, there is limited support for this feature. The current > > <snip> > > > > > I propose the following: > > - Store the error status in the daemon (it's currently thrown away after the > > D-Bus signal is sent). > > - Extend the D-Bus API and add method which will report the error status > > back anytime it's called (currently the status is received only once through > > the signal). Probably the least intrusive way is to add 'active_profile2' > > D-Bus method which will work the same way as the 'active_profile' method, > > i.e. return the profile/s name/s and in addition also return the errors. > > If I am not mistaken here then we will have something like 'tuned-adm > active-debug/status' in parallel to 'tuned-adm active'? > > If so, then IMHO it may cause some confusion as to which one a user should > run, yes we will have --help to read about it but someone who was already > using tuned-adm active may not care to explore this new option. > > Considering my understanding of your suggestion is correct? How difficult > will it be to extend the existing tuned-adm active to report the errors > with output such as "Current active profile (with errors): xyz"? > No regarding the 'tuned-adm' tool I just thought about 'tuned-adm active' to show something like: $ tuned-adm active Current active profile: balanced Error(s) when applying profile: ERROR_LINE1 ERROR_LINE2 ERROR_LINE3 .... To show all error lines: $ tuned-adm active --all I didn't think about the warn/debug, just about the errors. > > > - Switch 'tuned-adm' tool to use the 'active_profile2' D-Bus method and > > return the error information with the 'tuned-adm active'. This could either > > show the status and few lines of the error (there could be some additional > > argument, e.g. '--all', to display all errors or we could add some new > > option e.g. 'profile_status'). > > I agree with having the 'profile_status' option to report the errors > explicitly > as long as we have some sort of indication in the tuned-adm active that the > profile is applied with 'x' number of errors. > > > - Fix 'tuned-adm' to report the error on 'stderr' not on 'stdout'. > > - Extend the 'log.error()' function which is used in the plugins > > implementation to store the error string and if there is some error throw > > the 'InvalidProfileException' after the profile is processed. > > > > The questions are: > > - Should we extend the 'log.error()' which is used in TuneD plugins or > > propose new call, e.g. 'log.fail()'? > > I think the current logging API supports more than enough methods > > ('log.error()', 'log.info()', 'log.warn()', 'log.debug()'), so personally I > > wouldn't add new log method. Let's say the error is always fatal and if > > there are some places in the plugins we do not want to cause switch to the > > 'failing state', change them to the 'log.warn()'. > > Totally agree, instead of adding new options we should leverage the existing > one > that is log.error(). > For the places where the error is not fatal, we should replace log.error() > with > log.warn() instead. > > > - Should we a) fail early when the first 'log.error()' is called or b) > > process the whole profile log all errors and fail afterwards? > > I would go with the b) approach, it's more backward compatible and it could > > be useful to load the profile even if some errors occurred. > > Although, I am not strictly against this but for example in a Real-time > environment > we receive "-ENOSPC" from the kernel then we should definitely terminate and > in no > way give an impression that the profile is correctly applied. > > For situations where we would want to continue applying the profile even when > we are getting errors from the plugins we should not have log.error() in the > first > place, we should replace those occurrences with log.warn(). > > Having said that, I agree that modifying these occurrences of non-fatal > log.error() > with log.warn() may be difficult because there are several profiles. > If you want to terminate profile loading you can still throw exception 'InvalidProfileException' or we could add some wrapper for it. In the most cases (taking into account the current profiles) the errors are not fatal. E.g. the plugin is providing FEATURE1, FEATURE2, FEATURE3. When e.g. the FEATURE2 fails, the rest of features can be still usable. In such case we are using 'error' - some feature failed, we are sure there will be troubles. If all features are applied, but e.g. FEATURE2 was applied with some troubles, we use 'warn' (e.g. your bootloader doesn't seem to support multiple initrds and the initrd will be probably ignored) - there can be or needn't be troubles, we cannot judge. There are places in the code where the error/warn aren't used consistently, but we are trying to fix this. It seems OCP doesn't use D-Bus. TuneD uses D-Bus for CLI/GUI/frontend <-> daemon communication. It has fallback mode to support basic functionality without D-Bus. In this degraded mode just the configuration files are used (to e.g. change the profile). More complex fallback communication method hasn't exist yet. If this functionality from the D-Bus API has to be used without the D-Bus, we will have to implement some fallback communication method (e.g. by using sockets). (In reply to Jaroslav Škarvada from comment #4) > (In reply to Nitesh Narayan Lal from comment #3) > > (In reply to Jaroslav Škarvada from comment #2) <snip> > > > No regarding the 'tuned-adm' tool I just thought about 'tuned-adm active' to > show something > like: > > $ tuned-adm active > Current active profile: balanced > Error(s) when applying profile: > ERROR_LINE1 > ERROR_LINE2 > ERROR_LINE3 > .... > > To show all error lines: > $ tuned-adm active --all > > I didn't think about the warn/debug, just about the errors. > Ok, that makes sense to me. > > > > > - Switch 'tuned-adm' tool to use the 'active_profile2' D-Bus method and <snip> > > > > Having said that, I agree that modifying these occurrences of non-fatal > > log.error() > > with log.warn() may be difficult because there are several profiles. > > > If you want to terminate profile loading you can still throw exception > 'InvalidProfileException' or we could add some wrapper for it. That's true, this need to be applied around the plugin itself? So that if the plugin fails an exception is thrown. > In the most > cases (taking into account the current profiles) the errors are not fatal. > E.g. the plugin is providing FEATURE1, FEATURE2, FEATURE3. When e.g. the > FEATURE2 fails, the rest of features can be still usable. In such case we > are using 'error' - some feature failed, we are sure there will be troubles. > If all features are applied, but e.g. FEATURE2 was applied with some > troubles, we use 'warn' (e.g. your bootloader doesn't seem to support > multiple initrds and the initrd will be probably ignored) - there can be or > needn't be troubles, we cannot judge. There are places in the code where the > error/warn aren't used consistently, but we are trying to fix this. I agree where the errors are not-fatal we should be fine in continuing the profile application. But atleast for the RT profiles, there are certain errors that can be considered fatal and in those cases, it is better to rollback the changes and notify the user to fix those issues and re-try. I think once we fix all these places where log.error() should not be used it should be ok to make a decision to rollback and terminate based on it, am I wrong? After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |