Bug 2232727

Summary: convert2rhel fails with a backtrace in _bad_kernel_package_signature() constantly
Product: Red Hat Enterprise Linux 7 Reporter: Renaud Métrich <rmetrich>
Component: convert2rhelAssignee: Michal Bocek <mbocek>
Status: CLOSED MIGRATED QA Contact: upgrades-and-conversions
Severity: medium Docs Contact: Miriam Portman <mportman>
Priority: medium    
Version: 7.9CC: upgrades-and-conversions
Target Milestone: rcKeywords: MigratedToJIRA
Target Release: ---Flags: pm-rhel: mirror+
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-19 16:43:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Renaud Métrich 2023-08-18 08:19:52 UTC
Description of problem:

A customer hits constantly a failure when trying to convert his CentOS7 system:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
Traceback: Traceback (most recent call
    last):
  File "/usr/lib/python2.7/site-packages/convert2rhel/actions/__init__.py", line 393, in run
    action.run()
  File
    "/usr/lib/python2.7/site-packages/convert2rhel/actions/system_checks/rhel_compatible_kernel.py", line 49, in run
    _bad_kernel_package_signature(system_info.booted_kernel),
  File "/usr/lib/python2.7/site-
    packages/convert2rhel/actions/system_checks/rhel_compatible_kernel.py", line 119, in _bad_kernel_package_signature
    kernel_pkg_obj = get_installed_pkg_objects(name, version, release, arch)[0]
IndexError: list index out of range
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

The reason for the failure is no kernel matching name/version/release/arch was found.
To check the installed kernel (which is used to populate name/version/release/arch), the following command is executed:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
103     kernel_pkg, return_code = run_subprocess(
104         ["rpm", "-qf", "--qf", "%{VERSION}&%{RELEASE}&%{ARCH}&%{NAME}", vmlinuz_path], print_output=False
105     )
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Then output split:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
116     version, release, arch, name = tuple(kernel_pkg.split("&"))
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Finally "kernel_pkg_obj" extracted:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
119     kernel_pkg_obj = get_installed_pkg_objects(name, version, release, arch)[0]
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

If, on line 104, the rpm command returns something on stderr, then the "version" will be corrupted, causing the issue.
For some reason, this always happens on the customer system, as shown below through adding some instrumentation on line 119 below:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
119     logger.warning("name='%s', version='%s', release='%s', arch='%s'" % (name, version, release, arch))
120     kernel_pkg_obj = get_installed_pkg_objects(name, version, release, arch)[0]
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Resulting output ("version" field corrupted by stderr):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
WARNING - name='kernel', version='BDB2053 Freeing read locks for locker 0x3f9: 129310/139621967034176
BDB2053 Freeing read locks for locker 0x3fb: 129310/139621967034176
3.10.0', release='1160.95.1.el7', arch='x86_64'
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Version-Release number of selected component (if applicable):

convert2rhel-1.3.2-1.el7.noarch

How reproducible:

Always on customer system, didn't manage to force getting a warning for now

Additional info:

A quick and dirty solution is to discard stderr when executing "rpm -qf" command on line 104, which requires wrapping in a shell, as shown below:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
104         ["sh", "-c", "rpm -qf --qf \"%%{VERSION}&%%{RELEASE}&%%{ARCH}&%%{NAME}\" %s 2>/dev/null" % vmlinuz_path], print_output=False
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Ideally, utils.run_subprocess() should be reimplemented to not merge stdout and stderr and make the merge be optional.
Currently it is implementing automatic merging:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
 345 def run_subprocess(cmd, print_cmd=True, print_output=True):
 :
 362     process = subprocess.Popen(
 363         cmd,
 364         stdout=subprocess.PIPE,
 365         stderr=subprocess.STDOUT,
 366         bufsize=1,
 367     )
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

It's very possible many other calls to utils.run_process() may fail similarly.
Probably a robust implementation would be to add some more function parameters:
- "expect_stderr" parameter which will make run_subprocess() DIE if some stderr is found when "expect_stderr=False"
- "discard_stderr" parameter which will redirect stderr to /dev/null when "discard_stderr=True"

Comment 3 RHEL Program Management 2023-09-19 15:59:00 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 4 RHEL Program Management 2023-09-19 16:43:00 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.