Bug 2212953

Summary: Smaps file parsing in DNF's needs-restarting cannot handle garbage UTF-8-ish characters in smaps lines
Product: Red Hat Enterprise Linux 8 Reporter: Andy Baugh <andy.baugh>
Component: dnf-plugins-coreAssignee: Jan Kolarik <jkolarik>
Status: CLOSED MIGRATED QA Contact: swm-qe
Severity: low Docs Contact:
Priority: unspecified    
Version: 8.8CC: carl, james.antill, jkolarik
Target Milestone: rcKeywords: MigratedToJIRA, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-21 17:23:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andy Baugh 2023-06-06 17:15:47 UTC
Description of problem:
Occasionally when needs-restarting is ran, a component which requires restart may have an open fd corresponding to an entry in the smaps table with garbage characters in them. When this occurs, needs-restarting exits nonzero with a stacktrace referencing the attempt to read non-ascii lines from a filehandle established on line 77 of needs-restarting.py. This shouldn't be all that surprising, given the smaps file is in fact just binary content and not an encoded text file, but even then in 99% or more of cases I'd suspect this would still be fine to not handle.

Version-Release number of selected component (if applicable):
4.0.21-19 (dnf-plugins-core-4.0.21-19.el8_8.noarch)

How reproducible:
My best guess was to ensure one of the services which needs restarting was holding open a file with UTF-8 characters in the filename then run needs-restarting. This is... rather difficult to actually hit out in the wild, and at cPanel we've only seen it from customers reporting it to us after getting failure emails about needs-restarting. As such I went for a contrived approach within a unit test to reproduce the issue.

Steps to Reproduce:
See test in forthcoming pull request. I have no idea how people actually hit this in the wild, but we've got enough complaints about it that I can say it is a real problem.

Actual results:
  File "/usr/bin/needs-restarting", line 101, in <module>
    main.user_main(MAPPING[command] + args, exit_code=True)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 201, in user_main
    errcode = main(args)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 67, in main
    return _main(base, args, cli_class, option_parser_class)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 106, in _main
    return cli_run(cli, base)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 122, in cli_run
    cli.run()
  File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 1055, in run 
    return self.command.run()
  File "/usr/lib/python3.6/site-packages/dnf-plugins/needs_restarting.py", line 270, in run 
    for ofile in list_opened_files(uid):
  File "/usr/lib/python3.6/site-packages/dnf-plugins/needs_restarting.py", line 77, in list_opened_files
    lines = smaps_file.readlines()
  File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1562: ordinal not in range(128)

Expected results:
No nonzero exit, no stacktrace.

Additional info:
Will add a related pull request momentarily to this bug in a comment.

Comment 2 Jan Kolarik 2023-08-15 05:11:41 UTC
RHEL 9 clone: https://bugzilla.redhat.com/show_bug.cgi?id=2231923.

Comment 4 RHEL Program Management 2023-09-21 17:21:59 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 5 RHEL Program Management 2023-09-21 17:23:34 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.