Bug 2212953

Summary: Smaps file parsing in DNF's needs-restarting cannot handle garbage UTF-8-ish characters in smaps lines
Product: Red Hat Enterprise Linux 8 Reporter: Andy Baugh <andy.baugh>
Component: dnf-plugins-coreAssignee: Jan Kolarik <jkolarik>
Status: POST --- QA Contact: swm-qe
Severity: low Docs Contact:
Priority: unspecified    
Version: 8.8CC: carl, james.antill, jkolarik
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andy Baugh 2023-06-06 17:15:47 UTC
Description of problem:
Occasionally when needs-restarting is ran, a component which requires restart may have an open fd corresponding to an entry in the smaps table with garbage characters in them. When this occurs, needs-restarting exits nonzero with a stacktrace referencing the attempt to read non-ascii lines from a filehandle established on line 77 of needs-restarting.py. This shouldn't be all that surprising, given the smaps file is in fact just binary content and not an encoded text file, but even then in 99% or more of cases I'd suspect this would still be fine to not handle.

Version-Release number of selected component (if applicable):
4.0.21-19 (dnf-plugins-core-4.0.21-19.el8_8.noarch)

How reproducible:
My best guess was to ensure one of the services which needs restarting was holding open a file with UTF-8 characters in the filename then run needs-restarting. This is... rather difficult to actually hit out in the wild, and at cPanel we've only seen it from customers reporting it to us after getting failure emails about needs-restarting. As such I went for a contrived approach within a unit test to reproduce the issue.

Steps to Reproduce:
See test in forthcoming pull request. I have no idea how people actually hit this in the wild, but we've got enough complaints about it that I can say it is a real problem.

Actual results:
  File "/usr/bin/needs-restarting", line 101, in <module>
    main.user_main(MAPPING[command] + args, exit_code=True)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 201, in user_main
    errcode = main(args)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 67, in main
    return _main(base, args, cli_class, option_parser_class)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 106, in _main
    return cli_run(cli, base)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 122, in cli_run
    cli.run()
  File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 1055, in run 
    return self.command.run()
  File "/usr/lib/python3.6/site-packages/dnf-plugins/needs_restarting.py", line 270, in run 
    for ofile in list_opened_files(uid):
  File "/usr/lib/python3.6/site-packages/dnf-plugins/needs_restarting.py", line 77, in list_opened_files
    lines = smaps_file.readlines()
  File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1562: ordinal not in range(128)

Expected results:
No nonzero exit, no stacktrace.

Additional info:
Will add a related pull request momentarily to this bug in a comment.

Comment 2 Jan Kolarik 2023-08-15 05:11:41 UTC
RHEL 9 clone: https://bugzilla.redhat.com/show_bug.cgi?id=2231923.