Bug 2231923

Summary: Smaps file parsing in DNF's needs-restarting cannot handle garbage UTF-8-ish characters in smaps lines
Product: Red Hat Enterprise Linux 9 Reporter: Jonathan Wright <jonathan>
Component: dnf-plugins-coreAssignee: Jan Kolarik <jkolarik>
Status: POST --- QA Contact: swm-qe
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.2CC: carl, james.antill, jkolarik
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jonathan Wright 2023-08-14 17:05:40 UTC
Description of problem:
Occasionally when needs-restarting is ran, a component which requires restart may have an open fd corresponding to an entry in the smaps table with garbage characters in them. When this occurs, needs-restarting exits nonzero with a stacktrace referencing the attempt to read non-ascii lines from a filehandle established on line 77 of needs-restarting.py. This shouldn't be all that surprising, given the smaps file is in fact just binary content and not an encoded text file, but even then in 99% or more of cases I'd suspect this would still be fine to not handle.

Version-Release number of selected component (if applicable):
4.0.21-19 (dnf-plugins-core-4.0.21-19.el8_8.noarch)

How reproducible:
My best guess was to ensure one of the services which needs restarting was holding open a file with UTF-8 characters in the filename then run needs-restarting. This is... rather difficult to actually hit out in the wild, and at cPanel we've only seen it from customers reporting it to us after getting failure emails about needs-restarting. As such I went for a contrived approach within a unit test to reproduce the issue.

Steps to Reproduce:
See test in forthcoming pull request. I have no idea how people actually hit this in the wild, but we've got enough complaints about it that I can say it is a real problem.

Actual results:
  File "/usr/bin/needs-restarting", line 101, in <module>
    main.user_main(MAPPING[command] + args, exit_code=True)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 201, in user_main
    errcode = main(args)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 67, in main
    return _main(base, args, cli_class, option_parser_class)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 106, in _main
    return cli_run(cli, base)
  File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 122, in cli_run
    cli.run()
  File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 1055, in run 
    return self.command.run()
  File "/usr/lib/python3.6/site-packages/dnf-plugins/needs_restarting.py", line 270, in run 
    for ofile in list_opened_files(uid):
  File "/usr/lib/python3.6/site-packages/dnf-plugins/needs_restarting.py", line 77, in list_opened_files
    lines = smaps_file.readlines()
  File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1562: ordinal not in range(128)

Expected results:
No nonzero exit, no stacktrace.

Additional info:
https://github.com/rpm-software-management/dnf-plugins-core/pull/494

Comment 1 Jan Kolarik 2023-08-15 05:10:58 UTC
Original RHEL 8 bug: https://bugzilla.redhat.com/show_bug.cgi?id=2212953.