RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1511487 - [abrt] [faf] sos: _write_checksum(): /usr/lib/python2.7/site-packages/sos/sosreport.py killed by TypeError
Summary: [abrt] [faf] sos: _write_checksum(): /usr/lib/python2.7/site-packages/sos/sos...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: sos
Version: 7.5
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Pavel Moravec
QA Contact: Miroslav Hradílek
URL: http://faf.lab.eng.brq.redhat.com/faf...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-09 12:56 UTC by Vladimir Benes
Modified: 2018-10-30 10:33 UTC (History)
9 users (show)

Fixed In Version: sos-3.6-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-30 10:31:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github sosreport sos pull 1329 0 None closed [sosreport] Handle cases when creating archive fails 2020-12-22 14:16:30 UTC
Red Hat Product Errata RHEA-2018:3144 0 None None None 2018-10-30 10:33:21 UTC

Description Vladimir Benes 2017-11-09 12:56:34 UTC
This bug has been created based on an anonymous crash report requested by the package maintainer.

Report URL: http://faf.lab.eng.brq.redhat.com/faf/reports/bthash/baf7294f2b22418a8ccc03af719d971eed80ab0c/

Comment 1 Bryn M. Reeves 2017-11-09 13:11:05 UTC
Are the logs from the run available?

Comment 4 Pavel Moravec 2017-11-15 07:49:38 UTC
Simple reproducer - bit more generic:

mkdir /tmp/tmp; sosreport --batch --tmp-dir=/tmp/tmp

in another terminal, run after a random time:

rm -rf /tmp/tmp

Possible backtraces:

Traceback (most recent call last):
  File "/usr/sbin/sosreport", line 25, in <module>
    main(sys.argv[1:])
  File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 1637, in main
    sos.execute()
OSError: [Errno 2] No such file or directory: '/tmp/tmp/sos.X05jdV/sosreport-pmoravec-rhel74.gsslab.brq2.redhat.com-20171115084447'

> /usr/lib64/python2.7/shutil.py(237)rmtree()
-> names = os.listdir(path)

or the reported one:

1 	_write_checksum 	/usr/lib/python2.7/site-packages/sos/sosreport.py 	/usr/lib/python2.7/site-packages/sos/sosreport.py 	1471
2 	final_work 	/usr/lib/python2.7/site-packages/sos/sosreport.py 	/usr/lib/python2.7/site-packages/sos/sosreport.py 	1526
3 	execute 	/usr/lib/python2.7/site-packages/sos/sosreport.py 	/usr/lib/python2.7/site-packages/sos/sosreport.py 	1613
4 	main 	/usr/lib/python2.7/site-packages/sos/sosreport.py 	/usr/lib/python2.7/site-packages/sos/sosreport.py 	1634
5 	<module> 	/usr/sbin/sosreport 	/usr/sbin/sosreport 	25



Code fix is "obvious" (catch and react on exceptions when trying to delete tmp dir or writing checksum), but proper reaction might be tricky - dont want to create potential regression in 7.5 so defering to 7.6

Comment 5 Pavel Moravec 2018-03-03 16:52:30 UTC
*** Bug 1548199 has been marked as a duplicate of this bug. ***

Comment 6 Pavel Moravec 2018-04-23 14:26:07 UTC
Much more probable cause:

https://github.com/sosreport/sos/pull/1273

that is due to bz1548199 .

Comment 7 Pavel Moravec 2018-04-23 14:27:49 UTC
(In reply to Pavel Moravec from comment #6)
> Much more probable cause:
> 
> https://github.com/sosreport/sos/pull/1273
> 
> that is due to bz1548199 .

Please ignore, I am wrong (again).

Comment 8 Bryn M. Reeves 2018-04-23 15:08:21 UTC
The argument `archive` is passed None:

sosreport.py:1471:_write_checksum:TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

Traceback (most recent call last):
  File "/usr/sbin/sosreport", line 25, in <module>
    main(sys.argv[1:])
  File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 1634, in main
    sos.execute()
  File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 1613, in execute
    return self.final_work()
  File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 1526, in final_work
    self._write_checksum(archive, hash_name, checksum)
  File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 1471, in _write_checksum
    fp = open(archive + "." + hash_name, "w")
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

Local variables in innermost frame:
checksum: False
self: <sos.sosreport.SoSReport object at 0x7f7bbcc97910>
hash_name: 'md5'
archive: None

Which is possible due to a buglet in _create_checksum():

1443     def _create_checksum(self, archive, hash_name):
1444         if not archive:
1445             return False <------
1446 
1447         archive_fp = open(archive, 'rb')
1448         digest = hashlib.new(hash_name)
1449         digest.update(archive_fp.read())
1450         archive_fp.close()
1451         return digest.hexdigest()

We should not even be attempting to call these functions if we have no valid archive path.

The real bug is here:

1475             # compression could fail for a number of reasons
1476             try:
1477                 archive = self.archive.finalize(
1478                     self.opts.compression_type)
1479             except (OSError, IOError) as e:
1480                 if e.errno in fatal_fs_errors:
1481                     print("")
1482                     print(_(" %s while finalizing archive" % e.strerror))
1483                     print("")
1484                     self._exit(1)
1485             except:
1486                 if self.opts.debug:
1487                     raise
1488                 else:
1489                     return False

We must have returned from self.archive.finalize() with archive==None: nothing ever checks this, _create_checksum() happily ignores it, and then _write_checksum() dies in the spaghetti...

There's a straightforward fix if this is needed promptly, but really, all of these functions would benefit from a thorough review and more robust error handling.

Comment 9 Pavel Moravec 2018-05-30 10:48:30 UTC
Filip,
I know you collected various reproducers of "sosreport fails on disk full" scenarios - what is their status / some patches available?

Comment 10 Filip Krska 2018-06-01 13:46:27 UTC
Ahoj Pavle, you probably mean the mentioned pull/1273/ (related rather to bug 1548199 than this "TypeError: unsupported operand" bug). And yes, latest patch there attempts to address all "disk full" scenarios I encountered when testing.

Comment 11 Pavel Moravec 2018-06-05 06:56:13 UTC
(In reply to Filip Krska from comment #10)
> Ahoj Pavle, you probably mean the mentioned pull/1273/ (related rather to
> bug 1548199 than this "TypeError: unsupported operand" bug). And yes, latest
> patch there attempts to address all "disk full" scenarios I encountered when
> testing.

Three times is enough! :) to confusingly match this BZ with an independent another one.

https://github.com/sosreport/sos/pull/1329 has a fix for this one.

Comment 12 Pavel Moravec 2018-06-21 10:44:39 UTC
Reproducer steps (deterministic but require code change in sos to catch proper timing):

1) in  /usr/lib/python2.7/site-packages/sos/sosreport.py around line 1493 (depends on sos version):

            try:
                archive = self.archive.finalize(
                    self.opts.compression_type)
            except (OSError, IOError) as e:

add there:

            try:
                import time
                time.sleep(10)
                archive = self.archive.finalize(
                    self.opts.compression_type)
            except (OSError, IOError) as e:


2) Then run e.g.:

sosreport --batch

(you can limit it to few plugins only, but dont use --build !)


3) Check tmp dir from:

An archive containing the collected information will be generated in
/var/tmp/sos.U68EOp and may be provided to a Red Hat support
representative.


4) Once you see:

Creating compressed archive...

delete the temp.dir:

rm -rf /var/tmp/sos.U68EOp


5) wait 10s to complete sosreport's run

Comment 13 Bryn M. Reeves 2018-06-21 11:20:30 UTC
We don't support an external process removing the temporary directory out from under sos while it is running and never have. The correct procedure is:

1. Stop the sosreport process (*which will clean up its temporary directory anyway*)

2. Remove the containing directory if you wish to

We can harden the exception handling to print a better error in this case but just like every other time this has been discussed the best answer is "don't do it".

ABRT has had this problem for years: it sets up the directory, it starts the sosreport process (in correct order!), yet it cannot stop the process before it unlinks the temporary directory .... ?

Comment 17 errata-xmlrpc 2018-10-30 10:31:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:3144


Note You need to log in before you can comment on or make changes to this bug.