Bug 1432003

Summary: After logrotate , dynamic looping call gets 'ValueError: I/O operation on closed file' on self.stream.flush() in /usr/lib64/python2.7/logging/handlers.py
Product: Red Hat Enterprise Linux 7 Reporter: Masaki Furuta ( RH ) <mfuruta>
Component: pythonAssignee: Charalampos Stratakis <cstratak>
Status: CLOSED ERRATA QA Contact: Branislav Náter <bnater>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: apevec, berrange, bnater, cstratak, dasmith, eglynn, jkejda, kchamart, lhh, mfuruta, pviktori, python-maint, rbryant, sbauza, sferdjao, sgordon, srevivo, torsava, vromanso, vstinner
Target Milestone: pre-dev-freezeKeywords: Triaged
Target Release: ---Flags: vstinner: needinfo+
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: python-2.7.5-58.el7 Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 20:25:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 5 Victor Stinner 2017-03-31 14:33:37 UTC
The problem is unrelated to Nova, but Oslo Log. In RHOS 6 (Juno), Nova still used "Oslo Incubator": copy of Oslo Log in openstack/common/.

The problem is that an external program (logrotate) renames the log file while Nova has an open file descriptor. 

nova/openstack/common/log.py adds a logging.handlers.WatchedFileHandler() log handler:
https://github.com/openstack/nova/blob/juno-eol/nova/openstack/common/log.py#L490

This doesn't support when the open file is moved/renamed. The bug is already known:
https://bugs.launchpad.net/oslo.log/+bug/1437444

It was fixed upstream with the addition of a new log handler which detects and handles moved file: FastWatchedFileHandler.
https://review.openstack.org/#/c/208366/

Two similar bugzilla issues:

* https://bugzilla.redhat.com/show_bug.cgi?id=1432003 (reported at 2014-07-25, no component) -> this one was reported upstream, see the launchpad issue above
* https://bugzilla.redhat.com/show_bug.cgi?id=1123298 (reported at 2014-07-25, component: nova)

Comment 8 Victor Stinner 2017-04-03 14:03:50 UTC
Mike Burns confirmed that python-inotify is already in RHEL 7, so it is ok to use it in RHOS 6.

Comment 15 Charalampos Stratakis 2017-04-19 15:49:56 UTC
(In reply to Petr Viktorin from comment #12)
> Harris or Tomáš, can you please confirm that RHEL 7 uses 2.7.5 and that
> https://hg.python.org/cpython/rev/bb8b0c7fefd0 fix was not backported?

I can confirm that the python version in RHEL 7 (2.7.5), doesn't include this fix.

Victor, by backporting the patch in python, the issue will be resolved?

Comment 16 Victor Stinner 2017-04-19 16:03:50 UTC
> Victor, by backporting the patch in python, the issue will be resolved?

Sorry, that's hard to say at this point:

"How reproducible: Once on customer site only."

and then "Actual results: Can't get any log entry to pursue the cause in nova-compute.log after Feb 05, due to 'ValueError: I/O operation on closed file' on self.stream.flush() in /usr/lib64/python2.7/logging/handlers.py, after logrotate."

Funny or not, I need missing logs to prove that the logging bugfix will fix the logging issue...

At least, according to the customer traceback and to the CPython upstream bug report: they seem *very* similar.

The real mystery is why the nova failed to reopen the nova file while logrotate was rotating the log file? Likely a race condition, likely hard to reproduce.

Comment 20 Charalampos Stratakis 2017-04-19 16:45:14 UTC
Scratch builds with the fix on top of python-2.7.5-39

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13042090

Comment 34 errata-xmlrpc 2017-08-01 20:25:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1868