Hide Forgot
Description of problem: In /usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py: line 165: def process(self): .. try: self.total_units = self._get_total() self.report_progress() self.initialize() self.report_progress() .. finally: # Always call finalize to allow cleanup of file handles self.finalize() self.post_process() Assume an exception (example below) is raised such that the file expected to be created has not been created at all. Calling self.finalize() then fails with: Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) Task pulp.server.managers.repo.publish.publish[d83cb083-161b-472d-ba43-05d121dde87f] raised unexpected: OSError(2, 'No such file or directory') Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) Traceback (most recent call last): Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) R = retval = fun(*args, **kwargs) Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 328, in __call__ Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) return super(Task, self).__call__(*args, **kwargs) Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 437, in __protected_call__ Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) return self.run(*args, **kwargs) Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp/server/managers/repo/publish.py", line 99, in publish Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) transfer_repo, conduit, call_config) Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp/server/managers/repo/publish.py", line 127, in _do_publish Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) publish_report = publish_repo(transfer_repo, conduit, call_config) Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 495, in wrap_f Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) return f(*args, **kwargs) Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/distributors/yum/distributor.py", line 143, in publish_repo Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) return self._publisher.publish() Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py", line 558, in publish Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) return self.process_lifecycle() Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py", line 503, in process_lifecycle Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) super(PluginStep, self).process_lifecycle() Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py", line 127, in process_lifecycle Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) step.process() Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py", line 211, in process Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) self.finalize() Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/distributors/yum/publish.py", line 419, in finalize Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) self.other_context.checksum) Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/distributors/yum/metadata/repomd.py", line 82, in add_metadata_file_metadata Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) timestamp_element.text = str(int(os.path.getmtime(file_path))) Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) File "/usr/lib64/python2.7/genericpath.py", line 54, in getmtime Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) return os.stat(filename).st_mtime Apr 5 14:50:36 Satellite pulp: celery.worker.job:ERROR: (1275-37664) OSError: [Errno 2] No such file or directory: '/var/lib/pulp/working/repos/Default_Organization-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_RPMs_x86_64_7Server/distributors/yum_distributor/repodata/other.xml.gz' This happens due to the fact that MetadataFileContext finalize method sees the file is not opened, so it does nothing, but PublishRpmStep finalize assumes there has been created file for other_context. (same applies when an exception is raised for file_lists or primary context). The exception above is practically impossible to troubleshoot. It is so due to the fact a swallowed exception is silently ignored. Simple printing warning with exception details would help a lot to understand what happened. One real example what exception could be raised & swallowed: Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) 01612026: caught exception: Traceback (most recent call last): Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py", line 183, in process Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) self.initialize() Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/distributors/yum/publish.py", line 403, in initialize Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) context.initialize() Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/metadata_writer.py", line 74, in initialize Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) self._open_metadata_file_handle() Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/metadata_writer.py", line 364, in _open_metadata_file_handle Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) content = gzip_handle.read(BUFFER_SIZE) Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) File "/usr/lib64/python2.7/gzip.py", line 261, in read Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) self._read(readsize) Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) File "/usr/lib64/python2.7/gzip.py", line 325, in _read Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) self._read_eof() Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) File "/usr/lib64/python2.7/gzip.py", line 347, in _read_eof Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) hex(self.crc))) Apr 14 18:02:36 Satellite pulp: pulp.plugins.util.publish_step:WARNING: (11964-64448) IOError: CRC check failed 0x7d4c4209 != 0x60a4e457L I.e. CRC check on the gzip file is wrong -> exception raised -> silently ignored -> no other (or filelist or primary) context file exists -> cryptic "No such file exists" raised. Another possibility might be (havent tested/seen) file permissions, too many open FDs, some I/O problem,.. many. Please add some warning what exception was raised, like: # Double check & return if we have been canceled if self.canceled: return except Exception as e: _LOG.warn('caught exception: %s' % traceback.format_exc()) finally: # Always call finalize to allow cleanup of file handles self.finalize() Version-Release number of selected component (if applicable): pulp-server-2.6.0.20-1.el7sat.noarch How reproducible: 100% (if reproducer known) Steps to Reproduce: 1. Somehow let damage primar / other / filelist gzip for some repo (not sure on this step) 2. Try to sync & publish the repo 3. Try to understand from logs what happened Actual results: 2. fails 3. no idea what happened, "No such file or directory: other.xml.gz" is use-less info hiding the real cause. Expected results: 2. fails 3. original exception raised that prevented opening the metadata / context file. Additional info:
Per 6.3 planning, moving out non acked bugs to the backlog
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
I suggest re-triaging this for priority and severity. This bug is very difficult to reproduce. If pulp fails to open a new file for writing, the error handling tries to then close that file. This just causes the error to be reported incorrectly, obscuring the original (highly unlikely) problem. That is silly, but very unlikely to happen in the real world. It's worth fixing, but given that information, I suggest the severity should not be "high", and perhaps the priority should not even be "medium".
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.
Verified on Sat 6.3 snap 33
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. > > > > For information on the advisory, and where to find the updated files, follow the link below. > > > > If the solution does not work for you, open a new bug report. > > > > https://access.redhat.com/errata/RHSA-2018:0336