Bug 1879920

Summary: Not closing the fd if blivet fails to read the device [rhel-7.9.z]
Product: Red Hat Enterprise Linux 7 Reporter: nijin ashok <nashok>
Component: python-blivetAssignee: Vojtech Trefny <vtrefny>
Status: CLOSED ERRATA QA Contact: Release Test Team <release-test-team-automation>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.8CC: cswanson, hhaberma, jreznik, jstodola, vtrefny
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: python-blivet-0.61.15.76-1.el7_9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-10 13:10:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1875554    

Description nijin ashok 2020-09-17 11:11:21 UTC
Description of problem:

This bug is a follow up of bug 1875554. The issue was observed in the supervdsm which uses python-blivet where it holds the unampped multipath device indefinitely.

From the strace output, we can see that the blivet opens the device, tried to read it and then failed with EIO. However, it doesn't care to close it. 

===
10078 10:38:25.869889 open("/dev/mapper/36001405097fdbe4d6e04e3b9bdc97014", O_RDONLY <unfinished ...>
10078 10:38:25.870015 <... open resumed>) = 25
10078 10:38:25.870415 lseek(25, 440, SEEK_SET <unfinished ...>
10078 10:38:25.870600 read(25,  <unfinished ...>
10078 10:38:25.870759 <... read resumed>0x7f865c0ec984, 4) = -1 EIO (Input/output error)

< -- then jumped to next device without closing the fd 25-->

10078 10:38:25.871192 open("/dev/sda", O_RDONLY <unfinished ...>
===


blivlet is  not closing the device if there is an exception while accessing the device.

===
blivet/devicelibs/edd.py

153 def collect_mbrs(devices):
154     """ Read MBR signatures from devices.
155 
156         Returns a dict mapping device names to their MBR signatures. It is not
157         guaranteed this will succeed, with a new disk for instance.
158     """
159     mbr_dict = {}
160     for dev in devices:
161         try:
162             fd = os.open(dev.path, os.O_RDONLY)
163             # The signature is the unsigned integer at byte 440:
164             os.lseek(fd, 440, 0)
165             mbrsig = struct.unpack('I', os.read(fd, 4))
166             os.close(fd)                                          
167         except OSError as e:
168             log.warning("edd: error reading mbrsig from disk %s: %s",
169                         dev.name, str(e))
170             continue                                                            ===> Not closing fd if it fails to access the device and continues with next device
====


Version-Release number of selected component (if applicable):

python-blivet-0.61.15.75-1.el7.noarch

How reproducible:

100%

Steps to Reproduce:

The issue is reproducible in RHV with below steps.

[1] Map a LUN to the server. 

[2] Create a partition on this LUN.

[3] Unamp the LUN from the storage.

[4] Fush the cache on the host.

# echo 3 > /proc/sys/vm/drop_caches

[5] Login to RHV-M => click on hosts => Storage devices => sync


Actual results:

Not closing the fd  if it fails while reading mbr

Expected results:

It should close the fd correctly if it fails to read the device.

Additional info:

A fix for this https://github.com/storaged-project/blivet/pull/899 is already merged in upstream.

Comment 15 Jan Stodola 2020-10-07 14:20:10 UTC
Thanks for testing this, nijin.

I'm moving this bug to VERIFIED based on comment 14.

Comment 24 errata-xmlrpc 2020-11-10 13:10:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (python-blivet bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5032

Comment 25 nijin ashok 2021-01-05 03:26:22 UTC
*** Bug 1875554 has been marked as a duplicate of this bug. ***