Bug 1701234
Summary: | blk-availability.service doesn't respect unit order on shutdown | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Renaud Métrich <rmetrich> |
Component: | lvm2 | Assignee: | Peter Rajnoha <prajnoha> |
lvm2 sub component: | Default / Unclassified | QA Contact: | cluster-qe <cluster-qe> |
Status: | CLOSED DEFERRED | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | agk, cbesson, cmarthal, erlend, heinzm, jbrassow, jmagrini, loberman, mbliss, mrichter, msnitzer, paelzer, pdwyer, prajnoha, qguo, revers, rhandlin, zkabelac |
Version: | 7.6 | Keywords: | Reopened, Triaged |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-11-18 07:15:31 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1784876 |
Description
Renaud Métrich
2019-04-18 12:56:02 UTC
Normally, I'd add Before=local-fs-pre.target into blk-availability.service so on shutdown its ExecStop would execute after all local mount points are unmounted. The problem might be with all the dependencies like iscsi, fcoe and rbdmap services where we need to make sure that these are executed *after* blk-availability. So I need to find a proper target that we can hook on so that it also fits all the dependencies. It's possible we need to create a completely new target so we can properly synchronize all the services on shutdown. I'll see what I can do... Indeed, wasn't able to find a proper target, none exists. I believe blk-availability itself needs to be modified to only deactivate non-local disks (hopefully there is a way to distinguish). Hi Peter, Could you explain why blk-availability is needed when using multipath or iscsi? With systemd ordering dependencies in units, is that really needed? (In reply to Renaud Métrich from comment #4) > Hi Peter, > > Could you explain why blk-availability is needed when using multipath or > iscsi? > With systemd ordering dependencies in units, is that really needed? It is still needed because otherwise there wouldn't be anything else to properly deactivate the stack. Even though, the blk-availability.service with blkdeactivate call is still not perfect, it's still better than nothing and letting systemd to shoot down the devices on its own within its "last-resort" device deactivation loop that happens in shutdown initramfs (here, the iscsi/fcoe and all the other devices are already disconnected anyway, so anything else on top can't be properly deactivated). We've just received related report on github too (https://github.com/lvmteam/lvm2/issues/18). I'm revisiting this problem now. The correct solution requires more patching - this part is very fragile at the moment (...easy to break other functionality). (In reply to Renaud Métrich from comment #3) > I believe blk-availability itself needs to be modified to only deactivate > non-local disks (hopefully there is a way to distinguish). It's possible that we need to split the blk-availability (and the blkdeactivate) in two because of this... There is a way to distinguish I hope (definitely for iscsi/fcoe), but there currently isn't a central authority to decide on this so it must be done manually (checking certain properties in sysfs "manually"). I must be missing something. This service is used to deactivate "remote" block devices requiring the network, such as iscsi or fcoe. Why aren't these services deactivating the block devices by themselves? That way systemd won't kill everything abruptly. (In reply to Renaud Métrich from comment #7) > I must be missing something. This service is used to deactivate "remote" > block devices requiring the network, such as iscsi or fcoe. Nope, ALL storage, remote as well as local, if possible. We need to look at the complete stack (e.g. device-mapper devices which are layered on top of other layers, are set up locally) > Why aren't these services deactivating the block devices by themselves? Well, honestly, because nobody has ever solved that :) At the beginning, it probably wasn't that necessary and if you just shut your system down and let the devices as they are (unattached, not deactivated), it wasn't such a problem. But now, with various caching layers, thin pools... it's getting quite important to deactivate the stack properly to also properly flush any metadata or data. Of course, we still need to count with the situation where there's a power outage and the machine is not backed by any other power source so you'd have your machine shot down immediately (for that there are various checking and fixing mechanism). But it's certainly better to avoid this situation as you could still lose some data. Systemd's loop in the shutdown initramfs is really the last-resort thing to execute, but we can't rely on that (it's just a loop on device list with limited loop count, it doesn't look at the real nature of that layer in the stack). OK, then we need a "blk-availability-local" service and "blk-availability-remote" service and maybe associated targets, similar to "local-fs.target" and "remote-fs.target". Probably this should be handled by systemd package itself, typically by analyzing the device properties when a device shows up in udev. Based on the report here, this affects only setups with custom services/systemd units. Also, the blk-availability/blkdeactivate has been in RHEL7 since 7.0 and this seems to be the only report we have received so far (therefore, I don't expect much users to be affected by this issue). Also, I think it's less risk adding the extra dependency as already described here https://access.redhat.com/solutions/4154611 than splitting the blk-availability / blkdeactivate into (at least) two parts running at different times. Also, if we did this, we'd need to introduce a new synchronization point (like a systemd target) that other services would need to depend on (and so it would require much more changes in various other components which involves risks). In future, we'll try to cover this shutdown scenario in a more proper way with new Storage Instantiation Daemon (SID). Red Hat Enterprise Linux 7 shipped it's final minor release on September 29th, 2020. 7.9 was the last minor releases scheduled for RHEL 7. From intial triage it does not appear the remaining Bugzillas meet the inclusion criteria for Maintenance Phase 2 and will now be closed. From the RHEL life cycle page: https://access.redhat.com/support/policy/updates/errata#Maintenance_Support_2_Phase "During Maintenance Support 2 Phase for Red Hat Enterprise Linux version 7,Red Hat defined Critical and Important impact Security Advisories (RHSAs) and selected (at Red Hat discretion) Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available." If this BZ was closed in error and meets the above criteria please re-open it flag for 7.9.z, provide suitable business and technical justifications, and follow the process for Accelerated Fixes: https://source.redhat.com/groups/public/pnt-cxno/pnt_customer_experience_and_operations_wiki/support_delivery_accelerated_fix_release_handbook Feature Requests can re-opened and moved to RHEL 8 if the desired functionality is not already present in the product. Please reach out to the applicable Product Experience Engineer[0] if you have any questions or concerns. [0] https://bugzilla.redhat.com/page.cgi?id=agile_component_mapping.html&product=Red+Hat+Enterprise+Linux+7 Apologies for the inadvertent closure. |