Bug 1982053

Summary: Add a suggestion to split disk for /var/lib/cinder/conversion and how to achieve it
Product: Red Hat OpenStack Reporter: Keigo Noha <knoha>
Component: documentationAssignee: Ian Frangs <ifrangs>
Status: NEW --- QA Contact: RHOS Documentation Team <rhos-docs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16.1 (Train)CC: brian.rosmaita, eharney, erpeters, ifrangs, jamsmith, jschluet, ltoscano, ndeevy, rhos-docs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Keigo Noha 2021-07-14 05:26:06 UTC
Description of problem:
Current overcloud node has a flatten partition which means the disk just has / partition only.
On controller nodes, /var/lib/cinder/conversion is used for image-> volume or volume->image conversion. If the image or volume is bigger than the available filesystem size, the services running on controller node may be unstable due to fulfilling the filesystem and the conversion operation will fail.
What we observed in support cases is that concurrent volume conversion operations cause high I/O to the root disk and it makes slow for monitor operation to mariadb or rabbitmq then monitor operation fails and unnecessary recovery operations is executed.

To avoid the issue, customer should consider one of following options.

1. Add additional local storage to the node and mount it to the directory.
2. Add additional external storage to the node(FC, DAS) and mount it to the directory.
3. Add additional external storage to the node(iSCSI) and mount it to the directory.
4. Use NFS for the directory.
5. Use volume_copy_bps_limit option in cinder.conf to limit the I/O per backend.

Option 1 and 2 can be achieved with the normal RHEL operation.
Option 3 requires a custom service file to login the iscsi storage during system boots because iscsi.service is disabled by default. Controller nodes run iscsid container on it. The custom systemd unit needs to be run after iscsid container before pacemaker.service.
Option 4 needs some implementation in T-H-T side handled at https://bugzilla.redhat.com/show_bug.cgi?id=1886762
Option 5 requires a fix of https://bugzilla.redhat.com/show_bug.cgi?id=1967838

Comment 8 Ian Frangs 2023-03-08 12:16:55 UTC
Hi Luigi,

Please can you take a look at this issue to determine if this should be documented and if so where (which document)?
Also, if a change to the OpenStack documentation is needed, who in the cinder squad could assist in making this documentation change?

Thanks,
Ian.

Comment 9 Luigi Toscano 2023-03-10 10:01:25 UTC
(In reply to Ian Frangs from comment #8)
> Hi Luigi,
> 
> Please can you take a look at this issue to determine if this should be
> documented and if so where (which document)?

Part of this (option 4) was already added to the documentation probably while solving bug 1886762, and it's available as the "Configuring an external NFS share for conversion" section of the "Advanced Overcloud Customization" guide on 16.2: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/advanced_overcloud_customization/index#proc_configuring-external-nfs-share-conversion_storage-configuration

I was going to say that it could be expanded and generalized to include the other use cases, but then that guide disappeared on 17.0 and its content moved around, so that content is availble as "Configuring an external NFS share for conversion" of the "Director Installation and Usage". But now it is part of a chapter called "Configuring NFS storage", which is really focused on NFS. Unless it could be generalized to talk about block storage options in general.

> Also, if a change to the OpenStack documentation is needed, who in the
> cinder squad could assist in making this documentation change?

I'm not sure, I guess we need to discuss it internally.