Bug 1873161
Summary: | [ROKS/Azure] ssd drives recognized as hdd in ceph osd tree | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Elvir Kuric <ekuric> | |
Component: | ocs-operator | Assignee: | Jose A. Rivera <jarrpa> | |
Status: | CLOSED WONTFIX | QA Contact: | Raz Tamir <ratamir> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.5 | CC: | ebenahar, jarrpa, madam, mbukatov, muagarwa, nberry, ocs-bugs, owasserm, pkundra, rcyriac, sabose, shan, shberry, sostapov | |
Target Milestone: | --- | Keywords: | AutomationBackLog, Performance | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1928197 (view as bug list) | Environment: | ||
Last Closed: | 2021-06-02 15:44:25 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1848907, 1928197 |
Description
Elvir Kuric
2020-08-27 14:21:45 UTC
How are the device presented by the Kernel? Can you run "lsblk --output-all" on one of the VM? If the disks are reported as rotational then this is expected. sh-4.2# lsblk --all NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk |-sda1 8:1 0 256M 0 part /boot |-sda2 8:2 0 1G 0 part `-sda3 8:3 0 1.8T 0 part / sdb 8:16 0 893.8G 0 disk `-sdb1 8:17 0 893.8G 0 part `-docker_data 253:1 0 893.8G 0 crypt /var/data sdc 8:32 0 1.8T 0 disk `-sdc1 8:33 0 1.8T 0 part sdd 8:48 0 20G 0 disk `-3600a0980383056666424506a33377335 253:0 0 20G 0 mpath /var/data/kubelet/pods/f57adacf-d0d2-4895-a0a5-5fb46d42260d/volumes/ibm~ibmc-block/pvc-9cd164aa-f919-4d44-bfce-a978e719d4be sde 8:64 0 20G 0 disk `-3600a0980383056666424506a33377335 253:0 0 20G 0 mpath /var/data/kubelet/pods/f57adacf-d0d2-4895-a0a5-5fb46d42260d/volumes/ibm~ibmc-block/pvc-9cd164aa-f919-4d44-bfce-a978e719d4be rbd0 252:0 0 300G 0 disk /var/data/kubelet/pods/7640f74a-f129-4e66-b973-c5f90e39ddcb/volumes/kubernetes.io~csi/pvc-630e25ad-6e3c-4893-a0f8-978d47e75e2f/mount rbd1 252:16 0 300G 0 disk /var/data/kubelet/pods/9dbbac7b-be03-4f6d-80ee-4fc4bcdcdd39/volumes/kubernetes.io~csi/pvc-8f15be58-5d9e-4410-9ebb-d7d715166cc5/mount loop0 7:0 0 1.8T 0 loop I've logged into an environment and this is not a ceph nor rook bug, the Kernel reports the device used for the OSD as an HDD. See: [root@rook-ceph-osd-0-ddd75d95-vlj46 /]# ls -al /var/lib/ceph/osd/ceph-0/block brw-rw-rw-. 1 ceph ceph 8, 48 Sep 3 12:38 /var/lib/ceph/osd/ceph-0/block [root@rook-ceph-osd-0-ddd75d95-vlj46 /]# ls -al /dev/sdd brw-rw-rw-. 1 root disk 8, 48 Sep 3 10:32 /dev/sdd [root@rook-ceph-osd-0-ddd75d95-vlj46 /]# lsblk -o ROTA /dev/sdd ROTA 1 So it is expected. We could solve this by forcing the type of drive if we know it in advance. When the ocs-op creates the volumeClaimTemplates, it can set an annotation like "crushDeviceClass: ssd" see https://github.com/rook/rook/blob/master/cluster/examples/kubernetes/ceph/cluster-on-pvc.yaml#L117 Then rook will invoke ceph-volume correctly. I'm moving this to ocs-op for a fix. Should we ask the Red Hat <-> Azure team for some clarifications here? This bug affects bz# 1848907. See https://bugzilla.redhat.com/show_bug.cgi?id=1848907#c13 for detailed problem description. (In reply to Yaniv Kaul from comment #9) > Should we ask the Red Hat <-> Azure team for some clarifications here? FYI: this was reported upstream https://github.com/MicrosoftDocs/azure-docs/issues/13958#issuecomment-430523880 and it seems it didn't get much attention. Since the work in Rook-Ceph is simply to add an annotation, we can already do the same in ocs-operator. From a development standpoint I believe this can be considered already done, one owuld just need to specify an appropriately configured PVC template in the StorageCluster. Naturally, all devices in the same device set would have the same CRUSH device type. The catch, of course, is that you can't yet do this with a StorageCluster created from the UI. I'm not sure how/if we want to deal with that for OCS 4.6, and whether that should be a separate BZ. QE understanding of this issue: - kernel reports SSD devices as HDD ones on Azure and ROKS (comment 8) - it's unclear how to push this upstream to Azure/kernel people (comment 11) - we can override the detection on ocs operator's side though (comment 12) - fix should include automatic override of device type on affected platforms Steps to verify the fix, for all affected platforms (Azure, ROKS): - install OCS on the platform following the docs (standard installation from OCP Console) - there should be no need to perform any additional manual tweaks - check device type in `ceph osd tree` output That said, this conflicts with one statement from comment 12: > The catch, of course, is that you can't yet do this with a StorageCluster created from the UI. I'm not sure how/if we want to deal with that for OCS 4.6, and whether that should be a separate BZ. I interpret this as a suggestion to have this fixed in 4.7 or later, while for 4.6 we can only come up with a workaround. But then it's not clear to me why we are targeting this to 4.6. QE can't consider this BZ fixed without automatic override of device type on affected platforms or an option to override available from UI. If we want to come up with a workaround for 4.6, we should open a separate BZ for 4.6, where we explain the workaround in more detail, so that both QE and doc team can understand it. Asking for clarification of the plan. Based on this, we should retriage this BZ and create a BZ for a 4.6 workaround if needed. This BZ should be targeted to 4.6 only if we actually plan to have in fixed there. Ah, I see... Yes, that makes perfect sense. While the backend work to allow for the annotation is already done, I don't think we can reliably automate this in the backend as well. We have no means of detecting the type of storage devices we'll be using before we use them. As such, the complete solution will require changes in the UI to allow the admin to specify this information. Given that, moving this to OCS 4.7. Do we need to clone this bug for the UI implementation? (In reply to Jose A. Rivera from comment #16) > Ah, I see... Yes, that makes perfect sense. While the backend work to allow > for the annotation is already done, I don't think we can reliably automate > this in the backend as well. We have no means of detecting the type of > storage devices we'll be using before we use them. As such, the complete > solution will require changes in the UI to allow the admin to specify this > information. Even for the backend to work, we need https://github.com/rook/rook/pull/6303 which is not available in OCS 4.6? Sebastien, can you confirm? > > Given that, moving this to OCS 4.7. Do we need to clone this bug for the UI > implementation? (In reply to Sahina Bose from comment #19) > (In reply to Jose A. Rivera from comment #16) > > Ah, I see... Yes, that makes perfect sense. While the backend work to allow > > for the annotation is already done, I don't think we can reliably automate > > this in the backend as well. We have no means of detecting the type of > > storage devices we'll be using before we use them. As such, the complete > > solution will require changes in the UI to allow the admin to specify this > > information. > > Even for the backend to work, we need https://github.com/rook/rook/pull/6303 > which is not available in OCS 4.6? > Sebastien, can you confirm? > > > > > Given that, moving this to OCS 4.7. Do we need to clone this bug for the UI > > implementation? I confirm, initially the target was 4.7. We will use bug 1903973 to set options related to SSD. Removing 4.6.z from this one. Pulkit, do we now set deviceClass to SSD by default too? This can be achieved via cli (please follow the yaml in https://bugzilla.redhat.com/show_bug.cgi?id=1873161#c25) If we need this from UI, we need to raise a UI BZ. Per understanding stated in comment 15, QE team can't consider this bug fixed if it doesn't work when OCS is installed via OCP Console as usual. Based on that, I'm challenging the state of the bug, moving it back to ASSIGNED. That said, I cloned this bug to OCP Console as BZ 1928197 to make sure the changes in UI are properly tracked. This needs more discussion on how to enable this option for the end user, as of now enabling it via UI is not an option because of the complexities involved. We might have to see if something can be done from backend, moving it out of 4.7 It is important to note that changing the device class could be disruptive to existing pools that are using the previous device class. So while Rook can update the device class, it is better to create the cluster in the first place with the correct device class to ensure pools are created also with the correct device class. Acking for the Rook changes. If the deviceClass is being set by the OCS operator or UI during pool creation, they will need a separate BZ for updating their device class. Wait, before acking I think it needs more discussion. The last comment would only be needed if we actually need to support changing the crushDeviceClass on the storageClassDeviceSet. We really should be setting the expected crushDeviceClass property correctly from the start instead of modifying it later. For example, if the OCS operator is setting the tuneFastDeviceClass for Azure, it could also set the crushDeviceClass to ssd at the same time. Pulkit, in that case should this be moved back to the OCS operator? Supporting the changing crushDeviceClass is more challenging since it could affect existing pools. To clarify the scenario... - We are seeing hdds instead of ssds due to incorrect reporting from the kernel. Any fix we implement is a workaround. - The deviceClass needs to be initialized correctly with a new cluster. Changing the deviceClass after the cluster is initialized is not a best practice as it could affect existing pools. - The device class can be overridden in the CephCluster CR with the "crushDeviceClass" annotation on the storageClassDeviceSet. It can already be overridden from the StorageCluster CR by setting the "deviceClass" property. - The UI does not currently set the deviceClass as needed for Azure and IBM to override the incorrect value. @Elvir, @Martin What is the real severity of this issue? It seems very low. Are users seeing any impact from this change? Or it's just an incorrect detail observed in the OSD properties? Users should never see this detail. Since we are setting tuneFastDeviceClass, we should be seeing the same behavior as the ssds, but it is just showing up as the wrong type. If there is no other side effect, let's close it as won't fix. If I've missed the impact and a fix is still needed, seems like the UI needs to set the deviceClass in the StorageCluster CR, or else the OCS operator needs to override the deviceClass when needed. Rook doesn't have the context to make this change. Moving to the OCS operator to consider if there is anything that can be done at that level, or else it should be closed in favor of the UI bug. https://bugzilla.redhat.com/show_bug.cgi?id=1928197 I agree with Travis' assessment that, based on the discussions in this BZ, this is not worth our time to fix. Closing this as WONTFIX. If for some reason this causes actual problems in the future, we can re-open this issue. The resolution of this bug is suboptimal, but at least the performance is not impacted, since we now force ssd tuning on Azure and ROKS via BZ 1903973. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |