Bug 1764545
Summary: | Problem cloning a volume larger than 500GB | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Andre <afariasa> |
Component: | openstack-cinder | Assignee: | Eric Harney <eharney> |
Status: | CLOSED ERRATA | QA Contact: | Tzach Shefi <tshefi> |
Severity: | high | Docs Contact: | Chuck Copello <ccopello> |
Priority: | medium | ||
Version: | 13.0 (Queens) | CC: | ealcaniz, eharney, mmethot, pmorey |
Target Milestone: | z11 | Keywords: | Triaged, ZStream |
Target Release: | 13.0 (Queens) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-cinder-12.0.8-5.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-03-10 11:25:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1796694 |
Description
Andre
2019-10-23 09:52:00 UTC
Hi, I was checking the code for customer version[1] and they have the workaround that increases the cpu_time to 30[2]. But on the command that raises the issue, we have that the cpu is set to 8[3], is this an issue? It seems that it should reflect the cpu_time set in the code, but that's not the case. [1] ~~~ $ grep -ir nova pollux-tds-controller-1/sos_commands/rpm/sh_-c_rpm_--nodigest_-qa_--qf_NAME_-_VERSION_-_RELEASE_._ARCH_INSTALLTIME_date_awk_-F_printf_-59s_s_n_1_2_sort_-V openstack-nova-api-17.0.9-9.el7ost.noarch openstack-nova-common-17.0.9-9.el7ost.noarch openstack-nova-compute-17.0.9-9.el7ost.noarch openstack-nova-conductor-17.0.9-9.el7ost.noarch openstack-nova-console-17.0.9-9.el7ost.noarch openstack-nova-migration-17.0.9-9.el7ost.noarch openstack-nova-novncproxy-17.0.9-9.el7ost.noarch openstack-nova-placement-api-17.0.9-9.el7ost.noarch openstack-nova-scheduler-17.0.9-9.el7ost.noarch ~~~ [2] ~~~ QEMU_IMG_LIMITS = processutils.ProcessLimits( cpu_time=30, address_space=1 * units.Gi) ~~~ [3] ~~~ 2019-10-21 14:25:28.790 76 ERROR cinder.volume.manager Command: /usr/bin/python2 -m oslo_concurrency.prlimit --as=1073741824 --cpu=8 -- env LC_ALL=C qemu-img info /var/lib/cinder/mnt/[OMITTED] ~~~ Are you looking at the code in nova or cinder? Both have QEMU_IMG_LIMITS that are applied for these calls. I was checking this on nova code, do we have it in both places? If so, where the nova variable is being used? I wanna try to increase this value so customer can try it, how should we proceed? (In reply to Andre from comment #3) > I was checking this on nova code, do we have it in both places? If so, where > the nova variable is being used? > I wanna try to increase this value so customer can try it, how should we > proceed? The error shown in the description and comment #1 was from the cinder volume manager so the nova code would not be relevant. The Nova code also already has higher limits than Cinder does. Changing it Cinder and restarting cinder-volume should help. If this works, we can work toward getting this patch to relevant branches: https://review.opendev.org/#/c/691901/ How should we proceed testing? Since it requires a change in the code, I need engineering patch, right? Hi Eric, do you mind help us on this topic, please. The customer has a production environment, so, should we provide a hotfix or test package to the customer instead of manual changes? I am trying to get https://review.opendev.org/#/c/691901/ merged into upstream master, will provide a hotfix package once we at least get this merged there. Waiting for newer puddle 2020-01-15.3 Resulted in a pre-fixedin version openstack-cinder-12.0.8-3.el7ost < openstack-cinder-12.0.8-5.el7ost Verfied on: openstack-cinder-12.0.10-2.el7ost.noarch Using K2 iscsi backed 500G volume, cinder create 500 --name K2_500G (overcloud) [stack@undercloud-0 ~]$ cinder show 7bace336-d2cc-4530-864e-e4f455e73eb1 +--------------------------------+------------------------------------------+ | Property | Value | +--------------------------------+------------------------------------------+ | attached_servers | ['2ff82e12-e95a-45e5-ad28-4bd6d840e9e7'] | | attachment_ids | ['def3d853-c19f-4ec6-82b8-92e09131873f'] | | availability_zone | nova | | bootable | false | | consistencygroup_id | None | | created_at | 2020-02-09T15:52:44.000000 | | description | None | | encrypted | False | | id | 7bace336-d2cc-4530-864e-e4f455e73eb1 | | metadata | attached_mode : rw | | migration_status | None | | multiattach | False | | name | K2_500G | | os-vol-host-attr:host | controller-0@k2iscsi#k2iscsi | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 2cd01b0fe6c644a48cbfa6da5a03d25b | | replication_status | None | | size | 500 | | snapshot_id | None | | source_volid | None | | status | in-use | | updated_at | 2020-02-09T15:53:31.000000 | | user_id | b8c1ccd7e02c4f22ad8929f1cb5fcaba | | volume_type | tripleo | +--------------------------------+------------------------------------------+ Attached to instance and filled with random data nova volume-attach 2ff82e12-e95a-45e5-ad28-4bd6d840e9e7 7bace336-d2cc-4530-864e-e4f455e73eb1 # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 1G 0 disk |-vda1 253:1 0 1015M 0 part / `-vda15 253:15 0 8M 0 part vdb 253:16 0 500G 0 disk /root/kuku # df -h Filesystem Size Used Available Use% Mounted on /dev 240.1M 0 240.1M 0% /dev /dev/vda1 978.9M 23.9M 914.2M 3% / tmpfs 244.2M 0 244.2M 0% /dev/shm tmpfs 244.2M 88.0K 244.1M 0% /run /dev/vdb 492.0G 466.4G 652.5M 100% /root/kuku Now lets detach volume and clone it nova volume-detach 2ff82e12-e95a-45e5-ad28-4bd6d840e9e7 7bace336-d2cc-4530-864e-e4f455e73eb1 cinder create 501 --source-volid 7bace336-d2cc-4530-864e-e4f455e73eb1 --name 501G_ClonedVolume +--------------------------------+--------------------------------------+ | Property | Value | +--------------------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | consistencygroup_id | None | | created_at | 2020-02-10T05:36:30.000000 | | description | None | | encrypted | False | | id | b23cd5d0-2579-4ac4-a1dc-a04863319497 | | metadata | {} | | migration_status | None | | multiattach | False | | name | 501G_ClonedVolume | | os-vol-host-attr:host | controller-0@k2iscsi#k2iscsi | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 2cd01b0fe6c644a48cbfa6da5a03d25b | | replication_status | None | | size | 501 | | snapshot_id | None | | source_volid | 7bace336-d2cc-4530-864e-e4f455e73eb1 | | status | creating | | updated_at | 2020-02-10T05:36:31.000000 | | user_id | b8c1ccd7e02c4f22ad8929f1cb5fcaba | | volume_type | tripleo | +--------------------------------+--------------------------------------+ Wait a while for clone operation to finish (overcloud) [stack@undercloud-0 ~]$ cinder list +--------------------------------------+-----------+-------------------+------+-------------+----------+--------------------------------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+-------------------+------+-------------+----------+--------------------------------------+ | 7bace336-d2cc-4530-864e-e4f455e73eb1 | available | K2_500G | 500 | tripleo | false | | | b23cd5d0-2579-4ac4-a1dc-a04863319497 | in-use | 501G_ClonedVolume | 501 | tripleo | false | 2ff82e12-e95a-45e5-ad28-4bd6d840e9e7 | -> cloned volume Attach volume to instance and check we have same data on both #nova volume-attach 2ff82e12-e95a-45e5-ad28-4bd6d840e9e7 b23cd5d0-2579-4ac4-a1dc-a04863319497 Looking inside all data us there, successfully cloned a 500G, good to verify. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0764 |