Bug 1302032 - Large image is downloaded from glance during the volume creation and fails
Summary: Large image is downloaded from glance during the volume creation and fails
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-glance
Version: 5.0 (RHEL 7)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 5.0 (RHEL 7)
Assignee: Flavio Percoco
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-26 15:10 UTC by Jeremy
Modified: 2023-09-14 03:16 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-02-22 06:11:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jeremy 2016-01-26 15:10:41 UTC
Description of problem: Trying to build a cinder volume using image. The image is about 17 GB.
The volume we are trying to create is 100 GB

Intermittently, the request fails


Version-Release number of selected component (if applicable):


How reproducible:
75%

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
There is no LB. Three controllers managed by pacemaker.
glance has a VIP configured and I beleive is managed by HaProxy.



Trying to build a cinder volume using image. The image is about 17 GB.
The volume we are trying to create is 100 GB

Intermittently, the request fails. The glance logs show
2016-01-25 11:33:06.863 22647 INFO glance.wsgi.server [d552da60-551f-47b8-9295-ddbba514a047 e998e2dabe8e4d1eb2a6b778b4422c01 24341f87ac894f78970abef5231f1a67 - - -] Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/eventlet/wsgi.py", line 406, in handle_one_response
    write(''.join(towrite))
  File "/usr/lib/python2.7/site-packages/eventlet/wsgi.py", line 354, in write
    _writelines(towrite)
  File "/usr/lib64/python2.7/socket.py", line 334, in writelines
    self.flush()
  File "/usr/lib64/python2.7/socket.py", line 303, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
  File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 309, in sendall
    tail = self.send(data, flags)
  File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 295, in send
    total_sent += fd.send(data[total_sent:], flags)
error: [Errno 32] Broken pipe

The same request issued again will succeed.

Comment 2 Jeremy 2016-01-26 15:33:11 UTC
nova volume-list
+--------------------------------------+-----------+-----------------------+------+-------------+--------------------------------------+
| ID                                   | Status    | Display Name          | Size | Volume Type | Attached to                          |
+--------------------------------------+-----------+-----------------------+------+-------------+--------------------------------------+
| 664c62b8-ec7a-4431-9978-7eef71e31c14 | error     | ceph_win2012_vol      | 100  | ceph        |                                      |
| 82ea6e36-4139-4e5e-8daf-4b10ad5a942e | in-use    |                       | 100  | nfs         | 30db6f61-68fb-4221-ace7-0b1106c58456 |
| 00b543bf-d1d2-4ff2-ad38-db4a36a46e09 | available | vol-windows-2012-08   | 100  | nfs         |                                      |
| 2e3c5c10-2a0f-44d0-8b3b-571a84068c50 | available | vol-windows-2012-07   | 100  | nfs         |                                      |
| 6b631cf3-c113-4d17-b009-1e8b71569b4b | available | vol-windows-2012-06   | 100  | nfs         |                                      |
| eb48f28a-0051-488b-946a-30d0664b6f2e | error     | vol-windows-2012-05   | 100  | nfs         |                                      |
| 91d008cb-7475-4741-b3f1-9f3b0d8e336b | available | vol-windows-2012-04   | 100  | nfs         |                                      |
| 9666e2ad-6369-4ee2-87bc-7fa309f8dc15 | in-use    | bvpkura_win_02        | 100  | nfs         | f68562cd-69f8-4fa0-ba21-dc53b103b65a |
| 7c2f50d9-4b41-4ef5-aec4-9922459aa8f8 | available | test-nfs-win-vol02    | 100  | nfs         |                                      |
| 88366c98-0ffd-4190-87e5-b5c59eac3551 | available | test-ceph-linux-vol01 | 40   | ceph        |                                      |
| a2527631-1945-4c10-80e8-45bfc562d38f | in-use    | bvpkwin01_nfs         | 100  | nfs         | d9021577-4f10-451f-a72b-424187ad6d8c |
| 81c0094f-6f4e-43c1-8d08-956e893a8465 | in-use    | bvpklinux01_nfs       | 25   | nfs         | 6c2f1b1f-0b33-4389-8bd4-d4f82ddf473f |
+--------------------------------------+-----------+-----------------------+------+-------------+--------------------------------------+

The following is the failed on in the list
eb48f28a-0051-488b-946a-30d0664b6f2e | error     | vol-windows-2012-05   | 100  | nfs         





[root@xlabostkctrl1 scripts(pkura_lab_hfd)]# glance image-list
+--------------------------------------+-------------------------+-------------+------------------+-------------+--------+
| ID                                   | Name                    | Disk Format | Container Format | Size        | Status |
+--------------------------------------+-------------------------+-------------+------------------+-------------+--------+
| c993bb22-57c3-46ba-884e-4cb8621f817b | WIN2012-OS-IMG          | qcow2       | bare             | 18008834048 | active |
+--------------------------------------+-------------------------+-------------+------------------+-------------+--------+

Comment 3 Flavio Percoco 2016-01-27 12:37:09 UTC
@Jeremy

The logs are not in the collab-shell. Could you upload them there?

Also, the traceback in the description is incomplete, could you paste the full traceback?

Comment 4 Flavio Percoco 2016-01-27 13:50:41 UTC
I found the logs in the case.

Broken Pipes normally happen when the connection drops on the other end. By looking at the logs, it seems that this environment has serious network issues as there are *many* broken pipes in nova/cinder logs. It also looses connection to MySQL.

I'd recommend debugging that first.

Comment 5 Jeremy 2016-01-27 21:07:51 UTC
I'm seeing this is in /var/log/cinder/volume.log

2016-01-27 14:52:35.853 16078 ERROR cinder.volume.flows.manager.create_volume [req-83e7003f-8d6c-4bb1-ab29-00b6407877a8 e998e2dabe8e4d1eb2a6b778b4422c01 24341f87ac894f78970abef5231f1a67 - - -] Failed to copy image f5618dd7-cdfb-4b04-80c2-26cc75033860 to volume: 23206ea2-615c-419b-af03-42a40e195d67, error: [Errno 32] Corrupted image. Checksum was f8840d9261d652b8366b62c8302231f4 expected 8cc30337d32a01957c48b9d91e419866
2016-01-27 14:52:35.856 16078 DEBUG cinder.volume.flows.common [req-83e7003f-8d6c-4bb1-ab29-00b6407877a8 e998e2dabe8e4d1eb2a6b778b4422c01 24341f87ac894f78970abef5231f1a67 - - -] Updating volume: 23206ea2-615c-419b-af03-42a40e195d67 with {'status': 'error'} due to: ??? error_out_volume /usr/lib/python2.7/site-packages/cinder/volume/flows/common.py:87
2016-01-27 14:52:35.893 16078 DEBUG cinder.openstack.common.periodic_task [-] Running periodic task VolumeManager._publish_service_capabilities run_periodic_tasks /usr/lib/python2.7/site-packages/cinder/openstack/common/periodic_task.py:178
2016-01-27 14:52:35.893 16078 DEBUG cinder.manager [-] Notifying Schedulers of capabilities ... _publish_service_capabilities /usr/lib/python2.7/site-packages/cinder/manager.py:128
2016-01-27 14:52:35.894 16078 WARNING cinder.openstack.common.loopingcall [-] task run outlasted interval by 32.45048 sec
2016-01-27 14:52:35.895 16078 ERROR cinder.volume.flows.manager.create_volume [req-83e7003f-8d6c-4bb1-ab29-00b6407877a8 e998e2dabe8e4d1eb2a6b778b4422c01 24341f87ac894f78970abef5231f1a67 - - -] Volume 23206ea2-615c-419b-af03-42a40e195d67: create failed

Comment 6 Sergey Gotliv 2016-01-28 08:55:16 UTC
Data integrity check fails most probably because the image is only partially downloaded, but just to be on the safe side let's verify that image in Glance is not corrupted. According to the log they use a Glance file store so its relatively easy to confirm, just find an image and run md5sum.

Logs support the the partial download theory, it looks the client terminated connection at some point either due to the networking issue or maybe because of the timeout somewhere. Does glance api work behind HA proxy, if it does can we get haproxy logs?

Comment 8 Red Hat Bugzilla 2023-09-14 03:16:47 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.