Bug 1532537 - StreamingAPI - Upload/Download disk via UI/SDK does not work -Disk stuck at 'paused by system' state - Failed to add image ticket to ovirt-imageio-proxy
Summary: StreamingAPI - Upload/Download disk via UI/SDK does not work -Disk stuck at ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.2.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent vote
Target Milestone: ovirt-4.2.1
: ---
Assignee: Daniel Erez
QA Contact: Avihai
URL:
Whiteboard:
Depends On:
Blocks: 1122970 1530730
TreeView+ depends on / blocked
 
Reported: 2018-01-09 09:34 UTC by Avihai
Modified: 2018-02-12 11:52 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-02-12 11:52:14 UTC
oVirt Team: Storage
rule-engine: ovirt-4.2+
rule-engine: blocker+


Attachments (Terms of Use)
engine , vdsm ,image_proxy,image_demon logs (962.57 KB, application/x-gzip)
2018-01-09 09:34 UTC, Avihai
no flags Details

Description Avihai 2018-01-09 09:34:41 UTC
Created attachment 1378921 [details]
engine , vdsm ,image_proxy,image_demon logs

Description of problem:
Trying to upload/download disk image via UI/SDK does not work -Disk stuck at 'paused by system' state.

At engine it looks like its fails to add image ticket to ovirt-imageio-proxy.

No other indication at other VDSM/proxy logs.

Engine:

Download :
2018-01-09 11:03:59,259+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.PrepareImageVDSCommand] (default task-30) [dd51678c-fea8-4913-adb8-377e67c722ef] START, PrepareImageVDSCommand(HostName = host_mixed_3,
 PrepareImageVDSCommandParameters:{hostId='5046c759-c25e-4ad0-9cd3-1737979f3aad'}), log id: 14291a69
2018-01-09 11:04:01,230+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.PrepareImageVDSCommand] (default task-30) [dd51678c-fea8-4913-adb8-377e67c722ef] FINISH, PrepareImageVDSCommand, return: PrepareImageRe
turn:{status='Status [code=0, message=Done]'}, log id: 14291a69
2018-01-09 11:04:01,236+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.AddImageTicketVDSCommand] (default task-30) [dd51678c-fea8-4913-adb8-377e67c722ef] START, AddImageTicketVDSCommand(HostName = host_mixe
d_3, AddImageTicketVDSCommandParameters:{hostId='5046c759-c25e-4ad0-9cd3-1737979f3aad', ticketId='eb6a5bfb-92de-4a5d-9234-1884b40f4f0e', timeout='300', operations='[read]', size='1073741824', url='file:///rhev/d
ata-center/mnt/blockSD/5820c57b-a8fb-4169-9b08-bcf28753c4f6/images/fab59736-1764-45dd-8f89-cff62d767a1d/2df52b69-10d9-42bd-8492-b2b8ab973e46', filename='download_test.qcow2'}), log id: 7f85c47
2018-01-09 11:04:01,267+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.AddImageTicketVDSCommand] (default task-30) [dd51678c-fea8-4913-adb8-377e67c722ef] FINISH, AddImageTicketVDSCommand, return: StatusOnly
Return [status=Status [code=0, message=Done]], log id: 7f85c47
2018-01-09 11:04:01,267+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-30) [dd51678c-fea8-4913-adb8-377e67c722ef] Started transfer session with ticket id eb6a5bfb-
92de-4a5d-9234-1884b40f4f0e, timeout 300 seconds
2018-01-09 11:04:01,267+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-30) [dd51678c-fea8-4913-adb8-377e67c722ef] Adding image ticket to ovirt-imageio-proxy, id eb
6a5bfb-92de-4a5d-9234-1884b40f4f0e
2018-01-09 11:04:01,301+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-30) [dd51678c-fea8-4913-adb8-377e67c722ef] Failed to add image ticket to ovirt-imageio-proxy
2018-01-09 11:04:01,303+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.ImageTransferUpdater] (default task-30) [dd51678c-fea8-4913-adb8-377e67c722ef] Updating image transfer d4ac5a91-790a-4c12-8760-624d9
7b5e0c8 (image fab59736-1764-45dd-8f89-cff62d767a1d) phase to Paused by System


Upload:
2018-01-09 11:14:18,066+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferImageStatusCommand] (default task-13) [f275d1ff-e0ac-4a47-9235-b6f39f6e72d3] Running command: TransferImageStatusCommand int
ernal: false. Entities affected :  ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group CREATE_DISK with role type USER
2018-01-09 11:14:18,424+02 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeLegalityVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-94) [26244577-0939-4fcd-a50a-eb66f8b62a49] FINISH, SetVolu
meLegalityVDSCommand, log id: 3996d490
2018-01-09 11:14:18,427+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.AddImageTicketVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-94) [26244577-0939-4fcd-a50a-eb66f8b62a49] START, AddImageTic
ketVDSCommand(HostName = host_mixed_1, AddImageTicketVDSCommandParameters:{hostId='ac4512aa-0551-4670-b3ae-8f69ad6d90b8', ticketId='d88ca88a-ffcb-46d0-9e77-19a711d4a81c', timeout='300', operations='[write]', siz
e='1073741824', url='file:///rhev/data-center/mnt/blockSD/5820c57b-a8fb-4169-9b08-bcf28753c4f6/images/d0230b0e-2c6a-4f22-a673-301931d4d7e7/315339c0-40d1-47d0-918a-48a24f56fb86', filename='null'}), log id: 5d21c6
bf
2018-01-09 11:14:18,446+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.AddImageTicketVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-94) [26244577-0939-4fcd-a50a-eb66f8b62a49] FINISH, AddImageTi
cketVDSCommand, return: StatusOnlyReturn [status=Status [code=0, message=Done]], log id: 5d21c6bf
2018-01-09 11:14:18,446+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-94) [26244577-0939-4fcd-a50a-eb66f8b62a49] Started transfe
r session with ticket id d88ca88a-ffcb-46d0-9e77-19a711d4a81c, timeout 300 seconds
2018-01-09 11:14:18,446+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-94) [26244577-0939-4fcd-a50a-eb66f8b62a49] Adding image ti
cket to ovirt-imageio-proxy, id d88ca88a-ffcb-46d0-9e77-19a711d4a81c
2018-01-09 11:14:18,488+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-94) [26244577-0939-4fcd-a50a-eb66f8b62a49] Failed to add image ticket to ovirt-imageio-proxy
2018-01-09 11:14:18,490+02 INFO  [org.

VDSM:
Download -  host_mixed_3

upload -  host_mixed_1:
2018-01-09 11:14:18,433+0200 INFO  (jsonrpc/3) [vdsm.api] START add_image_ticket(ticket={'url': 'file:///rhev/data-center/mnt/blockSD/5820c57b-a8fb-4169-9b08-bcf28753c4f6/images/d0230b0e-2c6a-4f22-a673-301931d4d
7e7/315339c0-40d1-47d0-918a-48a24f56fb86', 'size': 1073741824, 'uuid': 'd88ca88a-ffcb-46d0-9e77-19a711d4a81c', 'timeout': 300, 'ops': ['write']}) from=::ffff:10.35.161.118,33660, flow_id=26244577-0939-4fcd-a50a-
eb66f8b62a49, task_id=f369793f-02f1-44fa-8200-27f14ecb3fe9 (api:46)
2018-01-09 11:14:18,434+0200 DEBUG (jsonrpc/3) [storage.imagetickets] Sending request method='PUT', ticket='d88ca88a-ffcb-46d0-9e77-19a711d4a81c', body='{"url": "file:///rhev/data-center/mnt/blockSD/5820c57b-a8f
b-4169-9b08-bcf28753c4f6/images/d0230b0e-2c6a-4f22-a673-301931d4d7e7/315339c0-40d1-47d0-918a-48a24f56fb86", "size": 1073741824, "uuid": "d88ca88a-ffcb-46d0-9e77-19a711d4a81c", "timeout": 300, "ops": ["write"]}' 
(imagetickets:78)
2018-01-09 11:14:18,439+0200 INFO  (jsonrpc/3) [vdsm.api] FINISH add_image_ticket return=None from=::ffff:10.35.161.118,33660, flow_id=26244577-0939-4fcd-a50a-eb66f8b62a49, task_id=f369793f-02f1-44fa-8200-27f14e
cb3fe9 (api:52)
2018-01-09 11:14:18,440+0200 DEBUG (jsonrpc/3) [storage.TaskManager.Task] (Task='f369793f-02f1-44fa-8200-27f14ecb3fe9') finished: None (task:1201)
2018-01-09 11:14:18,441+0200 DEBUG (jsonrpc/3) [storage.TaskManager.Task] (Task='f369793f-02f1-44fa-8200-27f14ecb3fe9') moving from state preparing -> state finished (task:602)
2018-01-09 11:14:18,441+0200 DEBUG (jsonrpc/3) [storage.ResourceManager.Owner] Owner.releaseAll requests {} resources {} (resourceManager:910)
2018-01-09 11:14:18,442+0200 DEBUG (jsonrpc/3) [storage.ResourceManager.Owner] Owner.cancelAll requests {} (resourceManager:947)
2018-01-09 11:14:18,442+0200 DEBUG (jsonrpc/3) [storage.TaskManager.Task] (Task='f369793f-02f1-44fa-8200-27f14ecb3fe9') ref 0 aborting False (task:1002)
2018-01-09 11:14:18,444+0200 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Host.add_image_ticket succeeded in 0.01 seconds (__init__:573)
2018-01-09 11:14:20,702+0200 INFO  (jsonrpc/4) [api.host] START getAllVmStats() from=::ffff:10.35.161.118,33660 (api:46)


image-proxy.log:
Download host_mixed_3:
(Thread-9  ) WARNING 2018-01-09 11:04:01,296 web:89:web:(log_response) 10.35.161.118 - PUT /tickets/ 404 127 (0.00s)

Upload - host_mixed_1::
(Thread-10 ) WARNING 2018-01-09 11:14:18,482 web:89:web:(log_response) 10.35.161.118 - PUT /tickets/ 404 127 (0.00s)

deamon log:
Download - host_mixed_3::
2018-01-09 11:04:01,250 INFO    (ticket.server) [tickets] Adding ticket <Ticket active=False expires=4465083 filename=u'download_test.qcow2' ops=[u'read'] size=1073741824 transferred=0 url=u'file:///rhev/data-center/mnt/blockSD/5820c57b-a8fb-4169-9b08-bcf28753c4f6/images/fab59736-1764-45dd-8f89-cff62d767a1d/2df52b69-10d9-42bd-8492-b2b8ab973e46' uuid=u'eb6a5bfb-92de-4a5d-9234-1884b40f4f0e' at 0x28bdf50>
2018-01-09 11:04:01,253 INFO    (ticket.server) [web] / - PUT /eb6a5bfb-92de-4a5d-9234-1884b40f4f0e 200 0 (0.00s)


Upload- host_mixed_1:
2018-01-09 11:14:18,436 INFO    (ticket.server) [tickets] Adding ticket <Ticket active=False expires=4465111 filename=None ops=[u'write'] size=1073741824 transferred=0 url=u'file:///rhev/data-center/mnt/blockSD/5820c57b-a8fb-4169-9b08-bcf28753c4f6/images/d0230b0e-2c6a-4f22-a673-301931d4d7e7/315339c0-40d1-47d0-918a-48a24f56fb86' uuid=u'd88ca88a-ffcb-46d0-9e77-19a711d4a81c' at 0x7fb750158090>
2018-01-09 11:14:18,437 INFO    (ticket.server) [web] / - PUT /d88ca88a-ffcb-46d0-9e77-19a711d4a81c 200 0 (0.00s)


Version-Release number of selected component (if applicable):
Engine
VDSM:4.20.9.3-1

How reproducible:
100% , happens all the time on different engines .

Issue did not occur in previos build , thus regression.

Steps to Reproduce:
1.Via UI/SDK try to upload/download a disk

Actual results:
Upload/Download fails & disks are stuck in 'paused by system' state .

Download/upload can not be cancled & disk cann't be removed - will open a seperate bug for this.

Expected results:


Additional info:
All relevant logs attached.

Comment 1 Avihai 2018-01-09 09:47:31 UTC
Version-Release number of selected component (if applicable):

Engine : 4.2.1-0.2.el7
VDSM:4.20.9.3-1
ovirt-imageio-proxy: 1.2.0-0.el7ev.noarch

Comment 2 Daniel Erez 2018-01-09 13:28:18 UTC
(In reply to Avihai from comment #1)
> Version-Release number of selected component (if applicable):
> 
> Engine : 4.2.1-0.2.el7
> VDSM:4.20.9.3-1
> ovirt-imageio-proxy: 1.2.0-0.el7ev.noarch

Can you please try updating to latest proxy (1.2.1)

Comment 5 Avihai 2018-01-10 09:06:14 UTC
This issue also occurred also in RHEL7.4 (initially tested on RHEL7.5)

Comment 8 Avihai 2018-01-10 10:17:41 UTC
Thank you derez for helping me with the dependency issue, now that we have the new 1.2.1 ovirt-imageio-proxy indeed download & upload looks good .

Comment 9 Yaniv Kaul 2018-01-10 14:13:20 UTC
(In reply to Avihai from comment #8)
> Thank you derez for helping me with the dependency issue, now that we have
> the new 1.2.1 ovirt-imageio-proxy indeed download & upload looks good .

Thanks for testing!

Comment 10 Daniel Erez 2018-01-10 15:20:57 UTC
(In reply to Avihai from comment #8)
> Thank you derez for helping me with the dependency issue, now that we have
> the new 1.2.1 ovirt-imageio-proxy indeed download & upload looks good .

So the issue was a mismatch between the engine and imageio-proxy versions (i.e. engine 1.2.1 with proxy 1.2.0).

Comment 11 Allon Mureinik 2018-01-10 15:27:42 UTC
(In reply to Daniel Erez from comment #10)
> (In reply to Avihai from comment #8)
> > Thank you derez for helping me with the dependency issue, now that we have
> > the new 1.2.1 ovirt-imageio-proxy indeed download & upload looks good .
> 
> So the issue was a mismatch between the engine and imageio-proxy versions
> (i.e. engine 1.2.1 with proxy 1.2.0).
Daniel - can we have an RFE to send the version as part of the protocol and fail "cleanly" if the versions don't match?

Comment 12 Daniel Erez 2018-01-10 15:47:53 UTC
(In reply to Allon Mureinik from comment #11)
> (In reply to Daniel Erez from comment #10)
> > (In reply to Avihai from comment #8)
> > > Thank you derez for helping me with the dependency issue, now that we have
> > > the new 1.2.1 ovirt-imageio-proxy indeed download & upload looks good .
> > 
> > So the issue was a mismatch between the engine and imageio-proxy versions
> > (i.e. engine 1.2.1 with proxy 1.2.0).
> Daniel - can we have an RFE to send the version as part of the protocol and
> fail "cleanly" if the versions don't match?

I can open an RFE for that, but not sure it worth the effort or even makes sense for internal versions.

Comment 13 Avihai 2018-01-15 06:04:05 UTC
(In reply to Daniel Erez from comment #10)
> (In reply to Avihai from comment #8)
> > Thank you derez for helping me with the dependency issue, now that we have
> > the new 1.2.1 ovirt-imageio-proxy indeed download & upload looks good .
> 
> So the issue was a mismatch between the engine and imageio-proxy versions
> (i.e. engine 1.2.1 with proxy 1.2.0).

Daniel, 

Is this new 1.2.1 ovirt-imageio-proxy included in latest downstream build?

I took this build from the upstream path but we (QA) are testing downsteam builds...

Comment 14 Avihai 2018-01-15 06:33:55 UTC
I checked & 1.2.1 ovirt-imageio-proxy  is NOT included in latest downstream build ovirt-engine-4.2.1.1-0.1.el7.noarch .

Please add it as we(QA) are testing only downstream.

Comment 16 Avihai 2018-01-15 08:37:28 UTC
After discussion made between Raz T.& Yaniv K. , I'm reopening as the current latest downstream release(4.2.1-0.2) does not contain the fix.(comments #13 & #14)

Comment 17 Daniel Erez 2018-01-15 08:42:58 UTC
Actually, there was no real fix here. It was just a miss correlation between engine and proxy versions. Hasn't it been resolved for you according to https://bugzilla.redhat.com/show_bug.cgi?id=1532537#c8 ?

Comment 18 Avihai 2018-01-15 09:07:22 UTC
(In reply to Daniel Erez from comment #17)
> Actually, there was no real fix here. It was just a miss correlation between
> engine and proxy versions. Hasn't it been resolved for you according to
> https://bugzilla.redhat.com/show_bug.cgi?id=1532537#c8 ?

Indeed in comment #8 the issue was resolved for me using UPSTREAM rpm's containing new 1.2.1 ovirt-imageio.

I understand that the bug is caused by miss correlation between engine and proxy versions in DOWNSTREAM builds so the fix is to sync it (although it's not writing/correcting code its still fixing the issue).

Currently, engine and proxy versions is only synced to upstream & not in downstream(4.2.1-0.2) builds.

So for this bug to be fixed ,Yaniv & Raz suggested NOT to open a new bug but use this bug to solve the miss correlation issue.

Please add the relevant 1.2.1 ovirt-imageio-X rpm's to the downstream rpm's so we can close this bug.

Comment 20 Kevin Alon Goldblatt 2018-01-18 12:08:40 UTC
The same issue of "paused by system happens on:

rhel7.4 
ovirt-engine-4.2.1.1-0.1.el7.noarch
vdsm-4.20.13-1.el7ev.x86_64

Comment 21 RHV bug bot 2018-01-18 17:39:04 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[No external trackers attached]

For more info please contact: infra@ovirt.org

Comment 22 Daniel Erez 2018-01-24 14:57:58 UTC
No patch was required. The issue was a miss correlation between engine and proxy versions. Released imageio-1.2.1 to mitigate it.

Comment 23 Avihai 2018-01-25 12:25:16 UTC
Verified .

ovirt-engine-4.2.1.3-0.1.el7.noarch
ovirt-imageio-proxy-1.2.1-0.el7ev.noarch

Comment 24 Sandro Bonazzola 2018-02-12 11:52:14 UTC
This bugzilla is included in oVirt 4.2.1 release, published on Feb 12th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.1 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.