Bug 1858956 - Bad handling of imageio errors since imageio 1.5
Summary: Bad handling of imageio errors since imageio 1.5
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.40.22
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.4.5
: ---
Assignee: Roman Bednář
QA Contact: Amit Sharir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-20 21:05 UTC by Nir Soffer
Modified: 2021-11-04 19:28 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-22 10:22:43 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 113013 0 master ABANDONED tests: add coverage for imagetickets._read_content 2021-02-11 11:18:06 UTC
oVirt gerrit 113014 0 master ABANDONED imagetickets: add the ability to process text response from virtio 2021-02-11 11:18:06 UTC
oVirt gerrit 113015 0 master ABANDONED test: remove xfail marker for text response test 2021-02-11 11:18:07 UTC
oVirt gerrit 113036 0 master MERGED tests: cover text error responses from imageio 2021-02-11 15:45:46 UTC
oVirt gerrit 113037 0 master MERGED imagetickets: fix imageio text response parsing 2021-02-16 14:29:09 UTC
oVirt gerrit 113159 0 master MERGED imagetickets: refactor request() to private function 2021-02-16 14:29:06 UTC
oVirt gerrit 113393 0 master MERGED tests: remove duplicit imagetickets tests 2021-02-16 14:29:12 UTC

Description Nir Soffer 2020-07-20 21:05:08 UTC
Description of problem:

Vdsm does not parse imageio errors correctly, ignoring the content-type and 
trying to parse the error text as json. This worked in the past since imageio
was sending json errors, but since 4.3 imageio returns text errors.

Here is an example exception from vdsm log:

2020-07-20 23:37:27,479+0300 INFO  (jsonrpc/5) [vdsm.api] FINISH remove_image_ticket error=Image daemon request failed: "status=409, reason=Conflict, error={'explanation': '
Invalid JSON', 'detail': 'Expecting value: line 1 column 1 (char 0)'}" from=::ffff:192.168.122.12,33132, flow_id=5bfbb6ef-eeaa-499d-9548-6d348ad7cd7d, task_id=8a8ac51a-8eaf-
48f9-8391-e4ae7e632ec9 (api:52)
2020-07-20 23:37:27,479+0300 ERROR (jsonrpc/5) [storage.TaskManager.Task] (Task='8a8ac51a-8eaf-48f9-8391-e4ae7e632ec9') Unexpected error (task:880)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/storage/imagetickets.py", line 132, in _read_content
    return json.loads(res_data.decode("utf8"))
  File "/usr/lib64/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 887, in _run
    return fn(*args, **kargs)
  File "<decorator-gen-149>", line 2, in remove_image_ticket
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 3063, in remove_image_ticket
    imagetickets.remove_ticket(uuid)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/imagetickets.py", line 66, in wrapper
    return func(*args, **kw)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/imagetickets.py", line 90, in remove_ticket
    request("DELETE", uuid)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/imagetickets.py", line 108, in request
    content = _read_content(res)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/imagetickets.py", line 135, in _read_content
    raise se.ImageDaemonError(response.status, response.reason, error_info)
vdsm.storage.exception.ImageDaemonError: Image daemon request failed: "status=409, reason=Conflict, error={'explanation': 'Invalid JSON', 'detail': 'Expecting value: line 1 column 1 (char 0)'}"

There are

Version-Release number of selected component (if applicable):
vdsm-4.40.22-53.gitf9b7e2914.el8.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Cause imageio to fail a tickets request

One way is to set a very short remove_timeout:
Edit /etc/ovirt-imageio/conf.d/99-test.conf:

[control]
remove_timeout = 10

2. Start a transfer using sdk image_transfer.py example:

$ ./image_transfer.py --engine-url https://engine3 --username admin@internal --password-file engine3-password --cafile engine3.pem upload eae3c26b-b9e7-4606-bf64-6422ad28b729
Connecting...
Looking up disk eae3c26b-b9e7-4606-bf64-6422ad28b729
Creating image transfer for upload
Transfer ID: 5fe6be40-4112-41e3-beed-ac4b90dc610b
Transfer host name: host4
Transfer URL: https://host4:54322/images/9d192d7c-f786-4d33-a82b-33da70f79ca3
Proxy URL: https://engine3:54323/images/9d192d7c-f786-4d33-a82b-33da70f79ca3
Conneted to imageio server
Transfer is alive

Wait until the "Transfer is alive" message

3. Cancel the transfer in engine UI, or send a cacnel request using the API
   for the transfer ID (in this case: 5fe6be40-4112-41e3-beed-ac4b90dc610b)

4. When engine send a remove_image_ticket request to vdsm, the request will fail
   with a timeout after 10 seconds.

Expected results:
Vdsm should log the real error:

    vdsm.storage.exception.ImageDaemonError: Image daemon request failed: "status=409, reason=Conflict, error='Timeout cancelling ticket d7caee92-d3f2-499c-8e4b-ca29ebdf929d'

More info:

Vdsm should use the response content-type:
- If imageio returns "content-type: text/plain; charset=UTF-8", vdsm should
  decode the error as utf-8 text.
- If imageio returns "content-type: application/json" vdsm should decode the
  body as json.

This issue makes debugging harder since you need to find the real error in 
imageio log, instead of engine event log.

This is the text from engine event log:

    VDSM host4 command RemoveImageTicketVDS failed: Image daemon request failed:
    "status=409, reason=Conflict, error={'explanation': 'Invalid JSON', 'detail':
    'Expecting value: line 1 column 1 (char 0)'}"

Comment 2 Roman Bednář 2021-02-22 11:34:50 UTC
Patches have been merged to master, should be available in next vdsm build: 4.40.50.7

Comment 4 Sandro Bonazzola 2021-03-22 12:57:25 UTC
This bugzilla is included in oVirt 4.4.5 release, published on March 18th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.5 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.