Bug 1576500

Summary: StreamingAPI - Transfer disk paused by system as adding a new ticket failed due proxy error PUT /tickets/: [500] _strptime_time
Product: [oVirt] ovirt-imageio Reporter: Avihai <aefrat>
Component: ProxyAssignee: Daniel Erez <derez>
Status: CLOSED CURRENTRELEASE QA Contact: Yosi Ben Shimon <ybenshim>
Severity: high Docs Contact:
Priority: unspecified    
Version: 1.4.7CC: aefrat, audgiri, bugs, derez, ebenahar, jan.hladik, nsoffer, royoung, tnisan, ybenshim
Target Milestone: ovirt-4.3.2Keywords: Automation, AutomationBlocker
Target Release: 1.5.1Flags: rule-engine: ovirt-4.3+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-imageio-1.5.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-19 10:05:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1547768    
Attachments:
Description Flags
engine, proxy logs
none
issue_rep_logs_TC18258
none
TC18268 engine,VDSM,RESTAPI logs
none
image proxy log
none
rpm -qa content
none
logs none

Description Avihai 2018-05-09 15:33:14 UTC
Created attachment 1433933 [details]
engine, proxy logs

Description of problem:
Automation TC18829 which download 4 disks snapshots concurrently via multithreading.

Right at the beginning of the transfer 3 one of the disk add ticket operation was successful but the last disk ('disk_virtioraw_0514540171' (id '1f503faa-fb6a-4c9d-a160-c2c96187aabb') failed with the following errors in Engine and proxy log:

Proxy log:
(Thread-1  ) INFO 2018-05-05 14:56:18,276 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/
(Thread-2  ) INFO 2018-05-05 14:56:18,280 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/
(Thread-3  ) INFO 2018-05-05 14:56:18,311 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/
(Thread-4  ) INFO 2018-05-05 14:56:18,319 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/
(Thread-1  ) INFO 2018-05-05 14:56:18,324 auth:187:auth2:(add_signed_ticket) Adding new ticket: <Ticket id=u'55b5fb18-c74a-4fec-ab8b-fe0b8333510d', url=u'https://lynx17.lab.eng.tlv2.redhat.com:54322' timeout=359
98.67524600029 at 0x7f6f5a3cd6d0>
(Thread-1  ) INFO 2018-05-05 14:56:18,325 web:102:web:(log_finish) FINISH [127.0.0.1] PUT /tickets/: [200] 0 (0.04s)
(Thread-4  ) INFO 2018-05-05 14:56:18,327 auth:187:auth2:(add_signed_ticket) Adding new ticket: <Ticket id=u'13f5bcce-51b0-4099-81c6-5219902038b2', url=u'https://lynx14.lab.eng.tlv2.redhat.com:54322' timeout=359
98.672771930695 at 0x7f6f5a3e8dd0>
(Thread-3  ) INFO 2018-05-05 14:56:18,327 auth:187:auth2:(add_signed_ticket) Adding new ticket: <Ticket id=u'b2dae56a-5e46-46e5-8b61-cadab2fa6aa8', url=u'https://lynx14.lab.eng.tlv2.redhat.com:54322' timeout=359
98.671777009964 at 0x7f6f5a3cd710>
(Thread-4  ) INFO 2018-05-05 14:56:18,328 web:102:web:(log_finish) FINISH [127.0.0.1] PUT /tickets/: [200] 0 (0.01s)
(Thread-3  ) INFO 2018-05-05 14:56:18,328 web:102:web:(log_finish) FINISH [127.0.0.1] PUT /tickets/: [200] 0 (0.02s)
(Thread-2  ) ERROR 2018-05-05 14:56:18,307 web:112:web:(log_error) ERROR [127.0.0.1] PUT /tickets/: [500] _strptime_time (0.03s)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 64, in __call__
    resp = self.dispatch(request)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 91, in dispatch
    return method(*match.groups())
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 31, in put
    auth.add_signed_ticket(self.request.body)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 184, in add_signed_ticket
    payload = _decode_ovirt_ticket(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 231, in _decode_ovirt_ticket
    return decoder.decode(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode
    self._verifyCertificate(self._ca, x509)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate
    x509.get_not_before().get_datetime().replace(tzinfo=None) <=
  File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime
    tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6]
AttributeError: _strptime_time


Engine log :
2018-05-05 14:56:18,388+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-26) [877635a8-2629-460a-bf73-61735ff5ead0] Failed to add image ticket to ovirt-imageio-proxy
2018-05-05 14:56:18,388+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-26) [877635a8-2629-460a-bf73-61735ff5ead0] Failed to add image ticket to ovirt-imageio-proxy
2018-05-05 14:56:18,389+03 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-22) [b951ef04-bad5-46e3-b3da-014443a540b8] Returning from proceedCommandExecution after starting transfer session for image transfer command '73eff2f5-88d7-4fdd-af52-1f2429df4903'
2018-05-05 14:56:18,397+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-26) [877635a8-2629-460a-bf73-61735ff5ead0] EVENT_ID: TRANSFER_IMAGE_PAUSED_BY_SYSTEM_FAILED_TO_ADD_TICKET_TO_PROXY(1,070), Transfer was paused by system. Reason: failed to add image ticket to ovirt-imageio-proxy.


2018-05-05 14:56:19,779+03 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-50) [877635a8-2629-460a-bf73-61735ff5ead0] Transfer was paused by system. Download disk 'disk_virtioraw_0514540171' (id '1f503faa-fb6a-4c9d-a160-c2c96187aabb')



Version-Release number of selected component (if applicable):
ovirt-engine-4.2.3.4-0.1.el7.noarch
vdsm-4.20.27.1-1.el7ev.x86_64

How reproducible:
Happened only once so far.

Steps to Reproduce(from automation log):
14:52:43 2018-05-05 14:52:43,712 INFO       Test Setup   1: Creating VM vm_TestCase18829_0514524371
14:52:57 2018-05-05 14:52:57,332 INFO       Test Setup   2: Start VM vm_TestCase18829_0514524371
14:52:57 2018-05-05 14:52:57,335 INFO       Test Setup   3: [class] Start VM vm_TestCase18829_0514524371 with {'wait_for_ip': False, 'pause': False, 'use_cloud_init': False, 'timeout': 600, 'wait_for_status': 'up'}
14:53:38 2018-05-05 14:53:38,165 INFO       Test Setup   4: Fetch VM vm_TestCase18829_0514524371 executor
14:54:01 2018-05-05 14:54:01,418 INFO       Test Setup   5: Create disks with filesystem on VM vm_TestCase18829_0514524371
14:55:20 2018-05-05 14:55:19,982 INFO       Test Setup   6: Stop VM vm_TestCase18829_0514524371
14:55:20 2018-05-05 14:55:20,182 INFO       Test Setup   7: [class] Stop vm vm_TestCase18829_0514524371 with {'async': 'true'}
14:55:24 2018-05-05 14:55:24,682 INFO       Test Setup   8: Creating snapshot snap_TestCase18829_0514552468 of VM vm_TestCase18829_0514524371
14:55:24 2018-05-05 14:55:24,687 INFO       Test Setup   9: [class] Add snapshot to VM vm_TestCase18829_0514524371 with {'description': 'snap_TestCase18829_0514552468', 'wait': True}
14:55:45 2018-05-05 14:55:45,409 INFO 1796: storage/rhevmtests.storage.storage_streaming_api.test_download_image.TestCase18829.test_download_disk_on_running_vm[iscsi]
14:55:45 2018-05-05 14:55:45,410 INFO STORAGE: ISCSI
14:55:45 2018-05-05 14:55:45,410 INFO       Test Step  10: Check md5sums and sizes of disks ['disk_virtioraw_0514540171', 'disk_virtiocow_0514540171', 'disk_virtio_scsicow_0514540171', 'disk_virtio_scsiraw_0514540171'] pre download
14:56:16 2018-05-05 14:56:16,302 INFO       Test Step  11: Starting parallel download of snapshot disks of the followingdisks ['disk_virtioraw_0514540171', 'disk_virtiocow_0514540171', 'disk_virtio_scsicow_0514540171', 'disk_virtio_scsiraw_0514540171'] matching snapshot snap_TestCase18829_0514552468
14:56:16 2018-05-05 14:56:16,366 INFO       Test Step  12: Download image id c989ecd4-08b5-4568-9d76-40ad3bfbd61c
14:56:16 2018-05-05 14:56:16,428 INFO       Test Step  13: Download image id 3176e5e3-3f57-4ab0-a661-13e0847efc51
14:56:16 2018-05-05 14:56:16,488 INFO       Test Step  14: Download image id 35b962c7-17b8-4638-a8f6-c77e59d0a7f6
14:56:16 2018-05-05 14:56:16,549 INFO       Test Step  15: Download image id 1f503faa-fb6a-4c9d-a160-c2c96187aabb
14:58:10 2018-05-05 14:58:10,329 INFO       Test Step  16: Comparing md5sum before and after download
14:58:18 2018-05-05 14:58:18,425 ERROR Result: FAILED

Actual results:
All 4 snapshot disks should successfully download

Expected results:
Only 3/4 snapshot disks should be downloaded.
Disk 'disk_virtioraw_0514540171' (id '1f503faa-fb6a-4c9d-a160-c2c96187aabb') transfer was paused by system.

Additional info:

Comment 1 Avihai 2018-05-13 05:43:36 UTC
The same issue occurred again at Tier1 TC18258 again in ISCSI flavor, this time the disk that failed to add ticket was as follows :

newImageId='7cf6078e-53f1-4b53-bb96-94260dc3fe10', imageType='Sparse',
DiskAlias :"disk_virtiocow_1100034730","

Logs attached.

Details:

Proxy:
Thread-2  ) INFO 2018-05-11 00:05:39,259 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/
(Thread-1  ) INFO 2018-05-11 00:05:39,262 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/
(Thread-3  ) INFO 2018-05-11 00:05:39,279 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/
(Thread-3  ) INFO 2018-05-11 00:05:39,286 auth:187:auth2:(add_signed_ticket) Adding new ticket: <Ticket id=u'998a9ce4-b1fd-4647-a689-4d20d514e09a', url=u'https://lynx01.lab.eng.tlv2.redhat.com:54322' timeout=359
98.71343302727 at 0x7f642ecf1850>
(Thread-3  ) INFO 2018-05-11 00:05:39,286 web:102:web:(log_finish) FINISH [127.0.0.1] PUT /tickets/: [200] 0 (0.01s)
(Thread-1  ) INFO 2018-05-11 00:05:39,287 auth:187:auth2:(add_signed_ticket) Adding new ticket: <Ticket id=u'd5c4c167-c064-4696-813f-ce5359e427a3', url=u'https://lynx01.lab.eng.tlv2.redhat.com:54322' timeout=359
98.712075948715 at 0x7f642ecf17d0>
(Thread-1  ) INFO 2018-05-11 00:05:39,288 web:102:web:(log_finish) FINISH [127.0.0.1] PUT /tickets/: [200] 0 (0.02s)
(Thread-2  ) ERROR 2018-05-11 00:05:39,275 web:112:web:(log_error) ERROR [127.0.0.1] PUT /tickets/: [500] _strptime_time (0.02s)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 64, in __call__
    resp = self.dispatch(request)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 91, in dispatch
    return method(*match.groups())
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 31, in put
    auth.add_signed_ticket(self.request.body)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 184, in add_signed_ticket
    payload = _decode_ovirt_ticket(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 231, in _decode_ovirt_ticket
    return decoder.decode(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode
    self._verifyCertificate(self._ca, x509)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate
    x509.get_not_before().get_datetime().replace(tzinfo=None) <=
  File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime
    tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6]
AttributeError: _strptime_time


Engine:
2018-05-11 00:05:38,334+03 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-3) [ebadad88-79da-4788-8fa9-8d9d38d6958c] Successfully added Download disk 'disk_virtio_scsi
raw_1100034730' (id '00000000-0000-0000-0000-000000000000') for image transfer command 'd1882194-a2b5-4e0f-8db7-8df592250dec'


2018-05-11 00:05:39,333+03 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-3) [ebadad88-79da-4788-8fa9-8d9d38d6958c] Returning from proceedCommandExecution after start
ing transfer session for image transfer command 'd1882194-a2b5-4e0f-8db7-8df592250dec'
2018-05-11 00:05:39,345+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-23) [cd0de2f6-bd8c-4777-924c-bd5acb8a6cb9] Failed to add image ticket to ovirt-imageio-proxy
2018-05-11 00:05:39,345+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-23) [cd0de2f6-bd8c-4777-924c-bd5acb8a6cb9] Failed to add image ticket to ovirt-imageio-proxy
2018-05-11 00:05:39,356+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-23) [cd0de2f6-bd8c-4777-924c-bd5acb8a6cb9] EVENT_ID: TRANSFER_IMAGE_PAUSED_BY_SYSTEM_FAILED_TO_ADD_TICKET_TO_PROXY(1,070), Transfer was paused by system. Reason: failed to add image ticket to ovirt-imageio-proxy.

VDSM relevant log (last entrie on relevant image -7cf6078e-53f1-4b53-bb96-94260dc3fe10 operation add_image_ticket) :
2018-05-11 00:05:39,162+0300 INFO  (jsonrpc/2) [vdsm.api] START add_image_ticket(ticket={'url': 'file:///rhev/data-center/mnt/blockSD/c713576c-109b-4c69-a18d-acdd83aa9286/images/b08c7bfc-9ccc-464f-b668-eded65b99
22d/7cf6078e-53f1-4b53-bb96-94260dc3fe10', 'size': 1073741824, 'uuid': 'd7999015-c13a-47e6-b297-e9724eb00fdc', 'timeout': 300, 'ops': ['read']}) from=::ffff:10.46.16.196,32990, flow_id=cd0de2f6-bd8c-4777-924c-bd
5acb8a6cb9, task_id=88f1bd2a-8e6a-4296-af90-73acefde6b26 (api:46)
2018-05-11 00:05:39,162+0300 DEBUG (jsonrpc/2) [storage.imagetickets] Sending request method='PUT', ticket='d7999015-c13a-47e6-b297-e9724eb00fdc', body='{"url": "file:///rhev/data-center/mnt/blockSD/c713576c-109b-4c69-a18d-acdd83aa9286/images/b08c7bfc-9ccc-464f-b668-eded65b9922d/7cf6078e-53f1-4b53-bb96-94260dc3fe10", "size": 1073741824, "uuid": "d7999015-c13a-47e6-b297-e9724eb00fdc", "timeout": 300, "ops": ["read"]}' (imagetickets:78)

Comment 2 Avihai 2018-05-13 05:54:26 UTC
Created attachment 1435578 [details]
issue_rep_logs_TC18258

Comment 3 Avihai 2018-08-09 09:16:19 UTC
Seen once more in TestCase18268 on builds:
ovirt-engine-4.2.6-0.1.el7ev.noarch
ovirt-imageio-proxy-1.4.2-0.el7ev.noarch

Logs attached



From Proxy log:
(Thread-1  ) INFO 2018-08-08 10:38:41,250 web:97:web:(__call__) START: [10.46.16.250] PUT /tickets/
(Thread-2  ) INFO 2018-08-08 10:38:41,258 web:97:web:(__call__) START: [10.46.16.250] PUT /tickets/
(Thread-3  ) INFO 2018-08-08 10:38:41,264 web:97:web:(__call__) START: [10.46.16.250] PUT /tickets/
(Thread-4  ) INFO 2018-08-08 10:38:41,273 web:97:web:(__call__) START: [10.46.16.250] PUT /tickets/
(Thread-4  ) INFO 2018-08-08 10:38:41,312 auth:187:auth2:(add_signed_ticket) Adding new ticket: <Ticket id=u'8d795065-826a-4ae8-80f5-1c35fbb64090', url=u'https://lynx09.lab.eng.tlv2.redhat.com:54322' timeout=359
97.68738389015 at 0x7fa6c3cba210>
(Thread-4  ) INFO 2018-08-08 10:38:41,313 web:77:web:(close) FINISH [10.46.16.250] PUT /tickets/: [200] 0 [request=0.039486]
(Thread-2  ) ERROR 2018-08-08 10:38:41,304 web:136:web:(log_error) ERROR [10.46.16.250] PUT /tickets/: [500] _strptime_time [request=0.046504]
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 99, in __call__
    resp = self.dispatch(request, clock)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 128, in dispatch
    return method(*match.groups())
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 32, in put
    auth.add_signed_ticket(self.request.body)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 184, in add_signed_ticket
    payload = _decode_ovirt_ticket(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 231, in _decode_ovirt_ticket
    return decoder.decode(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode
    self._verifyCertificate(self._ca, x509)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate
    x509.get_not_before().get_datetime().replace(tzinfo=None) <=
  File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime
    tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6]
AttributeError: _strptime_time
(Thread-3  ) ERROR 2018-08-08 10:38:41,307 web:136:web:(log_error) ERROR [10.46.16.250] PUT /tickets/: [500] _strptime_time [request=0.042288]
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 99, in __call__
    resp = self.dispatch(request, clock)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 128, in dispatch
    return method(*match.groups())
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 32, in put
    auth.add_signed_ticket(self.request.body)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 184, in add_signed_ticket
    payload = _decode_ovirt_ticket(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 231, in _decode_ovirt_ticket
    return decoder.decode(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode
    self._verifyCertificate(self._ca, x509)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate
    x509.get_not_before().get_datetime().replace(tzinfo=None) <=
  File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime
    tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6]
AttributeError: _strptime_time
(Thread-1  ) ERROR 2018-08-08 10:38:41,307 web:136:web:(log_error) ERROR [10.46.16.250] PUT /tickets/: [500] _strptime_time [request=0.057461]
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 99, in __call__
    resp = self.dispatch(request, clock)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 128, in dispatch
    return method(*match.groups())
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 32, in put
    auth.add_signed_ticket(self.request.body)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 184, in add_signed_ticket
    payload = _decode_ovirt_ticket(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 231, in _decode_ovirt_ticket
    return decoder.decode(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode
    self._verifyCertificate(self._ca, x509)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate
    x509.get_not_before().get_datetime().replace(tzinfo=None) <=
  File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime
    tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6]
AttributeError: _strptime_time

Engine log:

2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-2) [51504c41-1d66-4795-921f-d4b7988aee27] Failed to add image ticket to ovirt-imageio-proxy
2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-2) [51504c41-1d66-4795-921f-d4b7988aee27] Failed to add image ticket to ovirt-imageio-proxy
2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-8) [f0192c31-afd0-4d96-9b20-e6ade22420db] Failed to add image ticket to ovirt-imageio-proxy
2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-4) [474622a7-85a2-468b-b187-b23f0075b2cb] Failed to add image ticket to ovirt-imageio-proxy
2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-8) [f0192c31-afd0-4d96-9b20-e6ade22420db] Failed to add image ticket to ovirt-imageio-proxy
2018-08-08 10:38:41,405+03 ERROR

Comment 4 Avihai 2018-08-09 09:17:17 UTC
Created attachment 1474615 [details]
TC18268 engine,VDSM,RESTAPI logs

Comment 5 Avihai 2018-08-09 09:20:16 UTC
Created attachment 1474617 [details]
image proxy log

Comment 6 Nir Soffer 2018-08-15 00:25:41 UTC
Avihay, this looks like python error:

    tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6]
AttributeError: _strptime_time

Can you attach output of "rpm -qa" to the bug from the host running the proxy?

Comment 7 Avihai 2018-08-20 12:05:00 UTC
This tests failed on multiple hosts(Engines)running the proxy.

All of these hosts are reinstalled from scratch each time we/infra run them.

I added a current host rpm -qa in the attached file.

Comment 8 Avihai 2018-08-20 12:05:36 UTC
Created attachment 1477188 [details]
rpm -qa content

Comment 9 Yosi Ben Shimon 2018-12-13 13:26:35 UTC
Failed again in TestCase18275 (tier2) on:
ovirt-engine-4.3.0-0.6.alpha2.el7.noarch
vdsm-4.30.4-1.el7ev.x86_64
ovirt-imageio-proxy-1.4.6-1.el7ev.noarch

Three (nasty) exceptions found in the engine log:
1. Failed to add image ticket to ovirt-imageio-proxy: java.lang.RuntimeException: Request to imageio-proxy failed, response code: 500
2. Failed to stop image transfer session. Ticket does not exist for image '52e31846-2886-40d7-ad7a-361f4b5dbf23' => caused java.lang.reflect.InvocationTargetException
3. Exception: java.lang.RuntimeException: EngineException: Vds with id: 00000000-0000-0000-0000-000000000000 was not found (Failed with error RESOURCE_MANAGER_VDS_NOT_FOUND and code 5004)

From image-proxy log:
(Thread-2  ) ERROR 2018-12-12 09:19:18,331 web:136:web:(log_error) ERROR [10.46.16.252] PUT /tickets/ [500] _strptime_time [request=0.061959]
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/ovirt_imageio_common/web.py", line 99, in __call__
    resp = self.dispatch(request, clock)
  File "/usr/lib64/python2.7/site-packages/ovirt_imageio_common/web.py", line 128, in dispatch
    return method(*match.groups())
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 32, in put
    auth.add_signed_ticket(self.request.body)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 194, in add_signed_ticket
    payload = _decode_ovirt_ticket(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 241, in _decode_ovirt_ticket
    return decoder.decode(data)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode
    self._verifyCertificate(self._ca, x509)
  File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate
    x509.get_not_before().get_datetime().replace(tzinfo=None) <=
  File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime
    tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6]
AttributeError: _strptime_time


Attaching all relevant logs from this execution.

Comment 10 Yosi Ben Shimon 2018-12-13 13:30:52 UTC
Created attachment 1514056 [details]
logs

Comment 11 Nir Soffer 2018-12-13 16:44:37 UTC
This looks like a bug in M2Crypto - but maybe we are not using it correctly.

M2Crypto needs special initialization when using in multi-threaded program
and we don't have such code.

This is a fix we had in vdsm related to M2Crypto:

commit a028c6a3335cb58e1eec67d74fd361b78926547f
Author: Nir Soffer <nsoffer>
Date:   Tue Jun 24 00:37:18 2014 +0300

    sslutils: Document M2Crypto threading initialization
    
    As documented and painfully recognized long ago in, using M2Crypto in a
    multi-threaded server such as Vdsm requires explicit initialization.
    
    M2Crypto was initialized but the code was not docuemnted properly. This
    lead to the initialization being removed in this patch:
    http://gerrit.ovirt.org/#/c/28858/4/lib/vdsm/verifyingtransport.py
    
    Hopefully, improved documentation and moving it to the sslutils module
    will prevent such attempts in the future.
    
    Change-Id: I8e5f8dc4662e16ba7e39f60336819750e1366124
    Relates-to: http://www.heikkitoivonen.net/m2crypto/api/M2Crypto.threading-module.html
    Relates-to: https://bugzilla.redhat.com/482420
    Signed-off-by: Nir Soffer <nsoffer>
    Reviewed-on: http://gerrit.ovirt.org/29104
    Reviewed-by: Yaniv Bronhaim <ybronhei>
    Reviewed-by: Piotr Kliczewski <piotr.kliczewski>
    Reviewed-by: Dan Kenigsberg <danken>

Comment 12 Nir Soffer 2018-12-13 17:03:34 UTC
Avihay, if you know how to reproduce this issue, it would be nice to test if
the attached patch fixes this.

Comment 13 Avihai 2018-12-14 06:42:16 UTC
There is no clear scenario and this did not happen each time so no high confidence in a reproduction.

The common ground in all TC's(below) that this issue was found was multithreaded disk/disk snapshot downloads ( downloading 4 disks/snapdisks concurrently)

Tier2
TestCase18229
TestCase18268
TestCase18275

Tier1
TestCase18258

About taking the patch ,instead of rebuilding imageio-proxy, I see its a couple of line in image_proxy.py ,
can I add them manually at the engine machine and reload the image_proxy service, right ?

High confidence verification of this issue will take time as this did not reproduce each time.
Elad, please give this priority .

Comment 14 Elad 2018-12-16 08:31:09 UTC
(In reply to Avihai from comment #13)
> There is no clear scenario and this did not happen each time so no high
> confidence in a reproduction.
> 
> The common ground in all TC's(below) that this issue was found was
> multithreaded disk/disk snapshot downloads ( downloading 4 disks/snapdisks
> concurrently)
> 
> Tier2
> TestCase18229
> TestCase18268
> TestCase18275
> 
> Tier1
> TestCase18258
> 
> About taking the patch ,instead of rebuilding imageio-proxy, I see its a
> couple of line in image_proxy.py ,
> can I add them manually at the engine machine and reload the image_proxy
> service, right ?
If this is the scope needed for testing the unmerged patch, then this shouldn't be much effort, let's give it a try

Comment 15 Avihai 2018-12-16 08:34:14 UTC
(In reply to Nir Soffer from comment #12)
> Avihay, if you know how to reproduce this issue, it would be nice to test if
> the attached patch fixes this.


About taking the patch, instead of rebuilding image-proxy on the engine, I see it's a couple of line in image_proxy.py file.
can I add them manually at the engine machine and reload the image_proxy service ?

Comment 16 Avihai 2018-12-16 08:38:17 UTC
Also , the patch is not reviewed yet .

Please ping me when patch is done and I'll test it.

Comment 17 Sandro Bonazzola 2019-01-28 09:34:25 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 18 Nir Soffer 2019-02-12 07:20:15 UTC
Daniel, the patch is waiting for review since Dec 13. Can you review it?

Comment 19 Daniel Erez 2019-02-19 07:10:29 UTC
(In reply to Nir Soffer from comment #18)
> Daniel, the patch is waiting for review since Dec 13. Can you review it?

Merged. Will be included in next ovirt-imageio build.

Comment 20 Yosi Ben Shimon 2019-03-17 13:11:21 UTC
Tested using:
ovirt-engine-4.3.2.1-0.1.el7.noarch
ovirt-imageio-proxy-1.5.1-0.el7ev.noarch

The latest tier 1 & 2 executions went find and none of the mentioned test cases didn't fail.
Also executed these test cases without any errors:

Tier2
TestCase18268
TestCase18275

Tier1
TestCase18258

Moving to VERIFIED

Comment 21 Sandro Bonazzola 2019-03-19 10:05:24 UTC
This bugzilla is included in oVirt 4.3.2 release, published on March 19th 2019.

Since the problem described in this bug report should be
resolved in oVirt 4.3.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 22 Tal Nisan 2019-10-29 16:44:57 UTC
*** Bug 1616019 has been marked as a duplicate of this bug. ***