Bug 1576500
Summary: | StreamingAPI - Transfer disk paused by system as adding a new ticket failed due proxy error PUT /tickets/: [500] _strptime_time | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-imageio | Reporter: | Avihai <aefrat> | ||||||||||||||
Component: | Proxy | Assignee: | Daniel Erez <derez> | ||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Yosi Ben Shimon <ybenshim> | ||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||
Priority: | unspecified | ||||||||||||||||
Version: | 1.4.7 | CC: | aefrat, audgiri, bugs, derez, ebenahar, jan.hladik, nsoffer, royoung, tnisan, ybenshim | ||||||||||||||
Target Milestone: | ovirt-4.3.2 | Keywords: | Automation, AutomationBlocker | ||||||||||||||
Target Release: | 1.5.1 | Flags: | rule-engine:
ovirt-4.3+
|
||||||||||||||
Hardware: | Unspecified | ||||||||||||||||
OS: | Unspecified | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | ovirt-imageio-1.5.1 | Doc Type: | If docs needed, set a value | ||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2019-03-19 10:05:24 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | |||||||||||||||||
Bug Blocks: | 1547768 | ||||||||||||||||
Attachments: |
|
Description
Avihai
2018-05-09 15:33:14 UTC
The same issue occurred again at Tier1 TC18258 again in ISCSI flavor, this time the disk that failed to add ticket was as follows : newImageId='7cf6078e-53f1-4b53-bb96-94260dc3fe10', imageType='Sparse', DiskAlias :"disk_virtiocow_1100034730"," Logs attached. Details: Proxy: Thread-2 ) INFO 2018-05-11 00:05:39,259 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/ (Thread-1 ) INFO 2018-05-11 00:05:39,262 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/ (Thread-3 ) INFO 2018-05-11 00:05:39,279 web:95:web:(log_start) START [127.0.0.1] PUT /tickets/ (Thread-3 ) INFO 2018-05-11 00:05:39,286 auth:187:auth2:(add_signed_ticket) Adding new ticket: <Ticket id=u'998a9ce4-b1fd-4647-a689-4d20d514e09a', url=u'https://lynx01.lab.eng.tlv2.redhat.com:54322' timeout=359 98.71343302727 at 0x7f642ecf1850> (Thread-3 ) INFO 2018-05-11 00:05:39,286 web:102:web:(log_finish) FINISH [127.0.0.1] PUT /tickets/: [200] 0 (0.01s) (Thread-1 ) INFO 2018-05-11 00:05:39,287 auth:187:auth2:(add_signed_ticket) Adding new ticket: <Ticket id=u'd5c4c167-c064-4696-813f-ce5359e427a3', url=u'https://lynx01.lab.eng.tlv2.redhat.com:54322' timeout=359 98.712075948715 at 0x7f642ecf17d0> (Thread-1 ) INFO 2018-05-11 00:05:39,288 web:102:web:(log_finish) FINISH [127.0.0.1] PUT /tickets/: [200] 0 (0.02s) (Thread-2 ) ERROR 2018-05-11 00:05:39,275 web:112:web:(log_error) ERROR [127.0.0.1] PUT /tickets/: [500] _strptime_time (0.02s) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 64, in __call__ resp = self.dispatch(request) File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 91, in dispatch return method(*match.groups()) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 31, in put auth.add_signed_ticket(self.request.body) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 184, in add_signed_ticket payload = _decode_ovirt_ticket(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 231, in _decode_ovirt_ticket return decoder.decode(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode self._verifyCertificate(self._ca, x509) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate x509.get_not_before().get_datetime().replace(tzinfo=None) <= File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6] AttributeError: _strptime_time Engine: 2018-05-11 00:05:38,334+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-3) [ebadad88-79da-4788-8fa9-8d9d38d6958c] Successfully added Download disk 'disk_virtio_scsi raw_1100034730' (id '00000000-0000-0000-0000-000000000000') for image transfer command 'd1882194-a2b5-4e0f-8db7-8df592250dec' 2018-05-11 00:05:39,333+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-3) [ebadad88-79da-4788-8fa9-8d9d38d6958c] Returning from proceedCommandExecution after start ing transfer session for image transfer command 'd1882194-a2b5-4e0f-8db7-8df592250dec' 2018-05-11 00:05:39,345+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-23) [cd0de2f6-bd8c-4777-924c-bd5acb8a6cb9] Failed to add image ticket to ovirt-imageio-proxy 2018-05-11 00:05:39,345+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-23) [cd0de2f6-bd8c-4777-924c-bd5acb8a6cb9] Failed to add image ticket to ovirt-imageio-proxy 2018-05-11 00:05:39,356+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-23) [cd0de2f6-bd8c-4777-924c-bd5acb8a6cb9] EVENT_ID: TRANSFER_IMAGE_PAUSED_BY_SYSTEM_FAILED_TO_ADD_TICKET_TO_PROXY(1,070), Transfer was paused by system. Reason: failed to add image ticket to ovirt-imageio-proxy. VDSM relevant log (last entrie on relevant image -7cf6078e-53f1-4b53-bb96-94260dc3fe10 operation add_image_ticket) : 2018-05-11 00:05:39,162+0300 INFO (jsonrpc/2) [vdsm.api] START add_image_ticket(ticket={'url': 'file:///rhev/data-center/mnt/blockSD/c713576c-109b-4c69-a18d-acdd83aa9286/images/b08c7bfc-9ccc-464f-b668-eded65b99 22d/7cf6078e-53f1-4b53-bb96-94260dc3fe10', 'size': 1073741824, 'uuid': 'd7999015-c13a-47e6-b297-e9724eb00fdc', 'timeout': 300, 'ops': ['read']}) from=::ffff:10.46.16.196,32990, flow_id=cd0de2f6-bd8c-4777-924c-bd 5acb8a6cb9, task_id=88f1bd2a-8e6a-4296-af90-73acefde6b26 (api:46) 2018-05-11 00:05:39,162+0300 DEBUG (jsonrpc/2) [storage.imagetickets] Sending request method='PUT', ticket='d7999015-c13a-47e6-b297-e9724eb00fdc', body='{"url": "file:///rhev/data-center/mnt/blockSD/c713576c-109b-4c69-a18d-acdd83aa9286/images/b08c7bfc-9ccc-464f-b668-eded65b9922d/7cf6078e-53f1-4b53-bb96-94260dc3fe10", "size": 1073741824, "uuid": "d7999015-c13a-47e6-b297-e9724eb00fdc", "timeout": 300, "ops": ["read"]}' (imagetickets:78) Created attachment 1435578 [details]
issue_rep_logs_TC18258
Seen once more in TestCase18268 on builds: ovirt-engine-4.2.6-0.1.el7ev.noarch ovirt-imageio-proxy-1.4.2-0.el7ev.noarch Logs attached From Proxy log: (Thread-1 ) INFO 2018-08-08 10:38:41,250 web:97:web:(__call__) START: [10.46.16.250] PUT /tickets/ (Thread-2 ) INFO 2018-08-08 10:38:41,258 web:97:web:(__call__) START: [10.46.16.250] PUT /tickets/ (Thread-3 ) INFO 2018-08-08 10:38:41,264 web:97:web:(__call__) START: [10.46.16.250] PUT /tickets/ (Thread-4 ) INFO 2018-08-08 10:38:41,273 web:97:web:(__call__) START: [10.46.16.250] PUT /tickets/ (Thread-4 ) INFO 2018-08-08 10:38:41,312 auth:187:auth2:(add_signed_ticket) Adding new ticket: <Ticket id=u'8d795065-826a-4ae8-80f5-1c35fbb64090', url=u'https://lynx09.lab.eng.tlv2.redhat.com:54322' timeout=359 97.68738389015 at 0x7fa6c3cba210> (Thread-4 ) INFO 2018-08-08 10:38:41,313 web:77:web:(close) FINISH [10.46.16.250] PUT /tickets/: [200] 0 [request=0.039486] (Thread-2 ) ERROR 2018-08-08 10:38:41,304 web:136:web:(log_error) ERROR [10.46.16.250] PUT /tickets/: [500] _strptime_time [request=0.046504] Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 99, in __call__ resp = self.dispatch(request, clock) File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 128, in dispatch return method(*match.groups()) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 32, in put auth.add_signed_ticket(self.request.body) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 184, in add_signed_ticket payload = _decode_ovirt_ticket(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 231, in _decode_ovirt_ticket return decoder.decode(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode self._verifyCertificate(self._ca, x509) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate x509.get_not_before().get_datetime().replace(tzinfo=None) <= File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6] AttributeError: _strptime_time (Thread-3 ) ERROR 2018-08-08 10:38:41,307 web:136:web:(log_error) ERROR [10.46.16.250] PUT /tickets/: [500] _strptime_time [request=0.042288] Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 99, in __call__ resp = self.dispatch(request, clock) File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 128, in dispatch return method(*match.groups()) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 32, in put auth.add_signed_ticket(self.request.body) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 184, in add_signed_ticket payload = _decode_ovirt_ticket(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 231, in _decode_ovirt_ticket return decoder.decode(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode self._verifyCertificate(self._ca, x509) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate x509.get_not_before().get_datetime().replace(tzinfo=None) <= File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6] AttributeError: _strptime_time (Thread-1 ) ERROR 2018-08-08 10:38:41,307 web:136:web:(log_error) ERROR [10.46.16.250] PUT /tickets/: [500] _strptime_time [request=0.057461] Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 99, in __call__ resp = self.dispatch(request, clock) File "/usr/lib/python2.7/site-packages/ovirt_imageio_common/web.py", line 128, in dispatch return method(*match.groups()) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 32, in put auth.add_signed_ticket(self.request.body) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 184, in add_signed_ticket payload = _decode_ovirt_ticket(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 231, in _decode_ovirt_ticket return decoder.decode(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode self._verifyCertificate(self._ca, x509) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate x509.get_not_before().get_datetime().replace(tzinfo=None) <= File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6] AttributeError: _strptime_time Engine log: 2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-2) [51504c41-1d66-4795-921f-d4b7988aee27] Failed to add image ticket to ovirt-imageio-proxy 2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-2) [51504c41-1d66-4795-921f-d4b7988aee27] Failed to add image ticket to ovirt-imageio-proxy 2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-8) [f0192c31-afd0-4d96-9b20-e6ade22420db] Failed to add image ticket to ovirt-imageio-proxy 2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-4) [474622a7-85a2-468b-b187-b23f0075b2cb] Failed to add image ticket to ovirt-imageio-proxy 2018-08-08 10:38:41,405+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-8) [f0192c31-afd0-4d96-9b20-e6ade22420db] Failed to add image ticket to ovirt-imageio-proxy 2018-08-08 10:38:41,405+03 ERROR Created attachment 1474615 [details]
TC18268 engine,VDSM,RESTAPI logs
Created attachment 1474617 [details]
image proxy log
Avihay, this looks like python error: tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6] AttributeError: _strptime_time Can you attach output of "rpm -qa" to the bug from the host running the proxy? This tests failed on multiple hosts(Engines)running the proxy. All of these hosts are reinstalled from scratch each time we/infra run them. I added a current host rpm -qa in the attached file. Created attachment 1477188 [details]
rpm -qa content
Failed again in TestCase18275 (tier2) on: ovirt-engine-4.3.0-0.6.alpha2.el7.noarch vdsm-4.30.4-1.el7ev.x86_64 ovirt-imageio-proxy-1.4.6-1.el7ev.noarch Three (nasty) exceptions found in the engine log: 1. Failed to add image ticket to ovirt-imageio-proxy: java.lang.RuntimeException: Request to imageio-proxy failed, response code: 500 2. Failed to stop image transfer session. Ticket does not exist for image '52e31846-2886-40d7-ad7a-361f4b5dbf23' => caused java.lang.reflect.InvocationTargetException 3. Exception: java.lang.RuntimeException: EngineException: Vds with id: 00000000-0000-0000-0000-000000000000 was not found (Failed with error RESOURCE_MANAGER_VDS_NOT_FOUND and code 5004) From image-proxy log: (Thread-2 ) ERROR 2018-12-12 09:19:18,331 web:136:web:(log_error) ERROR [10.46.16.252] PUT /tickets/ [500] _strptime_time [request=0.061959] Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/ovirt_imageio_common/web.py", line 99, in __call__ resp = self.dispatch(request, clock) File "/usr/lib64/python2.7/site-packages/ovirt_imageio_common/web.py", line 128, in dispatch return method(*match.groups()) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/tickets.py", line 32, in put auth.add_signed_ticket(self.request.body) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 194, in add_signed_ticket payload = _decode_ovirt_ticket(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/auth.py", line 241, in _decode_ovirt_ticket return decoder.decode(data) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 94, in decode self._verifyCertificate(self._ca, x509) File "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/ticket.py", line 70, in _verifyCertificate x509.get_not_before().get_datetime().replace(tzinfo=None) <= File "/usr/lib64/python2.7/site-packages/M2Crypto/ASN1.py", line 181, in get_datetime tm = list(time.strptime(rest, "%d %H:%M:%S %Y"))[:6] AttributeError: _strptime_time Attaching all relevant logs from this execution. Created attachment 1514056 [details]
logs
This looks like a bug in M2Crypto - but maybe we are not using it correctly. M2Crypto needs special initialization when using in multi-threaded program and we don't have such code. This is a fix we had in vdsm related to M2Crypto: commit a028c6a3335cb58e1eec67d74fd361b78926547f Author: Nir Soffer <nsoffer> Date: Tue Jun 24 00:37:18 2014 +0300 sslutils: Document M2Crypto threading initialization As documented and painfully recognized long ago in, using M2Crypto in a multi-threaded server such as Vdsm requires explicit initialization. M2Crypto was initialized but the code was not docuemnted properly. This lead to the initialization being removed in this patch: http://gerrit.ovirt.org/#/c/28858/4/lib/vdsm/verifyingtransport.py Hopefully, improved documentation and moving it to the sslutils module will prevent such attempts in the future. Change-Id: I8e5f8dc4662e16ba7e39f60336819750e1366124 Relates-to: http://www.heikkitoivonen.net/m2crypto/api/M2Crypto.threading-module.html Relates-to: https://bugzilla.redhat.com/482420 Signed-off-by: Nir Soffer <nsoffer> Reviewed-on: http://gerrit.ovirt.org/29104 Reviewed-by: Yaniv Bronhaim <ybronhei> Reviewed-by: Piotr Kliczewski <piotr.kliczewski> Reviewed-by: Dan Kenigsberg <danken> Avihay, if you know how to reproduce this issue, it would be nice to test if the attached patch fixes this. There is no clear scenario and this did not happen each time so no high confidence in a reproduction. The common ground in all TC's(below) that this issue was found was multithreaded disk/disk snapshot downloads ( downloading 4 disks/snapdisks concurrently) Tier2 TestCase18229 TestCase18268 TestCase18275 Tier1 TestCase18258 About taking the patch ,instead of rebuilding imageio-proxy, I see its a couple of line in image_proxy.py , can I add them manually at the engine machine and reload the image_proxy service, right ? High confidence verification of this issue will take time as this did not reproduce each time. Elad, please give this priority . (In reply to Avihai from comment #13) > There is no clear scenario and this did not happen each time so no high > confidence in a reproduction. > > The common ground in all TC's(below) that this issue was found was > multithreaded disk/disk snapshot downloads ( downloading 4 disks/snapdisks > concurrently) > > Tier2 > TestCase18229 > TestCase18268 > TestCase18275 > > Tier1 > TestCase18258 > > About taking the patch ,instead of rebuilding imageio-proxy, I see its a > couple of line in image_proxy.py , > can I add them manually at the engine machine and reload the image_proxy > service, right ? If this is the scope needed for testing the unmerged patch, then this shouldn't be much effort, let's give it a try (In reply to Nir Soffer from comment #12) > Avihay, if you know how to reproduce this issue, it would be nice to test if > the attached patch fixes this. About taking the patch, instead of rebuilding image-proxy on the engine, I see it's a couple of line in image_proxy.py file. can I add them manually at the engine machine and reload the image_proxy service ? Also , the patch is not reviewed yet . Please ping me when patch is done and I'll test it. This bug has not been marked as blocker for oVirt 4.3.0. Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1. Daniel, the patch is waiting for review since Dec 13. Can you review it? (In reply to Nir Soffer from comment #18) > Daniel, the patch is waiting for review since Dec 13. Can you review it? Merged. Will be included in next ovirt-imageio build. Tested using: ovirt-engine-4.3.2.1-0.1.el7.noarch ovirt-imageio-proxy-1.5.1-0.el7ev.noarch The latest tier 1 & 2 executions went find and none of the mentioned test cases didn't fail. Also executed these test cases without any errors: Tier2 TestCase18268 TestCase18275 Tier1 TestCase18258 Moving to VERIFIED This bugzilla is included in oVirt 4.3.2 release, published on March 19th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. *** Bug 1616019 has been marked as a duplicate of this bug. *** |