Bug 2041902
| Summary: | 504 timeout error when trying to upload an image to NFS store. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Tzach Shefi <tshefi> | ||||
| Component: | openstack-glance | Assignee: | Cyril Roelandt <cyril> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | |||||
| Severity: | high | Docs Contact: | Andy Stillman <astillma> | ||||
| Priority: | unspecified | ||||||
| Version: | 16.2 (Train) | CC: | athomas, eglynn | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2022-01-27 06:20:20 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Created attachment 1851604 [details]
Glance api logs
The component probably should have been set to python-glance-store.
But I'll let you folks decide.
Attaching the api logs:
grep 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 controller-2.api.log
2022-01-18 11:20:30.657 18 DEBUG glance.api.middleware.version_negotiation [-] Determining version of request: PUT /v2/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20/file Accept: */* process_request /usr/lib/python3.6/site-packages/glance/api/middleware/version_negotiation.py:45
2022-01-18 11:20:30.657 18 DEBUG glance.api.middleware.version_negotiation [-] new path /v2/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20/file process_request /usr/lib/python3.6/site-packages/glance/api/middleware/version_negotiation.py:70
2022-01-18 11:26:33.983 18 DEBUG glance_store._drivers.filesystem [req-34ce50ec-8205-42bf-a4af-6fc7db40222b be681907aee94c4ba19f0ce901304a5f 8cc5e1fbf0cb488bb80ac0e8f2a211ac - default default] Wrote 16338944 bytes to /var/lib/glance/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 with checksum 1d3062cd89af34e419f7100277f38b2b and multihash 553d220ed58cfee7dafe003c446a9f197ab5edf8ffc09396c74187cf83873c877e7ae041cb80f3b91489acf687183adcd689b53b38e3ddd22e627e7f98a09c46 add /usr/lib/python3.6/site-packages/glance_store/_drivers/filesystem.py:777
2022-01-18 11:26:34.033 18 INFO eventlet.wsgi.server [req-34ce50ec-8205-42bf-a4af-6fc7db40222b be681907aee94c4ba19f0ce901304a5f 8cc5e1fbf0cb488bb80ac0e8f2a211ac - default default] 172.17.1.120 - - [18/Jan/2022 11:26:34] "PUT /v2/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20/file HTTP/1.1" 204 0 363.376649
2022-01-18 11:28:01.801 18 DEBUG glance.api.middleware.version_negotiation [-] Determining version of request: GET /v2/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 Accept: */* process_request /usr/lib/python3.6/site-packages/glance/api/middleware/version_negotiation.py:45
2022-01-18 11:28:01.802 18 DEBUG glance.api.middleware.version_negotiation [-] new path /v2/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 process_request /usr/lib/python3.6/site-packages/glance/api/middleware/version_negotiation.py:70
2022-01-18 11:28:02.792 18 INFO eventlet.wsgi.server [req-41c7f358-86e3-4a3e-9ea0-ee95d4832077 be681907aee94c4ba19f0ce901304a5f 8cc5e1fbf0cb488bb80ac0e8f2a211ac - default default] 172.17.1.120 - - [18/Jan/2022 11:28:02] "GET /v2/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 HTTP/1.1" 200 1067 0.991540
And if you look at the log I guess this is the issue:
2022-01-18 11:26:33.983 18 DEBUG glance_store._drivers.filesystem [req-34ce50ec-8205-42bf-a4af-6fc7db40222b be681907aee94c4ba19f0ce901304a5f 8cc5e1fbf0cb488bb80ac0e8f2a211ac - default default] Wrote 16338944 bytes to /var/lib/glance/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 with checksum 1d3062cd89af34e419f7100277f38b2b and multihash 553d220ed58cfee7dafe003c446a9f197ab5edf8ffc09396c74187cf83873c877e7ae041cb80f3b91489acf687183adcd689b53b38e3ddd22e627e7f98a09c46 add /usr/lib/python3.6/site-packages/glance_store/_drivers/filesystem.py:777
2022-01-18 11:26:34.030 18 DEBUG glance_store.multi_backend [req-34ce50ec-8205-42bf-a4af-6fc7db40222b be681907aee94c4ba19f0ce901304a5f 8cc5e1fbf0cb488bb80ac0e8f2a211ac - default default] Skipping store.set_acls... not implemented set_acls_for_multi_store /usr/lib/python3.6/site-packages/glance_store/multi_backend.py:549
2022-01-18 11:26:34.033 18 INFO eventlet.wsgi.server [req-34ce50ec-8205-42bf-a4af-6fc7db40222b be681907aee94c4ba19f0ce901304a5f 8cc5e1fbf0cb488bb80ac0e8f2a211ac - default default] Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/eventlet/wsgi.py", line 599, in handle_one_response
write(b'')
File "/usr/lib/python3.6/site-packages/eventlet/wsgi.py", line 538, in write
wfile.flush()
File "/usr/lib64/python3.6/socket.py", line 604, in write
return self._sock.send(b)
File "/usr/lib/python3.6/site-packages/eventlet/greenio/base.py", line 397, in send
return self._send_loop(self.fd.send, data, flags)
File "/usr/lib/python3.6/site-packages/eventlet/greenio/base.py", line 384, in _send_loop
return send_method(data, *args)
ConnectionResetError: [Errno 104] Connection reset by peer
2022-01-18 11:26:34.033 18 INFO eventlet.wsgi.server [req-34ce50ec-8205-42bf-a4af-6fc7db40222b be681907aee94c4ba19f0ce901304a5f 8cc5e1fbf0cb488bb80ac0e8f2a211ac - default default] 172.17.1.120 - - [18/Jan/2022 11:26:34] "PUT /v2/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20/file HTTP/1.1" 204 0 363.376649
2022-01-18 11:26:35.241 18 DEBUG eventlet.wsgi.server [-] (18) accepted ('172.17.1.104', 37736) serv
grep 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 controller-0.api.log
2022-01-18 11:20:30.645 18 DEBUG glance.api.v2.images [req-eed18d90-9135-4ecb-b305-3e66f1bc2595 be681907aee94c4ba19f0ce901304a5f 8cc5e1fbf0cb488bb80ac0e8f2a211ac - default default] The 'locations' list of image 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 is empty, not including 'direct_url' in response _format_image /usr/lib/python3.6/site-packages/glance/api/v2/images.py:1339
$ grep 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 controller-1.api.log
Didn't report anything at all per this image.
We suspect there is an issue with either the netapp or networking, I'm still looking into this, it may reach a "closed as not a bug" state. I'll follow up and update this bug within a few days. Closed as not a bug, rot cause turned out to be a netapp issue. After rebooting both cluster nodes Glance uploads and CI jobs no seem to be running. |
Description of problem: On a system where Glance uses NFS, a Netapp share in my case, when I try to upload am image, cirros in this case, I got this response: HTTP 504 Gateway Time-out: The server didn't respond in time. Version-Release number of selected component (if applicable): python3-glance-19.0.5-2.20210528112824.b3de9da.el8ost.1.noarch python3-glance-store-1.0.2-2.20210528102039.79e043a.el8ost.noarch openstack-glance-19.0.5-2.20210528112824.b3de9da.el8ost.1.noarch python3-glanceclient-2.17.1-2.20210528071101.1aba8f2.el8ost.1.noarch How reproducible: Both Liron and I hit the same issue on this system. I think he mentioned he may have noticed this in the past before. Steps to Reproduce: 1. Deploy a system with Glance over NFS BTW on this system Cinder also uses NFS, cinder create vol worked fine. GlanceBackend: 'file' GlanceNfsEnabled: true GlanceNfsShare: '172.16.X.Y:/vol_glance' GlanceNfsOptions: '_netdev,bg,intr,context=system_u:object_r:svirt_sandbox_file_t:s0,nosharecache' 2. Try to upload a small cirros image (overcloud) [stack@undercloud-0 ~]$ glance image-create --name tshefi --file cirr.img --disk-format qcow2 --container-format bare +------------------+--------------------------------------+ | Property | Value | +------------------+--------------------------------------+ | checksum | None | | container_format | bare | | created_at | 2022-01-18T11:20:30Z | | disk_format | qcow2 | | id | 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 | | min_disk | 0 | | min_ram | 0 | | name | tshefi | | os_hash_algo | None | | os_hash_value | None | | os_hidden | False | | owner | 8cc5e1fbf0cb488bb80ac0e8f2a211ac | | protected | False | | size | None | | status | queued | | tags | [] | | updated_at | 2022-01-18T11:20:30Z | | virtual_size | Not available | | visibility | shared | +------------------+--------------------------------------+ HTTP 504 Gateway Time-out: The server didn't respond in time. ^ this is the error, I shouldn't get. 3. Recheck image status a few seconds later, image is available (overcloud) [stack@undercloud-0 ~]$ glance image-show 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 +------------------+----------------------------------------------------------------------------------+ | Property | Value | +------------------+----------------------------------------------------------------------------------+ | checksum | 1d3062cd89af34e419f7100277f38b2b | | container_format | bare | | created_at | 2022-01-18T11:20:30Z | | direct_url | file:///var/lib/glance/images/5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 | | disk_format | qcow2 | | id | 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 | | min_disk | 0 | | min_ram | 0 | | name | tshefi | | os_hash_algo | sha512 | | os_hash_value | 553d220ed58cfee7dafe003c446a9f197ab5edf8ffc09396c74187cf83873c877e7ae041cb80f3b9 | | | 1489acf687183adcd689b53b38e3ddd22e627e7f98a09c46 | | os_hidden | False | | owner | 8cc5e1fbf0cb488bb80ac0e8f2a211ac | | protected | False | | size | 16338944 | | status | active | | stores | default_backend | | tags | [] | | updated_at | 2022-01-18T11:26:33Z | | virtual_size | Not available | | visibility | shared | +------------------+----------------------------------------------------------------------------------+ Also confirm file is indeed upload/found on NFS share: -rw-r-----. 1 42415 42415 16338944 Jan 18 13:26 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 4. Liron also attempted to upload a cirros image via python tempest conf again hit same 504 error. | 4ff37631-a516-42d8-8835-f9a0f51323e1 | cirros-0.4.0-x86_64-disk.img | Actual results: Image upload reports: "HTTP 504 Gateway Time-out: The server didn't respond in time." I noticed this on the logs: [root@controller-0 ~]# grep -ir 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 /var/log/containers/glance/api.log:2022-01-18 11:20:30.645 18 DEBUG glance.api.v2.images [req-eed18d90-9135-4ecb-b305-3e66f1bc2595 be681907aee94c4ba19f0ce901304a5f 8cc5e1fbf0cb488bb80ac0e8f2a211ac - default default] The 'locations' list of image 5cc8e1f8-2b3b-45bd-b4b2-958847ec5b20 is empty, not including 'direct_url' in response _format_image /usr/lib/python3.6/site-packages/glance/api/v2/images.py:1339 Expected results: Image should upload without the 504 error.