Bug 1263512
Summary: | virt-who can't detect the connect to vcenter was timeout | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Liushihui <shihliu> | ||||
Component: | virt-who | Assignee: | Radek Novacek <rnovacek> | ||||
Status: | CLOSED ERRATA | QA Contact: | Eko <hsun> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.2 | CC: | hsun, ldai, ovasik, owwang, sgao, shihliu, wpinheir | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | virt-who-0.17-3.el7 | Doc Type: | No Doc Update | ||||
Doc Text: |
undefined
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-11-04 05:05:59 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1291737 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
Liushihui
2015-09-16 03:29:17 UTC
Created attachment 1073887 [details]
virt-who's log
I don't see any reasonable way how to solve this. We would need some kind of heartbeat mechanism to poll if the connection to ESX is alive. But that would defeat the purpose of listening to ESX events. virt-who now retries the connection to ESX after 30 minutes of inactivity. I think this interval is quite fine for this kind of unexpected event that shouldn't occur very often. What do you think? I think it's ok to retries the connection to ESX after 30min. However, I also consider if we can show some remind info to customers about what happened to the connection between virt-who and vcenter. maybe it needn't to run like heartbeat mechanism, it just remind one time. Eg: "Failed to connect with vcenter, we'll try the connection again after 30min". is that acceptable? The problem is that virt-who can't detect that the connection is not working any more. From virt-who perspective, the connection is still fine without any communication from ESX. Without some check (heartbeat) there is no way to distinguish between just waiting for event and broken connection, both of them mean no communication. Radek, I understand virt-who can't detect the connection is not working when the refresh interval is too long since virt-who hasn't the heart-beat mechanism to detect it. However,I wonder why virt-who still can't detect that the connection is not working when the refresh interval is 5s. In my opinion, virt-who should get the host/guest mapping info from vcenter every 5s. If it can't get response from vcenter, it should detect the connection is not working. In this situation, it should show some error info. Maybe it needn't to send the error info every 5s, it just need to remind one time. I agree that virt-who behavior is not optimal. I'll look into it for next version. This issue has been resolved upstream. This will fixed by virt-who rebase. Fixed in virt-who-0.17-1.el7. Reopen it on virt-who-0.17-2.el7.noarch since virt-who can't generate any info if it failed to connect vcenter during monitor process. Checked version: subscription-manager-1.17.6-1.el7.x86_64 python-rhsm-1.17.2-1.el7.x86_64 python-rhsm-1.17.2-1.el7.x86_64 Checked process: 1. Register system to Satellite6.2 2. Configure virt-who run at esx mode and refresh interval is 60s. 3. Restart virt-who service, virt-who send host/guest mapping to server every 60s 4. In vcenter, Set firewall to block the connection from virt-who's machine 5. Check the virt-who's status's and log [root@localhost ~]# ps -ef|grep virt-who root 11254 1 0 03:34 ? 00:00:00 /usr/bin/python2 /usr/bin/virt-who root 11261 11254 0 03:34 ? 00:00:00 /usr/bin/python2 /usr/bin/virt-who root 20567 10354 0 03:46 pts/0 00:00:00 grep --color=auto virt-who Result: After 1 hour ,virt-who still hasn't generate any info. When it occurred this problem, virt-who still run normally. However, if restore the connection between vcenter and virt-who, virt-who still can't detect the connection(still no any info show on the log), unregister system, it will show error info as the following: " 2016-06-02 21:41:34,664 [virtwho.main WARNING] MainProcess(11254):MainThread @executor.py:reload:326 - virt-who reload 2016-06-02 21:41:34,696 [virtwho.test-esx1 INFO] Esx-1(11261):MainThread @esx.py:logout:346 - Can't log out from ESX: Server raised fault: 'The session is not authenticated.'" Fixed upstream: https://github.com/virt-who/virt-who/commit/300a7c7dd2aa638e154b896891cc9c093df810ed Fixed in virt-who-0.17-3.el7. Verified it on virt-who-0.17-6.el7 since virt-who can detect the disconnect of vcenter and it also can recover the connection after vcenter can reachable. Therefore, verify it. Verified version: virt-who-0.17-6.el7.noarch subscription-manager-1.17.9-1.el7.x86_64 python-rhsm-1.17.5-1.el7.x86_64 Verified process: 1. Register system to Satellite6.2 2. Configure virt-who run at esx mode and refresh interval is 60s. 3. Restart virt-who service, virt-who send host/guest mapping to server every 60s 4. In vcenter, Set firewall to block the connection from virt-who's machine 5. After 60s, check virt-who's log, virt-who will throw error info in the log # tail -f /var/log/rhsm/rhsm.log 2016-07-27 01:52:58,899 [virtwho.env_cmdline ERROR] Esx-1(50919):MainThread @virt.py:run:377 - Virt backend 'env/cmdline' fails with exception: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/virtwho/virt/virt.py", line 372, in run self._run() File "/usr/lib/python2.7/site-packages/virtwho/virt/esx/esx.py", line 175, in _run options=options) File "/usr/lib/python2.7/site-packages/suds/client.py", line 542, in __call__ return client.invoke(args, kwargs) File "/usr/lib/python2.7/site-packages/suds/client.py", line 602, in invoke result = self.send(soapenv) File "/usr/lib/python2.7/site-packages/suds/client.py", line 641, in send reply = transport.send(request) File "/usr/lib/python2.7/site-packages/virtwho/virt/esx/esx.py", line 96, in send verify=False File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 507, in post return self.request('POST', url, data=data, json=json, **kwargs) File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 464, in request resp = self.send(prep, **send_kwargs) File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 576, in send r = adapter.send(request, **kwargs) File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 433, in send raise ReadTimeout(e, request=request) ReadTimeout: HTTPSConnectionPool(host='10.73.2.95', port=443): Read timed out. (read timeout=64) 2016-07-27 01:52:58,900 [virtwho.env_cmdline INFO] Esx-1(50919):MainThread @virt.py:run:390 - Waiting 60 seconds before retrying backend 'env/cmdline' 6. In vcenter, delete the firewall to make virt-who connect vcenter normally. 7. After 60s, check virt-who's log, virt-who can throw normally info in the log. # tail -f /var/log/rhsm/rhsm.log 2016-07-27 01:53:58,963 [virtwho.env_cmdline DEBUG] Esx-1(50919):MainThread @esx.py:_prepare:128 - Log into ESX 2016-07-27 01:54:00,581 [virtwho.env_cmdline DEBUG] Esx-1(50919):MainThread @esx.py:_prepare:131 - Creating ESX event filter 2016-07-27 01:54:01,263 [virtwho.env_cmdline DEBUG] Esx-1(50919):MainThread @virt.py:enqueue:357 - Report for config "env/cmdline" gathered, putting to queue for sending 2016-07-27 01:54:01,264 [virtwho.main INFO] MainProcess(50912):MainThread @executor.py:run:250 - Report for config "env/cmdline" hasn't changed, not sending Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2387.html |