Bug 1099209
| Summary: | Invalid status on Data Center - Setting Data Center status to Non Responsive - Network error during communication with the Host | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Retired] oVirt | Reporter: | Sokratis <sokratis123k> | ||||||||
| Component: | ovirt-engine-core | Assignee: | Liron Aravot <laravot> | ||||||||
| Status: | CLOSED NOTABUG | QA Contact: | Pavel Stehlik <pstehlik> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | 3.4 | CC: | acathrow, amureini, bugs, gklein, iheim, laravot, sokratis123k, yeylon | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | 3.4.2 | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | storage | ||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2014-06-01 19:28:43 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
Liron, can you please take a look? Sokratis, can you also add VDSM's logs? There are 2 hosts connected at the moment. Do you need vdsm.log from both? Created attachment 897498 [details]
ovirt-node-01 vdsm.log
Created attachment 897500 [details]
ovirt-node-02 vdsm.log
I added vdsm.log from both hosts anyway. Let me know if you need anything else. Thanks! After rebooting both ovirt nodes the data center came up again and has been running normally for almost a day. I will keep monitoring this for a couple of days. In the meantime, please let me know if you find anything in the logs that indicates a problem. Thank you! Liron, any insights? Any updates regarding this? Sokratis, the main issue here seems to be storage access problem
Thread-53::DEBUG::2014-05-19 17:14:38,989::task::1185::TaskManager.Task::(prepare) Task=`3e2832b1-451a-4f09-a33b-b5d1cd652491`::finished: {'4e6c401f-7ab7-4b16-99f5-e32cc564495e': {'code': 0, 'version': 0, 'acquired': True, 'delay': '0.000514588', 'lastCheck': '9.5', 'valid': True}, 'f742258e-ecf8-436f-9da9-b5cdb4c305ba': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000332391', 'lastCheck': '223.1', 'valid': True}}
(note the high "lastCheck" value), can you check your host access to the storage?
Actually what I noticed was very high load on the ovirt nodes at that time (>300) and that's why I rebooted them. Since then (10 days ago) I didn't have any problems with the host access to the storage. It could be a stuck process or just a random problem with storage. I don't think we can investigate this further. I suggest you close it and if the problem occurs again I can just open a new request. Thanks for your help! Sokratis (In reply to Sokratis from comment #11) > I suggest you close it and if the > problem occurs again I can just open a new request. Closing based on this comment. Sokratis, please don't hesitate to reopen this issue if the problem reoccurs. |
Created attachment 897272 [details] ovirt engine log Description of problem: On 19/05/2014 17:14 PM the Data Center switched to Non-Responsive and every attempt to active the Master Data Domain failed Version-Release number of selected component (if applicable): oVirt 3.4.1 How reproducible: Steps to Reproduce: 1. Select Master Data Domain and try to Activate Actual results: Attached failed with the following error: Network error during communication with the Host engine.log is attached Thank you, Sokratis