| Summary: | [VDSM] sometimes one host loosing IP address | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Kobi Hakimi <khakimi> | ||||||
| Component: | General | Assignee: | Dan Kenigsberg <danken> | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | Meni Yakove <myakove> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 4.17.26 | CC: | bugs, gklein, khakimi, ylavi | ||||||
| Target Milestone: | --- | Keywords: | AutomationBlocker | ||||||
| Target Release: | --- | Flags: | gklein:
ovirt-3.6.z?
gklein: ovirt-4.0.0? gklein: blocker? rule-engine: planning_ack? rule-engine: devel_ack? rule-engine: testing_ack? |
||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-05-05 08:30:37 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
Created attachment 1151137 [details]
messages log files
Critical /var/log/message logs ( from April 24 to 26) are missing. My guess is that at some point during this interval (Apr 25 ~4PM), dhclient has died. 24 hours later, the host looses its lease and dies. Please add the logs of that time when the problem reproduces. You right this time interval is missing, but it weird, since I copied all msg folder. Next time I will try to add this log. If this happens again and you find the log, please reopen. When I tried to clean the GE5: Engine: jenkins-vm-13.scl.lab.tlv.redhat.com Hosts: - RHEL72:lynx23,24 - RHEVH72:lynx21,22 Which installed with: Red Hat Enterprise Virtualization Manager Version: 3.6.6-0.1.el6 I got connection timeout see in: https://rhev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com:8443/job/GE-cleaner/1530/consoleFull when I look at this error I saw that 2 hosts are not up: - lynx24 was in maintenance mode - lynx21 was Non Responsive status I tried to activate the first and reinsall the second but both operations failed so I tried to ping to these machines but no connection at all so I connect to one of them with ipmitool: ipmitool -I lanplus -H lynx24-mgmt.qa.lab.tlv.redhat.com -U root -P **** sol activate and saw: 1. There is no ip as expected. 2. The command "pgrep dhcliet" return nothing I leave the machine stuck in this state to be able to investigate it so I couldn't take the logs. could you please investigate it? Sorry Kobi, I do not understand. Can you please collect historic /var/log/message from the non-responsive host? If not, why? In this case we found the reason: one test case of power management stop the network and failed to activate it. Sorry for interrupt you all!! |
Created attachment 1151136 [details] folder of vdsm log Description of problem: [VDSM] sometimes one host loosing IP address Version-Release number of selected component (if applicable): Red Hat Enterprise Virtualization Manager Version: 3.6.5.3-0.1.el6 How reproducible: sometimes Steps to Reproduce: There is no scenario to reproduce it but I experience this issue in the second time in a week +-. 1. with ovirt 4.0: - I look at my engine of GE4 day after installation it - host_mixed_1 was inactive - When I tried to connect to it I realize that it lost the IP address 2. with rhevm 3.6.5: - look at the engine of GE3 after running some test[1] - host_mixed_2 was inactive - When I tried to connect to it I realize that it lost the IP address Actual results: In the events log I saw that in the same time that probably the host lost his IP Apr 26, 2016 5:10:08 PM - VDSM host_mixed_2 command failed: Heartbeat exeeded Expected results: Keep working without loosing ip Additional info: danken started to investigate it but didn't see the reason. see attached vdsm and messages logs