Bug 1169831
| Summary: | [rhev-upgrade] migration fail with libvirt error after vdsm upgrade | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Michael Burman <mburman> | ||||
| Component: | ovirt-engine | Assignee: | Nobody <nobody> | ||||
| Status: | CLOSED WORKSFORME | QA Contact: | |||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.5.0 | CC: | danken, ecohen, gklein, iheim, lpeer, lsurette, lvernia, mavital, mburman, michal.skrivanek, ofrenkel, rbalakri, Rhev-m-bugs, yeylon | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 3.5.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | virt | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-12-15 09:27:56 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
well it just looks like your other host doesn't have enough memory to run the vm: 2014-12-02 15:41:03,273 INFO [org.ovirt.engine.core.bll.MigrateVmCommand] (org.ovirt.thread.pool-7-thread-46) [1ba74e91] Running command: MigrateVmCommand internal: false. Entities affected : ID: d4e7dcb2-1bc7-4df2-be72-9dbfbce24b0f Type: VMAction group MIGRATE_VM with role type USER, ID: d4e7dcb2-1bc7-4df2-be72-9dbfbce24b0f Type: VMAction group EDIT_VM_PROPERTIES with role type USER, ID: fc27297c-1b31-4a51-9f63-9b900346ef38 Type: VdsGroupsAction group CREATE_VM with role type USER 2014-12-02 15:41:03,280 INFO [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (org.ovirt.thread.pool-7-thread-46) [1ba74e91] Candidate host orange-vdsc.qa.lab.tlv.redhat.com (ff17c2c4-fa88-4dee-8846-21c494a9a16b) was filtered out by VAR__FILTERTYPE__INTERNAL filter Memory (correlation id: 1ba74e91) 2014-12-02 15:41:03,288 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-46) [1ba74e91] Correlation ID: 1ba74e91, Job ID: 75d6cfcd-c0b3-41b6-8416-bb35b6e68b2b, Call Stack: null, Custom Event ID: -1, Message: Migration failed, No available host found (VM: Students_, Source: orange-vdsd.qa.lab.tlv.redhat.com). are you sure orange-vdsc has enough memory to run the vm? Yes indeed. Max free Memory for scheduling new VMs:2355 MB Please connect my setup 10.35.161.37, it still happening. (In reply to Michael Burman from comment #2) > Yes indeed. > Max free Memory for scheduling new VMs:2355 MB > > Please connect my setup 10.35.161.37, it still happening. sorry, but you didn't put any more access information, which cluster, which VM... that said it's most likely the memory. Since there are no other reports I'm removing the blocker request. please check the scheduling err about memory when it happens to you Michal, please enter the setup, migration still fail, i'm not sure what is the problem, it's in this stage from last week, i didn't change any thing. I see a host installation is going on right now...so please try to find some sable time or gather relevant logs when it reproduces. Thanks All relevant logs are attached. The setup is already upgraded to 3.5 and i can't reproduce this issue at the moment. I leaved the setup whole a week for investigation until yesterday. well, too bad the logs are wrong, the libvirt log is 10 days older than vdsm. so...it doesn't reproduce anymore? Do you have any better logs from yesterday's attempt? Please connect setup and hosts, should be there from yesterday's migration attempt. Seems to be related to networking. Some err about can't set MTU on an interface on dst. Looks like screwed setup.... failing on destination on: libvirtError: Cannot get interface MTU on 'qbr7a9364a8-67': No such device I would suspect some network setup discrepancy or bug. Lior? I've asked ibarkan to look into it. Not necessarily related to MTU, might be a misleading error message. I'm having trouble correlating what you're describing to the right log entries. Please reproduce, and point us EXACTLY to the right log entries when the failure occurs. Hi Lior, Like i wrote in the description, i'm not sure this issue reproducible. This issue accrued in a mixed upgrade setup, after updating vdsm on my 2 hosts to vdsm-4.16.7.4-1.el6ev >> vdsm-4.16.7.5-1.el6ev The setup was in this way for almost a week before restarting engine, update him to latest build and update vdsm on hosts to latest version. When we will take this setup back from a snap shot, i will try to reproduce. (In reply to Michal Skrivanek from comment #14) > failing on destination on: > libvirtError: Cannot get interface MTU on 'qbr7a9364a8-67': No such device qbr7a9364a8-67 stinks of openstack's neutron. Has this host been used by the openstack-net hook? By openstack directly? Michal, where did you see that libvirt error? grepping the attached logs produced nothing. (In reply to Dan Kenigsberg from comment #19) > Michal, where did you see that libvirt error? grepping the attached logs > produced nothing. yes, "thanks" to the IOProcess verbose logging it's not so easy to dig it out from the logs:-) Better to see it live. (In reply to Dan Kenigsberg from comment #18) > (In reply to Michal Skrivanek from comment #14) > > failing on destination on: > > libvirtError: Cannot get interface MTU on 'qbr7a9364a8-67': No such device > > qbr7a9364a8-67 stinks of openstack's neutron. Has this host been used by the > openstack-net hook? By openstack directly? Hi Dan, This hosts have openstack packages installed: openstack-neutron-openvswitch-2014.1.3-12.el6ost.noarch vdsm-hook-openstacknet-4.16.8.1-2.el6ev.noarch openstack-utils-2014.1-3.2.el6ost.noarch openstack-neutron-2014.1.3-12.el6ost.noarch - But, at the stage when this migration issue's happened, no Neutron appliance was running on this hosts, only openstack packages were installed. Michael, given the log rotation, we must have a reproduction in order to continue. Please make sure that http://gerrit.ovirt.org/35930 is applied to /etc/vdsm/logger.conf so logs are not rotated as vigorously. Dan, when we take this mixed upgrade setup from snapshot, i will try to reproduce this one and apply this to /etc/vdsm/logger.conf. Didn't managed to reproduce this migration issue. So let's close this for now, please reopen if it happens again! Or just open a new bug with the accurate scenario. |
Created attachment 963768 [details] migration fail logs Description of problem: [rhev-upgrade] migration fail with libvirt error after vdsm upgrade. In the upgrade environment was performing migrations over migration network, after host was updated vdsm-4.16.7.4-1.el6ev >> vdsm-4.16.7.5-1.el6ev migrations failed with error:' Migration failed, No available host found (VM: Students_, Source: orange-vdsd.qa.lab.tlv.redhat.com).' there was a connectivity between source and destinations. The issue was fixed with shutting vm down and then run again. Version-Release number of selected component (if applicable): 3.5.0-0.22.el6ev vdsm-4.16.7.5-1.el6ev libvirt-0.10.2-46.el6_6.2 How reproducible: Not sure if this is reproducible. Additional info: libvirt and vdsm logs from source and destination attached, also engine log.