| Summary: | RFE: Ability to live migrate instance after moving existing ceph-mon service on different nodes | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> | ||||
| Component: | rhosp-director | Assignee: | Angus Thomas <athomas> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Omri Hochman <ohochman> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 10.0 (Newton) | CC: | berrange, dbecker, dgilbert, eglynn, gfidente, jslagle, kchamart, mburns, mcornea, morazi, rhel-osp-director-maint | ||||
| Target Milestone: | ga | Keywords: | FutureFeature | ||||
| Target Release: | 10.0 (Newton) | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-11-11 15:54:52 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
ceph.conf on the compute node: [global] osd_pool_default_min_size = 1 auth_service_required = cephx mon_initial_members = overcloud-serviceapi-0,overcloud-serviceapi-1,overcloud-serviceapi-2 fsid = d825caf0-a446-11e6-91fe-525400a81fbf cluster_network = 192.168.0.18/25 auth_supported = cephx auth_cluster_required = cephx mon_host = 10.0.0.154,10.0.0.153,10.0.0.157 auth_client_required = cephx public_network = 10.0.0.144/25 I think we need to restart the qemu process, by stopping/restarting the VM qemu should reinit its rbd connection with the new config settings. Yep, it looks so, after doing nova stop/start I no longer got the old MONs timeout error message. @mcornea: can you clarify the apparent contradiction between comment #0: "I tried manually restarting libvirtd and openstack-nova-compute but I couldn't make it work." and comment #3: "after doing nova stop/start I no longer got the old MONs timeout error message" i.e. what's the exact different between manually restarting openstack-nova-compute and nova stop/start? @mcornea: can you attach the full QEMU log file from both source and destination hosts? (In reply to Eoghan Glynn from comment #4) > @mcornea: can you clarify the apparent contradiction between comment #0: > > "I tried manually restarting libvirtd and openstack-nova-compute but I > couldn't make it work." > > and comment #3: > > "after doing nova stop/start I no longer got the old MONs timeout error > message" > > i.e. what's the exact different between manually restarting > openstack-nova-compute and nova stop/start? nova stop/start refers to the instance: nova stop $instance; nova start $instance while restarting openstack-nova-compute is systemctl restart openstack-nova-compute (In reply to Eoghan Glynn from comment #5) > @mcornea: can you attach the full QEMU log file from both source and > destination hosts? The QEMU logs: http://paste.openstack.org/show/588289/ From the destination in that pastebin: 2016-11-07 16:31:05.351+0000: starting up libvirt version: 2.0.0, package: 10.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-09-21-10:15:26, x86-038.build.eng.bos.redhat.com), qemu version: 2.6.0 (qemu-kvm-rhev-2.6.0-27.el7), hostname: overcloud-compute-1.localdomain LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000004,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-5-instance-00000004/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off -cpu Broadwell,+vme,+ss,+vmx,+osxsave,+f16c,+rdrand,+hypervisor,+arat,+tsc_adjust,+xsaveopt,+pdpe1gb,+abm,+rtm,+hle -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid fe2952db-6dc8-44c2-b26a-0f0300065d21 -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=14.0.1-5.el7ost,serial=135cfcf4-8659-45b3-ab94-bb5027185027,uuid=fe2952db-6dc8-44c2-b26a-0f0300065d21,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-5-instance-00000004/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -object secret,id=virtio-disk0-secret0,data=2gomEsaWRmx2CEa5VR2wxUgUionIYSJ0h/mly+/xEfE=,keyid=masterKey0,iv=P24YU3se79jC+QMXAhHdig==,format=base64 -drive 'file=rbd:vms/fe2952db-6dc8-44c2-b26a-0f0300065d21_disk:id=openstack:auth_supported=cephx\;none:mon_host=10.0.0.138\:6789\;10.0.0.141\:6789\;10.0.0.149\:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0,cache=writeback,discard=unmap' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -object secret,id=virtio-disk1-secret0,data=iorN1kMCSA3jz/7wgQyPkKSxqEeNkNb/asu4rF96CyU=,keyid=masterKey0,iv=ETKHh1NgX/xU8uCKB2vzWQ==,format=base64 -drive 'file=rbd:volumes/volume-30f8f80c-8bb3-4d1a-ab7c-ec906aad8517:id=openstack:auth_supported=cephx\;none:mon_host=10.0.0.140\:6789\;10.0.0.142\:6789\;10.0.0.155\:6789,file.password-secret=virtio-disk1-secret0,format=raw,if=none,id=drive-virtio-disk1,serial=30f8f80c-8bb3-4d1a-ab7c-ec906aad8517,cache=none' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,fd=34,id=hostnet0,vhost=on,vhostfd=36 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:7f:cc:23,bus=pci.0,addr=0x3 -add-fd set=2,fd=38 -chardev file,id=charserial0,path=/dev/fdset/2,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:4 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on char device redirected to /dev/pts/4 (label charserial1) 2016-11-07T16:36:05.457888Z qemu-kvm: -drive file=rbd:vms/fe2952db-6dc8-44c2-b26a-0f0300065d21_disk:id=openstack:auth_supported=cephx\;none:mon_host=10.0.0.138\:6789\;10.0.0.141\:6789\;10.0.0.149\:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0,cache=writeback,discard=unmap: error connecting: Connection timed out 2016-11-07 16:36:05.467+0000: shutting down so I *think* that means it's the rbd connection timing out? Created attachment 1218161 [details] Source & destination qemu logs (from comment #7) Attaching them as a plain text file to this bug, because paste-bins expire. The rbd connection is timing out because these mons(10.0.0.138,10.0.0.141,10.0.0.149) are no longer part of the cluster, they were removed in step 4. Given that the current cluster status is ok I'd expect the rbd connection to be using the new MONs and be able to reach the cluster. The initial monitor hosts are queried by Nova when it first starts the guest. They are then put in the XML given to libvirt, which in turn passes them to QEMU. If you decomission those monitor hosts it is inevitably going break any existing QEMU guests, since their XML config will be pointing to hosts that no longer exist. This will certainly break live migration, since QEMU on the target host will be trying to connect to the same monitors it had on the source. Dealing with decomissions of ceph monitors is not something Nova has ever attempted to address, so this is an RFE really. (In reply to Daniel Berrange from comment #12) > Dealing with decomissions of ceph monitors is not something Nova has ever > attempted to address, so this is an RFE really. Thanks, marking it as an RFE in this case. This was further discussed on the compute DFG triage call today and the consensus was that moving the monitors in this way is operationally incorrect. If you want the flexibility to move around services like that, then that's what VIPs are for. If the monitor sat behind a virtual IP, then moving it would not require the highly awkward changes to static config as a knock-on impact, instead everything should continue to work. |
Description of problem: After moving the ceph-mon services on different nodes I cannot live migrate an instance. The compute node where the instance is running is failing with error connecting: Connection timed out error message as it's trying to reach the old nodes where the ceph-mon service was initially running. Note that the Ceph cluster reports a HEALTH_OK state and new instances can be deployed./etc/ceph/ceph.conf on the compute nodes also references only the new nodes running the ceph-mon service. Version-Release number of selected component (if applicable): openstack-nova-compute-14.0.1-5.el7ost.noarch How reproducible: 1/1 Steps to Reproduce: 1. Deploy overcloud with 3 x monolithic controllers with ceph storage. Check ceph cluster health: cluster d825caf0-a446-11e6-91fe-525400a81fbf health HEALTH_OK monmap e1: 3 mons at {overcloud-controller-0=10.0.0.146:6789/0,overcloud-controller-1=10.0.0.142:6789/0,overcloud-controller-2=10.0.0.139:6789/0} election epoch 6, quorum 0,1,2 overcloud-controller-2,overcloud-controller-1,overcloud-controller-0 osdmap e29: 6 osds: 6 up, 6 in flags sortbitwise pgmap v68: 224 pgs, 6 pools, 218 MB data, 33 objects 1510 MB used, 118 GB 2. Run an overcloud instance 3. Deploy additional 2 nodes of a new role running the CephMON service 4. Remove the initial ceph mons running on the controllers: sudo systemctl stop ceph-mon.target; sudo ceph mon remove controller-0 sudo systemctl stop ceph-mon.target; sudo ceph mon remove controller-1 sudo systemctl stop ceph-mon.target; sudo ceph mon remove controller-2 5. Make sure that the cluster health looks ok: cluster d825caf0-a446-11e6-91fe-525400a81fbf health HEALTH_OK monmap e6: 2 mons at {overcloud-serviceapi-0=10.0.0.154:6789/0,overcloud-serviceapi-1=10.0.0.153:6789/0} election epoch 24, quorum 0,1 overcloud-serviceapi-1,overcloud-serviceapi-0 osdmap e33: 6 osds: 6 up, 6 in flags sortbitwise pgmap v1426: 224 pgs, 6 pools, 3183 MB data, 4866 objects 9564 MB used, 110 GB / 119 GB avail 224 active+clean 6. Live migrate the instance started on step 2 Actual results: Live migration fails: 2016-11-07 06:29:16.831 6204 ERROR nova.virt.libvirt.driver [req-23f7dd7b-2087-4998-886b-2cd7da1f1bda edb5c79dd9fb4813991048b50cad4ae7 f9615fbeb4fe4bdb87b73d5d004ba876 - - -] [instance: 8c9915f4-84c3-44f1-b409-f593e385f1d2] Live Migration failure: internal error: qemu unexpectedly closed the monitor: 2016-11-07T06:29:16.611698Z qemu-kvm: -drive file=rbd:vms/8c9915f4-84c3-44f1-b409-f593e385f1d2_disk:id=openstack:auth_supported=cephx\;none:mon_host=10.0.0.139\:6789\;10.0.0.142\:6789\;10.0.0.146\:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0,cache=writeback,discard=unmap: error connecting: Connection timed out Expected results: Live migration succeds and QEMU uses the new Ceph MONs instead of the old ones. Additional info: I tried manually restarting libvirtd and openstack-nova-compute but I couldn't make it work. Please let me know if there's any other step that I missed. Thank you.