Bug 1425316

Summary:	`nova rescue` of an instance with Ceph backend fails with corrupted XFS errors
Product:	Red Hat OpenStack	Reporter:	Anil Dhingra <adhingra>
Component:	openstack-nova	Assignee:	Lee Yarwood <lyarwood>
Status:	CLOSED ERRATA	QA Contact:	Gabriel Szasz <gszasz>
Severity:	urgent	Docs Contact:
Priority:	high
Version:	9.0 (Mitaka)	CC:	adhingra, awaugama, berrange, dasmith, dmaley, eglynn, gszasz, kchamart, lyarwood, mschuppe, rjones, sbauza, sferdjao, sgordon, skinjo, sputhenp, srevivo, vromanso
Target Milestone:	async	Keywords:	Triaged, ZStream
Target Release:	9.0 (Mitaka)
Hardware:	Unspecified
OS:	Linux
Whiteboard:
Fixed In Version:	openstack-nova-13.1.2-18.el7ost	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-06-19 18:30:40 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Anil Dhingra 2017-02-21 07:52:49 UTC

Description of problem:
One of VM instance running rhel 7.2 has a corrupted XFS and it failed to bootup.
The backend storage is using Ceph 1.3

instance console log as shown below

[2168854.428627] XFS (vda1): xfs_log_force: error -5 returned.
[2168884.508500] XFS (vda1): xfs_log_force: error -5 returned.
[2168914.588387] XFS (vda1): xfs_log_force: error -5 returned.

tried to rescue vm using 

# nova rescue <vm>
# nova rescue --image <alternate rhel image> <vm>

But instance in not moving to rescue state & always boot & shows xfs errors.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:
instance should boot from alternate image allow to xfs repair to recover filesystem

Additional info:

Comment 3 Sadique Puthen 2017-02-21 10:32:44 UTC

The instance that need to be rescued is

$ nova show 286775c6-cb4c-4182-be98-7153e7fe2467
+----------------------------------------+------------------------------------------------------------------+
| Property                               | Value                                                            |
+----------------------------------------+------------------------------------------------------------------+
| OS-DCF:diskConfig                      | AUTO                                                             |
| OS-EXT-AZ:availability_zone            | nova                                                             |
| OS-EXT-SRV-ATTR:host                   | myr-eqx-sg-ocpn-06.localdomain                                   |
| OS-EXT-SRV-ATTR:hostname               | sg-eq-dpc-03                                                     |
| OS-EXT-SRV-ATTR:hypervisor_hostname    | myr-eqx-sg-ocpn-06.localdomain                                   |
| OS-EXT-SRV-ATTR:instance_name          | instance-0000006b                                                |
| OS-EXT-SRV-ATTR:kernel_id              |                                                                  |
| OS-EXT-SRV-ATTR:launch_index           | 0                                                                |
| OS-EXT-SRV-ATTR:ramdisk_id             |                                                                  |
| OS-EXT-SRV-ATTR:reservation_id         | r-mmx8iwwv                                                       |
| OS-EXT-SRV-ATTR:root_device_name       | /dev/vda                                                         |
| OS-EXT-SRV-ATTR:user_data              | -                                                                |
| OS-EXT-STS:power_state                 | 1                                                                |
| OS-EXT-STS:task_state                  | -                                                                |
| OS-EXT-STS:vm_state                    | active                                                           |
| OS-SRV-USG:launched_at                 | 2017-01-09T20:49:52.000000                                       |
| OS-SRV-USG:terminated_at               | -                                                                |
| accessIPv4                             |                                                                  |
| accessIPv6                             |                                                                  |
| config_drive                           |                                                                  |
| created                                | 2017-01-09T20:42:05Z                                             |
| description                            | SG-EQ-DPC-03                                                     |
| dns-internal-1241-provider-net network | 192.168.41.15                                                    |
| dns-mgmt-1240-provider-net network     | 192.168.40.15                                                    |
| flavor                                 | 4_vCPU_32GB_RAM_500GB_HDD (b22e7b69-e28d-4f10-af44-bd99ebb2b3af) |
| hostId                                 | 71a48e70b6f2f1002799e6e3825d999679bdb4ff80128ce80f7308f1         |
| host_status                            | UP                                                               |
| id                                     | 286775c6-cb4c-4182-be98-7153e7fe2467                             |
| image                                  | SG-EQ-DPC-03 (e89a8519-b968-41b1-9cbe-0a3d17cec2d1)              |
| key_name                               | sebastian-ssh                                                    |
| locked                                 | False                                                            |
| metadata                               | {}                                                               |
| name                                   | SG-EQ-DPC-03                                                     |
| os-extended-volumes:volumes_attached   | []                                                               |
| progress                               | 0                                                                |
| security_groups                        | whitelist-all                                                    |
| status                                 | ACTIVE                                                           |
| tenant_id                              | 97ecd21c11d14ccf857ec41ef0afa22d                                 |
| updated                                | 2017-02-17T08:29:34Z                                             |
| user_id                                | 3e289fae9b694c789b370df4b97d6e8e                                 |
+----------------------------------------+------------------------------------------------------------------+
[stack@myr-eqx-sg-odn-01 ~]$

The rescue was tried with an RHEL image and Debian image.

$ glance image-list
+--------------------------------------+--------------------+
| ID                                   | Name               |
+--------------------------------------+--------------------+
| c321e78a-b6bb-40b5-8f54-bd8f223a2a3a | Debian-8.6.3       |
| 16e41469-1e8d-4458-a7fa-3ededf2e80bf | RHEL-7.3           |

$ nova rescue --image 16e41469-1e8d-4458-a7fa-3ededf2e80bf 286775c6-cb4c-4182-be98-7153e7fe2467

This causes the system to boot to the instance instead of booting a new instance from the specified image and attaching the instance disk as secondary disk to it to repair.

qemu     37433     1  3 11:47 ?        00:00:22 /usr/libexec/qemu-kvm -name guest=instance-0000006b,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-25-instance-0000006b/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off -cpu Broadwell,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+dca,+osxsave,+f16c,+rdrand,+arat,+tsc_adjust,+xsaveopt,+pdpe1gb,+abm,+rtm,+hle -m 32768 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 286775c6-cb4c-4182-be98-7153e7fe2467 -smbios type=1,manufacturer=Red Hat,product=OpenStack Nova,version=13.1.1-7.el7ost,serial=6d20bf45-0918-413b-b4bf-dcc702889837,uuid=286775c6-cb4c-4182-be98-7153e7fe2467,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-25-instance-0000006b/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -object secret,id=virtio-disk0-secret0,data=hTuL6fGIvWXUkd8B+ouiVIsOuZ2/VZrKuQFMedM/dqo=,keyid=masterKey0,iv=lng1Sg7vZfifqkSXF6llKg==,format=base64 -drive file=rbd:vms/286775c6-cb4c-4182-be98-7153e7fe2467_disk:id=cinder:auth_supported=cephx\;none:mon_host=192.168.22.12\:6789\;192.168.22.13\:6789\;192.168.22.14\:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0,cache=writeback,discard=unmap -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=33 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:45:dd:5d,bus=pci.0,addr=0x3 -netdev tap,fd=34,id=hostnet1,vhost=on,vhostfd=35 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=fa:16:3e:5f:dd:ae,bus=pci.0,addr=0x4 -add-fd set=4,fd=37 -chardev file,id=charserial0,path=/dev/fdset/4,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on

The qemu-kvm process shows there is only one disk ceph mapped to the process. Does that mean the rescue just starts the vm without rescuing it? Is it expected to work with Ceph. We tested it with an instance using local disk which attaches both disks properly and helps to rescue.

Comment 8 Richard W.M. Jones 2017-02-21 13:08:39 UTC

I'll note that errno 5 == EIO.  Bad hardware?

Comment 15 Sadique Puthen 2017-02-22 04:06:09 UTC

Attaching a screenshot from customer environment. The screenshot shows the rescue is trying to mount /dev/vdb1 as the root disk instead of /dev/vda1. This looks like a bug.

We were able to reproduce it. Just created a an instance with 20GB disk. The disk is healthy. We then rescued it using "nova rescue <server>". We can see that /dev/vdb1 (20 GB) disk is recognized as the / and mounted instead of using /dev/vda1 as / disk and leaving /dev/vdb1 for repair. It has left /dev/vda1 as unmounted and left us to repair.

I hope this would be easy to reproduce in a normal environment. We were however not able to reproduce when we use a different image using --image. For me it looks like both vda1 and vdb1 has same uuid and it picks the wrong / partition to mount it.

How can we avoid it?

Comment 16 Sadique Puthen 2017-02-22 04:09:58 UTC

We can reliably reproduce it with --image with without it. In most cases, it detects the /dev/vdb1 as the / disk and uses that. This may be because of uuid, how can we work around it?

Comment 17 Lee Yarwood 2017-02-22 04:15:15 UTC

(In reply to Sadique Puthen from comment #15)
> Attaching a screenshot from customer environment. The screenshot shows the
> rescue is trying to mount /dev/vdb1 as the root disk instead of /dev/vda1.
> This looks like a bug.
> 
> We were able to reproduce it. Just created a an instance with 20GB disk. The
> disk is healthy. We then rescued it using "nova rescue <server>". We can see
> that /dev/vdb1 (20 GB) disk is recognized as the / and mounted instead of
> using /dev/vda1 as / disk and leaving /dev/vdb1 for repair. It has left
> /dev/vda1 as unmounted and left us to repair.
> 
> I hope this would be easy to reproduce in a normal environment. We were
> however not able to reproduce when we use a different image using --image.
> For me it looks like both vda1 and vdb1 has same uuid and it picks the wrong
> / partition to mount it.
> 
> How can we avoid it?

As I said before, use the RHEL boot ISO as the rescue image, it should avoid this behaviour.

Comment 18 Sadique Puthen 2017-02-22 09:35:26 UTC

Lee,

We used the boot.iso for RHEL-7.3, but it still detects /dev/vdb1 as its / disk and fails to boot due to xfs corruption. We investigated further and found below.

We did unrescue and did "sudo rbd ls -l vms". We could see disk.rescue still staying in the vms pool which shows a size of 300G. So this means that when we do unrescue, the rescue image is not being deleted. Then we do rescue again using boot.iso, it uses the old disk.rescue to boot the vm which was left over from previous rescue. Since the left over disk is the original image from which the vm was started, both images have the same uuid for /. So the instance detects the vdb1 as the root disk every time.

The problem of not getting rescue disk deleted during unrescue is reported at https://bugs.launchpad.net/nova/+bug/1478199 which is the root cause of all problems. Now we have two things to do.

1 - Urgent. Delete the rescue disk manually from ceph vms pool, rescue again with correct image and recover the vm. Any suggestions on how to delete it?

2 - Fix the bug in nova that causes unrescue not to delete rescue image.

Comment 19 Sadique Puthen 2017-02-22 09:40:34 UTC

Another upstream report https://bugs.launchpad.net/nova/+bug/1511123

Comment 20 Martin Schuppert 2017-02-22 09:56:17 UTC

https://bugs.launchpad.net/nova/+bug/1478199 is a duplicate of https://bugs.launchpad.net/nova/+bug/1475652 with a fix merged and got backported to newton in nova 14.0.2 .

Comment 21 Sadique Puthen 2017-02-22 10:09:20 UTC

We are using osp-9. Requesting a backport.

Comment 22 Lee Yarwood 2017-02-22 14:00:39 UTC

(In reply to Sadique Puthen from comment #21)
> We are using osp-9. Requesting a backport.

ACK, nice catch, I'll do this now.

Comment 34 errata-xmlrpc 2017-06-19 18:30:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1508