1250042 – RBD crash when we Force_off the VM after writeback blocked by IO errors

Bug 1250042 - RBD crash when we Force_off the VM after writeback blocked by IO errors

Summary: RBD crash when we Force_off the VM after writeback blocked by IO errors

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RBD
Sub Component:
Version:	1.3.0
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	1.3.2
Assignee:	Josh Durgin
QA Contact:	ceph-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1253803 ceph131rn
TreeView+	depends on / blocked

Reported:	2015-08-04 12:08 UTC by Tanay Ganguly
Modified:	2017-07-30 15:30 UTC (History)
CC List:	7 users (show)
Fixed In Version:	RHEL: ceph-0.94.5-2.el7cp Ubuntu: ceph_0.94.5-2redhat1
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-02-29 14:42:40 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Rbd crash (4.29 KB, text/plain) 2015-08-04 12:08 UTC, Tanay Ganguly	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	12597	0	None	None	None	Never
Red Hat Product Errata	RHBA-2016:0313	0	normal	SHIPPED_LIVE	Red Hat Ceph Storage 1.3.2 bug fix and enhancement update	2016-02-29 19:37:43 UTC

Description Tanay Ganguly 2015-08-04 12:08:02 UTC

Created attachment 1059057 [details]
Rbd crash

Description of problem:
After i am hitting Bug 1249004, when i reboot the VM i am seeing the RBD crash in VM.log

Version-Release number of selected component (if applicable):
ceph version 0.94.1.4

How reproducible:
100 %

Steps to Reproduce:
1. Hit the Bug 1249004 ( Steps are mentioned in that Bug )
2. Force Off the Vm.

Actual results:
Once the VM is powered off, i am seeing the Crash dump in vm.log

inputs_connect: inputs channel client create
block I/O error in device 'drive-virtio-disk0': Invalid argument (22)
block I/O error in device 'drive-virtio-disk0': Invalid argument (22)
block I/O error in device 'drive-virtio-disk0': Invalid argument (22)
block I/O error in device 'drive-virtio-disk0': Invalid argument (22)
block I/O error in device 'drive-virtio-disk0': Invalid argument (22)
block I/O error in device 'drive-virtio-disk0': Invalid argument (22)
block I/O error in device 'drive-virtio-disk0': Invalid argument (22)
qemu: terminating on signal 15 from pid 21361
osdc/ObjectCacher.cc: In function 'ObjectCacher::~ObjectCacher()' thread 7f93b412ca40 time 2015-08-04 08:00:48.850180
osdc/ObjectCacher.cc: 551: FAILED assert(i->empty())
 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
 1: (()+0x160a15) [0x7f93af0dea15]
 2: (()+0x3a3dd1) [0x7f93af321dd1]
 3: (()+0x57346) [0x7f93aefd5346]
 4: (()+0x7d859) [0x7f93aeffb859]
 5: (rbd_close()+0x9) [0x7f93aefba719]
 6: (()+0xa709d) [0x7f93b420c09d]
 7: (()+0x8ace9) [0x7f93b41efce9]
 8: (()+0x8aba8) [0x7f93b41efba8]
 9: (()+0x8ad73) [0x7f93b41efd73]
 10: (()+0x8ae18) [0x7f93b41efe18]
 11: (()+0x75d0f) [0x7f93b41dad0f]
 12: (__libc_start_main()+0xf5) [0x7f93aa057af5]
 13: (()+0x7714d) [0x7f93b41dc14d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
2015-08-04 12:00:49.402+0000: shutting down


Expected results:
There should not be any Crash

Additional info:
Logs attached.

Comment 2 Tanay Ganguly 2015-08-20 15:58:15 UTC

Jason,

I am unable to reproduce this rbd crash with the latest build 0.94.1.7

Is the Fix of original Bug (https://bugzilla.redhat.com/show_bug.cgi?id=1249004) have fixed this issue as well, 

If NOT, can you please let me know if there is a definite step on how to reproduce this particular BUG.

Comment 3 Jason Dillaman 2015-08-20 16:14:46 UTC

This crash will only be seen when the OSDs return an error.  Since the OSD error WRT updating the object map has been corrected, you will need a new mechanism to induce an OSD error to reproduce this issue.  One possibility might be to enable a quota on a pool and fill it up -- forcing the OSDs to return an ENOSPC error.

Comment 4 Harish NV Rao 2015-08-20 16:32:55 UTC

Jason,

Can you please clarify:

>Is the Fix of original Bug (https://bugzilla.redhat.com show_bug.cgi?id=1249004) have fixed this (1250042) issue as well?

Regards,
Harish

Comment 5 Jason Dillaman 2015-08-20 16:53:00 UTC

Negative -- restating comment 3: the bug still exists but you need a new way to expose it since the OSD is no longer returning an error when attempting to update the object map from an IO write operation.

Comment 6 Federico Lucifredi 2015-08-21 19:27:51 UTC

Comment #3 has instructions on how to reproduce, and Harish's e-mail says Tanay will test this. We should have an update Monday on this?

Comment 7 Tanay Ganguly 2015-08-24 12:03:48 UTC

As per Jason's comment 3, I tried to induce ENOSPC by setting quota limit on the pool.

Client recieved ENOSPC upon trying to exceed the quota.

Log snippet from /var/log/libvirt/qemu/rhel7.0.log
=================================================
char device redirected to /dev/pts/2 (label charserial0)
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 574.700000 ms, bitrate 1310529 bps (1.249818 Mbps) LOW BANDWIDTH
red_dispatcher_set_cursor_peer: 
inputs_connect: inputs channel client create
block I/O error in device 'drive-virtio-disk0': No space left on device (28)
block I/O error in device 'drive-virtio-disk0': No space left on device (28)
block I/O error in device 'drive-virtio-disk0': No space left on device (28)
block I/O error in device 'drive-virtio-disk0': No space left on device (28)
block I/O error in device 'drive-virtio-disk0': No space left on device (28)
qemu: terminating on signal 15 from pid 12775
red_channel_client_disconnect_dummy: rcc=0x7f2c086c2d40 (channel=0x7f2c082a7c70 type=5 id=0)
snd_channel_put: SndChannel=0x7f2c0867a5e0 freed
red_channel_client_disconnect_dummy: rcc=0x7f2c086a4080 (channel=0x7f2c082a7f60 type=6 id=0)
snd_channel_put: SndChannel=0x7f2c086b2870 freed

===================================================

But the "Previous Crash didn't happen" on the client side. Surprisingly I couldn't find any log related to space issue on OSDs despite having log-level 20.

@Jason,

Could you please have a look into this or let me know any other ways of reproducing the issue.

Cluster Details:

1 13.49991 root default                                        
-2  2.69998     host magna094                                   
 0  0.89999         osd.0          up  1.00000          1.00000 
 1  0.89999         osd.1          up  1.00000          1.00000 
 2  0.89999         osd.2          up  1.00000          1.00000 
-3  2.69998     host magna112                                   
 3  0.89999         osd.3          up  1.00000          1.00000 
 4  0.89999         osd.4          up  1.00000          1.00000 
 5  0.89999         osd.5          up  1.00000          1.00000 
-4  2.69998     host magna114                                   
 6  0.89999         osd.6          up  1.00000          1.00000 
 7  0.89999         osd.7          up  1.00000          1.00000 
 8  0.89999         osd.8          up  1.00000          1.00000 
-5  2.69998     host magna106                                   
 9  0.89999         osd.9          up  1.00000          1.00000 
10  0.89999         osd.10         up  1.00000          1.00000 
11  0.89999         osd.11         up  1.00000          1.00000 
-6  2.69998     host magna034                                   
12  0.89999         osd.12         up  1.00000          1.00000 
13  0.89999         osd.13         up  1.00000          1.00000 
14  0.89999         osd.14         up  1.00000          1.00000 

Client Details:
10.12.27.2
root/redhat

You can use librbd_api.py script to induce the error quickly, or even i have Qemu already setup where the Installation will fail.

virsh list
 Id    Name                           State
----------------------------------------------------
 8     rhel7.0                        running

Comment 8 Jason Dillaman 2015-08-24 19:43:30 UTC

I repeated it on first attempt on my local machine by creating an image, setting the pool max-object quota to three, and running the following with "rbd cache writethrough until flush = false" in my config:

  import rbd
  import rados
  cluster = rados.Rados(conffile='ceph.conf')
  cluster.connect()
  ioctx = cluster.open_ioctx('rbd')
  image = rbd.Image(ioctx, 'foo')
  data = '1' * 1024
  image.write(data, 1<<22)
  image.write(data, 2<<22)

Comment 9 Tanay Ganguly 2015-08-25 11:52:27 UTC

Thanks Jason, after adding rbd cache writethrough until flush = false in config file i am able to reproduce the issue.

I have a general query after understanding the problem, please correct me if my understanding is correct.

I still didn't see any ENOSPC error from OSD logs, because client side librados is handling it from its cache and returning from there itself.

So OSD are not reached and hence i am not seeing any ENOSPC error message from OSD logs. I tried strace and osd logs to confirm the same.

If the above statement is true, could you please briefly explain me the layers which are involved in the interaction.

Log snippet:
===============================================================================


osdc/ObjectCacher.cc: In function 'ObjectCacher::~ObjectCacher()' thread 7f76b139c740 time 2015-08-25 02:03:09.727456
osdc/ObjectCacher.cc: 551: FAILED assert(i->empty())
 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
 1: (()+0x160bf5) [0x7f7688b7cbf5]
 2: (()+0x3a3fe1) [0x7f7688dbffe1]
 3: (()+0x57496) [0x7f7688a73496]
 4: (()+0x7da29) [0x7f7688a99a29]
 5: (rbd_close()+0x9) [0x7f7688a587c9]
 6: (ffi_call_unix64()+0x4c) [0x7f76a9797dac]
 7: (ffi_call()+0x1f5) [0x7f76a97976d5]
 8: (_ctypes_callproc()+0x30b) [0x7f76a99aac8b]
 9: (()+0xaa85) [0x7f76a99a4a85]
 10: (PyObject_Call()+0x43) [0x7f76b0e0f073]
 11: (PyEval_EvalFrameEx()+0x1d4c) [0x7f76b0ea334c]
 12: (PyEval_EvalFrameEx()+0x4350) [0x7f76b0ea5950]
 13: (PyEval_EvalCodeEx()+0x7ed) [0x7f76b0ea71ad]
 14: (()+0x6f098) [0x7f76b0e34098]
 15: (PyObject_Call()+0x43) [0x7f76b0e0f073]
 16: (()+0x59085) [0x7f76b0e1e085]
 17: (PyObject_Call()+0x43) [0x7f76b0e0f073]
 18: (PyEval_CallObjectWithKeywords()+0x47) [0x7f76b0ea0ff7]
 19: (()+0xa106c) [0x7f76b0e6606c]
 20: (()+0x9c0aa) [0x7f76b0e610aa]
 21: (()+0x7e9ef) [0x7f76b0e439ef]
 22: (()+0x803b0) [0x7f76b0e453b0]
 23: (_PyModule_Clear()+0x16c) [0x7f76b0e49e8c]
 24: (PyImport_Cleanup()+0x45b) [0x7f76b0eb6a7b]
 25: (Py_Finalize()+0xfe) [0x7f76b0ec233e]
 26: (Py_Main()+0x6a5) [0x7f76b0ed3545]
 27: (__libc_start_main()+0xf5) [0x7f76b0100af5]
 28: python() [0x400721]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted


=============================================================================

Comment 10 Jason Dillaman 2015-08-25 12:02:08 UTC

The OSDs report the quota error:

2015-08-24 15:41:26.202818 7f48f478b700  1 -- 172.16.231.161:6800/41202 --> 172.16.231.161:0/1042309 -- osd_op_reply(9 rbd_data.100674b0dc51.0000000000000002 [set-alloc-hint object_size 4194304 write_size 4194304,write 0~1024] v0'0 uv0 ondisk = -28 ((28) No space left on device)) v6 -- ?+0 0x4452000 con 0x4249b20

Comment 12 Ken Dreyer (Red Hat) 2015-12-10 20:55:12 UTC

Fixed in upstream's v0.94.4.

Comment 14 Tanay Ganguly 2016-02-06 17:22:35 UTC

Unable to reproduce the same.
Marking it as Verified.

Use Case: 
Set quota on pool and do lots of IO operations
Resize operations. (Increase/Shrink in a loop)
Librbd VM operations. ( VM creation and creating FS on top of rbd )
VM with IO migration ( Migration from one host to another)
Different RBD image size (100 G /781 TB)

Comment 16 errata-xmlrpc 2016-02-29 14:42:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:0313

Note You need to log in before you can comment on or make changes to this bug.