Bug 1252819 - Seeing a Call Trace while Removing an OSD
Summary: Seeing a Call Trace while Removing an OSD
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 1.3.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: 1.3.2
Assignee: Samuel Just
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-12 09:54 UTC by Tanay Ganguly
Modified: 2017-07-30 15:12 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-12 16:38:53 UTC
Embargoed:


Attachments (Terms of Use)

Description Tanay Ganguly 2015-08-12 09:54:30 UTC
Description of problem:
While writing to rbd image over VMs, when i tried to remove the OSD host, i saw a Call Trace

Version-Release number of selected component (if applicable):
ceph version 0.94.1.4

How reproducible:
Unable to reproduce it again

Steps to Reproduce:
1. Create an RBD image and export it to a QEMU Client.
2. Setup a VM and install OS on top of it.
3. Then do Bunch of writes (Using fio, dd)
4. Also triggered lots of other VMs cross migration among 2 different QEMU host in parallel.
5. Reboot the OSD host ( had 3 OSD daemon in it )

Actual results:
Seeing a Call Trace

Expected results:
Should not be there

Logs:

[345190.173821] NTFS driver 2.1.30 [Flags: R/O MODULE].
[345190.244951] QNX4 filesystem 0.2.3 registered.
[536951.586698] Key type ceph registered
[536951.587277] libceph: loaded (mon/osd proto 15/24)
[536951.590877] rbd: loaded rbd (rados block device)
[536951.603804] libceph: client168153 fsid 4d262ff8-ccff-4d0f-895c-371fc21ad58f
[536951.604822] libceph: mon0 10.8.128.103:6789 session established
[536967.061542] libceph: client168157 fsid 4d262ff8-ccff-4d0f-895c-371fc21ad58f
[536967.062507] libceph: mon0 10.8.128.103:6789 session established
[678914.710580] libceph: client192510 fsid 4d262ff8-ccff-4d0f-895c-371fc21ad58f
[678914.712085] libceph: mon0 10.8.128.103:6789 session established
[678914.716341] ------------[ cut here ]------------
[678914.716355] WARNING: CPU: 8 PID: 24086 at /build/linux-kDCE9u/linux-3.13.0/net/ceph/osd_client.c:979 remove_osd+0xea/0x130 [libceph]()
[678914.716357] Modules linked in: rbd libceph ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs xt_multiport xt_tcpudp iptable_filter ip_tables x_tables ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi nfsd auth_rpcgss nfs_acl nfs x86_pkg_temp_thermal intel_powerclamp coretemp lockd sunrpc kvm_intel fscache kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel scsi_transport_iscsi aes_x86_64 lrw gf128mul lp sb_edac gpio_ich mei_me ioatdma parport edac_core ipmi_si mei glue_helper joydev lpc_ich ablk_helper cryptd wmi mac_hid shpchp btrfs xor raid6_pq libcrc32c hid_generic usbhid hid igb i2c_algo_bit isci dca ahci libsas ptp libahci pps_core scsi_transport_sas
[678914.716404] CPU: 8 PID: 24086 Comm: rbd Not tainted 3.13.0-59-generic #98-Ubuntu
[678914.716406] Hardware name: Supermicro SYS-F627R3-RTB+/X9DRFR, BIOS 3.0b 04/24/2014
[678914.716407]  0000000000000009 ffff88084d1a7d28 ffffffff81723700 0000000000000000
[678914.716411]  ffff88084d1a7d60 ffffffff8106785d ffff88084f1d2800 ffff88084f1d2818
[678914.716414]  ffff880853b9c750 ffff88083334bb80 ffff88082a3d6ca0 ffff88084d1a7d70
[678914.716416] Call Trace:
[678914.716423]  [<ffffffff81723700>] dump_stack+0x45/0x56
[678914.716428]  [<ffffffff8106785d>] warn_slowpath_common+0x7d/0xa0
[678914.716431]  [<ffffffff8106793a>] warn_slowpath_null+0x1a/0x20
[678914.716438]  [<ffffffffa06bab5a>] remove_osd+0xea/0x130 [libceph]
[678914.716446]  [<ffffffffa06bed14>] ceph_osdc_stop+0x94/0x100 [libceph]
[678914.716451]  [<ffffffffa06b07dc>] ceph_destroy_client+0x2c/0xa0 [libceph]
[678914.716455]  [<ffffffffa053da48>] rbd_client_release+0x68/0xa0 [rbd]
[678914.716459]  [<ffffffffa053e7a5>] rbd_dev_destroy+0x65/0x70 [rbd]
[678914.716463]  [<ffffffffa05444f5>] rbd_add+0x975/0xb1b [rbd]
[678914.716467]  [<ffffffff81497457>] bus_attr_store+0x27/0x30
[678914.716471]  [<ffffffff81234de8>] sysfs_write_file+0x128/0x1c0
[678914.716475]  [<ffffffff811be334>] vfs_write+0xb4/0x1f0
[678914.716478]  [<ffffffff811bd6e8>] ? do_sys_open+0x1b8/0x280
[678914.716481]  [<ffffffff811bed69>] SyS_write+0x49/0xa0
[678914.716484]  [<ffffffff81734294>] system_call_fastpath+0x16/0x1b
[678914.716486] ---[ end trace 53c4171e662b9124 ]---
[678917.704377] libceph: client192511 fsid 4d262ff8-ccff-4d0f-895c-371fc21ad58f
[678917.705309] libceph: mon0 10.8.128.103:6789 session established
[678970.702421] libceph: client192520 fsid 4d262ff8-ccff-4d0f-895c-371fc21ad58f
[678970.703374] libceph: mon0 10.8.128.103:6789 session established
[679008.064836] libceph: client192529 fsid 4d262ff8-ccff-4d0f-895c-371fc21ad58f
[679008.065930] libceph: mon0 10.8.128.103:6789 session established

Comment 2 Ken Dreyer (Red Hat) 2015-08-12 16:38:53 UTC
It looks like this is using kernel 3.13. We can't fix bugs in Ubuntu's kernels, so I'm not sure if there's anything to do here. (similar to bz 1250907)


Note You need to log in before you can comment on or make changes to this bug.