Bug 1559749

Summary: [CephFS]: IO is hanging while doing rsync
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Persona non grata <nobody+410372>
Component: CephFSAssignee: Patrick Donnelly <pdonnell>
Status: CLOSED ERRATA QA Contact: Persona non grata <nobody+410372>
Severity: high Docs Contact: Aron Gunn <agunn>
Priority: high    
Version: 3.0CC: agunn, anharris, ceph-eng-bugs, ceph-qe-bugs, edonnell, hnallurv, jlayton, john.spray, kdreyer, nobody+410372, pdonnell, rperiyas, vshankar, zyan
Target Milestone: z3   
Target Release: 3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-12.2.4-10.el7cp Ubuntu: ceph_12.2.4-14redhat1xenial Doc Type: Bug Fix
Doc Text:
.Reducing the number of active MDS daemons on CephFS no longer causes kernel client's I/O to hang Previously, reducing the number of active Metadata Server (MDS) daemons on a Ceph File System (CephFS) would cause kernel client's I/O to hang. When this happens, kernel clients were unable to connect to MDS ranks greater than or equal to `max_mds`. This issue has been fixed in this release.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-15 18:20:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1557269    

Description Persona non grata 2018-03-23 08:20:03 UTC
Description of problem:
Running automated scripts for testing rsync module,but IOs were hung for long hours. Different IO are used(dd,fio,crefi,touch)

Version-Release number of selected component (if applicable):
ceph version 12.2.4-4.el7cp (bfc2b497ab362f2b3afa7bd1f9d0053f74b60d66) luminous (stable)

How reproducible:
Always

Steps to Reproduce:
1.Setup ceph cluster and mount on ceph-fuse and kernel clients on same mount point
2.Try to do IOs on machine and mount point,use rsync module to sync data from local to mount dir and data on mount dir to local.

Actual results:
IOs were hung

Expected results:

IOs should be successful and sync should happen

Additional info:
Logs of clients and 2 active mdss are attached

Comment 16 Yan, Zheng 2018-03-29 01:21:29 UTC
looks like ceph_mdsc_open_export_target_session(mdsc, target) return error. the function only return -ENOMEM (unlikely in this case) and -EINVAL. It return -EINVAL when "target >= mdsmap->m_max_mds". did you change max_mds from 2 to 1 during the test?

Comment 17 Persona non grata 2018-03-29 03:58:49 UTC
(In reply to Yan, Zheng from comment #16)
> looks like ceph_mdsc_open_export_target_session(mdsc, target) return error.
> the function only return -ENOMEM (unlikely in this case) and -EINVAL. It
> return -EINVAL when "target >= mdsmap->m_max_mds". did you change max_mds
> from 2 to 1 during the test?

Yes,for previous test cleanup

Comment 39 errata-xmlrpc 2018-05-15 18:20:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1563

Comment 40 Patrick Donnelly 2018-06-29 23:58:06 UTC
*** Bug 1594760 has been marked as a duplicate of this bug. ***