Bug 1503411
| Summary: | [iSCSI]; Incorrect number of tcmu-runner daemons reported after GWs go down and come back up | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Tejas <tchandra> |
| Component: | iSCSI | Assignee: | Jason Dillaman <jdillama> |
| Status: | CLOSED WONTFIX | QA Contact: | Tejas <tchandra> |
| Severity: | medium | Docs Contact: | Erin Donnelly <edonnell> |
| Priority: | unspecified | ||
| Version: | 3.0 | CC: | ceph-eng-bugs, ceph-qe-bugs, edonnell, jdillama, tchandra |
| Target Milestone: | rc | ||
| Target Release: | 3.* | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Known Issue | |
| Doc Text: |
.Incorrect number of `tcmu-runner` daemons reported after iSCSI target LUNs fail and recover
After iSCSI target Logical Unit Numbers (LUNs) recover from a failure, the `ceph -s` command in certain cases outputs an incorrect number of `tcmu-runner` daemons.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-02-26 16:14:54 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1494421 | ||
@Tejas: the service daemons have a 60 second grace period (by default). Did you check the daemon state after 60 seconds had passed? |
Description of problem: Version-Release number of selected component (if applicable): ceph version 12.2.1-14.el7cp libtcmu-1.3.0-0.4.el7cp.x86_64 In the 'ceph -s' command output the number of tcmu-runner daemons is reported. I am disabling the network interface on the GW nodes, and after a while bringing it back up. Command used: ifdown <eth> ifup <eth> Total luns: 122 expected tcmu daemons: 488 After 1 GW network dwon: ceph -s cluster: id: 2057393b-ce5e-4821-9eb0-96519e801921 health: HEALTH_OK services: mon: 3 daemons, quorum havoc,mustang,skytrain mgr: mustang(active) osd: 20 osds: 20 up, 20 in rgw: 1 daemon active tcmu-runner: 257 daemons active <---------------- data: pools: 13 pools, 842 pgs objects: 1140k objects, 3320 GB usage: 9960 GB used, 12284 GB / 22245 GB avail pgs: 842 active+clean After all 4 GWs have gone down and come back up: ~]# ceph -s cluster: id: 2057393b-ce5e-4821-9eb0-96519e801921 health: HEALTH_OK services: mon: 3 daemons, quorum havoc,mustang,skytrain mgr: mustang(active) osd: 20 osds: 20 up, 20 in rgw: 1 daemon active tcmu-runner: 31 daemons active <--------------- data: pools: 13 pools, 842 pgs objects: 1140k objects, 3320 GB usage: 9961 GB used, 12284 GB / 22245 GB avail pgs: 842 active+clean io: client: 10743 B/s rd, 111 MB/s wr, 10 op/s rd, 511 op/s wr