Bug 2181350

Summary: Ceph 100.000% pgs unknown when dual stack enabled
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Shay Rozen <srozen>
Component: rookAssignee: Santosh Pillai <sapillai>
Status: NEW --- QA Contact: Shay Rozen <srozen>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.13CC: abose, brgardne, ebenahar, muagarwa, nberry, odf-bz-bot, sapillai
Target Milestone: ---Keywords: TestBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shay Rozen 2023-03-23 19:19:31 UTC
Description of problem (please be detailed as possible and provide log
snippests):
When installing dual stack on ODF ceph health is:
  cluster:
    id:     e31d0f7b-4329-4f8c-9375-33e8bdfc02bb
    health: HEALTH_WARN
            Reduced data availability: 61 pgs inactive
 
  services:
    mon: 3 daemons, quorum a,b,c (age 100m)
    mgr: a(active, since 99m)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 99m), 3 in (since 99m)
 
  data:
    volumes: 1/1 healthy
    pools:   12 pools, 61 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     100.000% pgs unknown
             61 unknown




Version of all relevant components (if applicable):
ocp 4.12.7 stable
odf 4.13.0-107

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Can't work with the product

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
2/2

Can this issue reproduce from the UI?
No

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install ODF with dual stack enabled in storagecluster
spec:
  network:
    dualStack: true
2. check ceph -s



Actual results:
ceph is not healthy and all PGs inactive

  cluster:
    id:     e31d0f7b-4329-4f8c-9375-33e8bdfc02bb
    health: HEALTH_WARN
            Reduced data availability: 61 pgs inactive
 
  services:
    mon: 3 daemons, quorum a,b,c (age 100m)
    mgr: a(active, since 99m)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 99m), 3 in (since 99m)
 
  data:
    volumes: 1/1 healthy
    pools:   12 pools, 61 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     100.000% pgs unknown
             61 unknown




Expected results:
ceph health is ok and PGs are active


Additional info:

Comment 5 Santosh Pillai 2023-03-27 02:16:27 UTC
Can you please also add the behavior when IPv6 was enabled and dual stack was false.

Comment 6 Santosh Pillai 2023-03-28 13:31:28 UTC
Currently waiting for the dual stack cluster from OCP QE to debug this issue.

Comment 8 Shay Rozen 2023-03-29 14:01:40 UTC
When installing ipv4 with dual stack false installation is successfull. With ipv6 and dual stack false same symptoms

Comment 9 Santosh Pillai 2023-04-10 12:57:37 UTC
Ceph does not support dual stack in downstream - https://bugzilla.redhat.com/show_bug.cgi?id=1804290

So `IPFamily:True` and `DualStack:True` won't work in a dual stack cluster. 
Also observed that `IPFamily:true` and `DualStack:False` have same issue.

Comment 11 Mudit Agarwal 2023-04-24 08:44:34 UTC
Moving out of 4.13 as an expected behaviour, we need to re assess our options here.

Comment 12 Blaine Gardner 2023-07-11 15:17:36 UTC
Moving out of 4.14 since it wasn't planned for the short dev cycle.

No known planned support in downstream ceph for dual stack.