Back to bug 1978722
| Who | When | What | Removed | Added |
|---|---|---|---|---|
| Travis Nielsen | 2021-07-02 16:50:21 UTC | Assignee | tnielsen | shan |
| Sébastien Han | 2021-07-05 13:42:10 UTC | Flags | needinfo?(sagrawal) | |
| Sidhant Agrawal | 2021-07-05 14:09:03 UTC | Flags | needinfo?(sagrawal) | |
| Sidhant Agrawal | 2021-07-05 14:40:36 UTC | Keywords | TestBlocker | |
| Mudit Agarwal | 2021-07-05 14:56:02 UTC | CC | muagarwa, shan | |
| Flags | needinfo?(shan) | |||
| Sébastien Han | 2021-07-05 15:34:28 UTC | Status | NEW | ASSIGNED |
| Flags | needinfo?(shan) | |||
| Neha Berry | 2021-07-06 10:32:49 UTC | CC | nberry | |
| QA Contact | ebenahar | sagrawal | ||
| Raz Tamir | 2021-07-06 14:15:05 UTC | CC | ratamir | |
| RHEL Program Management | 2021-07-06 14:15:14 UTC | Target Release | --- | OCS 4.8.0 |
| Sébastien Han | 2021-07-06 21:29:44 UTC | Status | ASSIGNED | POST |
| Link ID | Github rook/rook/pull/8272 | |||
| Red Hat Bugzilla | 2021-07-07 02:10:55 UTC | Doc Type | If docs needed, set a value | No Doc Update |
| Sébastien Han | 2021-07-07 10:19:29 UTC | Doc Text | When the CephCluster is configured with Multus and multiple networks are used to deploy Ceph some commands are failing to be executed from the Operator. These commands, in particular, radosgw-admin ones need access to the "ceph public network" to talk to OSDs. Unfortunately, the Rook-Ceph Operator does not have the network annotations and thus doesn't have the networks available and cannot reach OSDs. So the commands end up hanging and eventually time out. Applying the annotations to the Operator pod is possible but will result in restarting the operator too and this should be avoided at all costs. Also, applying the annotations beforehand is not possible since the Multus declaration is in the CephCluster specification. So we would have no idea what to do. So the current approach runs a new sidecar container in the mgr pod to act as a proxy for "some" ceph commands, only the radosgw-admin ones for multi-site setup. This is a small container with admin access running idle waiting for commands to be executed. In a sense, it is similar to the toolbox but we didn't want to clearly expose it, so running as a sidecar is quite nice. Proxying command is obviously not always recommended since we add an extra hop in the network path. Now each request has to go from the operator pod to the API server to the remote pod to Ceph. Previously, the command only goes from the operator to Ceph. It's worth noting that external mode is not impacted since no rgw pod is configured. This scenario is flexible and allows us to scale pretty well since any CephCluster with Multus will see its mgr sidecar deployed and can then talk to Ceph. We are not limited. |
|
| Mudit Agarwal | 2021-07-07 10:51:52 UTC | Doc Type | No Doc Update | Bug Fix |
| Mudit Agarwal | 2021-07-07 15:09:17 UTC | Doc Text | When the CephCluster is configured with Multus and multiple networks are used to deploy Ceph some commands are failing to be executed from the Operator. These commands, in particular, radosgw-admin ones need access to the "ceph public network" to talk to OSDs. Unfortunately, the Rook-Ceph Operator does not have the network annotations and thus doesn't have the networks available and cannot reach OSDs. So the commands end up hanging and eventually time out. Applying the annotations to the Operator pod is possible but will result in restarting the operator too and this should be avoided at all costs. Also, applying the annotations beforehand is not possible since the Multus declaration is in the CephCluster specification. So we would have no idea what to do. So the current approach runs a new sidecar container in the mgr pod to act as a proxy for "some" ceph commands, only the radosgw-admin ones for multi-site setup. This is a small container with admin access running idle waiting for commands to be executed. In a sense, it is similar to the toolbox but we didn't want to clearly expose it, so running as a sidecar is quite nice. Proxying command is obviously not always recommended since we add an extra hop in the network path. Now each request has to go from the operator pod to the API server to the remote pod to Ceph. Previously, the command only goes from the operator to Ceph. It's worth noting that external mode is not impacted since no rgw pod is configured. This scenario is flexible and allows us to scale pretty well since any CephCluster with Multus will see its mgr sidecar deployed and can then talk to Ceph. We are not limited. | |
| Doc Type | Bug Fix | No Doc Update | ||
| Red Hat Bugzilla | 2021-07-07 15:09:17 UTC | Doc Type | No Doc Update | No Doc Update |
| OpenShift BugZilla Robot | 2021-07-07 19:57:36 UTC | Status | POST | MODIFIED |
| Sébastien Han | 2021-07-07 19:58:03 UTC | Link ID | Github openshift/rook/pull/274 | |
| Orit Wasserman | 2021-07-08 14:14:00 UTC | CC | owasserm | |
| Sébastien Han | 2021-07-08 15:54:34 UTC | Summary | OCS deployment with multus unsuccessful with noobaa stuck in Configuring phase | CephObjectStore deployment fails with multus, leading Noobaa to never be ready |
| Mudit Agarwal | 2021-07-09 01:12:16 UTC | Status | MODIFIED | ON_QA |
| Fixed In Version | 4.8.0-450.ci | |||
| Sidhant Agrawal | 2021-07-12 12:09:44 UTC | Status | ON_QA | ASSIGNED |
| Sébastien Han | 2021-07-12 12:37:45 UTC | Status | ASSIGNED | POST |
| Link ID | Github rook/rook/pull/8297 | |||
| Sébastien Han | 2021-07-12 15:00:37 UTC | Link ID | Github openshift/rook/pull/276 | |
| OpenShift BugZilla Robot | 2021-07-12 16:13:56 UTC | Status | POST | MODIFIED |
| Mudit Agarwal | 2021-07-13 02:05:45 UTC | Status | MODIFIED | ON_QA |
| Fixed In Version | 4.8.0-450.ci | 4.8.0-452.ci | ||
| Jiffin | 2021-07-13 05:36:41 UTC | CC | jthottan | |
| Sidhant Agrawal | 2021-07-13 16:00:20 UTC | Status | ON_QA | VERIFIED |
| Elad | 2021-08-25 09:25:10 UTC | Keywords | AutomationBackLog | |
| Red Hat Bugzilla | 2022-01-10 10:25:14 UTC | CC | ratamir | |
| Ramakrishnan Periyasamy | 2022-08-17 10:01:39 UTC | CC | rperiyas | |
| Red Hat Bugzilla | 2022-12-31 19:21:10 UTC | QA Contact | sagrawal | nberry |
| Red Hat Bugzilla | 2022-12-31 19:35:02 UTC | CC | rperiyas | |
| Red Hat Bugzilla | 2022-12-31 19:54:34 UTC | CC | nberry | |
| QA Contact | nberry | |||
| Red Hat Bugzilla | 2022-12-31 20:00:17 UTC | CC | jthottan | |
| Red Hat Bugzilla | 2022-12-31 22:33:42 UTC | CC | owasserm | |
| Alasdair Kergon | 2023-01-04 04:47:42 UTC | QA Contact | sagrawal | |
| Alasdair Kergon | 2023-01-04 05:01:03 UTC | CC | jthottan | |
| Alasdair Kergon | 2023-01-04 05:18:56 UTC | CC | nberry | |
| Alasdair Kergon | 2023-01-04 05:26:53 UTC | CC | owasserm | |
| Alasdair Kergon | 2023-01-04 05:37:14 UTC | CC | rperiyas | |
| Red Hat Bugzilla | 2023-01-31 23:38:20 UTC | CC | madam | |
| Red Hat Bugzilla | 2023-08-03 08:28:40 UTC | CC | ocs-bugs |
Back to bug 1978722