RDO tickets are now tracked in Jira https://issues.redhat.com/projects/RDO/issues/
Bug 1524320 - Define troubleshooting procedures for critical RDO Infra services
Summary: Define troubleshooting procedures for critical RDO Infra services
Keywords:
Status: CLOSED EOL
Alias: None
Product: RDO
Classification: Community
Component: Infrastructure
Version: trunk
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: trunk
Assignee: Alan Pevec
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-12-11 09:10 UTC by Javier Peña
Modified: 2024-02-15 14:43 UTC (History)
0 users

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-02-14 11:46:44 UTC
Embargoed:


Attachments (Terms of Use)

Description Javier Peña 2017-12-11 09:10:44 UTC
There are several services provided by the RDO Infrastructure that are considered as critical (see https://www.rdoproject.org/infra/service-continuity/). A service failure or degradation can affect multiple projects.

Each of the RDO Infra maintainers has different levels of knowledge about the services, so we have to document the most common troubleshooting procedures for the critical services. This will help in the following areas:

- Allowing consistent troubleshooting when the most knowledgeable person is not around (e.g. weekends or holidays).
- Prevent the "someone is hit by the bus" effect.

Comment 1 Javier Peña 2017-12-11 09:12:58 UTC
From the RDO Service Continuity page, we should document troubleshooting procedures for at least:

- review.rdoproject.org nodepool nodes (or nodepool in general)
- RDO Trunk repositories
- DLRN DB instance
- images.rdoproject.org
- trunk.registry.rdoproject.org
- www.rdoproject.org
- lists.rdoprojects.org

We should rely on upstream published documentation as much as possible.

Comment 2 David Moreau Simard 2017-12-11 14:40:35 UTC
It's okay to link to a documentation place (in git or readthedocs) from the service continuity page but I'm not sure rdoproject.org is a good place for technical documentation like this.

Upstream uses system-config [1][2] for this purpose.
Our equivalent would be rdo-infra-playbooks I guess ?

Some projects already have their built-in documentation (RDO registry, delorean, weirdo) so we could see to link to them as appropriate (from the "main" documentation hub)

[1]: https://docs.openstack.org/infra/system-config/
[2]: https://github.com/openstack-infra/system-config

Comment 3 Javier Peña 2017-12-11 15:08:19 UTC
Maybe we could create a new rdo-docs repo for that, and publish it to readthedocs.org as mentioned on IRC?

Comment 4 Alan Pevec 2018-04-13 12:55:15 UTC
RDO Registry doc is published at http://rdo-container-registry.readthedocs.io/


Note You need to log in before you can comment on or make changes to this bug.