Bug 1973033 - [RHCS-baremetal] - Provided shell script in the chapter "Recovering the Ceph Monitor store when using BlueStore" is failing
Summary: [RHCS-baremetal] - Provided shell script in the chapter "Recovering the Ceph ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 4.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 5.1
Assignee: Neha Ojha
QA Contact: Manohar Murthy
URL:
Whiteboard:
Depends On:
Blocks: 1969383
TreeView+ depends on / blocked
 
Reported: 2021-06-17 07:16 UTC by skanta
Modified: 2022-04-30 16:21 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-30 16:21:09 UTC
Embargoed:


Attachments (Terms of Use)

Description skanta 2021-06-17 07:16:18 UTC
Description of problem:
  The provided shell script in the 4.x  troubleshooting guide is failed.

  


Version-Release number of selected component (if applicable):

Reference DOC- https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html-single/troubleshooting_guide/index?lb_target=production#recovering-the-ceph-monitor-store-when-using-bluestore_diag

How reproducible: 


Steps to Reproduce:
1. Crash the MON's by executing the "rm -rf  /var/lib/ceph/mon/ceph-{mon_id}/store.db"
2. Recover the MON with the provided procedure.
3. Execute the script at step-2-i



Actual results:

    

Expected results:


Additional info:

 Modified the script as below to test the recovery scenario.

Shell script
------------

#!/bin/bash -x

ms=/tmp/monstore/
rm -rf $ms
mkdir $ms

for host in ceph-bharath-1623727358372-node3-mon-osd ceph-bharath-1623727358372-node4-osd-client ceph-bharath-1623727358372-node5-osd ceph-bharath-1623727358372-node6-osd ceph-bharath-1623727358372-node7-osd;
do
echo "The Host name is :$host"

ssh -l root $host "rm -rf  $ms"
ssh -l root $host "mkdir $ms"

scp -r $ms"store.db" $host:$ms
rm -rf $ms
mkdir $ms

#ssh -l root $host "mkdir $ms"

ssh -t root@$host <<'EOT'
ms=/tmp/monstore
for osd in /var/lib/ceph/osd/ceph-*;
do
     IN=$osd
     arrIN=(${IN//-/ })
     systemctl stop ceph-osd@${arrIN[1]}.service
     sleep 5
     echo "ceph-objectstore-tool --type bluestore --data-path $osd --op update-mon-db --no-mon-config --mon-store-path $ms"
     ceph-objectstore-tool --type bluestore --data-path $osd --op update-mon-db --no-mon-config --mon-store-path $ms
     systemctl start ceph-osd@${arrIN[1]}.service
     sleep 5
     echo "Pulling data finished in OSD-$arrIN"
done
EOT

scp -r $host:$ms"*" $ms
echo "Finished Pulling data: $host"

done
---------------------------------------------------------------------

During testing, I hardcoded and used sleep commands in the script. 


If any steps/logic is missed  please include and provide the refined script

Comment 1 skanta 2021-06-17 07:26:11 UTC
@vikhyat- Please let me know any further information is required?
          Do I need to assign this ticket to any developer?

Comment 2 Vikhyat Umrao 2021-06-17 08:27:05 UTC
I think for the script we do not need a RADOS developer :). Let me ask one of our Ceph support team members and he can help you. @linuxkidd can you please help Bharath here this is for Monitor DB restore script.

Comment 3 Michael J. Kidd 2021-06-17 15:25:18 UTC
I see three errors in the script from the documentation:

1: The 'ssh' command is missing an 's' - "sh -t $host"  should be "ssh -t $host"
2: Bash doesn't stop the 'ssh' input at EOF when the EOF doesn't start at the beginning of line -- thus, the heavily spaced ( for visual aesthetics ) EOF in the docs does not work.
3: The need for '--no-mon-config' on the ceph-objectstore-tool (COT) line. 
   - I'm not sure if this is strictly required in all cases, but it has been needed in the most recent times I've used COT.

The addition of the stop/start commands is not expected to be needed since the cluster would be down.
  - In "Prerequisites" before the script, Containerized deployments section includes that all OSD containers should be stopped.
  - The docs should be updated for the Bare-metal deployments to include all OSD services should be stopped.

There are some other optimizations that can be done in the script... 
 - it's not necessary to 'rm -rf' the paths each loop iteration. Instead, the rsync command can be modified to ensure the contents remain pure without wasting 'rm' cycles.
 - modify the 'pull' rsync to remove the source files from the OSD nodes so space consumed during the process is freed.
 - modify the 'host' variable name to 'osd_node' to be more clear.

I do still recommend using rsync over scp due to the efficiencies of only transferring modified files and removing the source files without an additional separate ssh/rm command sequence.

Here's my final, recommended ( but untested ) script:

## --------------------------------------------------------------------------
## NOTE: The directory names specified by 'ms', 'db', and 'db_slow' must end
## with a trailing / otherwise rsync will not operate properly.
## --------------------------------------------------------------------------
ms=/tmp/monstore/
db=/root/db/
db_slow=/root/db.slow/

mkdir -p $ms $db $db_slow

## --------------------------------------------------------------------------
## NOTE: Replace the contents inside double quotes for 'osd_nodes' below with
## the list of OSD nodes in the environment.
## --------------------------------------------------------------------------
osd_nodes="osdnode1 osdnode2 osdnode3..."

for osd_node in $osd_nodes; do
echo "Operating on $osd_node"
rsync -avz --delete $ms $osd_node:$ms
rsync -avz --delete $db $osd_node:$db
rsync -avz --delete $db_slow $osd_node:$db_slow

ssh -t $osd_node <<EOF
    for osd in /var/lib/ceph/osd/ceph-*; do
    ceph-objectstore-tool --type bluestore --data-path \$osd --op update-mon-db --no-mon-config --mon-store-path $ms
    done
EOF

rsync -avz --delete --remove-source-files $osd_node:$ms $ms
rsync -avz --delete --remove-source-files $osd_node:$db $db
rsync -avz --delete --remove-source-files $osd_node:$db_slow $db_slow
done
## --------------------------------------------------------------------------
## End of script
## --------------------------------------------------------------------------

@skanta Please test the above modified script - once confirmed functional, the documentation should be updated as specified.

Comment 4 skanta 2021-06-17 16:00:35 UTC
The script mentioned in the description is modified and tested.

Script link - https://bugzilla.redhat.com/show_bug.cgi?id=1973033#c0


1.

Even though the MON's are down the OSD services are running in the cluster because of that if I try to execute the "ceph-objectstore-tool" command I am getting below error-

Error Message- Mount failed with ‘(11) Resource temporarily unavailable.
As per the document, this occurs when ceph-objectstore-tool executed on a running OSD.
Reference Doc- https://docs.ceph.com/en/latest/man/8/ceph-objectstore-tool/

It is a valid point that to avoid the above we can add the step to stop all OSD's. I will remove the stop and start service steps in the script

2. rsync is removed and cp is used in the script

3. I am initializing the osd_nodes variable with all existing OSD nodes. It will be a tedious task for the customers if the OSD count is more.

Comment 5 Michael J. Kidd 2021-06-25 13:13:36 UTC
Modified script to also populate a keyring file inside of $ms so it is copied and updated throughout the process.

## --------------------------------------------------------------------------
## NOTE: The directory names specified by 'ms', 'db', and 'db_slow' must end
## with a trailing / otherwise rsync will not operate properly.
## --------------------------------------------------------------------------
ms=/tmp/monstore/
db=/root/db/
db_slow=/root/db.slow/

mkdir -p $ms $db $db_slow

## --------------------------------------------------------------------------
## NOTE: Replace the contents inside double quotes for 'osd_nodes' below with
## the list of OSD nodes in the environment.
## --------------------------------------------------------------------------
osd_nodes="osdnode1 osdnode2 osdnode3..."

for osd_node in $osd_nodes; do
echo "Operating on $osd_node"
rsync -avz --delete $ms $osd_node:$ms
rsync -avz --delete $db $osd_node:$db
rsync -avz --delete $db_slow $osd_node:$db_slow

ssh -t $osd_node <<EOF
for osd in /var/lib/ceph/osd/ceph-*; do
    ceph-objectstore-tool --type bluestore --data-path \$osd --op update-mon-db --no-mon-config --mon-store-path $ms
    if [ -e \$osd/keyring ]; then
        cat \$osd/keyring >> $ms/keyring
        echo '    caps mgr = "allow profile osd"' >> $ms/keyring
        echo '    caps mon = "allow profile osd"' >> $ms/keyring
        echo '    caps osd = "allow *"' >> $ms/keyring
EOT
    else
        echo WARNING: \$osd on $osd_node does not have a local keyring.
    fi
done
EOF

rsync -avz --delete --remove-source-files $osd_node:$ms $ms
rsync -avz --delete --remove-source-files $osd_node:$db $db
rsync -avz --delete --remove-source-files $osd_node:$db_slow $db_slow
done
## --------------------------------------------------------------------------
## End of script
## --------------------------------------------------------------------------

Comment 6 Michael J. Kidd 2021-06-25 13:17:37 UTC
Fixing a typo ( left-over EOT from multiple experiments ).. apologies.

## --------------------------------------------------------------------------
## NOTE: The directory names specified by 'ms', 'db', and 'db_slow' must end
## with a trailing / otherwise rsync will not operate properly.
## --------------------------------------------------------------------------
ms=/tmp/monstore/
db=/root/db/
db_slow=/root/db.slow/

mkdir -p $ms $db $db_slow

## --------------------------------------------------------------------------
## NOTE: Replace the contents inside double quotes for 'osd_nodes' below with
## the list of OSD nodes in the environment.
## --------------------------------------------------------------------------
osd_nodes="osdnode1 osdnode2 osdnode3..."

for osd_node in $osd_nodes; do
echo "Operating on $osd_node"
rsync -avz --delete $ms $osd_node:$ms
rsync -avz --delete $db $osd_node:$db
rsync -avz --delete $db_slow $osd_node:$db_slow

ssh -t $osd_node <<EOF
for osd in /var/lib/ceph/osd/ceph-*; do
    ceph-objectstore-tool --type bluestore --data-path \$osd --op update-mon-db --no-mon-config --mon-store-path $ms
    if [ -e \$osd/keyring ]; then
        cat \$osd/keyring >> $ms/keyring
        echo '    caps mgr = "allow profile osd"' >> $ms/keyring
        echo '    caps mon = "allow profile osd"' >> $ms/keyring
        echo '    caps osd = "allow *"' >> $ms/keyring
    else
        echo WARNING: \$osd on $osd_node does not have a local keyring.
    fi
done
EOF

rsync -avz --delete --remove-source-files $osd_node:$ms $ms
rsync -avz --delete --remove-source-files $osd_node:$db $db
rsync -avz --delete --remove-source-files $osd_node:$db_slow $db_slow
done
## --------------------------------------------------------------------------
## End of script
## --------------------------------------------------------------------------

Comment 8 skanta 2021-08-19 11:04:25 UTC
Verified the following script and working as expected.

## --------------------------------------------------------------------------
## NOTE: The directory names specified by 'ms', 'db', and 'db_slow' must end
## with a trailing / otherwise rsync will not operate properly.
## --------------------------------------------------------------------------
ms=/tmp/monstore/
db=/root/db/
db_slow=/root/db.slow/

mkdir -p $ms $db $db_slow

## --------------------------------------------------------------------------
## NOTE: Replace the contents inside double quotes for 'osd_nodes' below with
## the list of OSD nodes in the environment.
## --------------------------------------------------------------------------
osd_nodes="osdnode1 osdnode2 osdnode3..."

for osd_node in $osd_nodes; do
echo "Operating on $osd_node"
rsync -avz --delete $ms $osd_node:$ms
rsync -avz --delete $db $osd_node:$db
rsync -avz --delete $db_slow $osd_node:$db_slow

ssh -t $osd_node <<EOF
for osd in /var/lib/ceph/osd/ceph-*; do
    ceph-objectstore-tool --type bluestore --data-path \$osd --op update-mon-db --no-mon-config --mon-store-path $ms
    if [ -e \$osd/keyring ]; then
        cat \$osd/keyring >> $ms/keyring
        echo '    caps mgr = "allow profile osd"' >> $ms/keyring
        echo '    caps mon = "allow profile osd"' >> $ms/keyring
        echo '    caps osd = "allow *"' >> $ms/keyring
    else
        echo WARNING: \$osd on $osd_node does not have a local keyring.
    fi
done
EOF

rsync -avz --delete --remove-source-files $osd_node:$ms $ms
rsync -avz --delete --remove-source-files $osd_node:$db $db
rsync -avz --delete --remove-source-files $osd_node:$db_slow $db_slow
done
## --------------------------------------------------------------------------
## End of script
## --------------------------------------------------------------------------


Note You need to log in before you can comment on or make changes to this bug.