Description of problem:
1) Deployment of a hybrid ceph cluster crashes.
2) Monitor daemon of homogeneous cluster built of s390x nodes crashes repeatedly under specific workload on Ceph FS.
Ceph version: 15.2.3 ("Octopus")
Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)? 4
Reproducible: Yes
Steps to reproduce:
1) Build a "hybrid" cluster. Bootstrap a monitor node on s390 box. Add a number of OSD nodes, one of which is on x86_64. Start a ceph manager and a monitor daemon. Start an OSD daemon on the x86_64 node.
Actual results: the OSD daemon crashes.
Expected results: the OSD daemon works without crash.
2) Build a cluster (all nodes on s390x) with 1 monitor node, 3 OSDs, 1 MDS and 1 MGR. Add 2 more monitor nodes which are OSDs already. The resulted cluster has 3 monitor nodes, 2 of which are both OSDs and monitors. Create and mount Ceph FS. Run stress-ng tool.
Actual results: monitor daemon crashes repeatedly
Expected results: monitor daemon doesn't crash
Additional info:
Topology:
root@m8330013:~# ceph node ls all
{
"mon": {
"m8330013": [
"m8330013"
],
"m8330014": [
"m8330014"
],
"m8330015": [
"m8330015"
]
},
"osd": {
"m8330014": [
0
],
"m8330015": [
1
],
"m8330016": [
2
]
},
"mds": {
"m8330013": [
"m8330013"
]
},
"mgr": {
"m8330013": [
"m8330013"
],
"m8330015": [
"m8330015"
]
}
}
Stress-ng Job file :
run sequential
metrics
verbose
timeout 5m
times
timestamp
#0 means 1 stressor per CPU
access 0
bind-mount 0
chdir 0
chmod 0
chown 0
copy-file 0
dentry 0
dir 0
dirdeep 0
dnotify 0
dup 0
eventfd 0
fallocate 0
fanotify 0
fcntl 0
fiemap 0
file-ioctl 0
filename 0
flock 0
fstat 0
getdent 0
handle 0
inode-flags 0
inotify 0
io 0
iomix 0
ioprio 0
lease 0
link 0
locka 0
lockf 0
lockofd 0
mknod 0
open 0
procfs 0
rename 0
symlink 0
sync-file 0
utime 0
xattr 0
Command for Execution:
stress-ng --job <job_file> --temp-path <cephfs_mountpoint> --log-file <log_file>
The root cause of the problems was identified as a number of endian bugs during Ceph message encoding/decoding that were causing various issues on IBM Z.
Proposed fixups were sent to upstream. The related upstream pull requests are:
https://github.com/ceph/ceph/pull/35920 (msg/msg_types: entity_addrvec_t: fix decode on big-endian hosts)
This is probably not critical on its own, but it did cause ceph unit test failures.
https://github.com/ceph/ceph/pull/36697 (messages,mds: Fix decoding of enum types on big-endian systems)
This showed up as MON daemons crashing whenever they receive a HEALTH_WARN message (and probably other problems).
https://github.com/ceph/ceph/pull/36992 (include/encoding: Fix encode/decode of float types on big-endian systems)
This causes immediate crashes of an IBM Z OSD attempting to join an x86 based Ceph cluster (and probably other problems).
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2021:0081
Description of problem: 1) Deployment of a hybrid ceph cluster crashes. 2) Monitor daemon of homogeneous cluster built of s390x nodes crashes repeatedly under specific workload on Ceph FS. Ceph version: 15.2.3 ("Octopus") Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 4 Reproducible: Yes Steps to reproduce: 1) Build a "hybrid" cluster. Bootstrap a monitor node on s390 box. Add a number of OSD nodes, one of which is on x86_64. Start a ceph manager and a monitor daemon. Start an OSD daemon on the x86_64 node. Actual results: the OSD daemon crashes. Expected results: the OSD daemon works without crash. 2) Build a cluster (all nodes on s390x) with 1 monitor node, 3 OSDs, 1 MDS and 1 MGR. Add 2 more monitor nodes which are OSDs already. The resulted cluster has 3 monitor nodes, 2 of which are both OSDs and monitors. Create and mount Ceph FS. Run stress-ng tool. Actual results: monitor daemon crashes repeatedly Expected results: monitor daemon doesn't crash Additional info: Topology: root@m8330013:~# ceph node ls all { "mon": { "m8330013": [ "m8330013" ], "m8330014": [ "m8330014" ], "m8330015": [ "m8330015" ] }, "osd": { "m8330014": [ 0 ], "m8330015": [ 1 ], "m8330016": [ 2 ] }, "mds": { "m8330013": [ "m8330013" ] }, "mgr": { "m8330013": [ "m8330013" ], "m8330015": [ "m8330015" ] } } Stress-ng Job file : run sequential metrics verbose timeout 5m times timestamp #0 means 1 stressor per CPU access 0 bind-mount 0 chdir 0 chmod 0 chown 0 copy-file 0 dentry 0 dir 0 dirdeep 0 dnotify 0 dup 0 eventfd 0 fallocate 0 fanotify 0 fcntl 0 fiemap 0 file-ioctl 0 filename 0 flock 0 fstat 0 getdent 0 handle 0 inode-flags 0 inotify 0 io 0 iomix 0 ioprio 0 lease 0 link 0 locka 0 lockf 0 lockofd 0 mknod 0 open 0 procfs 0 rename 0 symlink 0 sync-file 0 utime 0 xattr 0 Command for Execution: stress-ng --job <job_file> --temp-path <cephfs_mountpoint> --log-file <log_file>