Bug 2292327 - [ceph-bluestore-tool] Addition of dedicated DB device with CBT bluefs-bdev-new-db returns mount failed error
Summary: [ceph-bluestore-tool] Addition of dedicated DB device with CBT bluefs-bdev-ne...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 5.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 5.3z7
Assignee: Adam Kupczyk
QA Contact: Harsh Kumar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-06-14 03:58 UTC by Harsh Kumar
Modified: 2024-10-25 04:25 UTC (History)
9 users (show)

Fixed In Version: ceph-16.2.10-266.el9cp
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-06-26 10:02:29 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-9169 0 None None None 2024-06-14 03:59:11 UTC
Red Hat Product Errata RHSA-2024:4118 0 None None None 2024-06-26 10:02:33 UTC

Description Harsh Kumar 2024-06-14 03:58:32 UTC
Description of problem:
  Could possibly be due to the same issue causing BZ#2292323
  Following upstream documentation for ceph-bluestore-tool
    bluefs-bdev-new-db --path osd path --dev-target new-device

    Adds DB device to BlueFS, fails if DB device already exists.


    ceph-bluestore-tool bluefs-bdev-new-db --path osd path --dev-target new-device

    Refer: https://docs.ceph.com/en/latest/man/8/ceph-bluestore-tool/

    The bluefs-bdev-new-db command is currently failing on latest nightly build of RHCS 5.3 (16.2.10-264.el8cp)

Version-Release number of selected component (if applicable):
  ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)


How reproducible:
3/3

Steps to Reproduce:
1. Deploy a RHCS 5.3 cluster
2. On an OSD node, stop any OSD service at random
  # systemctl stop ceph-3d3ab846-2951-11ef-b4fa-fa163e72f4bd.service
3. Ensure spare disk or LVM is available to be used as dedicated DB device
4. From inside the OSD container, run the ceph-bluestore-tool bluefs-bdev-new-db command
  # cephadm shell --name osd.3  --env CEPH_ARGS='--bluestore_block_db_size=1341967564' -- ceph-bluestore-tool bluefs-bdev-new-db --dev-target /dev/data_vg4/lv4 --path /var/lib/ceph/osd/ceph-3

Actual results:
  From automation log (RHCS 5.3):
    cephadm shell --name osd.3  --env CEPH_ARGS='--bluestore_block_db_size=1341967564' -- ceph-bluestore-tool bluefs-bdev-new-db --dev-target /dev/data_vg4/lv4 --path /var/lib/ceph/osd/ceph-3 on 10.0.210.200 timeout 300
    2024-06-13 13:04:17,496 (cephci.test_bluestoretool_workflows) [ERROR] - cephci.forked-repo-tfa-q2.cephci.ceph.ceph.py:1604 - Error 134 during cmd, timeout 300
    2024-06-13 13:04:17,497 (cephci.test_bluestoretool_workflows) [ERROR] - cephci.forked-repo-tfa-q2.cephci.ceph.ceph.py:1605 - Inferring fsid 3d3ab846-2951-11ef-b4fa-fa163e72f4bd
    Inferring config /var/lib/ceph/3d3ab846-2951-11ef-b4fa-fa163e72f4bd/osd.3/config
    Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:64dd6a2d61230837791b0bcf23b79b2b022f41cab332670593502ee9458fc4bc
    2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs _verify_alloc_granularity OP_FILE_UPDATE of 0:0x501000~100000 does not align to alloc_size 0x100000
    2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs mount failed to replay log: (14) Bad address
    2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_bluefs failed bluefs mount: (14) Bad address
    2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db failed to prepare db environment: 
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::add_new_bluefs_device(int, const string&)' thread 7f6fb0c66540 time 2024-06-13T07:34:16.888185+0000
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: 6628: FAILED ceph_assert(r == 0)
    2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _setup_block_symlink_or_file failed to create block.db symlink to /dev/dm-7: (9) Bad file descriptor
     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f6fa6ce00c4]
     2: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     3: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     4: main()
     5: __libc_start_main()
     6: _start()
    *** Caught signal (Aborted) **
     in thread 7f6fb0c66540 thread_name:ceph-bluestore-
     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: /lib64/libpthread.so.0(+0x12d20) [0x7f6fa6109d20]
     2: gsignal()
     3: abort()
     4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6fa6ce0115]
     5: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     6: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     7: main()
     8: __libc_start_main()
     9: _start()
    2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::add_new_bluefs_device(int, const string&)' thread 7f6fb0c66540 time 2024-06-13T07:34:16.888185+0000
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: 6628: FAILED ceph_assert(r == 0)

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f6fa6ce00c4]
     2: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     3: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     4: main()
     5: __libc_start_main()
     6: _start()

    2024-06-13T07:34:16.889+0000 7f6fb0c66540 -1 *** Caught signal (Aborted) **
     in thread 7f6fb0c66540 thread_name:ceph-bluestore-

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: /lib64/libpthread.so.0(+0x12d20) [0x7f6fa6109d20]
     2: gsignal()
     3: abort()
     4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6fa6ce0115]
     5: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     6: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     7: main()
     8: __libc_start_main()
     9: _start()
     NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

       -54> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs _verify_alloc_granularity OP_FILE_UPDATE of 0:0x501000~100000 does not align to alloc_size 0x100000
       -53> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs mount failed to replay log: (14) Bad address
       -52> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_bluefs failed bluefs mount: (14) Bad address
       -51> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db failed to prepare db environment: 
       -50> 2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _setup_block_symlink_or_file failed to create block.db symlink to /dev/dm-7: (9) Bad file descriptor
       -49> 2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::add_new_bluefs_device(int, const string&)' thread 7f6fb0c66540 time 2024-06-13T07:34:16.888185+0000
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: 6628: FAILED ceph_assert(r == 0)

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f6fa6ce00c4]
     2: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     3: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     4: main()
     5: __libc_start_main()
     6: _start()

       -48> 2024-06-13T07:34:16.889+0000 7f6fb0c66540 -1 *** Caught signal (Aborted) **
     in thread 7f6fb0c66540 thread_name:ceph-bluestore-

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: /lib64/libpthread.so.0(+0x12d20) [0x7f6fa6109d20]
     2: gsignal()
     3: abort()
     4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6fa6ce0115]
     5: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     6: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     7: main()
     8: __libc_start_main()
     9: _start()
     NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

       -10> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs _verify_alloc_granularity OP_FILE_UPDATE of 0:0x501000~100000 does not align to alloc_size 0x100000
        -9> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs mount failed to replay log: (14) Bad address
        -5> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_bluefs failed bluefs mount: (14) Bad address
        -4> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db failed to prepare db environment: 
        -2> 2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _setup_block_symlink_or_file failed to create block.db symlink to /dev/dm-7: (9) Bad file descriptor
        -1> 2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::add_new_bluefs_device(int, const string&)' thread 7f6fb0c66540 time 2024-06-13T07:34:16.888185+0000
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: 6628: FAILED ceph_assert(r == 0)

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f6fa6ce00c4]
     2: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     3: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     4: main()
     5: __libc_start_main()
     6: _start()

         0> 2024-06-13T07:34:16.889+0000 7f6fb0c66540 -1 *** Caught signal (Aborted) **
     in thread 7f6fb0c66540 thread_name:ceph-bluestore-

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: /lib64/libpthread.so.0(+0x12d20) [0x7f6fa6109d20]
     2: gsignal()
     3: abort()
     4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6fa6ce0115]
     5: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     6: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     7: main()
     8: __libc_start_main()
     9: _start()
     NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


    2024-06-13 13:04:17,497 (cephci.test_bluestoretool_workflows) [ERROR] - cephci.forked-repo-tfa-q2.cephci.ceph.rados.bluestoretool_workflows.py:72 - Exception hit while command execution. cephadm shell --name osd.3  --env CEPH_ARGS='--bluestore_block_db_size=1341967564' -- ceph-bluestore-tool bluefs-bdev-new-db --dev-target /dev/data_vg4/lv4 --path /var/lib/ceph/osd/ceph-3 Error:  Inferring fsid 3d3ab846-2951-11ef-b4fa-fa163e72f4bd
    Inferring config /var/lib/ceph/3d3ab846-2951-11ef-b4fa-fa163e72f4bd/osd.3/config
    Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:64dd6a2d61230837791b0bcf23b79b2b022f41cab332670593502ee9458fc4bc
    2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs _verify_alloc_granularity OP_FILE_UPDATE of 0:0x501000~100000 does not align to alloc_size 0x100000
    2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs mount failed to replay log: (14) Bad address
    2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_bluefs failed bluefs mount: (14) Bad address
    2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db failed to prepare db environment: 
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::add_new_bluefs_device(int, const string&)' thread 7f6fb0c66540 time 2024-06-13T07:34:16.888185+0000
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: 6628: FAILED ceph_assert(r == 0)
    2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _setup_block_symlink_or_file failed to create block.db symlink to /dev/dm-7: (9) Bad file descriptor
     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f6fa6ce00c4]
     2: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     3: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     4: main()
     5: __libc_start_main()
     6: _start()
    *** Caught signal (Aborted) **
     in thread 7f6fb0c66540 thread_name:ceph-bluestore-
     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: /lib64/libpthread.so.0(+0x12d20) [0x7f6fa6109d20]
     2: gsignal()
     3: abort()
     4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6fa6ce0115]
     5: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     6: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     7: main()
     8: __libc_start_main()
     9: _start()
    2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::add_new_bluefs_device(int, const string&)' thread 7f6fb0c66540 time 2024-06-13T07:34:16.888185+0000
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: 6628: FAILED ceph_assert(r == 0)

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f6fa6ce00c4]
     2: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     3: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     4: main()
     5: __libc_start_main()
     6: _start()

    2024-06-13T07:34:16.889+0000 7f6fb0c66540 -1 *** Caught signal (Aborted) **
     in thread 7f6fb0c66540 thread_name:ceph-bluestore-

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: /lib64/libpthread.so.0(+0x12d20) [0x7f6fa6109d20]
     2: gsignal()
     3: abort()
     4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6fa6ce0115]
     5: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     6: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     7: main()
     8: __libc_start_main()
     9: _start()
     NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

       -54> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs _verify_alloc_granularity OP_FILE_UPDATE of 0:0x501000~100000 does not align to alloc_size 0x100000
       -53> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs mount failed to replay log: (14) Bad address
       -52> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_bluefs failed bluefs mount: (14) Bad address
       -51> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db failed to prepare db environment: 
       -50> 2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _setup_block_symlink_or_file failed to create block.db symlink to /dev/dm-7: (9) Bad file descriptor
       -49> 2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::add_new_bluefs_device(int, const string&)' thread 7f6fb0c66540 time 2024-06-13T07:34:16.888185+0000
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: 6628: FAILED ceph_assert(r == 0)

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f6fa6ce00c4]
     2: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     3: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     4: main()
     5: __libc_start_main()
     6: _start()

       -48> 2024-06-13T07:34:16.889+0000 7f6fb0c66540 -1 *** Caught signal (Aborted) **
     in thread 7f6fb0c66540 thread_name:ceph-bluestore-

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: /lib64/libpthread.so.0(+0x12d20) [0x7f6fa6109d20]
     2: gsignal()
     3: abort()
     4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6fa6ce0115]
     5: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     6: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     7: main()
     8: __libc_start_main()
     9: _start()
     NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

       -10> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs _verify_alloc_granularity OP_FILE_UPDATE of 0:0x501000~100000 does not align to alloc_size 0x100000
        -9> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluefs mount failed to replay log: (14) Bad address
        -5> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_bluefs failed bluefs mount: (14) Bad address
        -4> 2024-06-13T07:34:16.631+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db failed to prepare db environment: 
        -2> 2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 bluestore(/var/lib/ceph/osd/ceph-3) _setup_block_symlink_or_file failed to create block.db symlink to /dev/dm-7: (9) Bad file descriptor
        -1> 2024-06-13T07:34:16.887+0000 7f6fb0c66540 -1 /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::add_new_bluefs_device(int, const string&)' thread 7f6fb0c66540 time 2024-06-13T07:34:16.888185+0000
    /builddir/build/BUILD/ceph-16.2.10/src/os/bluestore/BlueStore.cc: 6628: FAILED ceph_assert(r == 0)

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f6fa6ce00c4]
     2: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     3: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     4: main()
     5: __libc_start_main()
     6: _start()

         0> 2024-06-13T07:34:16.889+0000 7f6fb0c66540 -1 *** Caught signal (Aborted) **
     in thread 7f6fb0c66540 thread_name:ceph-bluestore-

     ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)
     1: /lib64/libpthread.so.0(+0x12d20) [0x7f6fa6109d20]
     2: gsignal()
     3: abort()
     4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6fa6ce0115]
     5: /usr/lib64/ceph/libceph-common.so.2(+0x28b2de) [0x7f6fa6ce02de]
     6: (BlueStore::add_new_bluefs_device(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x843) [0x561454e2bae3]
     7: main()
     8: __libc_start_main()
     9: _start()
     NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Expected results:
  Addition of dedicated DB device should not fail
  From automation log (RHCS 7.1 - 18.2.1-194.el9cp):
  cephadm shell --name osd.6  --env CEPH_ARGS='--bluestore_block_db_size=1341967564' -- ceph-bluestore-tool bluefs-bdev-new-db --dev-target /dev/data_vg1/lv4 --path /var/lib/ceph/osd/ceph-6 on 10.0.195.158 timeout 300

  inferring bluefs devices from bluestore path
  DB device added /dev/dm-7

  OSD metadata for osd.6: 
 {'id': 6, 'arch': 'x86_64', 'back_addr': '[v2:10.0.195.158:6818/3726679123,v1:10.0.195.158:6819/3726679123]', 'back_iface': '', 'bluefs': '1', 'bluefs_db_access_mode': 'blk', 'bluefs_db_block_size': '4096', 'bluefs_db_dev_node': '/dev/dm-7', 'bluefs_db_devices': 'vde', 'bluefs_db_driver': 'KernelDevice', 'bluefs_db_optimal_io_size': '0', 'bluefs_db_partition_path': '/dev/dm-7', 'bluefs_db_rotational': '1', 'bluefs_db_size': '2147483648', 'bluefs_db_support_discard': '0', 'bluefs_db_type': 'hdd', 'bluefs_dedicated_db': '1', 'bluefs_dedicated_wal': '1', 'bluefs_single_shared_device': '0', 'bluefs_wal_access_mode': 'blk', 'bluefs_wal_block_size': '4096', 'bluefs_wal_dev_node': '/dev/dm-8', 'bluefs_wal_devices': 'vde', 'bluefs_wal_driver': 'KernelDevice', 'bluefs_wal_optimal_io_size': '0', 'bluefs_wal_partition_path': '/dev/dm-8', 'bluefs_wal_rotational': '1', 'bluefs_wal_size': '2147483648', 'bluefs_wal_support_discard': '0', 'bluefs_wal_type': 'hdd', 'bluestore_bdev_access_mode': 'blk', 'bluestore_bdev_block_size': '4096', 'bluestore_bdev_dev_node': '/dev/dm-2', 'bluestore_bdev_devices': 'vdd', 'bluestore_bdev_driver': 'KernelDevice', 'bluestore_bdev_optimal_io_size': '0', 'bluestore_bdev_partition_path': '/dev/dm-2', 'bluestore_bdev_rotational': '1', 'bluestore_bdev_size': '26839351296', 'bluestore_bdev_support_discard': '0', 'bluestore_bdev_type': 'hdd', 'bluestore_min_alloc_size': '4096', 'ceph_release': 'reef', 'ceph_version': 'ceph version 18.2.1-194.el9cp (04a992766839cd3207877e518a1238cdbac3787e) reef (stable)', 'ceph_version_short': '18.2.1-194.el9cp', 'ceph_version_when_created': 'ceph version 18.2.1-194.el9cp (04a992766839cd3207877e518a1238cdbac3787e) reef (stable)', 'container_hostname': 'ceph-regression-nps4dm-4eym2d-node4', 'container_image': 'registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:3d75ca419b9ef00cf2c944680737e84e6e1059e0f33156bc21d4dbf76a7da5b1', 'cpu': 'AMD EPYC-Rome Processor', 'created_at': '2024-06-05T20:31:15.209473Z', 'default_device_class': 'hdd', 'device_ids': 'vdd=adfdc659-4c37-49b8-a,vde=b62b6db8-0fc8-445f-a', 'device_paths': 'vdd=/dev/disk/by-path/pci-0000:00:09.0,vde=/dev/disk/by-path/pci-0000:00:0a.0', 'devices': 'vdd,vde', 'distro': 'rhel', 'distro_description': 'Red Hat Enterprise Linux 9.4 (Plow)', 'distro_version': '9.4', 'front_addr': '[v2:10.0.195.158:6816/3726679123,v1:10.0.195.158:6817/3726679123]', 'front_iface': '', 'hb_back_addr': '[v2:10.0.195.158:6822/3726679123,v1:10.0.195.158:6823/3726679123]', 'hb_front_addr': '[v2:10.0.195.158:6820/3726679123,v1:10.0.195.158:6821/3726679123]', 'hostname': 'ceph-regression-nps4dm-4eym2d-node4', 'journal_rotational': '1', 'kernel_description': '#1 SMP PREEMPT_DYNAMIC Thu May 23 16:37:13 EDT 2024', 'kernel_version': '5.14.0-427.20.1.el9_4.x86_64', 'mem_swap_kb': '0', 'mem_total_kb': '7869560', 'network_numa_unknown_ifaces': 'back_iface,front_iface', 'objectstore_numa_unknown_devices': 'vdd,vde', 'os': 'Linux', 'osd_data': '/var/lib/ceph/osd/ceph-6', 'osd_objectstore': 'bluestore', 'osdspec_affinity': 'osd_spec_collocated', 'rotational': '1'}

Additional info:
  Addition of dedicated WAL device is unaffected
  RHCS 7.1 (18.2.1-194.el9cp)
  ===============================================
  cephadm shell --name osd.6 -- ceph-bluestore-tool bluefs-bdev-new-wal --dev-target /dev/data_vg1/lv5 --path /var/lib/ceph/osd/ceph-6 on 10.0.195.158 timeout 300
  inferring bluefs devices from bluestore path
  WAL device added /dev/dm-8

  RHCS 5.3
  ================================================
  cephadm shell --name osd.3 -- ceph-bluestore-tool bluefs-bdev-new-wal --dev-target /dev/data_vg4/lv5 --path /var/lib/ceph/osd/ceph-3 on 10.0.210.200 timeout 300
  inferring bluefs devices from bluestore path
  WAL device added /dev/dm-8

Comment 13 errata-xmlrpc 2024-06-26 10:02:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:4118

Comment 14 Red Hat Bugzilla 2024-10-25 04:25:21 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.