Bug 2167343 - [luminous] FAILED assert(authenticate_err == 0)
Summary: [luminous] FAILED assert(authenticate_err == 0)
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 3.1
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ---
: 6.2
Assignee: Brad Hubbard
QA Contact: Pawan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-02-06 11:00 UTC by shiqi
Modified: 2023-07-12 01:10 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-02-15 07:07:03 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-6092 0 None None None 2023-02-06 11:00:43 UTC

Description shiqi 2023-02-06 11:00:26 UTC
Description of problem:
nova compute shutdown with the log:
/builddir/build/BUILD/ceph-12.2.5/src/mon/MonClient.cc: In function 'int MonClient::authenticate(double)' thread 7f7bc63e9740 time 2023-02-06 13:55:52.904855
/builddir/build/BUILD/ceph-12.2.5/src/mon/MonClient.cc: 479: FAILED assert(authenticate_err == 0)
 ceph version 12.2.5-59.el7cp (d4b9f17b56b3348566926849313084dd6efc2ca2) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f7baab12ec0]
 2: (MonClient::authenticate(double)+0xa17) [0x7f7baab5d057]
 3: (librados::RadosClient::connect()+0x10ac) [0x7f7bb35d92bc]
 4: (rados_connect()+0x20) [0x7f7bb3583ff0]

Version-Release number of selected component (if applicable):
 ceph version 12.2.5-59.el7cp (d4b9f17b56b3348566926849313084dd6efc2ca2) luminous (stable)

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 17 shiqi 2023-02-17 08:42:17 UTC
/builddir/build/BUILD/ceph-12.2.5/src/mon/MonClient.cc: In function 'int MonClient::authenticate(double)' thread 7f7bc63e9740 time 2023-02-06 13:55:52.904855
/builddir/build/BUILD/ceph-12.2.5/src/mon/MonClient.cc: 479: FAILED assert(authenticate_err == 0)
 ceph version 12.2.5-59.el7cp (d4b9f17b56b3348566926849313084dd6efc2ca2) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f7baab12ec0]
 2: (MonClient::authenticate(double)+0xa17) [0x7f7baab5d057]
 3: (librados::RadosClient::connect()+0x10ac) [0x7f7bb35d92bc]
 4: (rados_connect()+0x20) [0x7f7bb3583ff0]

1. See above call stack . I think this is the coredump info.
2. As far as I understand __ceph_assert_fail would cannot trigger coredump generation.
3. debug_auth=20, debug_monc=20, debug_ms=1 will cause a large number of logs to occupy disks and reduce performance. According to the customer's description, the fault occurs every two months. Can you lower the log level ? such as debug_auth=5, debug_monc=5, debug_ms=1. Do you think this log level will affect the production environment?

Comment 25 shiqi 2023-04-07 03:41:40 UTC
@Brad Hubbard
Hello brad. All the logs and coredump are attached to the related case(03429811) include coredump(abrt.tar(1).gz), ceph client log(ceph.client.log.tar(1).gz), ceph callstack in docker log(W-PC-SRH310-369--docker-log-all.txt).  Please pay attention to these attachment. Please indicate which bug it is?  is it fixed? and is it intended to be fixed in this release ?


Note You need to log in before you can comment on or make changes to this bug.