Bug 2311294 - Community reported crash with Prometheus metrics enabled
Summary: Community reported crash with Prometheus metrics enabled
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: NFS-Ganesha
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 8.0
Assignee: Frank Filz
QA Contact: Manisha Saini
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-09-10 19:22 UTC by Frank Filz
Modified: 2024-11-25 09:09 UTC (History)
5 users (show)

Fixed In Version: nfs-ganesha-6.0-4.el9cp; rhceph-container-8-73
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-11-25 09:09:38 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-9761 0 None None None 2024-09-10 19:22:32 UTC
Red Hat Product Errata RHBA-2024:10216 0 None None None 2024-11-25 09:09:41 UTC

Description Frank Filz 2024-09-10 19:22:10 UTC
Description of problem:

With basic configuration:

```
NFS_CORE_PARAM { Protocols = NFSv4; Enable_RQUOTA = false; Mount_Path_Pseudo = true; Enable_FULLV4_Stats = true; }
EXPORT_DEFAULTS { Access_Type = RW; Squash = root_squash; Attr_Expiration_Time = 60; CLIENT { Clients = *; } }
EXPORT { Export_id = 1; Path = "/storage/1"; Pseudo = "/1"; FSAL { Name = VFS; } }
```

Address crash on first mount:

```
 0  0x0000fffff7ad4220 in __aarch64_ldadd8_acq_rel () from /lib64/libgmonitoring.so
 1  0x0000fffff7abfe20 in std::__atomic_base<long>::operator+= (this=0x10, __i=1) at /usr/include/c++/11.2.0/bits/atomic_base.h:394
 2  0x0000fffff7abc4b0 in prometheus::Gauge<long>::Increment (this=0x0, val=@0xfffff1e0d800: 1) at /nfs-ganesha/src/monitoring/prometheus-cpp-lite/core/include/prometheus/gauge.h:45
 3  0x0000fffff7ab5ae8 in ganesha_monitoring::monitoring__gauge_inc (handle=..., value=1) at /nfs-ganesha/src/monitoring/monitoring.cc:408
 4  0x0000fffff7ca97d0 in connection_manager_metrics__client_state_inc (state=CONNECTION_MANAGER__CLIENT_STATE__DRAINED) at /nfs-ganesha/src/RPCAL/connection_manager_metrics.c:158
 5  0x0000fffff7ca6bcc in connection_manager__client_init (client=0xffffc40097d8) at /nfs-ganesha/src/RPCAL/connection_manager.c:200
 6  0x0000fffff7c71578 in get_gsh_client (client_ipaddr=0xffffc4000c08, lookup_only=false) at /nfs-ganesha/src/support/client_mgr.c:198
 7  0x0000fffff7ca8560 in connection_manager__connection_started (xprt=0xffffc4000b10) at /nfs-ganesha/src/RPCAL/connection_manager.c:514
 8  0x0000fffff7c64db4 in nfs_rpc_dispatch_remote_addr_set_tcp (xprt=0xffffc4000b10) at /nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:426
 9  0x0000fffff7f90f28 in update_and_notify_remote_address_set (xprt=0xffffc4000b10) at /nfs-ganesha/src/libntirpc/src/svc_vc.c:752
 10 0x0000fffff7f92510 in svc_vc_recv (xprt=0xffffc4000b10) at /nfs-ganesha/src/libntirpc/src/svc_vc.c:1154
 11 0x0000fffff7f8cc00 in svc_rqst_xprt_task_recv (wpe=0xffffc4000de8) at /nfs-ganesha/src/libntirpc/src/svc_rqst.c:1210
 12 0x0000fffff7f8d914 in svc_rqst_epoll_loop (wpe=0xaaaaaea1d3d0) at /nfs-ganesha/src/libntirpc/src/svc_rqst.c:1585
 13 0x0000fffff7f9b10c in work_pool_thread (arg=0xffffcc0008e0) at /nfs-ganesha/src/libntirpc/src/work_pool.c:187
 14 0x0000fffff78de904 in start_thread (arg=0xfffff280dd97) at pthread_create.c:442
 15 0x0000fffff794329c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79
(gdb) frame 3
408       convert_from_handle<GaugeInt>(handle)->Increment(value);
(gdb) p metrics
$1 = {clients = {{metric = 0x0}, {metric = 0x0}, {metric = 0x0}, {metric = 0x0}}, connection_started_latencies = {{metric = 0x0}, {metric = 0x0}}, drain_local_client_latencies = {{metric = 0x0}, {metric = 0x0}, {metric = 0x0}, {metric = 0x0}}}

Version-Release number of selected component (if applicable):

Ganesha V6.0


How reproducible:

On any start up


Steps to Reproduce:
1. Start Ganesha with Prometheus metrics enabled
2.
3.

Actual results:

Crash


Expected results:

No crash


Additional info:

Comment 7 errata-xmlrpc 2024-11-25 09:09:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:10216


Note You need to log in before you can comment on or make changes to this bug.