Bug 2213196 - The 'foreman-proxy' service on Satellite fails with error "malloc_consolidate(): unaligned fastbin chunk detected" while provisioning a host
Summary: The 'foreman-proxy' service on Satellite fails with error "malloc_consolidate...
Keywords:
Status: ON_QA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Capsule
Version: 6.12.4
Hardware: Unspecified
OS: Unspecified
urgent
urgent with 1 vote
Target Milestone: 6.12.5
Assignee: satellite6-bugs
QA Contact: Lukas Pramuk
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-07 12:11 UTC by Jayant Bhatia
Modified: 2023-08-15 00:21 UTC (History)
11 users (show)

Fixed In Version: rubygem-xmlrpc-0.3.2-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker SAT-18238 0 None None None 2023-06-08 14:21:22 UTC
Red Hat Knowledge Base (Solution) 7018038 0 None None None 2023-06-09 09:49:03 UTC

Description Jayant Bhatia 2023-06-07 12:11:28 UTC
Description of problem:

While provisioning or creating a new host on Satellite, the foreman-proxy service fails with below error.

~~~
May 24 11:01:10 satellite.example.com smart-proxy[1892]: 10.1.98.100 - - [24/May/2023:11:01:09 CEST] "POST /realm/satellite.example.com/ HTTP/1.1" 200 632
May 24 11:01:10 satellite.example.com smart-proxy[1892]: - -> /realm/satellite.example.com/
May 24 11:01:10 satellite.example.com smart-proxy[1892]: malloc_consolidate(): unaligned fastbin chunk detected
May 24 11:01:10 satellite.example.com systemd[1]: foreman-proxy.service: Main process exited, code=killed, status=6/ABRT
May 24 11:01:10 satellite.example.com systemd[1]: foreman-proxy.service: Failed with result 'signal'.
/ All services displayed                                              [FAIL]    
Some services are not running (foreman-proxy)
~~~~


Version-Release number of selected component (if applicable): Satellite-6.11 and Satellite-6.12 deployed on RHEL-8


Steps to Reproduce: Creating a new host on Satellite/Capsule-6.11 or 6.12 deployed on RHEL-8 leads to the failure of foreman-proxy service on Satellite/Capsule.


Additional info: The issue is observed on Satellite/Capsule-6.11 and 6.12 deployed on RHEL-8 and gets addressed if Satellite is upgraded to version 6.13.

Comment 7 Sayan Das 2023-06-09 09:00:59 UTC
Ok, For all the support cases, the realm usage in system deployment seems to be a key point.

Even the end-user who confirmed that upgrading to 6.13 no longer triggers the issue had realm enabled and used during deployment.

Comment 8 Sayan Das 2023-06-09 09:22:49 UTC
100% Reproducer steps:

* Install Satellite 6.12 on RHEL 8.8
* Set up it with some kickstart content and provisioning features.
* Enable Realm feature and select Realm Capsule in a host group ( https://access.redhat.com/documentation/en-us/red_hat_satellite/6.12/html/installing_satellite_server_in_a_connected_network_environment/configuring_external_authentication_satellite#External_Authentication_for_Provisioned_Hosts_satellite )
* Use that Hostgroup to deploy a system


Actual Results:

2023-06-09T14:48:12 [I|app|8cec5277] Started POST "/hosts" for 10.74.24.4 at 2023-06-09 14:48:12 +0530
2023-06-09T14:48:12 [I|app|8cec5277] Processing by HostsController#create as */*
2023-06-09T14:48:12 [I|app|8cec5277]   Parameters: {"utf8"=>"✓", "authenticity_token"=>"QjOBBGNhrwvSjm2UOA1wz4tmRbzvsnXgcRm0gyIyXD9QnwXsGCjxbPBrHCGFx75OClCjn5vOi7vo8HWdvoASbg==", "host"=>{"name"=>"celia-helgert", "organization_id"=>"1", "location_id"=>"2", "hostgroup_id"=>"1", "content_facet_attributes"=>{"lifecycle_environment_id"=>"1", "content_view_id"=>"6", "content_source_id"=>"1", "kickstart_repository_id"=>"70"}, "managed"=>"true", "progress_report_id"=>"[FILTERED]", "type"=>"Host::Managed", "interfaces_attributes"=>{"0"=>{"_destroy"=>"0", "type"=>"Nic::Managed", "identifier"=>"", "name"=>"celia-helgert", "domain_id"=>"2", "subnet_id"=>"1", "ip"=>"192.168.233.156", "ip6"=>"", "managed"=>"1", "primary"=>"1", "provision"=>"1", "execution"=>"1", "virtual"=>"0", "tag"=>"", "attached_to"=>"", "compute_attributes"=>{"type"=>"VirtualVmxnet3", "network"=>"dvportgroup-847"}}}, "compute_attributes"=>{"cpus"=>"1", "corespersocket"=>"1", "memory_mb"=>"4096", "firmware"=>"bios", "cluster"=>"SYSLAB", "resource_pool"=>"Resources", "path"=>"/Datacenters/PRIME/vm/SYSMGMT/saydas", "guest_id"=>"rhel8_64Guest", "hardware_version"=>"Default", "memoryHotAddEnabled"=>"1", "cpuHotAddEnabled"=>"1", "add_cdrom"=>"0", "boot_order"=>["network", "disk"], "start"=>"1", "annotation"=>"", "scsi_controllers"=>"{\"scsiControllers\":[{\"type\":\"VirtualLsiLogicController\",\"key\":1000}],\"volumes\":[{\"thin\":true,\"name\":\"Hard disk\",\"mode\":\"persistent\",\"controllerKey\":1000,\"sizeGb\":10,\"datastore\":\"Arcade\",\"storagePod\":null}]}"}, "architecture_id"=>"1", "operatingsystem_id"=>"2", "provision_method"=>"build", "build"=>"1", "medium_id"=>"", "ptable_id"=>"136", "pxe_loader"=>"PXELinux BIOS", "disk"=>"", "root_pass"=>"[FILTERED]", "is_owned_by"=>"4-Users", "enabled"=>"1", "comment"=>"", "overwrite"=>"false"}, "media_selector"=>"synced_content"}
2023-06-09T14:48:12 [I|app|8cec5277] Add realm entry for new host host.example.com

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:12 8cec5277 [I] Started POST /realm/EXAMPLE.COM/ 

==> /var/log/messages <==
Jun  9 14:48:12 satellite612 smart-proxy[24297]: /usr/share/foreman-proxy/modules/realm_freeipa/provider.rb:46: warning: already initialized constant XMLRPC::Config::ENABLE_NIL_PARSER
Jun  9 14:48:12 satellite612 smart-proxy[24297]: /usr/share/gems/gems/xmlrpc-0.3.0/lib/xmlrpc/config.rb:22: warning: previous definition of ENABLE_NIL_PARSER was here

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:12 8cec5277 [I] Finished POST /realm/EXAMPLE.COM/ with 200 (468.61 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:12 [I|app|8cec5277] Adding Compute instance for host.example.com

==> /var/log/messages <==
Jun  9 14:48:12 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:12 IST] "POST /realm/EXAMPLE.COM/ HTTP/1.1" 200 633
Jun  9 14:48:12 satellite612 smart-proxy[24297]: - -> /realm/EXAMPLE.COM/

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started GET /tftp/serverName 
2023-06-09T14:48:14 8cec5277 [I] Finished GET /tftp/serverName with 200 (0.93 ms)
2023-06-09T14:48:14 8cec5277 [I] Started GET /tftp/serverName 
2023-06-09T14:48:14 8cec5277 [I] Finished GET /tftp/serverName with 200 (0.81 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|8cec5277] Create DHCP reservation host.example.com for host.example.com-00:50:56:b4:11:5d/192.168.233.156

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /dhcp/192.168.233.0 
2023-06-09T14:48:14 8cec5277 [I] Next-server option not IPv4, trying to resolve 'satellite612.example.com'
2023-06-09T14:48:14 8cec5277 [I] Finished POST /dhcp/192.168.233.0 with 200 (24.34 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|8cec5277] Add DNS A record for host.example.com/192.168.233.156

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /dns/ 
2023-06-09T14:48:14 8cec5277 [I] Finished POST /dns/ with 200 (14.28 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|8cec5277] Add DNS PTR record for 192.168.233.156/host.example.com

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /dns/ 

==> /var/log/messages <==
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "GET /tftp/serverName HTTP/1.1" 200 53
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /tftp/serverName
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "GET /tftp/serverName HTTP/1.1" 200 53
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /tftp/serverName
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "POST /dhcp/192.168.233.0 HTTP/1.1" 200 0
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /dhcp/192.168.233.0
Jun  9 14:48:14 satellite612 named[1067]: client @0x7fce3c0a25b0 127.0.0.1#48349/key rndc-key: updating zone 'sat.EXAMPLE.COM/IN': adding an RR at 'host.example.com' A 192.168.233.156
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "POST /dns/ HTTP/1.1" 200 0
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /dns/
Jun  9 14:48:14 satellite612 named[1067]: client @0x7fce3c0a25b0 127.0.0.1#55857/key rndc-key: updating zone '233.168.192.in-addr.arpa/IN': adding an RR at '156.233.168.192.in-addr.arpa' PTR host.example.com.

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Finished POST /dns/ with 200 (15.93 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|3e5ca2c8] Started GET "/tasks/d111af69-aa42-46ea-8417-7a18c0293d46" for 10.74.24.4 at 2023-06-09 14:48:14 +0530
2023-06-09T14:48:14 [I|app|8cec5277] Deploying TFTP PXELinux configuration for host.example.com
2023-06-09T14:48:14 [I|app|3e5ca2c8] Processing by TasksController#show as */*
2023-06-09T14:48:14 [I|app|3e5ca2c8]   Parameters: {"id"=>"d111af69-aa42-46ea-8417-7a18c0293d46"}

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /tftp/PXELinux/00:50:56:b4:11:5d 
2023-06-09T14:48:14 8cec5277 [I] Finished POST /tftp/PXELinux/00:50:56:b4:11:5d with 200 (2.08 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|3e5ca2c8]   Rendered tasks/_list.html.erb (Duration: 5.0ms | Allocations: 6717)
2023-06-09T14:48:14 [I|app|3e5ca2c8] Completed 200 OK in 13ms (Views: 5.4ms | ActiveRecord: 4.1ms | Allocations: 8504)
2023-06-09T14:48:14 [I|app|8cec5277] Deploying TFTP PXEGrub2 configuration for host.example.com

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /tftp/PXEGrub2/00:50:56:b4:11:5d 

==> /var/log/messages <==
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "POST /dns/ HTTP/1.1" 200 0
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /dns/
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "POST /tftp/PXELinux/00:50:56:b4:11:5d HTTP/1.1" 200 0
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /tftp/PXELinux/00:50:56:b4:11:5d
Jun  9 14:48:14 satellite612 smart-proxy[24297]: malloc_consolidate(): unaligned fastbin chunk detected
Jun  9 14:48:14 satellite612 systemd[1]: Created slice system-systemd\x2dcoredump.slice.
Jun  9 14:48:14 satellite612 systemd[1]: Started Process Core Dump (PID 24415/UID 0).


And then the host submission fails as foreman-proxy is killed.

Comment 11 Bart De Meester 2023-06-09 11:41:03 UTC
We got into this issue after an upgrade to Satellite 6.12 (See RH case 03533178) and RHEL8.8
Working situation (pre-upgrade) :
  - RHEL8.6
  - Satellite 6.11
Provisioning at that moment worked without issues using a REALM.

After upgrade we have following:
  - RHEL8.8
  - Satellite 6.12

The same provision now fails with the symptoms described in this Bugzilla

Please treat this with the highest priority as per company policy we prefer to stay one version behind the latest release for stability reasons.

Comment 17 Sayan Das 2023-06-15 09:20:36 UTC
With absolute certainty, I confirm that this fixes the issue 

# pwd
/usr/share/gems/gems/gssapi-1.2.0


# git diff
diff --git a/lib/gssapi/lib_gssapi.rb b/lib/gssapi/lib_gssapi.rb
index 0e61dad..e877801 100644
--- a/lib/gssapi/lib_gssapi.rb
+++ b/lib/gssapi/lib_gssapi.rb
@@ -179,11 +179,13 @@ module GSSAPI
     class GssCtxIdT < GssPointer
       def self.release_ptr(context_ptr)
         min_stat = FFI::MemoryPointer.new :OM_uint32
-        maj_stat = LibGSSAPI.gss_delete_sec_context(min_stat, context_ptr, LibGSSAPI::GSS_C_NO_BUFFER)
+        ptr_p = FFI::MemoryPointer.new :pointer
+        ctx_ptr = ptr_p.write_pointer(context_ptr)
+        maj_stat = LibGSSAPI.gss_delete_sec_context(min_stat, ctx_ptr, LibGSSAPI::GSS_C_NO_BUFFER)
       end
 
       def self.gss_c_no_context
-        self.new(GSSAPI::LibGSSAPI::GSS_C_NO_CONTEXT)
+        GssPointer.new(GSSAPI::LibGSSAPI::GSS_C_NO_CONTEXT)
       end
     end
 


i.e. the patch https://github.com/zenchild/gssapi/commit/6514d2474be31e1b97fe02f0a2c302463c8a0c88 that resides in rubygem-gssapi 1.3.1  shipped with Satellite 6.13

I applied the same patch on my 6.12 reproducer and tested three concurrent builds. So far , no error or no crash of services whatsoever.


Note You need to log in before you can comment on or make changes to this bug.