Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2213196

Summary: The 'foreman-proxy' service on Satellite fails with error "malloc_consolidate(): unaligned fastbin chunk detected" while provisioning a host
Product: Red Hat Satellite Reporter: Jayant Bhatia <jbhatia>
Component: Foreman ProxyAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Lukas Pramuk <lpramuk>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.12.4CC: ahumbe, alsouza, arahaman, aruzicka, bart.demeester, hyu, lpramuk, osousa, pmoravec, rlavi, saydas, sshtein
Target Milestone: 6.12.5Keywords: PrioBumpGSS, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rubygem-gssapi-1.3.1-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-24 18:24:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jayant Bhatia 2023-06-07 12:11:28 UTC
Description of problem:

While provisioning or creating a new host on Satellite, the foreman-proxy service fails with below error.

~~~
May 24 11:01:10 satellite.example.com smart-proxy[1892]: 10.1.98.100 - - [24/May/2023:11:01:09 CEST] "POST /realm/satellite.example.com/ HTTP/1.1" 200 632
May 24 11:01:10 satellite.example.com smart-proxy[1892]: - -> /realm/satellite.example.com/
May 24 11:01:10 satellite.example.com smart-proxy[1892]: malloc_consolidate(): unaligned fastbin chunk detected
May 24 11:01:10 satellite.example.com systemd[1]: foreman-proxy.service: Main process exited, code=killed, status=6/ABRT
May 24 11:01:10 satellite.example.com systemd[1]: foreman-proxy.service: Failed with result 'signal'.
/ All services displayed                                              [FAIL]    
Some services are not running (foreman-proxy)
~~~~


Version-Release number of selected component (if applicable): Satellite-6.11 and Satellite-6.12 deployed on RHEL-8


Steps to Reproduce: Creating a new host on Satellite/Capsule-6.11 or 6.12 deployed on RHEL-8 leads to the failure of foreman-proxy service on Satellite/Capsule.


Additional info: The issue is observed on Satellite/Capsule-6.11 and 6.12 deployed on RHEL-8 and gets addressed if Satellite is upgraded to version 6.13.

Comment 7 Sayan Das 2023-06-09 09:00:59 UTC
Ok, For all the support cases, the realm usage in system deployment seems to be a key point.

Even the end-user who confirmed that upgrading to 6.13 no longer triggers the issue had realm enabled and used during deployment.

Comment 8 Sayan Das 2023-06-09 09:22:49 UTC
100% Reproducer steps:

* Install Satellite 6.12 on RHEL 8.8
* Set up it with some kickstart content and provisioning features.
* Enable Realm feature and select Realm Capsule in a host group ( https://access.redhat.com/documentation/en-us/red_hat_satellite/6.12/html/installing_satellite_server_in_a_connected_network_environment/configuring_external_authentication_satellite#External_Authentication_for_Provisioned_Hosts_satellite )
* Use that Hostgroup to deploy a system


Actual Results:

2023-06-09T14:48:12 [I|app|8cec5277] Started POST "/hosts" for 10.74.24.4 at 2023-06-09 14:48:12 +0530
2023-06-09T14:48:12 [I|app|8cec5277] Processing by HostsController#create as */*
2023-06-09T14:48:12 [I|app|8cec5277]   Parameters: {"utf8"=>"✓", "authenticity_token"=>"QjOBBGNhrwvSjm2UOA1wz4tmRbzvsnXgcRm0gyIyXD9QnwXsGCjxbPBrHCGFx75OClCjn5vOi7vo8HWdvoASbg==", "host"=>{"name"=>"celia-helgert", "organization_id"=>"1", "location_id"=>"2", "hostgroup_id"=>"1", "content_facet_attributes"=>{"lifecycle_environment_id"=>"1", "content_view_id"=>"6", "content_source_id"=>"1", "kickstart_repository_id"=>"70"}, "managed"=>"true", "progress_report_id"=>"[FILTERED]", "type"=>"Host::Managed", "interfaces_attributes"=>{"0"=>{"_destroy"=>"0", "type"=>"Nic::Managed", "identifier"=>"", "name"=>"celia-helgert", "domain_id"=>"2", "subnet_id"=>"1", "ip"=>"192.168.233.156", "ip6"=>"", "managed"=>"1", "primary"=>"1", "provision"=>"1", "execution"=>"1", "virtual"=>"0", "tag"=>"", "attached_to"=>"", "compute_attributes"=>{"type"=>"VirtualVmxnet3", "network"=>"dvportgroup-847"}}}, "compute_attributes"=>{"cpus"=>"1", "corespersocket"=>"1", "memory_mb"=>"4096", "firmware"=>"bios", "cluster"=>"SYSLAB", "resource_pool"=>"Resources", "path"=>"/Datacenters/PRIME/vm/SYSMGMT/saydas", "guest_id"=>"rhel8_64Guest", "hardware_version"=>"Default", "memoryHotAddEnabled"=>"1", "cpuHotAddEnabled"=>"1", "add_cdrom"=>"0", "boot_order"=>["network", "disk"], "start"=>"1", "annotation"=>"", "scsi_controllers"=>"{\"scsiControllers\":[{\"type\":\"VirtualLsiLogicController\",\"key\":1000}],\"volumes\":[{\"thin\":true,\"name\":\"Hard disk\",\"mode\":\"persistent\",\"controllerKey\":1000,\"sizeGb\":10,\"datastore\":\"Arcade\",\"storagePod\":null}]}"}, "architecture_id"=>"1", "operatingsystem_id"=>"2", "provision_method"=>"build", "build"=>"1", "medium_id"=>"", "ptable_id"=>"136", "pxe_loader"=>"PXELinux BIOS", "disk"=>"", "root_pass"=>"[FILTERED]", "is_owned_by"=>"4-Users", "enabled"=>"1", "comment"=>"", "overwrite"=>"false"}, "media_selector"=>"synced_content"}
2023-06-09T14:48:12 [I|app|8cec5277] Add realm entry for new host host.example.com

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:12 8cec5277 [I] Started POST /realm/EXAMPLE.COM/ 

==> /var/log/messages <==
Jun  9 14:48:12 satellite612 smart-proxy[24297]: /usr/share/foreman-proxy/modules/realm_freeipa/provider.rb:46: warning: already initialized constant XMLRPC::Config::ENABLE_NIL_PARSER
Jun  9 14:48:12 satellite612 smart-proxy[24297]: /usr/share/gems/gems/xmlrpc-0.3.0/lib/xmlrpc/config.rb:22: warning: previous definition of ENABLE_NIL_PARSER was here

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:12 8cec5277 [I] Finished POST /realm/EXAMPLE.COM/ with 200 (468.61 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:12 [I|app|8cec5277] Adding Compute instance for host.example.com

==> /var/log/messages <==
Jun  9 14:48:12 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:12 IST] "POST /realm/EXAMPLE.COM/ HTTP/1.1" 200 633
Jun  9 14:48:12 satellite612 smart-proxy[24297]: - -> /realm/EXAMPLE.COM/

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started GET /tftp/serverName 
2023-06-09T14:48:14 8cec5277 [I] Finished GET /tftp/serverName with 200 (0.93 ms)
2023-06-09T14:48:14 8cec5277 [I] Started GET /tftp/serverName 
2023-06-09T14:48:14 8cec5277 [I] Finished GET /tftp/serverName with 200 (0.81 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|8cec5277] Create DHCP reservation host.example.com for host.example.com-00:50:56:b4:11:5d/192.168.233.156

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /dhcp/192.168.233.0 
2023-06-09T14:48:14 8cec5277 [I] Next-server option not IPv4, trying to resolve 'satellite612.example.com'
2023-06-09T14:48:14 8cec5277 [I] Finished POST /dhcp/192.168.233.0 with 200 (24.34 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|8cec5277] Add DNS A record for host.example.com/192.168.233.156

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /dns/ 
2023-06-09T14:48:14 8cec5277 [I] Finished POST /dns/ with 200 (14.28 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|8cec5277] Add DNS PTR record for 192.168.233.156/host.example.com

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /dns/ 

==> /var/log/messages <==
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "GET /tftp/serverName HTTP/1.1" 200 53
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /tftp/serverName
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "GET /tftp/serverName HTTP/1.1" 200 53
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /tftp/serverName
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "POST /dhcp/192.168.233.0 HTTP/1.1" 200 0
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /dhcp/192.168.233.0
Jun  9 14:48:14 satellite612 named[1067]: client @0x7fce3c0a25b0 127.0.0.1#48349/key rndc-key: updating zone 'sat.EXAMPLE.COM/IN': adding an RR at 'host.example.com' A 192.168.233.156
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "POST /dns/ HTTP/1.1" 200 0
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /dns/
Jun  9 14:48:14 satellite612 named[1067]: client @0x7fce3c0a25b0 127.0.0.1#55857/key rndc-key: updating zone '233.168.192.in-addr.arpa/IN': adding an RR at '156.233.168.192.in-addr.arpa' PTR host.example.com.

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Finished POST /dns/ with 200 (15.93 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|3e5ca2c8] Started GET "/tasks/d111af69-aa42-46ea-8417-7a18c0293d46" for 10.74.24.4 at 2023-06-09 14:48:14 +0530
2023-06-09T14:48:14 [I|app|8cec5277] Deploying TFTP PXELinux configuration for host.example.com
2023-06-09T14:48:14 [I|app|3e5ca2c8] Processing by TasksController#show as */*
2023-06-09T14:48:14 [I|app|3e5ca2c8]   Parameters: {"id"=>"d111af69-aa42-46ea-8417-7a18c0293d46"}

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /tftp/PXELinux/00:50:56:b4:11:5d 
2023-06-09T14:48:14 8cec5277 [I] Finished POST /tftp/PXELinux/00:50:56:b4:11:5d with 200 (2.08 ms)

==> /var/log/foreman/production.log <==
2023-06-09T14:48:14 [I|app|3e5ca2c8]   Rendered tasks/_list.html.erb (Duration: 5.0ms | Allocations: 6717)
2023-06-09T14:48:14 [I|app|3e5ca2c8] Completed 200 OK in 13ms (Views: 5.4ms | ActiveRecord: 4.1ms | Allocations: 8504)
2023-06-09T14:48:14 [I|app|8cec5277] Deploying TFTP PXEGrub2 configuration for host.example.com

==> /var/log/foreman-proxy/proxy.log <==
2023-06-09T14:48:14 8cec5277 [I] Started POST /tftp/PXEGrub2/00:50:56:b4:11:5d 

==> /var/log/messages <==
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "POST /dns/ HTTP/1.1" 200 0
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /dns/
Jun  9 14:48:14 satellite612 smart-proxy[24297]: X.X.X.X - - [09/Jun/2023:14:48:14 IST] "POST /tftp/PXELinux/00:50:56:b4:11:5d HTTP/1.1" 200 0
Jun  9 14:48:14 satellite612 smart-proxy[24297]: - -> /tftp/PXELinux/00:50:56:b4:11:5d
Jun  9 14:48:14 satellite612 smart-proxy[24297]: malloc_consolidate(): unaligned fastbin chunk detected
Jun  9 14:48:14 satellite612 systemd[1]: Created slice system-systemd\x2dcoredump.slice.
Jun  9 14:48:14 satellite612 systemd[1]: Started Process Core Dump (PID 24415/UID 0).


And then the host submission fails as foreman-proxy is killed.

Comment 11 Bart De Meester 2023-06-09 11:41:03 UTC
We got into this issue after an upgrade to Satellite 6.12 (See RH case 03533178) and RHEL8.8
Working situation (pre-upgrade) :
  - RHEL8.6
  - Satellite 6.11
Provisioning at that moment worked without issues using a REALM.

After upgrade we have following:
  - RHEL8.8
  - Satellite 6.12

The same provision now fails with the symptoms described in this Bugzilla

Please treat this with the highest priority as per company policy we prefer to stay one version behind the latest release for stability reasons.

Comment 17 Sayan Das 2023-06-15 09:20:36 UTC
With absolute certainty, I confirm that this fixes the issue 

# pwd
/usr/share/gems/gems/gssapi-1.2.0


# git diff
diff --git a/lib/gssapi/lib_gssapi.rb b/lib/gssapi/lib_gssapi.rb
index 0e61dad..e877801 100644
--- a/lib/gssapi/lib_gssapi.rb
+++ b/lib/gssapi/lib_gssapi.rb
@@ -179,11 +179,13 @@ module GSSAPI
     class GssCtxIdT < GssPointer
       def self.release_ptr(context_ptr)
         min_stat = FFI::MemoryPointer.new :OM_uint32
-        maj_stat = LibGSSAPI.gss_delete_sec_context(min_stat, context_ptr, LibGSSAPI::GSS_C_NO_BUFFER)
+        ptr_p = FFI::MemoryPointer.new :pointer
+        ctx_ptr = ptr_p.write_pointer(context_ptr)
+        maj_stat = LibGSSAPI.gss_delete_sec_context(min_stat, ctx_ptr, LibGSSAPI::GSS_C_NO_BUFFER)
       end
 
       def self.gss_c_no_context
-        self.new(GSSAPI::LibGSSAPI::GSS_C_NO_CONTEXT)
+        GssPointer.new(GSSAPI::LibGSSAPI::GSS_C_NO_CONTEXT)
       end
     end
 


i.e. the patch https://github.com/zenchild/gssapi/commit/6514d2474be31e1b97fe02f0a2c302463c8a0c88 that resides in rubygem-gssapi 1.3.1  shipped with Satellite 6.13

I applied the same patch on my 6.12 reproducer and tested three concurrent builds. So far , no error or no crash of services whatsoever.

Comment 23 Lukas Pramuk 2023-08-22 21:45:30 UTC
VERIFIED.

@Satellite 6.12.5 Snap2
rubygem-xmlrpc-0.3.2-1.el8sat.noarch

by manual reproducer described in comment#8 and comment#10:

1) Set up Satellite with kickstart content and provisioning features

2) Enable Realm feature and select Realm Capsule in a Host Group
( https://access.redhat.com/documentation/en-us/red_hat_satellite/6.12/html/installing_satellite_server_in_a_connected_network_environment/configuring_external_authentication_satellite#External_Authentication_for_Provisioned_Hosts_satellite )

3) Use that Host Group to deploy a system and check the log

FIX:

# hammer host create --name client --hostgroup baremetal --location-id 2 --organization-id 1 --provision-method build --build true --compute-attributes="start=1" --mac 56:6f:6d:f5:00:0e
Host created.
# grep smart-proxy /var/log/messages
...
Aug 22 16:49:13 localhost smart-proxy[67392]: <SAT_IP> - - [22/Aug/2023:16:49:13 EDT] "POST /tftp/iPXE/56:6f:6d:f5:00:0e HTTP/1.1" 200 0
Aug 22 16:49:13 localhost smart-proxy[67392]: - -> /tftp/iPXE/56:6f:6d:f5:00:0e
Aug 22 16:49:13 localhost smart-proxy[67392]: <SAT_IP> - - [22/Aug/2023:16:49:13 EDT] "POST /tftp/fetch_boot_file HTTP/1.1" 200 0
Aug 22 16:49:13 localhost smart-proxy[67392]: - -> /tftp/fetch_boot_file

>>> foreman-proxy no longer core dumps

vs.

REPRO:

# hammer host create --name client --hostgroup baremetal --location-id 2 --organization-id 1 --provision-method build --build true --compute-attributes="start=1" --mac 56:6f:6d:f5:00:10 
Could not create the host:
  Deploy TFTP PXELinux config for client.example.com task failed with the following error: ERF12-5472 [ProxyAPI::ProxyException]: Unable to set TFTP boot entry for 56:6f:6d:f5:00:10 ([RestClient::ServerBrokeConnection]: Server broke connection) for Capsule https://<SAT_FQDN>:9090/tftp
  Failed to perform rollback on Create Reverse IPv4 DNS record for client.example.com - ERF12-1261 [ProxyAPI::ProxyException]: Unable to delete DNS entry ([Errno::ECONNREFUSED]: Failed to open TCP connection to <SAT_FQDN>:9090 (Connection refused - connect(2) for "<SAT_FQDN>" port 9090)) for Capsule https://<SAT_FQDN>:9090/dns
  Failed to perform rollback on Create IPv4 DNS record for client.example.com - ERF12-1261 [ProxyAPI::ProxyException]: Unable to delete DNS entry ([Errno::ECONNREFUSED]: Failed to open TCP connection to <SAT_FQDN>:9090 (Connection refused - connect(2) for "<SAT_FQDN>" port 9090)) for Capsule https://<SAT_FQDN>:9090/dns
  Failed to perform rollback on Create DHCP Settings for client.example.com - ERF12-0635 [ProxyAPI::ProxyException]: Unable to delete DHCP entry for 56:6f:6d:f5:00:10 ([Errno::ECONNREFUSED]: Failed to open TCP connection to <SAT_FQDN>:9090 (Connection refused - connect(2) for "<SAT_FQDN>" port 9090)) for Capsule https://<SAT_FQDN>:9090/dhcp
  Failed to perform rollback on DHCP conflicts removal for client.example.com - ERF12-4395 [ProxyAPI::ProxyException]: Unable to retrieve DHCP entry for 56:6f:6d:f5:00:10 ([Errno::ECONNREFUSED]: Failed to open TCP connection to <SAT_FQDN>:9090 (Connection refused - connect(2) for "<SAT_FQDN>" port 9090)) for Capsule https://<SAT_FQDN>:9090/dhcp

# grep smart-proxy /var/log/messages
Aug 22 16:48:24 localhost smart-proxy[66678]: /usr/share/foreman-proxy/modules/realm_freeipa/provider.rb:46: warning: already initialized constant XMLRPC::Config::ENABLE_NIL_PARSER
Aug 22 16:48:24 localhost smart-proxy[66678]: /usr/share/gems/gems/xmlrpc-0.3.0/lib/xmlrpc/config.rb:22: warning: previous definition of ENABLE_NIL_PARSER was here
Aug 22 16:48:24 localhost smart-proxy[66678]: <SAT_IP> - - [22/Aug/2023:16:48:24 EDT] "POST /realm/EXAMPLE.COM/ HTTP/1.1" 200 586
Aug 22 16:48:24 localhost smart-proxy[66678]: - -> /realm/EXAMPLE.COM/
...
Aug 22 16:48:25 localhost smart-proxy[66678]: <SAT_IP> - - [22/Aug/2023:16:48:25 EDT] "GET /unattended/templateServer HTTP/1.1" 200 67
Aug 22 16:48:25 localhost smart-proxy[66678]: - -> /unattended/templateServer
Aug 22 16:48:25 localhost smart-proxy[66678]: malloc(): unaligned fastbin chunk detected
Aug 22 16:48:25 localhost systemd-coredump[67435]: Resource limits disable core dumping for process 66678 (smart-proxy).
Aug 22 16:48:25 localhost systemd-coredump[67435]: Process 66678 (smart-proxy) of user 990 dumped core.

Comment 27 errata-xmlrpc 2023-08-24 18:24:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.12.5 Async Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:4749

Comment 28 Pavel Moravec 2024-03-15 08:36:32 UTC
The fix is *NOT* in rubygem-xmlrpc-0.3.2-1 but in rubygem-gssapi-1.3.1-1

The bug is *NOT* fixed in 6.12.5 (which is shipped with rubygem-xmlrpc-0.3.2-1 but with rubygem-gssapi-1.2.0-8 and /usr/share/gems/gems/gssapi-1.2.0/lib/gssapi/lib_gssapi.rb does *miss* the patch.

The bug is really fixed in 6.13, I double-checked that.


So for potential Satellite 6.12 customers: please upgrade to 6.13 (or higher).