RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1543902 - Include /etc/hosts and /etc/nsswitch.conf into initramfs if fence_kdump is used otherwise crashkernel doesn't resolve the hostnames
Summary: Include /etc/hosts and /etc/nsswitch.conf into initramfs if fence_kdump is us...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kexec-tools
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Pingfan Liu
QA Contact: Qiao Zhao
URL:
Whiteboard:
Depends On:
Blocks: 1548445 1549423
TreeView+ depends on / blocked
 
Reported: 2018-02-09 14:11 UTC by Josef Zimek
Modified: 2022-03-13 14:41 UTC (History)
7 users (show)

Fixed In Version: kexec-tools-2.0.15-15.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-30 11:29:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3348911 0 None None None 2018-05-22 19:28:25 UTC
Red Hat Knowledge Base (Solution) 3349221 0 None None None 2018-05-22 19:28:01 UTC
Red Hat Product Errata RHBA-2018:3240 0 None None None 2018-10-30 11:31:07 UTC

Description Josef Zimek 2018-02-09 14:11:23 UTC
Description of problem:

fence_kdump_nodes is required parameter in case fence_kdump is used. If user doesn't specify fence_kdump_nodes in /etc/kdump.conf the list of cluster nodes is obtained from cluster automatically and that value is passed to fence_kdump_nodes:


# retrieves fence_kdump nodes from Pacemaker cluster configuration
get_pcs_fence_kdump_nodes() {
    local nodes

    # get cluster nodes from cluster cib, get interface and ip address
    nodelist=`pcs cluster cib | xmllint --xpath "/cib/status/node_state/@uname" -`

    # nodelist is formed as 'uname="node1" uname="node2" ... uname="nodeX"'
    # we need to convert each to node1, node2 ... nodeX in each iteration
    for node in ${nodelist}; do
        # convert $node from 'uname="nodeX"' to 'nodeX'
        eval $node
        nodename=$uname
        # Skip its own node name
        if [ "$nodename" = `hostname` -o "$nodename" = `hostname -s` ]; then
            continue
        fi
        nodes="$nodes $nodename"
    done

    echo $nodes
}


Problem with above approach is that cluster node names from pacemaker doen't necessarily have to be FQDN but can be user defined aliases. So this way we feed fence_kdump_nodes with list of aliases which are typically mapped in /etc/hosts but /etc/hosts is not included in initramfs by dracut by default. There are multiple reasons why it is not included and there is low change get it there by default. However it make sense to include /etc/hosts (and /etc/nsswitch.conf to make it work) when fence_kdump is used because without that hostname aliases are not resolvable from crashkernel which leads to fence_kdump failure (fence_kdump_send fails to send notifications and fencing results in timeout = vmcore doesn't get captured).

Same issue arrives not only when we automatically feed the fence_kdump_nodes with values obtained from pacemaker but also when user manually enters aliases into /etc/kdump.conf because we allow that. Kdump will not complain because when creating initramfs it resolves the alias but from crashkernel alias is not available. So currently it only works if fence_kdump_nodes contains a) IP addresses of cluster nodes b) FQDNs of cluster nodes.

There is workaround by setting the dracut_args in /etc/kdump.conf:

dracut_args --include /etc/hosts /etc/nsswitch.conf

However it sounds reasonable to include it by default.



Version-Release number of selected component (if applicable):

Tested on kexec-tools-2.0.14 however it is very likely applicable to all RHEL 7 releases

How reproducible:
Always

Steps to Reproduce:
1. set cluster with hostnames which are not FQDNs but aliases
2. map the aliases to IPs in /etc/hosts file
3. set fence_kdump as primary fencing method
4. restart kdump
5. crash the node
6. fencing of crashed node will timeout because


Actual results:
Fencing fails due to unresolvable hostnames in fence_kdump_nodes (vmcore not captured)


Expected results:
Fencing will succeed (vmcore gets captured) because aliases get resolved from within crashkernel because /etc/hosts and /etc/nsswitch.conf are available in case fence_kdump is installed


Additional info:
tested with: dracut_args --include /etc/hosts /etc/nsswitch.conf and it works as expected

Comment 3 Jesús M Fernández-Gallardo 2018-03-27 08:56:59 UTC
Hello,

The "dracut_args --include /etc/hosts /etc/nsswitch.conf" workaround you mentioned above does not work,mainly because the side effect of that entry line results in the content of file /etc/hosts being installed as /etc/nsswitch.conf. 

After inspecting how /usr/sbin/mkdumprd build the dracut_args array variable it receives as first argument to dracut I realized that probably you intended to use --install instead of --include.

The former allows you to install the content of several files into the target kdump ramdisk, while the latter can only be used once, assuming that what is stated in dracut man pages is correct.

Indeed, I checked that using several --include options (it does not matter if they appear in an unique line or in several dracut_args entries in /etc/kdump.conf), results in dracut complaining about syntax issues and aborting.

On the other side, using --install instead of --include takes you to a correct, consistent and complete ramdisk-kdump image:

This is the line I used in /etc/kdump.conf:

dracut_args --install "/etc/nsswitch.conf" --install "/etc/hosts"


Jesús.

Comment 4 Pingfan Liu 2018-04-28 03:04:50 UTC
(In reply to Josef Zimek from comment #0)
> Description of problem:
> 
> fence_kdump_nodes is required parameter in case fence_kdump is used. If user
> doesn't specify fence_kdump_nodes in /etc/kdump.conf the list of cluster
> nodes is obtained from cluster automatically and that value is passed to
> fence_kdump_nodes:
> 
> 
> # retrieves fence_kdump nodes from Pacemaker cluster configuration
> get_pcs_fence_kdump_nodes() {
>     local nodes
> 
>     # get cluster nodes from cluster cib, get interface and ip address
>     nodelist=`pcs cluster cib | xmllint --xpath
> "/cib/status/node_state/@uname" -`
> 
>     # nodelist is formed as 'uname="node1" uname="node2" ... uname="nodeX"'
>     # we need to convert each to node1, node2 ... nodeX in each iteration
>     for node in ${nodelist}; do
>         # convert $node from 'uname="nodeX"' to 'nodeX'
>         eval $node
>         nodename=$uname
>         # Skip its own node name
>         if [ "$nodename" = `hostname` -o "$nodename" = `hostname -s` ]; then
>             continue
>         fi
>         nodes="$nodes $nodename"
>     done
> 
>     echo $nodes
> }
> 
> 
> Problem with above approach is that cluster node names from pacemaker doen't
> necessarily have to be FQDN but can be user defined aliases. So this way we
> feed fence_kdump_nodes with list of aliases which are typically mapped in
> /etc/hosts but /etc/hosts is not included in initramfs by dracut by default.
> There are multiple reasons why it is not included and there is low change
> get it there by default. However it make sense to include /etc/hosts (and
> /etc/nsswitch.conf to make it work) when fence_kdump is used because without
> that hostname aliases are not resolvable from crashkernel which leads to
> fence_kdump failure (fence_kdump_send fails to send notifications and
> fencing results in timeout = vmcore doesn't get captured).
> 
Hi, do you hit this issue? During mkdumprd, it stores the ipaddr, instead of the hostname. Hence alias will not cause the problem.

Thanks,
Pingfan

Comment 5 Pingfan Liu 2018-04-28 03:07:21 UTC
But, it is harmless to include hosts and nsswitch.conf. If debugging in kdump shell, we try to scp something to alias.

Comment 6 Pingfan Liu 2018-06-21 05:50:56 UTC
My test environment is broken. After adding rd.debug in cmdline, I found that fence_kdump_send cmd has no node as its input param. Hence during my test, I did not hit the condition (fence_kdump_send fails to send notifications and fencing results in timeout = vmcore doesn't get captured).

It turns out that we need this fix.

Thanks,
Pingfan

Comment 11 errata-xmlrpc 2018-10-30 11:29:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3240


Note You need to log in before you can comment on or make changes to this bug.