Red Hat Bugzilla – Bug 153775
[RHEL3-U6][Diskdump] Backtrace of OS_INIT doesn't work
Last modified: 2010-10-21 22:52:13 EDT
Description of problem: When crashdump is executed via OS_INIT on IPF machine, backtrace command, which is a subcommand of crash, does not work correctly. Two problems were found in the OS_INIT code. (1) OS_INIT handler has two stages. stage1: handler written by assembler stage2: handler written by C The former is ia64_monarch_init_handler and ia64_slave_init_handler. The latter is ia64_init_handler. ia64_init_handler is called only by ia64_monarch_init_handler. When INIT interrupt is asserted, one cpu calls ia64_monarch_init_handler and the others call ia64_slave_init_handler. It means that ia64_init_handler is called by only one cpu. In that case, backtrace command fails. To make backtrace succeed, all cpus need to call ia64_monarch_init_handler. (2) The second problem occurs by correcting the first problem. When OS_INIT handler is called, SAL hands handler some information through register. The handler preserves this information in ia64_sal_to_os_handoff_state. (Please see SAL_TO_OS_MCA_HANDOFF_STATE_SAVE macro in the arch/ia64/kernel/ mca_asm.S.) If all cpus call ia64_monarch_init_handler at the same time, they writes their own information to the ia64_sal_to_os_handoff_state simultaneously and break its contents. Version-Release number of selected component (if applicable): kernel-2.4.21-31.EL How reproducible: always Steps to Reproduce: 1. Enable Diskdump 2. Push OS_INIT switch 3. bt with crash command Actual results: Backtrace command does not work Expected results: Backtrace command works correctly Additional info: none
How to correct these problems: (1) When OS_INIT handler is registered with SAL, register ia64_monarch_init_handler as both monarch handler and slave handler. (2) Prepare ia64_sal_to_os_handoff_state of each cpu beforehand.
Created attachment 112728 [details] osinit.patch The patch is provided by Fujitsu on IT#69948.
The fix for the issue will be targeted for RHEL3-U6.
A fix for this problem has just been committed to the RHEL3 U6 patch pool this evening (in kernel version 2.4.21-32.5.EL).
To verify the fix, we've done testing with kernel-2.4.21-32.9.EL. It works correctly. Regards, Akira
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html