System Under Test "SUT" Hardware Description: 1. Brief description of hardware A) SUT Info: system name= vendor= model= B) CPU Info: family= model= model name= total cpu count= C) Memory Info: type= dimm part#'s= memory amount= 2. Link to the Hardware Certifications for existing system A) Base Certification B) Supplemental Certification 3. List known issues A) Existing BZ's B) Existing Hardware Errata C) Existing KBase articles 4. Memory specifications Please provide a brief description for the following: A) What is the expected bandwidth of the memory subsystem system wide? (If we run many instances of memory intensive applications where each application does not cross NUMA boundaries, how much aggregate bandwidth might we expect on the server?) B) Does the memory subsystem support NORMAL -vs- PERFORMANCE mode at the management/BIOS layer? If so what is it set to? C) How many memory channels per socket for specific CPU? D) How many channels per socket are actually populated on the SUT?
All, This BZ was opened following the failure of: Bug 1311226 - EET: RHEL7.2 24TB RAM 768CPU HP Integrity Superdome X - BL920s Gen9 System https://bugzilla.redhat.com/show_bug.cgi?id=1311226 There was an issue found during the performance stage of testing in BZ1311226: https://bugzilla.redhat.com/show_bug.cgi?id=1311226#c37 We have opened this BZ to rerun the EET testing with kernel-3.10.0-327.18.2.el7. Best, -pbunyan
Nigel Croxon 2016-02-23 11:08:55 EST https://bugzilla.redhat.com/show_bug.cgi?id=1311226#c0 System Under Test "SUT" Hardware Description: 1. Brief description of hardware A) HP Integrity Superdome X - BL920s Gen9 System B) Broadwell EX -E7-8890v4 2.60GHz, CPU count 24 C) DDR4-2133 LRDimm, 24TB or 12TB 2. Link to the Hardware Certification for existing system: A) Base Certification See comment below for Base Certification B) Supplemental Certification 3. List of known issues: A) Existing BZ - https://bugzilla.redhat.com/show_bug.cgi?id=1293436 B) Existing Hardware Errata C) Existing KBase article - https://access.redhat.com/articles/1979103 4. Memory specifications: A) What is the expected bandwidth of the memory subsystem system wide? (If we run many instances of memory intensive applications where each application does not cross NUMA boundaries, how much aggregate bandwidth might we expect on the server?) ~762 GB/s at 16 sockets or ~48 GB/s per socket of memory bandwidth (read only) with RAS features enabled. ~1200TB/s at 16 sockets or ~75GB/s per socket of memory bandwidth (read only) with RAS features disabled. B) Does the memory subsystem support NORMAL -vs- PERFORMANCE mode at the management/BIOS layer? Yes If so what is it set to? Default is DDDC mode = performance mode C) How many memory channels per socket for specific CPU? The Integrity Superdome X contains 8 BL920s Gen9 blades Each of the 8 blades has 2 CPU sockets. Each CPU socket has 2 memory channels each connecting to 2 memory controllers that contain 6 Dimms each. Each CPU socket has 24 Dimms Each blade has 48 Dimms Total system Dimm capacity is 384 Dimms 384 x 32GB DDR4-2133 LRDimm = 12288TB of system memory installed D) How many channels per socket are actually populated on the test system? Each of the 16 CPU sockets has all memory slots populated - 24 x 32GB DDR4-2133 LRDimms = 768GB per CPU socket -End
All, The following Extended Engineering Testing (EET) is in progress: EET: RHEL7.2Z HP Integrity Superdome X This EET testing equires a Zstream kernel: 3.10.0-327.18.2.el7.x86_64 ====================================== TARGET HOST DETAILS: ====================================== Hostname = hawk604a.local Arch = x86_64 Distro = RHEL-7.2Z Kernel = 3.10.0-327.18.2.el7.x86_64 CPU count = 768 CPU model name = Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz BIOS Information = Vendor: HP Version: Bundle: 008.002.042 SFW: 041.119.000 Release Date: 04/30/2016 MemTotal = 25364774656 kB There are three stages of EET testing: [] Fundamentals (PBunyan pbunyan) [] Performance (BMarson bmarson) [] Lload (LWoodman lwoodman) Best, -pbunyan
All, Current testing status... RHEL7.2Z 24TB RAM 768CPU kernel-3.10.0-327.18.2.el7.x86_64 ====================================== FUNDAMENTALS: PBunyan ====================================== EET x86_64 Baremetal - scheduled EET x86_64 Xen - N/A EET x86_64 KVM - scheduled EET x86_64 Kdump - scheduled ====================================== PERFORMANCE: BMarson ====================================== x86_64 Linpack - pending review... x86_64 Stream - pending review... Barry - please provide a comment with your testing results ====================================== LLOAD: LWoodman ====================================== x86_64 Lload - scheduled Best, -pbunyan
I just finished looking over the linpack and streams runs. With our C based versions of these tests, compiled with gcc, we demonstrated: Linpack single precision (making use of on chip cache more) --------------------------------------------------------- Performance peaked at 800 Gflops when 288 instances (18 per NUMA node) were run in parallel. Linpack double precision (accesses more of main memory) ------------------------------------------------------- Performance peaked at 335 Gflops when 144 instances (9 per NUMA node) were run in parallel. Streams (main memory exerciser) ------------------------------- Typically performance peaks with our affinity testing (using taskset) but in this testbed, NUMA pinning performed even better. We migrated to the errata kernel with some scheduler fixes/enhancements because the GA kernel was showing too huge a standard deviation between the individual iterations as we increased the worklaod. This kernel is behaving far better. Performance peaked to 489 GB/sec when 224 instances (14 per NUMA node) were run in parallel. Based off the memory bandwidth information documented above, this seems a little low. I tested without hyperthreads and there was little difference. I did this to make sure we weren't accidentally thinking an LCPU was actually a cores other hyperthread. The tests were run with the tuned profile latency-performance which essentially forces all cores to stay at cstate C1. This limits CPU frequency (and in the past has prevented turbomode to run) so my scaling data (1 core per socket vs many) is more predictable. I've asked Nigel for more clarification of the memory bandwidth numbers and configuration. Barry
To connect to the Partner lab: Connect x2goclient to host address 10.16.46.165 with a "Session type = gnome" Available user names are pbunyan, bmarson, lwoodman, passwd: 100yard- Start the Firefox browser and connect to: https://bpe1-ssl.houston.hpe.com/dana-na/auth/url_default/welcome.cgi Do Not fill in a "User ID" or a "Passcode". On the "Token" pull-down, Select "BPIA Certificate". Click on "Sign In". A "User Identification Request" window will appear, click on "OK". The browser window will show "Network Connect" Line with a "Start" button. Click on "Start" A window appears, asking are you sure you want to run this application (Network Connect Launcher). Click on "Yes". A new window should appear in the top left. Showing the connection (Assigned IP address). At this point, once this window appears, you have VPN into the partner lab. Open a "Terminal" window and ssh into the jump station. Available user names are pbunyan, bmarson, lwoodman with personal passwords. for example, "ssh bmarson@jump1" The jump station IP address is 15.252.158.21. Once on the Jump station. You can ssh to the Onboard Administrator (OA). ssh Administrator.14.1 Password: Acme At the "Hawk604-oa1>" prompt, one can type "co 1" to connect to the console serial line. Ctrl-B is to exit. or on the jump station, ssh to the RHEL OS running. ssh root.14.30 Password: 100yard-
Just an update .. Not sure where my mind was when I wrote this. In comment #6 I wrote .. > I tested without hyperthreads and there was little difference. I did > this to make sure we weren't accidentally thinking an LCPU was actually > a cores other hyper-thread. This is totally incorrect. Removing HT at the BIOS level made the tests run properly in the NUMA pinning mode with the GA kernel. So the errata kernel did improve the behavior with the hyper-threads present. Barry
Barry, Have you determined if the performance testing is a pass -or- a fail with kernel-3.10.0-327.18.2.el7.x86_64? Best, -pbunyan
Paul, Im still waiting for information about memory bandwidth and configuration for this specific system. Barry
(In reply to Barry Marson from comment #10) > Paul, > > Im still waiting for information about memory bandwidth and configuration > for this specific system. > > Barry Nigel, Please provide BarryM with the required information. Best, -pbunyan
(In reply to PaulB from comment #11) > (In reply to Barry Marson from comment #10) > > Paul, > > > > Im still waiting for information about memory bandwidth and configuration > > for this specific system. > > > > Barry > > Nigel, > Please provide BarryM with the required information. > > Best, > -pbunyan We're working on speeds. Hope to have something soon. However to be totally in sync with terminology for the description... For SDx there is RAS mode and Perf mode. The default is RAS (not Perf) mode. RAS mode, for the Broadwell-EX-based Superdome X, is: "enhanced DDDC+1 with DRAM bank sparing and DDR4 command/address parity error retry" (bank sparing and parity error retry are adders to the previous SDx feature set).
Tom, Is this test system in default mode or was it switched to perf mode ? Who can answer that ? Thanks Barry
Following on to Comment #7 https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c7 At the "Hawk604-oa1>" prompt, one can type these commands "co 1" to connect to the console serial line "show livelogs" to show the current running BIOS/Firmware information messages. "^b" a Ctrl-B is to exit. "poweron partition 1" to power on the partition "poweron partition 1 force" to force a stuck system to power on the partition "poweroff partition 1" to power off the partition "poweroff partition 1 force" to force a stuck system to power off the partition
(In reply to PaulB from comment #3) > All, > The following Extended Engineering Testing (EET) is in progress: > EET: RHEL7.2Z HP Integrity Superdome X > > This EET testing equires a Zstream kernel: > 3.10.0-327.18.2.el7.x86_64 > > ====================================== > TARGET HOST DETAILS: > ====================================== > Hostname = hawk604a.local > Arch = x86_64 > Distro = RHEL-7.2Z ***** Apologies - small correction needed here. The distro did _not_ change, only the kernel was changed to a Zstream kernel for this EET test run. This should read: Distro = RHEL-7.2 ***** > Kernel = 3.10.0-327.18.2.el7.x86_64 > CPU count = 768 > CPU model name = Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz > BIOS Information = > Vendor: HP > Version: Bundle: 008.002.042 SFW: 041.119.000 > Release Date: 04/30/2016 > > MemTotal = 25364774656 kB > > There are three stages of EET testing: > [] Fundamentals (PBunyan pbunyan) > [] Performance (BMarson bmarson) > [] Lload (LWoodman lwoodman) > > > Best, > -pbunyan All, Current testing status... kernel-3.10.0-327.18.2.el7.x86_64 ====================================== FUNDAMENTALS: PBunyan ====================================== EET x86_64 Baremetal - ** PASSED ** EET x86_64 Xen - N/A EET x86_64 KVM - in progress... EET x86_64 Kdump - scheduled. ====================================== PERFORMANCE: BMarson ====================================== x86_64 Linpack - results under review... x86_64 Stream - results under review... Barry - Please provide a short summary/update of the performance testing results with kernel-3.10.0-327.18.2.el7.x86_64. ====================================== LLOAD: LWoodman ====================================== x86_64 Lload - scheduled Best, -pbunyan
(In reply to Barry Marson from comment #13) > Tom, > > Is this test system in default mode or was it switched to perf mode ? Who > can answer that ? > > Thanks > Barry Barry: The machine is in RAS mode. fyi, tom
OK then this explains why the performance of streams was lower than I expected. The performance tests PASS. Barry
(In reply to Tom Vaden from comment #16) > (In reply to Barry Marson from comment #13) > > Tom, > > > > Is this test system in default mode or was it switched to perf mode ? Who > > can answer that ? > > > > Thanks > > Barry > > Barry: > > The machine is in RAS mode. > > fyi, > Nigel, https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c2tom ---<-snip->--- B) Does the memory subsystem support NORMAL -vs- PERFORMANCE mode at the management/BIOS layer? Yes If so what is it set to? Default is DDDC mode = performance mode ---<-snip->--- Is the system bios in NORMAL -or- PERFORMANCE mode? Best, -pbunyan
The system is in NORMAL mode.
ruyang, Your kdump expertise would be greatly appreciated.... ============================================ Issue: system failing to kdump testing ============================================ ------------------------------------------------- This is the issue seen on console following triggering a crash (echo c > /proc/sysrq-trigger): -------------------------------------------------- ---<snip->--- [ 113.883697] sd 0:0:0:1: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 113.895119] sdb: sdb1 sdb2 sdb3 [ 113.899318] sd 0:0:0:1: [sdb] Attached SCSI disk [ TIME ] Timed out wa[ 114.023535] device-mapper: multipath service-time: version 0.2.0 loaded iting for device dev-mapper-mpathc1.device. [DEPEND] Dependency failed for /kdumproot/mnt/hpstorage. [DEPEND] Dependency failed for Initrd Root File System. [DEPEND] Dependency failed for Reload Configuration from the Real Root. [DEPEND] Dependency failed for File System Check on /dev/mapper/mpathc1. [ OK ] Stopped Kdump Vmcore Save Service. [ OK ] Stopped dracut pre-pivot and cleanup hook. [ OK ] Stopped target Initrd Default Target. [ OK ] Reached target Initrd File Systems. [ OK ] Stopped dracut mount hook. [ OK ] Stopped target Basic System. [ OK ] Stopped target System Initialization. Starting Setup Virtual Console... StartinFailed to start kdump-error-handler.service: Transaction is destructive. [FAILED] Failed to start Kdump Emergency. See 'systemctl status emergency.service' for details. [DEPEND] Dependency failed for Emergency Mode. [ OK ] Started Setup Virtual Console. [ OK ] Found device /dev/disk/by-uuid/e1da41b9-bd30-4079-a3c7-0bf2ded9b31c. [ OK ] Found device /dev/disk/by-uuid/3EBF-1CBC. [ OK ] Found device /dev/disk/by-uuid/6e0da585-542f-4556-9477-ab84407a32e9. [ OK ] Found device /dev/mapper/rhel_hawk604a-root. Starting File System Check on /dev/mapper/rhel_hawk604a-root... [ OK ] Started File System Check on /dev/mapper/rhel_hawk604a-root. Starting File System Check on /dev/mapper/mpathc1... [ 102.675298] systemd-fsck[852]: /sbin/fsck.xfs: XFS file system. [ OK ] Started dracut initqueue hook. [ OK ] Started File System Check on /dev/mapper/mpathc1. Mounting /kdumproot/mnt/hpstorage... Mounting /sysroot... [ OK ] Reached target Remote File Systems (Pre). [ OK [ 115.004581] SGI XFS with ACLs, security attributes, no debug enabled ] Reached target Remote File Sys[ 115.013609] XFS (dm-2): Mounting V4 Filesystem [ 115.013702] XFS (dm-7): Mounting V4 Filesystem tems. [ 115.099991] XFS (dm-2): Starting recovery (logdev: internal) [ 115.128176] XFS (dm-7): Starting recovery (logdev: internal) [ 115.215637] XFS (dm-7): Ending recovery (logdev: internal) [ OK ] Mounted /sysroot. [ 115.275986] XFS (dm-2): Ending recovery (logdev: internal) [ OK ] Mounted /kdumproot/mnt/hpstorage. ---<snip->--- ============================================ These are the relevant system kdump configs: ============================================ ------------------------------------------ cat /proc/cmdline ------------------------------------------ [root@hawk604a ~]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.10.0-327.18.2.el7.x86_64 root=/dev/mapper/rhel_hawk604a-root ro crashkernel=512M,high rd.lvm.lv=rhel_hawk604a/root rd.lvm.lv=rhel_hawk604a/swap console=ttyS0,115200n81 [root@hawk604a ~]# ------------------------------------------ /etc/sysconfig/kdump ------------------------------------------ [root@hawk604a ~]# cat /etc/sysconfig/kdump ---<snip->--- #raw /dev/vg/lv_kdump #ext4 /dev/vg/lv_kdump #ext4 LABEL=/boot #ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937 #nfs my.server.com:/export/tmp #ssh user.com #sshkey /root/.ssh/kdump_id_rsa #path /var/crash xfs UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c path /dumpit/here core_collector makedumpfile -l --message-level 1 -c -d 31 #core_collector scp #kdump_post /var/crash/scripts/kdump-post.sh #kdump_pre /var/crash/scripts/kdump-pre.sh #extra_bins /usr/bin/lftp #extra_modules gfs2 #default shell #force_rebuild 1 #dracut_args --omit-drivers "cfg80211 snd" --add-drivers "ext2 ext3" #fence_kdump_args -p 7410 -f auto -c 0 -i 10 #fence_kdump_nodes node1 node2 ---<snip->--- ------------------------------------------ cat /etc/kdump.conf ------------------------------------------ [root@hawk604a ~]# cat /etc/kdump.conf ---<snip->--- # #raw /dev/vg/lv_kdump #ext4 /dev/vg/lv_kdump #ext4 LABEL=/boot #ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937 #nfs my.server.com:/export/tmp #ssh user.com #sshkey /root/.ssh/kdump_id_rsa #path /var/crash xfs UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c path /dumpit/here core_collector makedumpfile -l --message-level 1 -c -d 31 #core_collector scp #kdump_post /var/crash/scripts/kdump-post.sh #kdump_pre /var/crash/scripts/kdump-pre.sh #extra_bins /usr/bin/lftp #extra_modules gfs2 #default shell #force_rebuild 1 #dracut_args --omit-drivers "cfg80211 snd" --add-drivers "ext2 ext3" #fence_kdump_args -p 7410 -f auto -c 0 -i 10 #fence_kdump_nodes node1 node2 [root@hawk604a ~]# ---<snip->--- ========= NOTE: ========= We had similar issue in previous testing: https://bugzilla.redhat.com/show_bug.cgi?id=1311226#c18 Seems rd.retry=300 in /etc/sysconfig/kdump KDUMP_COMMANDLINE_APPEND is not working with our test kernel-3.10.0-327.18.2.el7.x86_64 Other than we are testing kernel-3.10.0-327.18.2.el7.x86_64, the system is configured exactly the same as described here: https://bugzilla.redhat.com/show_bug.cgi?id=1311226#c18 Thank you for your time, ruyang.
Hi, Paul Seems cat /etc/sysconfig/kdump content is wrong, it is /etc/kdump.conf instead. Could you try longer rd.retry? the unit is second. ie. rd.retry=600 Thanks Dave
(In reply to Dave Young from comment #21) > Hi, Paul > > Seems cat /etc/sysconfig/kdump content is wrong, it is /etc/kdump.conf > instead. > > Could you try longer rd.retry? the unit is second. ie. rd.retry=600 > > Thanks > Dave ruyang, Apologies for the cut/paste mistake :/ Below is the current /etc/sysconfig/kdump, I will try rd.retry=600 - as suggested... ------------------------------------------ /etc/sysconfig/kdump ------------------------------------------ [root@hawk604a ~]# cat /etc/sysconfig/kdump # Kernel Version string for the -kdump kernel, such as 2.6.13-1544.FC5kdump # If no version is specified, then the init script will try to find a # kdump kernel with the same version number as the running kernel. KDUMP_KERNELVER="" # The kdump commandline is the command line that needs to be passed off to # the kdump kernel. This will likely match the contents of the grub kernel # line. For example: # KDUMP_COMMANDLINE="ro root=LABEL=/" # Dracut depends on proper root= options, so please make sure that appropriate # root= options are copied from /proc/cmdline. In general it is best to append # command line options using "KDUMP_COMMANDLINE_APPEND=". # If a command line is not specified, the default will be taken from # /proc/cmdline KDUMP_COMMANDLINE="" # This variable lets us append arguments to the current kdump commandline # As taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline #KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never" KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=4 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never rd.retry=300" # Any additional kexec arguments required. In most situations, this should # be left empty # # Example: # KEXEC_ARGS="--elf32-core-headers" KEXEC_ARGS="" #Where to find the boot image #KDUMP_BOOTDIR="/boot" #What is the image type used for kdump KDUMP_IMG="vmlinuz" #What is the images extension. Relocatable kernels don't have one KDUMP_IMG_EXT="" [root@hawk604a ~]# -End best, -pbunyan
ruyang, Testing with rd.retry=600 did not resolve the issue. ---------------------- Here is a console log: ---------------------- ---<-snip->--- [ 112.592260] sd 0:0:0:0: [sda] 195305472 512-byte logical blocks: (99.9 GB/93.1 GiB) [ 112.600780] sd 0:0:0:1: [sdb] 205070336 512-byte logical blocks: (104 GB/97.7 GiB) [ 112.600835] sd 0:0:1:1: [sde] 205070336 512-byte logical blocks: (104 GB/97.7 GiB) [ 112.600862] sd 0:0:1:2: [sdf] 49218740224 512-byte logical blocks: (25.1 TB/22.9 TiB) [ 112.600879] sd 0:0:1:0: [sdd] 195305472 512-byte logical blocks: (99.9 GB/93.1 GiB) [ 112.600979] sd 0:0:0:2: [sdc] 49218740224 512-byte logical blocks: (25.1 TB/22.9 TiB) [ 112.643151] sd 14:0:0:0: [sdg] 195305472 512-byte logical blocks: (99.9 GB/93.1 GiB) [ 112.643166] sd 14:0:1:1: [sdk] 205070336 512-byte logical blocks: (104 GB/97.7 GiB) [ 112.643175] sd 14:0:0:2: [sdi] 49218740224 512-byte logical blocks: (25.1 TB/22.9 TiB) [ 112.643321] sd 14:0:1:2: [sdl] 49218740224 512-byte logical blocks: (25.1 TB/22.9 TiB) [ 112.643358] sd 14:0:1:0: [sdj] 195305472 512-byte logical blocks: (99.9 GB/93.1 GiB) [ 112.643515] sd 0:0:1:0: [sdd] Write Protect is off [ 112.643603] sd 0:0:0:2: [sdc] Write Protect is off [ 112.643712] sd 14:0:0:2: [sdi] Write Protect is off [ 112.643730] sd 0:0:1:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.643736] sd 0:0:0:1: [sdb] Write Protect is off [ 112.643827] sd 0:0:1:1: [sde] Write Protect is off [ 112.643840] sd 0:0:0:2: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.643912] sd 14:0:0:2: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.643929] sd 0:0:0:1: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.643982] sd 14:0:1:2: [sdl] Write Protect is off [ 112.643991] sd 14:0:1:0: [sdj] Write Protect is off [ 112.644053] sd 0:0:0:0: [sda] Write Protect is off [ 112.644117] sd 0:0:1:1: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.644132] sd 14:0:1:1: [sdk] Write Protect is off [ 112.644141] sd 0:0:1:2: [sdf] Write Protect is off [ 112.644235] sd 14:0:1:0: [sdj] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.644246] sd 14:0:1:2: [sdl] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.644249] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.644364] sd 14:0:1:1: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.644367] sd 0:0:1:2: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.645397] sd 14:0:0:1: [sdh] 205070336 512-byte logical blocks: (104 GB/97.7 GiB) [ 112.646193] sd 14:0:0:1: [sdh] Write Protect is off [ 112.646370] sd 14:0:0:1: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.683209] sdk: sdk1 sdk2 sdk3 [ 112.683281] sdb: sdb1 sdb2 sdb3 [ 112.683317] sde: sde1 sde2 sde3 [ 112.683404] sdh: sdh1 sdh2 sdh3 [ 112.683951] sd 14:0:0:1: [sdh] Attached SCSI disk [ 112.688576] sd 0:0:1:1: [sde] Attached SCSI disk [ 112.689087] sdj: sdj1 sdj2 sdj3 [ 112.689128] sda: sda1 sda2 sda3 [ 112.689189] sdd: sdd1 sdd2 sdd3 [ 112.690010] sd 0:0:0:0: [sda] Attached SCSI disk [ 112.690022] sd 0:0:1:0: [sdd] Attached SCSI disk [ 112.690101] sd 14:0:1:0: [sdj] Attached SCSI disk [ 112.715975] sdc: sdc1 [ 112.716033] sdi: sdi1 [ 112.716135] sdl: sdl1 [ 112.716178] sdf: sdf1 [ 112.716831] sd 14:0:0:2: [sdi] Attached SCSI disk [ 112.716911] sd 14:0:1:2: [sdl] Attached SCSI disk [ 112.716955] sd 0:0:1:2: [sdf] Attached SCSI disk [ 112.716983] sd 0:0:0:2: [sdc] Attached SCSI disk [ 112.944656] sd 14:0:1:1: [sdk] Attached SCSI disk [ 112.945268] sd 0:0:0:1: [sdb] Attached SCSI disk [ 112.946796] sd 14:0:0:0: [sdg] Write Protect is off [ 112.946908] sd 14:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 112.971532] sdg: sdg1 sdg2 sdg3 [ 112.975682] sd 14:0:0:0: [sdg] Attached SCSI disk [ 113.088242] device-mapper: multipath service-time: version 0.2.0 loaded [ OK ] Found device /dev/disk/by-uuid/3EBF-1CBC. [ OK ] Found device /dev/disk/by-uuid/6e0da585-542f-4556-9477-ab84407a32e9. [ OK ] Found device /dev/mapper/rhel_hawk604a-root. Starting File System Check on /dev/mapper/rhel_hawk604a-root... [ OK ] Started File System Check on /dev/mapper/rhel_hawk604a-root. [ TIME ] Timed out waiting for device dev-mapper-mpathc1.device. [DEPEND] Dependency failed for /kdumproot/mnt/hpstorage. [DEPEND] Dependency failed for Initrd Root File System. [DEPEND] Dependency failed for Reload Configuration from the Real Root. [DEPEND] Dependency failed for File System Check on /dev/mapper/mpathc1. [ OK ] Found device /dev/disk/by-uuid/e1da41b9-bd30-4079-a3c7-0bf2ded9b31c. Starting File System Check on /dev/mapper/mpathc1... [ 101.892176] Starting Setup Virtual Console... [ OK ] Started dracut initqueue hook. [ OK ] Started File System Check on /dev/mapper/mpathc1. Failed to start kdump-error-handler.service: Transaction is destructive. [ 114.436096] SGI XFS with ACLs, security attributes, no debug enabled [ 114.438705] XFS (dm-7): Mounting V4 Filesystem Mountin Mounting /sysroot... [ OK ] Reached target Remote File Sys[ 114.468647] XFS (dm-5): Mounting V4 Filesystem tems (Pre). [ OK ] Reached target Remote File Systems. [FAILED] Failed to start Kdump Emergency. See 'systemctl status emergency.service' for details. [DEPEND] Dependency failed for Emergency Mode. [ OK ] Started Setup Virtual Console. [ 114.554311] XFS (dm-5): Starting recovery (logdev: internal) [ 114.613881] XFS (dm-7): Starting recovery (logdev: internal) [ 114.634916] XFS (dm-5): Ending recovery (logdev: internal) [ OK ] Mounted /sysroot. [ 114.792887] XFS (dm-7): Ending recovery (logdev: internal) [ OK ] Mounted /kdumproot/mnt/hpstorage. ---<-snip->--- As I stated previously, all configs were the same. We updated the kernel from 3.10.0-327.el7.x86_64 to 3.10.0-327.18.2.el7.x86_64 for this test run. I will have to dig into the kernel changelog to see if something jumps out at me... ruyang - any insight you may have regarding this time sensitive EET (Extended Engineering Testing) test run would be appreciated. Best, -pbunyan
Seems systemd through out an error maybe for: [DEPEND] Dependency failed for File System Check on /dev/mapper/mpathc1 Then kdump error handler server tried to start, but it failed to start But the target device mpathc1 was up after a while, I'm not sure why the emergency service was started without waiting for mpathc1.. Harald, do you have any thought's about this issue? Thanks Dave
quick guess: "udev.children-max=2" because of that, mpathc1 doesn't get the udev SYSTEMD_READY flag set in time. Why "udev.children-max=2" ? Any reason for this strange restriction? If there are issues with udev and cpus, maybe comment out: # CPU hotadd request # SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" in /lib/udev/rules.d/40-redhat.rules
udev.children-max=2 is for avoiding oom, to confirm Harald's concern, Paul, can you test again by removing udev.children-max=2 in kdump sysconfig file?
(In reply to Dave Young from comment #28) > udev.children-max=2 is for avoiding oom, to confirm Harald's concern, Paul, > can you test again by removing udev.children-max=2 in kdump sysconfig file? All, [] I retested with kernel-3.10.0-327.el7.x86_64 and was able to successfully crash and capture a vmcorefile. hmmmm... [] I then retested with kernel-3.10.0-327.18.2.el7.x86 and kexec-tools-2.0.7-38.el7_2.1.x86_64 - the system failed in the same manner :/ [] Then, as suggested... I removed the udev.children-max=2 from the "KDUMP_COMMANDLINE_APPEND=" setting in /etc/sysconfig/kdump. I retested with kernel-3.10.0-327.18.2.el7.x86_64 and kexec-tools-2.0.7-38.el7.x86_64 - success!! I was able to successfully crash the system and capture/analyse the vmcore file. Best, -pbunyan
All, Current testing status... RHEL7.2Z 24TB RAM 768CPU kernel-3.10.0-327.18.2.el7.x86_64 ====================================== FUNDAMENTALS: PBunyan ====================================== EET x86_64 Baremetal - ** PASS ** EET x86_64 Xen - N/A EET x86_64 KVM - ** PASS ** EET x86_64 Kdump - ** PASS - KBASE REQUIRED ** ====================================== PERFORMANCE: BMarson ====================================== x86_64 Linpack - ** PASS ** x86_64 Stream - ** PASS ** https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c17 ====================================== LLOAD: LWoodman ====================================== x86_64 Lload - in progress... Best, -pbunyan
(In reply to PaulB from comment #29) > (In reply to Dave Young from comment #28) > > udev.children-max=2 is for avoiding oom, to confirm Harald's concern, Paul, > > can you test again by removing udev.children-max=2 in kdump sysconfig file? > > > All, > [] I retested with kernel-3.10.0-327.el7.x86_64 and was > able to successfully crash and capture a vmcorefile. > hmmmm... > > [] I then retested with kernel-3.10.0-327.18.2.el7.x86 and > kexec-tools-2.0.7-38.el7_2.1.x86_64 - the system failed > in the same manner :/ > > [] Then, as suggested... > I removed the udev.children-max=2 from the > "KDUMP_COMMANDLINE_APPEND=" setting in /etc/sysconfig/kdump. > > I retested with kernel-3.10.0-327.18.2.el7.x86_64 and > kexec-tools-2.0.7-38.el7.x86_64 - success!! > I was able to successfully crash the system and capture/analyse > the vmcore file. > Paul, glad to know it works though I still need to figure out why udev.children-max=2 can cause the failure. It really should wait no matter how many udev threads being used. Thanks Dave
(In reply to Dave Young from comment #31) > > Paul, glad to know it works though I still need to figure out why > udev.children-max=2 can cause the failure. It really should wait no matter > how many udev threads being used. > > Thanks > Dave All, That being said... As long as a KBASE article can be approved and written for removing udev.children-max=2 from the "KDUMP_COMMANDLINE_APPEND=" setting in /etc/sysconfig/kdump, I can PASS the Fundamentals stage of EET testing. If a KBASE article cannot be written - the system fails EET. Adding Gary Case for KBASE blessing. Best, -pbunyan
(In reply to Harald Hoyer from comment #26) > quick guess: "udev.children-max=2" > > because of that, mpathc1 doesn't get the udev SYSTEMD_READY flag set in time. Harald, Paul has confirmed dropping udev.children-max=2 works, but why do not wait for SYSTEMD_READY? Can you give some hints where is the timeout value and how can we connect it to rd.retry? Thanks Dave
We would need to get Support's take on this as well. Without the change you won't have a functional kdump, and that directly impacts their ability to support customers. It would also be nice to know why this option causes kdump to fail.
systemd.mount manpage says about below option: x-systemd.device-timeout= Paul, could you give another try below? * keep the udev.children_max=2 in sysconfig * add rd.retry=600 * add x-systemd.device-timeout=600s If this works we may add x-systemd.device-timeout equal to rd.retry by default in case user does not specify it in fstab. Thanks Dave
rd.retry should be added to sysconfig x-systemd.device-timeout=600s should be added to /etc/fstab in the mount options of the multipath device for kdump
Larry, how did your testing go? Pass/Fail?
Nigel, Once LarryW has finished his testing, it seems we need debug the kdump issue further. Will there be time on HP schedule to allow for looking into the kdump issue? Best, -pbunyan
How much time do you need? Is it something that can be completed today? If no, we will have to reschedule time with the 24TB system in the future. -Nigel
(In reply to Dave Young from comment #36) > systemd.mount manpage says about below option: > x-systemd.device-timeout= > > Paul, could you give another try below? > * keep the udev.children_max=2 in sysconfig > * add rd.retry=600 > * add x-systemd.device-timeout=600s > > If this works we may add x-systemd.device-timeout equal to rd.retry by > default in case user does not specify it in fstab. > > Thanks > Dave Dave, Testing with the requested configuration fails in the same manner as noted here: https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c23 ======================= Note: This config fails ======================= --------------------- /etc/sysconfig/kdump: --------------------- KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never rd.retry=600 x-systemd.device-timeout=600s" ------------------ cat /proc/cmdline: ------------------ BOOT_IMAGE=/vmlinuz-3.10.0-327.18.2.el7.x86_64 root=/dev/mapper/rhel_hawk604a-root ro crashkernel=512M,high rd.lvm.lv=rhel_hawk604a/root rd.lvm.lv=rhel_hawk604a/swap console=ttyS0,115200n81 ---------------- /etc/kdump.conf: ---------------- xfs UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c path /dumpit/here core_collector makedumpfile -l --message-level 1 -c -d 31 ----------- /etc/fstab: ----------- UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c /mnt/hpstorage xfs defaults 0 0 Best, -pbunyan
Nigel, As a KBASE article cannot be used to justify the kdump configuration that includes removing udev.children-max=2 from KDUMP_COMMANDLINE_APPEND line in /etc/sysconfig/kdump, kdump testing is considered a FAIL. Therefore, the Fundamentals stage of EET testing has FAILED. Best, -pbunyan
Paul, The x-systemd.device-timeout= param is for /etc/fstab mount options, adding it to sysconfig does not help. Do you still have the machine in hand? Can we use it to test it again? Thanks Dave
(In reply to PaulB from comment #42) > Nigel, > As a KBASE article cannot be used to justify the kdump configuration that > includes removing udev.children-max=2 from KDUMP_COMMANDLINE_APPEND line in > /etc/sysconfig/kdump, kdump testing is considered a FAIL. > > Therefore, the Fundamentals stage of EET testing has FAILED. > > > Best, > -pbunyan Paul: It looks like a systemd deficiency for which there is a workaround. So what's our recourse if not a kbase? thanks, tom
We can utilize systemd "x-systemd.device-timeout=" parameter to address the issue. We just tested on Paul's machine, and it works. As an example, add 700s timeout in /etc/fstab: /dev/mapper/rhel-root / xfs defaults,x-systemd.device-timeout=700s 0 0 Then "touch /etc/kdump.conf" and "kdumpctl restart", after this kdump will use the new fstab options, so in kdump kernel, systemd will find "x-systemd.device-timeout" and use the timeout specified to wait the target ready. We suggest it as a solution.
(In reply to Xunlei Pang from comment #45) > We can utilize systemd "x-systemd.device-timeout=" parameter to address the > issue. We just tested on Paul's machine, and it works. > > As an example, add 700s timeout in /etc/fstab: > /dev/mapper/rhel-root / xfs defaults,x-systemd.device-timeout=700s 0 0 > > Then "touch /etc/kdump.conf" and "kdumpctl restart", after this kdump will > use the new fstab options, so in kdump kernel, systemd will find > "x-systemd.device-timeout" and use the timeout specified to wait the target > ready. > > > We suggest it as a solution. Xunlei Pang, Actually, the fstab entry was a bit more detailed. We set the fstab entry, as follows: [root@hawk604a here]# cat /etc/fstab UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c /mnt/hpstorage xfs rw,relatime,seclabel,attr2,inode64,noquota,x-systemd.device-timeout=700s 0 0 Why? - As you explained to me, kdump uses data from here: [root@hawk604a here]# findmnt --fstab TARGET SOURCE FSTYPE OPTIONS / /dev/mapper/rhel_hawk604a-root xfs defaults /boot UUID=6e0da585-542f-4556-9477-ab84407a32e9 xfs defaults /boot/efi UUID=3EBF-1CBC vfat umask=0077,shortname=winnt /home /dev/mapper/rhel_hawk604a-home xfs defaults swap /dev/mapper/rhel_hawk604a-swap swap defaults /mnt/hpstorage UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c xfs rw,relatime,seclabel,attr2,inode64,noquota,x-systemd.device-timeout=700s =========================================== Note: Kdump WORKS with the following config =========================================== --------------------------------- First - remember this known issue --------------------------------- Speaking with Nigel on this an previous testing, I was made aware of the following known issue: https://bugzilla.redhat.com/show_bug.cgi?id=1311226#c27 Bug 1123039 - [HP HPS 7.1 Bug] Crashkernel boot failure, out of memory, when crashkernel=512M,high ---<-snip->--- -The following setting was required: kernel command line: crashkernel=512M,high -/etc/sysconfig/kdump KDUMP_COMMANDLINE_APPEND: s/nr_cpus=1/nr_cpus=4 ---<-snip->--- --------------------- /etc/sysconfig/kdump: --------------------- KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=4 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never rd.retry=600" ------------------ cat /proc/cmdline: ------------------ BOOT_IMAGE=/vmlinuz-3.10.0-327.18.2.el7.x86_64 root=/dev/mapper/rhel_hawk604a-root ro crashkernel=512M,high rd.lvm.lv=rhel_hawk604a/root rd.lvm.lv=rhel_hawk604a/swap console=ttyS0,115200n81 ---------------- /etc/kdump.conf: ---------------- xfs UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c path /dumpit/here core_collector makedumpfile -l --message-level 1 -c -d 31 ----------- /etc/fstab: ----------- UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c /mnt/hpstorage xfs rw,relatime,seclabel,attr2,inode64,noquota,x-systemd.device-timeout=700s 0 0 At this point the status of the Fundamentals stage of testing is dependent on the following: Is this configuration/workaround acceptable by PM and can a KBASE article be written and approved? Adding needinfo from Gary Case for KBASE requirement. Best, -pbunyan
(In reply to Tom Vaden from comment #44) > (In reply to PaulB from comment #42) > > Nigel, > > As a KBASE article cannot be used to justify the kdump configuration that > > includes removing udev.children-max=2 from KDUMP_COMMANDLINE_APPEND line in > > /etc/sysconfig/kdump, kdump testing is considered a FAIL. > > > > Therefore, the Fundamentals stage of EET testing has FAILED. > > > > > > Best, > > -pbunyan > > Paul: > > It looks like a systemd deficiency for which there is a workaround. > So what's our recourse if not a kbase? > > thanks, > tom Tom, A KBASE will be required. We will need await Gary Case reply. Best, -pbunyan
(In reply to PaulB from comment #46) > (In reply to Xunlei Pang from comment #45) > > We can utilize systemd "x-systemd.device-timeout=" parameter to address the > > issue. We just tested on Paul's machine, and it works. > > > > As an example, add 700s timeout in /etc/fstab: > > /dev/mapper/rhel-root / xfs defaults,x-systemd.device-timeout=700s 0 0 > > > > Then "touch /etc/kdump.conf" and "kdumpctl restart", after this kdump will > > use the new fstab options, so in kdump kernel, systemd will find > > "x-systemd.device-timeout" and use the timeout specified to wait the target > > ready. > > > > > > We suggest it as a solution. > > Xunlei Pang, > Actually, the fstab entry was a bit more detailed. > > We set the fstab entry, as follows: > [root@hawk604a here]# cat /etc/fstab > UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c /mnt/hpstorage xfs > rw,relatime,seclabel,attr2,inode64,noquota,x-systemd.device-timeout=700s 0 0 > Hi Paul, The detailed "rw,relatime,seclabel,attr2,inode64,noquota" was actually copied from the output of previous findmnt, I think it is ok to use "defaults" instead in /etc/fstab for most cases, that is, "defaults,x-systemd.device-timeout=700s". Regards, Xunlei > > Why? - As you explained to me, kdump uses data from here: > [root@hawk604a here]# findmnt --fstab > TARGET SOURCE FSTYPE OPTIONS > / /dev/mapper/rhel_hawk604a-root xfs defaults > /boot UUID=6e0da585-542f-4556-9477-ab84407a32e9 xfs defaults > /boot/efi UUID=3EBF-1CBC vfat > umask=0077,shortname=winnt > /home /dev/mapper/rhel_hawk604a-home xfs defaults > swap /dev/mapper/rhel_hawk604a-swap swap defaults > /mnt/hpstorage UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c xfs > rw,relatime,seclabel,attr2,inode64,noquota,x-systemd.device-timeout=700s > > > =========================================== > Note: Kdump WORKS with the following config > =========================================== > --------------------------------- > First - remember this known issue > --------------------------------- > Speaking with Nigel on this an previous testing, I was made aware of the > following known issue: > https://bugzilla.redhat.com/show_bug.cgi?id=1311226#c27 > > Bug 1123039 - [HP HPS 7.1 Bug] Crashkernel boot failure, out of memory, when > crashkernel=512M,high > ---<-snip->--- > -The following setting was required: > kernel command line: crashkernel=512M,high > -/etc/sysconfig/kdump KDUMP_COMMANDLINE_APPEND: s/nr_cpus=1/nr_cpus=4 > ---<-snip->--- > > --------------------- > /etc/sysconfig/kdump: > --------------------- > KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=4 reset_devices > cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 > rootflags=nofail acpi_no_memhotplug transparent_hugepage=never rd.retry=600" > > ------------------ > cat /proc/cmdline: > ------------------ > BOOT_IMAGE=/vmlinuz-3.10.0-327.18.2.el7.x86_64 > root=/dev/mapper/rhel_hawk604a-root ro crashkernel=512M,high > rd.lvm.lv=rhel_hawk604a/root rd.lvm.lv=rhel_hawk604a/swap > console=ttyS0,115200n81 > > ---------------- > /etc/kdump.conf: > ---------------- > xfs UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c > path /dumpit/here > core_collector makedumpfile -l --message-level 1 -c -d 31 > > ----------- > /etc/fstab: > ----------- > UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c /mnt/hpstorage xfs > rw,relatime,seclabel,attr2,inode64,noquota,x-systemd.device-timeout=700s 0 0 > > > At this point the status of the Fundamentals stage of testing is dependent > on the following: > Is this configuration/workaround acceptable by PM and can a KBASE article be > written and approved? > > Adding needinfo from Gary Case for KBASE requirement. > > Best, > -pbunyan
I am not giving the official answer. But I just got a text message from Larry (who is at Red Hat Summit). We passed LLoad testing. -Nigel
Hello, This is current kbase. Please let me know if you would like anything added or changed for this? ######## Subject: HP Integrity Superdome X kdump option required in large configurations Environment HP Integrity Superdome X Large configuration RHEL 7.2.z Issue In a HP Integrity Superdome X with RHEL7.2.Z, 24TB RAM, and 768CPU kdump requires udev.children-max=2 in /etc/sysconfig/kdump. This udev.children-max=2 was added to the default kdump February 2013. This option limits the udev threads to 2. Resolution Edit /etc/sysconfig/kdump and confirm udev.children-max=2 is listed in KDUMP_COMMANDLINE_APPEND KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=4 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never rd.retry=300" ######### Thank You Joe Kachuck
(In reply to Joseph Kachuck from comment #51) > Hello, > This is current kbase. Please let me know if you would like anything added > or changed for this? Hi Joseph, I think you must be missing something, we now resort to systemd's "x-systemd.device-timeout" parameter in /etc/fstab in 1st kernel, that is: Add an entry into "/etc/fstab" for the dump target, and specify an extra "x-systemd.device-timeout=700" mount option for this entry, then rebuild the kdump initramfs. Regards, Xunlei > > ######## > Subject: > HP Integrity Superdome X kdump option required in large configurations > > Environment > > HP Integrity Superdome X Large configuration > RHEL 7.2.z > > Issue > > In a HP Integrity Superdome X with RHEL7.2.Z, 24TB RAM, and 768CPU kdump > requires udev.children-max=2 in /etc/sysconfig/kdump. > This udev.children-max=2 was added to the default kdump February 2013. > This option limits the udev threads to 2. > > Resolution > > Edit /etc/sysconfig/kdump and confirm udev.children-max=2 is listed in > KDUMP_COMMANDLINE_APPEND > > KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=4 reset_devices > cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 > rootflags=nofail acpi_no_memhotplug transparent_hugepage=never rd.retry=300" > ######### > > Thank You > Joe Kachuck
Hello, From looking at comment 45 would this be a preferred kbase? Please confirm if this is correct. Thank You Joe Kachuck
(In reply to Joseph Kachuck from comment #53) > Hello, > From looking at comment 45 would this be a preferred kbase? > > Please confirm if this is correct. > > Thank You > Joe Kachuck Joe, No. Please use the following comment, as a reference for writing the kbase: https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c46 Best, -pbunyan
Hello Paul, This is the new kbase: Since in comment 46. It appeared the udev.children-max=2 option was included. I have left it in the kbase. Please let me know if this looks better? ######## Subject: HP Integrity Superdome X kdump options required in large configurations Environment HP Integrity Superdome X Large configuration RHEL 7.2.z Issue In a HP Integrity Superdome X with RHEL7.2.Z, 24TB RAM, and 768CPU kdump requires additional options for kdump to work correctly. udev.children-max=2 should be added to /etc/sysconfig/kdump. This udev.children-max=2 was added to the default kdump February 2013. This option limits the udev threads to 2. x-systemd.device-timeout=700s should be added to the kdump mount point. Resolution Edit /etc/sysconfig/kdump and confirm udev.children-max=2 is listed in KDUMP_COMMANDLINE_APPEND KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=4 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never rd.retry=300" Edit /etc/kdump.conf add the correct path xfs UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c path /dumpit/here core_collector makedumpfile -l --message-level 1 -c -d 31 Edit /etc/fstab UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c /mnt/storage xfs rw,relatime,seclabel,attr2,inode64,noquota,x-systemd.device-timeout=700s 0 0 Then run the command "kdumpctl restart" ######### Thank You Joe Kachuck
HP / Nigel, We are discussing the KBASE article internally. Once approved JoeK will add comment. Best, -pbunyan
Thank you Paul and Red Hat. We await your posting. -Nigel
Hello, The kbase for this issue has now been published: https://access.redhat.com/solutions/2438911 Please let me know in email if this needs any changes. Thank You Joe Kachuck
I have finished my EET testing of the 24TB RAM 768CPU HP Integrity Superdome X running RHEL7.2.z. I was able to successfully consume and even over-commit all the RAM on every CPU, involking a storm of OOMkills on many CPUs at the same time. The system successfully killed the necessary processes to continue wunning without hanging or pausing for an excessive amount of time. This even worked OK when all of most of the memory was allocated on a different NUMA node that the node it was executing on, a stress test that has been problematic in the past on large systems. In addition, I was able to consume all the memory in the pagecache then apply a heavy anonymous workload. The system successfully reclaimed all or most of the pagecache memory even when the underlying files were mmap()'d into the address space of active processes, once again a stress test that proved problematic in the past on large HP and other systems. At this point I would say that Red Hat can officially support this system running RHEL7.2.z. The only thing I would probably say in a release note is that reclaiming lots of pagecache memory for several very large anonymous memory backed by Transparent Huge Pages(THP) can cause the system to pause for several seconds and even encounter soft lockups on a system this large. If this happens and the resulting pauses and/or soft lockups are problematic, disabling THP will eliminate it. The reason for the pauses/lockups is 1.) THP 2MB pages allow the memory demmand to be up to 512 times greater than 4KB small pages. 2.) The page reclaim code must defragment memory zones and break the 2MB pages into 512 individual 4KB pages in order to reclaim them. Larry Woodman
Joe, The Lload testing also requires a KBASE: https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c60 Best, -pbunyan
(In reply to PaulB from comment #61) > Joe, > The Lload testing also requires a KBASE: > https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c60 > > Best, > -pbunyan Paul: Can we enlarge or use the previous kbase that evoked similar behavior in the RHEL7.1 EET? It is at: https://access.redhat.com/articles/1979103 just a thought, tom
Hello Paul, Would it be acceptable to update kbase 1979103 as noted in comment 62? Thank You Joe Kachuck
(In reply to Joseph Kachuck from comment #63) > Hello Paul, > Would it be acceptable to update kbase 1979103 as noted in comment 62? > > Thank You > Joe Kachuck Joe, There are three stages of EET testing: [] Fundamentals (PBunyan pbunyan) [] Performance (BMarson bmarson) [] Lload (LWoodman lwoodman) That is an KBASE issue for Larry Woodmans Lload testing stage. I would prefer Larry Woodman answer your question. Adding needinfo from lwoodman. Best, -pbunyan
In regards to commnet #62: ------------------------------------------------------------------------------ Paul: Can we enlarge or use the previous kbase that evoked similar behavior in the RHEL7.1 EET? It is at: https://access.redhat.com/articles/1979103 just a thought, tom ------------------------------------------------------------------------------- Yes, please include this system in the scope of "https://access.redhat.com/articles/1979103". No sence in writing the exact release note for this system. Larry
Hello, The kbase has been updated: https://access.redhat.com/articles/1979103 Thank You Joe Kachuck
Hi Joe, since RHEL7.2 EET kbase is posted, can you please add RHEL7.2 kbase 'Hardware update requires updated version of RHEL to RHEL 7.2' to BL920s Gen9 on RH HCL, https://access.redhat.com/ecosystem/hardware/2165921 thank you, trinh
All, EET testing has completed successfully: EET: RHEL7.2Z 24TB RAM 768CPU HP Integrity Superdome X kernel-3.10.0-327.18.2.el7.x86_64 ====================================== FUNDAMENTALS: PBunyan ====================================== EET x86_64 Baremetal - ** PASS ** EET x86_64 Xen - N/A EET x86_64 KVM - ** PASS ** EET x86_64 Kdump - ** PASS - KBASE REQUIRED ** KBASE: https://access.redhat.com/solutions/2438911 ====================================== PERFORMANCE: BMarson ====================================== x86_64 Linpack - ** PASS ** x86_64 Stream - ** PASS ** https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c17 ====================================== LLOAD: LWoodman ====================================== x86_64 Lload - ** PASS - KBASE REQUIRED ** https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c60 KBASE: https://access.redhat.com/articles/1979103 Best, -pbunyan
Hello, Kbases have been added. Thank You Joe Kachuck
Thank you Paul for all of your efforts here.
(In reply to Nigel Croxon from comment #70) > Thank you Paul for all of your efforts here. ditto
(In reply to Dave Young from comment #28) > udev.children-max=2 is for avoiding oom, to confirm Harald's concern, Paul, > can you test again by removing udev.children-max=2 in kdump sysconfig file? Right, I removed the RAM check back in 2013. So we now have the default value of: children-max = 8 + CPU_COUNT * 2 Do you have a suggestion how to shrink this according to the RAM available? I am thinking of: cpu_max = 8 + CPU_COUNT * 2 ram_max = ... children-max = MIN(cpu_max, ram_max) We used to calculate ram_max with: ram_max = memsize_mb / 8 Any suggestion on how to calculate ram_max ? How much memory is available in the kdump environment?
HaroldH/DaveY, This EET BZ is closed. The resolution for kdump was suggested/approved by the kdump team and KBASE was completed. I would suggest opening a new BZ to troubleshoot/investigate. Add comments #71-73 to the new BZ and reference this BZ. Best, -pbunyan
In reply to Harald Hoyer from comment #72) > (In reply to Dave Young from comment #28) > > udev.children-max=2 is for avoiding oom, to confirm Harald's concern, Paul, > > can you test again by removing udev.children-max=2 in kdump sysconfig file? > > Right, I removed the RAM check back in 2013. So we now have the default > value of: > > children-max = 8 + CPU_COUNT * 2 > > Do you have a suggestion how to shrink this according to the RAM available? > > I am thinking of: > > cpu_max = 8 + CPU_COUNT * 2 > ram_max = ... > children-max = MIN(cpu_max, ram_max) > > We used to calculate ram_max with: > ram_max = memsize_mb / 8 > > Any suggestion on how to calculate ram_max ? The original value looks like udev will have one thread per 8M memory as the maximum thread number. I suspect it is too aggressive, during out test some processes like dhclient use a lot of memory, maybe memsize_mb/128 is a reasonable value, OTOH, even with /128, for 24T ram_max will be 196608, it will be too many, maybe there should be a maximum value for ram_max. > > How much memory is available in the kdump environment? Usually it is 160M+64M/Tb for crashkernel=auto (x86), but one can use specific value with like crashkernel=512M in kernel cmdline like in this bug. For ppc64 since they have 64K page, they need more memory in kdump kernel. Thanks Dave
Hi PaulB, Can you set the value of "DefaultTimeoutStartSec=700s" in /etc/systemd/system.conf, then have a test? It affects all the service's timeout. If fortunately, I hope it can survive from "Time out waiting for device dev-mapper-mpathc1.device", which cause the failure of kdump service. Thx, Pingfan (In reply to PaulB from comment #20) > [ TIME ] Timed out wa[ 114.023535] device-mapper: multipath service-time: > version 0.2.0 loaded > iting for device dev-mapper-mpathc1.device > [DEPEND] Dependency failed for /kdumproot/mnt/hpstorage. > [DEPEND] Dependency failed for Initrd Root File System. > [DEPEND] Dependency failed for Reload Configuration from the Real Root. > [DEPEND] Dependency failed for File System Check on /dev/mapper/mpathc1. > [ OK ] Stopped Kdump Vmcore Save Service. > [ OK ] Stopped dracut pre-pivot and cleanup hook. > [ OK ] Stopped target Initrd Default Target. > [ OK ] Reached target Initrd File Systems. > [ OK ] Stopped dracut mount hook. > [ OK ] Stopped target Basic System. > [ OK ] Stopped target System Initialization. > Starting Setup Virtual Console... > StartinFailed to start kdump-error-handler.service: Transaction is > destructive. > [FAILED] Failed to start Kdump Emergency. > See 'systemctl status emergency.service' for details. > [DEPEND] Dependency failed for Emergency Mode. > [ OK ] Started Setup Virtual Console. > [ OK ] Found device /dev/disk/by-uuid/e1da41b9-bd30-4079-a3c7-0bf2ded9b31c. > [ OK ] Found device /dev/disk/by-uuid/3EBF-1CBC. > [ OK ] Found device /dev/disk/by-uuid/6e0da585-542f-4556-9477-ab84407a32e9. > [ OK ] Found device /dev/mapper/rhel_hawk604a-root. > Starting File System Check on /dev/mapper/rhel_hawk604a-root... > [ OK ] Started File System Check on /dev/mapper/rhel_hawk604a-root. > Starting File System Check on /dev/mapper/mpathc1... > [ 102.675298] systemd-fsck[852]: /sbin/fsck.xfs: XFS file system. > [ OK ] Started dracut initqueue hook. > [ OK ] Started File System Check on /dev/mapper/mpathc1. > Mounting /kdumproot/mnt/hpstorage... > Mounting /sysroot... > [ OK ] Reached target Remote File Systems (Pre). > [ OK [ 115.004581] SGI XFS with ACLs, security attributes, no debug enabled > ] Reached target Remote File Sys[ 115.013609] XFS (dm-2): Mounting V4 > Filesystem > [ 115.013702] XFS (dm-7): Mounting V4 Filesystem > tems. > [ 115.099991] XFS (dm-2): Starting recovery (logdev: internal) > [ 115.128176] XFS (dm-7): Starting recovery (logdev: internal) > [ 115.215637] XFS (dm-7): Ending recovery (logdev: internal) > [ OK ] Mounted /sysroot. > [ 115.275986] XFS (dm-2): Ending recovery (logdev: internal) > [ OK ] Mounted /kdumproot/mnt/hpstorage. > ---<snip->--- > > > ============================================ > These are the relevant system kdump configs: > ============================================ > ------------------------------------------ > cat /proc/cmdline > ------------------------------------------ > [root@hawk604a ~]# cat /proc/cmdline > BOOT_IMAGE=/vmlinuz-3.10.0-327.18.2.el7.x86_64 > root=/dev/mapper/rhel_hawk604a-root ro crashkernel=512M,high > rd.lvm.lv=rhel_hawk604a/root rd.lvm.lv=rhel_hawk604a/swap > console=ttyS0,115200n81 > [root@hawk604a ~]# > > > ------------------------------------------ > /etc/sysconfig/kdump > ------------------------------------------ > [root@hawk604a ~]# cat /etc/sysconfig/kdump > ---<snip->--- > #raw /dev/vg/lv_kdump > #ext4 /dev/vg/lv_kdump > #ext4 LABEL=/boot > #ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937 > #nfs my.server.com:/export/tmp > #ssh user.com > #sshkey /root/.ssh/kdump_id_rsa > #path /var/crash > xfs UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c > path /dumpit/here > core_collector makedumpfile -l --message-level 1 -c -d 31 > #core_collector scp > #kdump_post /var/crash/scripts/kdump-post.sh > #kdump_pre /var/crash/scripts/kdump-pre.sh > #extra_bins /usr/bin/lftp > #extra_modules gfs2 > #default shell > #force_rebuild 1 > #dracut_args --omit-drivers "cfg80211 snd" --add-drivers "ext2 ext3" > #fence_kdump_args -p 7410 -f auto -c 0 -i 10 > #fence_kdump_nodes node1 node2 > ---<snip->--- > > ------------------------------------------ > cat /etc/kdump.conf > ------------------------------------------ > [root@hawk604a ~]# cat /etc/kdump.conf > ---<snip->--- > # > > #raw /dev/vg/lv_kdump > #ext4 /dev/vg/lv_kdump > #ext4 LABEL=/boot > #ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937 > #nfs my.server.com:/export/tmp > #ssh user.com > #sshkey /root/.ssh/kdump_id_rsa > #path /var/crash > xfs UUID=e1da41b9-bd30-4079-a3c7-0bf2ded9b31c > path /dumpit/here > core_collector makedumpfile -l --message-level 1 -c -d 31 > #core_collector scp > #kdump_post /var/crash/scripts/kdump-post.sh > #kdump_pre /var/crash/scripts/kdump-pre.sh > #extra_bins /usr/bin/lftp > #extra_modules gfs2 > #default shell > #force_rebuild 1 > #dracut_args --omit-drivers "cfg80211 snd" --add-drivers "ext2 ext3" > #fence_kdump_args -p 7410 -f auto -c 0 -i 10 > #fence_kdump_nodes node1 node2 > [root@hawk604a ~]# > ---<snip->--- > > ========= > NOTE: > ========= > We had similar issue in previous testing: > https://bugzilla.redhat.com/show_bug.cgi?id=1311226#c18 > > Seems rd.retry=300 in /etc/sysconfig/kdump KDUMP_COMMANDLINE_APPEND > is not working with our test kernel-3.10.0-327.18.2.el7.x86_64 > > Other than we are testing kernel-3.10.0-327.18.2.el7.x86_64, the system is > configured exactly the same as described here: > https://bugzilla.redhat.com/show_bug.cgi?id=1311226#c18 > > > Thank you for your time, ruyang.
(In reply to Pingfan Liu from comment #77) > Hi PaulB, > > Can you set the value of "DefaultTimeoutStartSec=700s" in > /etc/systemd/system.conf, then have a test? It affects all the service's > timeout. If fortunately, I hope it can survive from "Time out waiting for > device dev-mapper-mpathc1.device", which cause the failure of kdump service. > > Thx, > Pingfan > > > (In reply to PaulB from comment #20) Pingfan, This system is no longer available to us. This was a remote EET (Extended Engineering Testing). RE: https://bugzilla.redhat.com/show_bug.cgi?id=1346327#c75 ---<-snip>--- This EET BZ is closed. The resolution for kdump was suggested/approved by the kdump team and KBASE was completed. I would suggest opening a new BZ to troubleshoot/investigate. Add comments #71-73 to the new BZ and reference this BZ. ---<-snip>--- Best, -pbunyan