Bug 1658855

Summary: In VM: rngd.service: Failed with result 'core-dump'
Product: Red Hat Enterprise Linux 8 Reporter: yduan
Component: rng-toolsAssignee: Neil Horman <nhorman>
Status: CLOSED CURRENTRELEASE QA Contact: Vilém Maršík <vmarsik>
Severity: high Docs Contact:
Priority: high    
Version: 8.0CC: bhu, borntraeger, bugproxy, chayang, hannsj_uhl, jkachuck, juzhang, nhorman, vmarsik, xfu, yduan
Target Milestone: rcKeywords: Regression
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-14 01:06:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1478674, 1564587    
Attachments:
Description Flags
rngd_core_dump
none
core.rngd.1126
none
coredump none

Description yduan 2018-12-13 02:51:55 UTC
Description of problem:
rngd.service: Failed with result 'core-dump'

Version-Release number of selected component (if applicable):
Guest:
4.18.0-51.el8.x86_64
rng-tools-6.6-1.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1. # systemctl status rngd
● rngd.service - Hardware RNG Entropy Gatherer Daemon
   Loaded: loaded (/usr/lib/systemd/system/rngd.service; enabled; vendor preset: enabled)
   Active: failed (Result: core-dump) since Thu 2018-12-13 10:12:08 CST; 2min 1s ago
 Main PID: 8018 (code=dumped, signal=FPE)

Dec 13 10:12:08 dhcp-8-182.nay.redhat.com systemd[1]: Started Hardware RNG Entropy Gatherer Daemon.
Dec 13 10:12:08 dhcp-8-182.nay.redhat.com rngd[8018]: Initalizing available sources
Dec 13 10:12:08 dhcp-8-182.nay.redhat.com rngd[8018]: Initalizing entropy source hwrng
Dec 13 10:12:08 dhcp-8-182.nay.redhat.com rngd[8018]: Failed to init entropy source rdrand
Dec 13 10:12:08 dhcp-8-182.nay.redhat.com systemd[1]: rngd.service: Main process exited, code=dumped, status=8/FPE
Dec 13 10:12:08 dhcp-8-182.nay.redhat.com systemd[1]: rngd.service: Failed with result 'core-dump'.

2. # dmesg | grep -i error
[  264.689959] traps: rngd[8018] trap divide error ip:558b7fc7e90f sp:7fff0afe9290 error:0 in rngd[558b7fc76000+d000]

Actual results:
rngd.service: Failed with result 'core-dump'

Expected results:
rngd.service should be started successfully.

Additional info:
Can not reproduce with rng-tools-6.2-1.el8+9zz5.x86_64.

Comment 1 Neil Horman 2018-12-13 15:23:56 UTC
attach the core dump please?

Comment 2 Neil Horman 2018-12-13 16:11:00 UTC
I've got this installed on a beaker system with kernel 4.18.0-51 and rng-tools-6.6-1 and its running fine, so a more detailed reproducer is also needed here.  Alternatively, if you can provide access to the system in question, that would help

Comment 3 yduan 2018-12-14 00:38:06 UTC
(In reply to Neil Horman from comment #2)
> I've got this installed on a beaker system with kernel 4.18.0-51 and
> rng-tools-6.6-1 and its running fine, so a more detailed reproducer is also
> needed here.  Alternatively, if you can provide access to the system in
> question, that would help

Hi Neil,

Sorry for making confusion, as mentioned in comment 0, the problem occurred in a VM.

BR,
yduan

Comment 4 Neil Horman 2018-12-14 14:29:36 UTC
yes, I know, I installed a vm on the beaker system, and its running fine:

[root@localhost ~]# virt-what
kvm
[root@localhost ~]# uname -a
Linux localhost.localdomain 4.18.0-51.el8.x86_64 #1 SMP Fri Dec 7 22:46:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost ~]# rpm -qli rng-tools
Name        : rng-tools
Version     : 6.6
Release     : 1.el8
Architecture: x86_64
Install Date: Fri 14 Dec 2018 09:09:00 AM EST
Group       : System Environment/Base
Size        : 111922
License     : GPLv2+
Signature   : (none)
Source RPM  : rng-tools-6.6-1.el8.src.rpm
Build Date  : Fri 14 Dec 2018 09:08:35 AM EST
Build Host  : localhost
Relocations : (not relocatable)
URL         : https://github.com/nhorman/rng-tools
Summary     : Random number generator related utilities
Description :
Hardware random number generation tools.
/sbin/rngd
/usr/bin/rngtest
/usr/lib/.build-id
/usr/lib/.build-id/c6
/usr/lib/.build-id/c6/c53968c9f3d8a0aa609631d860f48585875e76
/usr/lib/.build-id/eb
/usr/lib/.build-id/eb/5d3c4c7f129e8da9c596b26940c94e31b37393
/usr/lib/systemd/system/rngd.service
/usr/share/doc/rng-tools
/usr/share/doc/rng-tools/AUTHORS
/usr/share/doc/rng-tools/NEWS
/usr/share/doc/rng-tools/README
/usr/share/licenses/rng-tools
/usr/share/licenses/rng-tools/COPYING
/usr/share/man/man1/rngtest.1.gz
/usr/share/man/man8/rngd.8.gz
[root@localhost ~]# systemctl restart rngd
[root@localhost ~]# systemctl status rngd
● rngd.service - Hardware RNG Entropy Gatherer Daemon
   Loaded: loaded (/usr/lib/systemd/system/rngd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2018-12-14 09:28:34 EST; 5s ago
 Main PID: 1327 (rngd)
    Tasks: 3 (limit: 12553)
   Memory: 5.1M
   CGroup: /system.slice/rngd.service
           └─1327 /sbin/rngd -f

Dec 14 09:28:34 localhost.localdomain systemd[1]: Stopped Hardware RNG Entropy Gatherer Daemon.
Dec 14 09:28:34 localhost.localdomain systemd[1]: Started Hardware RNG Entropy Gatherer Daemon.
Dec 14 09:28:34 localhost.localdomain rngd[1327]: Initalizing available sources
Dec 14 09:28:34 localhost.localdomain rngd[1327]: Failed to init entropy source hwrng
Dec 14 09:28:34 localhost.localdomain rngd[1327]: Enabling RDSEED rng support
Dec 14 09:28:34 localhost.localdomain rngd[1327]: Initalizing entropy source rdrand
Dec 14 09:28:35 localhost.localdomain rngd[1327]: Enabling JITTER rng support
Dec 14 09:28:35 localhost.localdomain rngd[1327]: Initalizing entropy source jitter

So, asking again, do you have the core, or can you provide me access to the system you are testing this on so we can determine why I can't reproduce what you are seeing?

Comment 5 yduan 2018-12-17 03:18:54 UTC
Created attachment 1514957 [details]
rngd_core_dump

Comment 6 yduan 2018-12-17 03:23:30 UTC
Hi Neil,

  The core is attached.

Additional Info:
# systemctl status rngd
 rngd.service - Hardware RNG Entropy Gatherer Daemon
   Loaded: loaded (/usr/lib/systemd/system/rngd.service; enabled; vendor preset: enabled)
   Active: failed (Result: core-dump) since Mon 2018-12-17 18:16:27 CST; 7h left
  Process: 1144 ExecStart=/sbin/rngd -f (code=dumped, signal=FPE)
 Main PID: 1144 (code=dumped, signal=FPE)

Dec 17 18:16:23 localhost.localdomain systemd[1]: Started Hardware RNG Entropy Gatherer Daemon.
Dec 17 18:16:24 localhost.localdomain rngd[1144]: Initalizing available sources
Dec 17 18:16:24 localhost.localdomain rngd[1144]: Initalizing entropy source hwrng
Dec 17 18:16:24 localhost.localdomain rngd[1144]: Failed to init entropy source rdrand
Dec 17 18:16:27 localhost.localdomain systemd[1]: rngd.service: Main process exited, code=dumped, status=8/FPE
Dec 17 18:16:27 localhost.localdomain systemd[1]: rngd.service: Failed with result 'core-dump'.

Core was generated by `/sbin/rngd -f'.
Program terminated with signal SIGFPE, Arithmetic exception.
#0  0x0000561c2781c90f in xread_jitter (buf=buf@entry=0x7fff894ba640, size=size@entry=16, 
    ent_src=ent_src@entry=0x561c27a22740 <entropy_sources+480>) at rngd_jitter.c:199
199	rngd_jitter.c: No such file or directory.
Missing separate debuginfos, use: dnf debuginfo-install brotli-1.0.6-1.el8.x86_64 cyrus-sasl-lib-2.1.27-0.3rc7.el8.x86_64 glibc-2.28-38.el8.x86_64 keyutils-libs-1.5.10-6.el8.x86_64 krb5-libs-1.16.1-19.el8.x86_64 libcom_err-1.44.3-1.el8.x86_64 libcurl-7.61.1-7.el8.x86_64 libgcrypt-1.8.3-2.el8.x86_64 libgpg-error-1.31-1.el8.x86_64 libidn2-2.0.5-1.el8.x86_64 libnghttp2-1.33.0-1.el8.x86_64 libpsl-0.20.2-5.el8.x86_64 libselinux-2.8-6.el8.x86_64 libssh-0.8.5-1.el8.x86_64 libsysfs-2.1.0-23.el8.x86_64 libunistring-0.9.9-3.el8.x86_64 libxcrypt-4.1.1-4.el8.x86_64 libxml2-2.9.7-5.el8.x86_64 openldap-2.4.46-8.el8.x86_64 openssl-libs-1.1.1-7.el8.x86_64 pcre2-10.31-11.el8.x86_64 xz-libs-5.2.4-3.el8.x86_64 zlib-1.2.11-10.el8.x86_64
(gdb) bt
#0  0x0000561c2781c90f in xread_jitter (buf=buf@entry=0x7fff894ba640, size=size@entry=16, 
    ent_src=ent_src@entry=0x561c27a22740 <entropy_sources+480>) at rngd_jitter.c:199
#1  0x0000561c2781d299 in init_jitter_entropy_source (ent_src=0x561c27a22740 <entropy_sources+480>)
    at rngd_jitter.c:476
#2  0x0000561c2781738c in main (argc=<optimized out>, argv=<optimized out>) at rngd.c:690
(gdb) bt full
#0  0x0000561c2781c90f in xread_jitter (buf=buf@entry=0x7fff894ba640, size=size@entry=16, 
    ent_src=ent_src@entry=0x561c27a22740 <entropy_sources+480>) at rngd_jitter.c:199
        data = 0
        current = 0x561c280af180
        start = 0x561c280af180
        request = <optimized out>
        idx = 0
        need = 16
        bptr = 0x7fff894ba640 "\n"
        retry_count = 0
        sleep = {tv_sec = 114257, tv_nsec = 659}
#1  0x0000561c2781d299 in init_jitter_entropy_source (ent_src=0x561c27a22740 <entropy_sources+480>)
    at rngd_jitter.c:476
        cpus = 0x0
        cpusize = <optimized out>
        i = <optimized out>
        core_id = <optimized out>
        key = "\n\000\000\000\000\000\000\000\240'\242'\034V\000"
        ret = <optimized out>
#2  0x0000561c2781738c in main (argc=<optimized out>, argv=<optimized out>) at rngd.c:690
        i = <optimized out>
        ent_sources = 1
        pid_fd = -1
        test_time = <optimized out>
(gdb)

Comment 7 Neil Horman 2018-12-17 16:33:37 UTC
hmm, looks like perhaps the disable flag is in the .bss section rather than .data, leading to an odd situation where that flag might be in an unknown state, leading to lack of initalization on startup.  Could you please test with this package:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19542200

And confirm that the crash no longer happens

thanks

Comment 8 yduan 2018-12-18 02:44:05 UTC
Created attachment 1515205 [details]
core.rngd.1126

Comment 9 yduan 2018-12-18 02:46:45 UTC
(In reply to Neil Horman from comment #7)
> hmm, looks like perhaps the disable flag is in the .bss section rather than
> .data, leading to an odd situation where that flag might be in an unknown
> state, leading to lack of initalization on startup.  Could you please test
> with this package:
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19542200
> 
> And confirm that the crash no longer happens
> 

Hi Neil,

I tested with your package and still met the crash.
The core file is attached.

Thanks,
yduan

# rpm -q rng-tools                                                              
rng-tools-6.6-2.el8.x86_64                                                      
# systemctl status rngd                                                         
● rngd.service - Hardware RNG Entropy Gatherer Daemon                           
   Loaded: loaded (/usr/lib/systemd/system/rngd.service; enabled; vendor preset: enabled)
   Active: failed (Result: core-dump) since Tue 2018-12-18 18:19:34 CST; 7h left
  Process: 1126 ExecStart=/sbin/rngd -f (code=dumped, signal=FPE)               
 Main PID: 1126 (code=dumped, signal=FPE)                                       
                                                                                
Dec 18 18:19:29 localhost.localdomain systemd[1]: Started Hardware RNG Entropy Gatherer Daemon.
Dec 18 18:19:30 localhost.localdomain rngd[1126]: Initalizing available sources 
Dec 18 18:19:30 localhost.localdomain rngd[1126]: Initalizing entropy source hwrng
Dec 18 18:19:30 localhost.localdomain rngd[1126]: Failed to init entropy source rdrand
Dec 18 18:19:34 localhost.localdomain systemd[1]: rngd.service: Main process exited, code=dumped, status=8/FPE
Dec 18 18:19:34 localhost.localdomain systemd[1]: rngd.service: Failed with result 'core-dump'.


Core was generated by `/sbin/rngd -f'.                                          
Program terminated with signal SIGFPE, Arithmetic exception.                    
#0  0x000055cb7fd3f90f in xread_jitter (buf=buf@entry=0x7fff7cf1e720, size=size@entry=16, 
    ent_src=ent_src@entry=0x55cb7ff45740 <entropy_sources+480>) at rngd_jitter.c:199
199 rngd_jitter.c: No such file or directory.                                   
Missing separate debuginfos, use: dnf debuginfo-install brotli-1.0.6-1.el8.x86_64 cyrus-sasl-lib-2.1.27-0.3rc7.el8.x86_64 glibc-2.28-38.el8.x86_64 keyutils-libs-1.5.10-6.el8.x86_64 krb5-libs-1.16.1-19.el8.x86_64 libcom_err-1.44.3-1.el8.x86_64 libcurl-7.61.1-7.el8.x86_64 libgcrypt-1.8.3-2.el8.x86_64 libgpg-error-1.31-1.el8.x86_64 libidn2-2.0.5-1.el8.x86_64 libnghttp2-1.33.0-1.el8.x86_64 libpsl-0.20.2-5.el8.x86_64 libselinux-2.8-6.el8.x86_64 libssh-0.8.5-1.el8.x86_64 libsysfs-2.1.0-23.el8.x86_64 libunistring-0.9.9-3.el8.x86_64 libxcrypt-4.1.1-4.el8.x86_64 libxml2-2.9.7-5.el8.x86_64 openldap-2.4.46-8.el8.x86_64 openssl-libs-1.1.1-7.el8.x86_64 pcre2-10.31-11.el8.x86_64 xz-libs-5.2.4-3.el8.x86_64 zlib-1.2.11-10.el8.x86_64
(gdb) bt 
#0  0x000055cb7fd3f90f in xread_jitter (buf=buf@entry=0x7fff7cf1e720, size=size@entry=16, 
    ent_src=ent_src@entry=0x55cb7ff45740 <entropy_sources+480>) at rngd_jitter.c:199
#1  0x000055cb7fd40299 in init_jitter_entropy_source (ent_src=0x55cb7ff45740 <entropy_sources+480>)
    at rngd_jitter.c:476                                                        
#2  0x000055cb7fd3a38c in main (argc=<optimized out>, argv=<optimized out>) at rngd.c:691
(gdb) bt full
#0  0x000055cb7fd3f90f in xread_jitter (buf=buf@entry=0x7fff7cf1e720, size=size@entry=16, 
    ent_src=ent_src@entry=0x55cb7ff45740 <entropy_sources+480>) at rngd_jitter.c:199
        data = 0                                                                
        current = 0x55cb80b27180                                                
        start = 0x55cb80b27180                                                  
        request = <optimized out>                                               
        idx = 0                                                                 
        need = 16                                                               
        bptr = 0x7fff7cf1e720 "\n"                                              
        retry_count = 0                                                         
        sleep = {tv_sec = 114257, tv_nsec = 659}                                
#1  0x000055cb7fd40299 in init_jitter_entropy_source (ent_src=0x55cb7ff45740 <entropy_sources+480>)
    at rngd_jitter.c:476                                                        
        cpus = 0x0                                                              
        cpusize = <optimized out>                                               
        i = <optimized out>                                                     
        core_id = <optimized out>                                               
        key = "\n\000\000\000\000\000\000\000\240W\364\177\313U\000"            
        ret = <optimized out>                                                   
#2  0x000055cb7fd3a38c in main (argc=<optimized out>, argv=<optimized out>) at rngd.c:691
        i = <optimized out>                                                     
        ent_sources = 1                                                         
        pid_fd = -1                                                             
        test_time = <optimized out>

Comment 10 Neil Horman 2018-12-18 12:07:43 UTC
No, at this point, I need to get on the VM in question, given that I have a VM that is working just fine here, there should be no reason for this crash.  Please provide access credentials and an address for the vm you are seeing this on.

Comment 12 Neil Horman 2018-12-19 15:15:14 UTC
thank you, and with that, I found the problem.  Apparently this VM is restricted in such a way that sched_getaffinity returns 0 cpus in its mask, leading to a computation that causes 0 threads to get started. In that case we should just default to a single thread.  This build:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19581888

Contains a patch for that.  Please test and confirm.  Note that I didn't bump the release number (sorry), so you will need to uninstall/reinstall the rng-tools pacakge

Comment 13 yduan 2018-12-20 10:25:29 UTC
(In reply to Neil Horman from comment #12)
> thank you, and with that, I found the problem.  Apparently this VM is
> restricted in such a way that sched_getaffinity returns 0 cpus in its mask,
> leading to a computation that causes 0 threads to get started. In that case
> we should just default to a single thread.  This build:
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19581888
> 
> Contains a patch for that.  Please test and confirm.  Note that I didn't
> bump the release number (sorry), so you will need to uninstall/reinstall the
> rng-tools pacakge

With this build, rngd started successfully and no core dumped.

# systemctl status rngd                                                         
● rngd.service - Hardware RNG Entropy Gatherer Daemon                           
   Loaded: loaded (/usr/lib/systemd/system/rngd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2018-12-21 02:15:48 CST; 7h left          
 Main PID: 1148 (rngd)                                                          
    Tasks: 2 (limit: 23485)                                                     
   Memory: 1.7M                                                                 
   CGroup: /system.slice/rngd.service                                           
           └─1148 /sbin/rngd -f                                                 
                                                                                
Dec 21 02:15:48 localhost.localdomain systemd[1]: Started Hardware RNG Entropy Gatherer Daemon.
Dec 21 02:15:48 localhost.localdomain rngd[1148]: Initalizing available sources 
Dec 21 02:15:48 localhost.localdomain rngd[1148]: Initalizing entropy source hwrng
Dec 21 02:15:48 localhost.localdomain rngd[1148]: Failed to init entropy source rdrand
Dec 21 02:15:51 localhost.localdomain rngd[1148]: Enabling JITTER rng support   
Dec 21 02:15:51 localhost.localdomain rngd[1148]: Initalizing entropy source jitter
# dmesg | grep -i error                                      
# coredumpctl list
No coredumps found.

Comment 20 Neil Horman 2018-12-21 12:20:23 UTC
There has to be something.  From the sched_getaffinity man page:

EINVAL The affinity bit mask mask contains no processors that are currently physically on the system and permitted to the thread according to any restrictions that may be imposed by  cpuset  cgroups
              or the "cpuset" mechanism described in cpuset(7).

EINVAL (sched_getaffinity() and, in kernels before 2.6.9, sched_setaffinity()) cpusetsize is smaller than the size of the affinity mask used by the kernel.

We're seeing an EINVAL errno on the call to sched_getaffinity, so one would presume that either we are using too small a cpu set mask or something in the cgroup configuartion is erroneously creating a zero affinity mask.  I can't believe the problem is a too small cpusetsize, as the cpusetsize is computed by asking the kernel what the size should be via the sysconf call.

I tried to hope back on to the vm to provide more detail, but it seems the system is no longer setup in the same way.  Can you re-establish it please so that I can look more closely?

Comment 23 yduan 2018-12-22 09:03:53 UTC
(In reply to Neil Horman from comment #20)
> There has to be something.  From the sched_getaffinity man page:
> 
> EINVAL The affinity bit mask mask contains no processors that are currently
> physically on the system and permitted to the thread according to any
> restrictions that may be imposed by  cpuset  cgroups
>               or the "cpuset" mechanism described in cpuset(7).
> 
> EINVAL (sched_getaffinity() and, in kernels before 2.6.9,
> sched_setaffinity()) cpusetsize is smaller than the size of the affinity
> mask used by the kernel.
> 
> We're seeing an EINVAL errno on the call to sched_getaffinity, so one would
> presume that either we are using too small a cpu set mask or something in
> the cgroup configuartion is erroneously creating a zero affinity mask.  I
> can't believe the problem is a too small cpusetsize, as the cpusetsize is
> computed by asking the kernel what the size should be via the sysconf call.
> 
> I tried to hope back on to the vm to provide more detail, but it seems the
> system is no longer setup in the same way.  Can you re-establish it please
> so that I can look more closely?

Sure, but I ran some time-consuming automation tests on that host.
I'll bring that VM back for your further investigation no later than next monday.

Thanks,
yduan

Comment 24 yduan 2018-12-23 07:04:38 UTC
(In reply to Neil Horman from comment #20)
> There has to be something.  From the sched_getaffinity man page:
> 
> EINVAL The affinity bit mask mask contains no processors that are currently
> physically on the system and permitted to the thread according to any
> restrictions that may be imposed by  cpuset  cgroups
>               or the "cpuset" mechanism described in cpuset(7).
> 
> EINVAL (sched_getaffinity() and, in kernels before 2.6.9,
> sched_setaffinity()) cpusetsize is smaller than the size of the affinity
> mask used by the kernel.
> 
> We're seeing an EINVAL errno on the call to sched_getaffinity, so one would
> presume that either we are using too small a cpu set mask or something in
> the cgroup configuartion is erroneously creating a zero affinity mask.  I
> can't believe the problem is a too small cpusetsize, as the cpusetsize is
> computed by asking the kernel what the size should be via the sysconf call.
> 
> I tried to hope back on to the vm to provide more detail, but it seems the
> system is no longer setup in the same way.  Can you re-establish it please
> so that I can look more closely?

Hi Neil,

  The system is available now.
  I have downgraded rng-tools to version 6.6.1 and your private build is in directory '/home'.

Thanks,
yduan

Comment 26 Neil Horman 2019-01-08 15:36:02 UTC
I can work on this again, please restart the vm

Comment 28 Neil Horman 2019-01-09 15:56:30 UTC
thank you, but unfortunately the problem does not reproduce now, even if my patch is not applied, so there must be something different about this setup

Comment 29 yduan 2019-01-10 10:44:19 UTC
(In reply to Neil Horman from comment #28)
> thank you, but unfortunately the problem does not reproduce now, even if my
> patch is not applied, so there must be something different about this setup

Hi Neil,

  I can still meet it in the VM.

BR,
yduan

Comment 30 Neil Horman 2019-01-10 14:45:52 UTC
there it is, not sure why it was working before.  How did you start this vm?  Are you just starting it straight with qemu?  I don't see libvirtd running on the host

Comment 31 Neil Horman 2019-01-10 14:50:08 UTC
*** Bug 1665057 has been marked as a duplicate of this bug. ***

Comment 32 IBM Bug Proxy 2019-01-10 15:41:16 UTC
Created attachment 1519856 [details]
coredump

Comment 33 yduan 2019-01-11 01:50:39 UTC
(In reply to Neil Horman from comment #30)
> there it is, not sure why it was working before.  How did you start this vm?
> Are you just starting it straight with qemu?  I don't see libvirtd running
> on the host

Yes, you can find the qemu command line in host through "ps aux | grep qemu".

BR,
yduan

Comment 34 IBM Bug Proxy 2019-01-11 08:00:55 UTC
------- Comment From cborntra.com 2019-01-11 02:52 EDT-------
For what its worth, I saw this problem in an LPAR.

Comment 35 Neil Horman 2019-01-11 15:13:10 UTC
Thank you, I think that may be the root of the reproducer.  All the systems I've tested on that are virt systems have the number of cores equal to the number of cpus on the bare metal system, and this guest is started with -smp 8,maxcpus=240.  Thats probably causing the sysconf call in rng tools to return 8 (thereby allocating 8 cpus in the affinity array, but getaffinity would require 240 entries to not fail.  Thats how you reproduce the problem, by forcing there to be fewer cpus configured than the max number

Comment 36 Christian Borntraeger 2019-01-14 09:29:30 UTC
FWIW, I have seen this problem in an s390x LPAR (not a KVM guest). (see  IBM bug 174583 - RH bug 1665057)

Comment 37 IBM Bug Proxy 2019-01-14 11:20:47 UTC
------- Comment From cborntra.com 2019-01-14 06:10 EDT-------
snapshot 1 was still ok (it has rng-tools-6.2-1.el8 instead of rng-tools-6.6-1.el8)

------- Comment From cborntra.com 2019-01-14 06:17 EDT-------
FWIW, the same would be true on LPAR. My LPAR has 32 processors, but the maximum value is higher.

Comment 41 IBM Bug Proxy 2019-01-22 20:40:39 UTC
------- Comment From Christian.Rund.com 2019-01-22 15:31 EDT-------
*** Bug 174884 has been marked as a duplicate of this bug. ***

Comment 45 Vilém Maršík 2019-01-25 15:10:51 UTC
Thanks.

I think that rng-tools-6.6-2.el8.x86_64.rpm from brew is correctly fixed:

[root@dhcp-8-136 ~]# rpm -q rng-tools
rng-tools-6.6-1.el8.x86_64
[root@dhcp-8-136 ~]# systemctl restart rngd
[root@dhcp-8-136 ~]# systemctl status rngd
● rngd.service - Hardware RNG Entropy Gatherer Daemon
   Loaded: loaded (/usr/lib/systemd/system/rngd.service; enabled; vendor preset: enabled)
   Active: failed (Result: core-dump) since Fri 2019-01-25 22:58:15 CST; 4s ago
  Process: 9837 ExecStart=/sbin/rngd -f (code=dumped, signal=FPE)
 Main PID: 9837 (code=dumped, signal=FPE)

Jan 25 22:58:15 dhcp-8-136.nay.redhat.com systemd[1]: Started Hardware RNG Entropy Gatherer Daemon.
Jan 25 22:58:15 dhcp-8-136.nay.redhat.com rngd[9837]: Initalizing available sources
Jan 25 22:58:15 dhcp-8-136.nay.redhat.com rngd[9837]: Initalizing entropy source hwrng
Jan 25 22:58:15 dhcp-8-136.nay.redhat.com rngd[9837]: Enabling RDRAND rng support
Jan 25 22:58:15 dhcp-8-136.nay.redhat.com rngd[9837]: Initalizing entropy source rdrand
Jan 25 22:58:15 dhcp-8-136.nay.redhat.com systemd[1]: rngd.service: Main process exited, code=dumped, status=8/FPE
Jan 25 22:58:15 dhcp-8-136.nay.redhat.com systemd[1]: rngd.service: Failed with result 'core-dump'.
[root@dhcp-8-136 ~]# rpm -U rng-tools-6.6-2.el8.x86_64.rpm
[root@dhcp-8-136 ~]# systemctl restart rngd
[root@dhcp-8-136 ~]# systemctl status rngd
● rngd.service - Hardware RNG Entropy Gatherer Daemon
   Loaded: loaded (/usr/lib/systemd/system/rngd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2019-01-25 23:05:41 CST; 8s ago
 Main PID: 30974 (rngd)
    Tasks: 2 (limit: 23485)
   Memory: 1.3M
   CGroup: /system.slice/rngd.service
           └─30974 /sbin/rngd -f

Jan 25 23:05:41 dhcp-8-136.nay.redhat.com systemd[1]: Started Hardware RNG Entropy Gatherer Daemon.
Jan 25 23:05:41 dhcp-8-136.nay.redhat.com rngd[30974]: Initalizing available sources
Jan 25 23:05:41 dhcp-8-136.nay.redhat.com rngd[30974]: Initalizing entropy source hwrng
Jan 25 23:05:41 dhcp-8-136.nay.redhat.com rngd[30974]: Enabling RDRAND rng support
Jan 25 23:05:41 dhcp-8-136.nay.redhat.com rngd[30974]: Initalizing entropy source rdrand
Jan 25 23:05:42 dhcp-8-136.nay.redhat.com rngd[30974]: Enabling JITTER rng support
Jan 25 23:05:42 dhcp-8-136.nay.redhat.com rngd[30974]: Initalizing entropy source jitter


Considering the bug reproduced & verified. Thanks for the VM, I should not need it anymore.

Comment 46 IBM Bug Proxy 2019-01-30 09:10:40 UTC
------- Comment From cborntra.com 2019-01-29 14:41 EDT-------
Verified on snapshot 4.