Bug 1652894
Summary: | when create vhba without indicating wwpn/wwnn, libvirt will generate duplicated ones | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | yisun | |
Component: | libvirt | Assignee: | John Ferlan <jferlan> | |
Status: | CLOSED ERRATA | QA Contact: | yisun | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 7.6 | CC: | hhan, jdenemar, jferlan, jsuchane, lmen, mtessun, xuzhang | |
Target Milestone: | rc | Keywords: | Automation, Regression, ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-4.5.0-11.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1655586 1667329 (view as bug list) | Environment: | ||
Last Closed: | 2019-08-06 13:14:02 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1651787, 1655586, 1667329 |
Description
yisun
2018-11-23 12:22:27 UTC
Generation of the wwnn/wwpn is handled via virRandomGenerateWWN which as the name suggests is supposed to be a random generation. I attempted your script on the host I have and 10 unique scsi_host/fc_host's were created with unique wwnn and wwpn values. Compared to the generated numbers you show using repeated patterns where the "random" number is essentially rotate/increment in a linear and non random pattern removing the "5001a4a" which is the QUMRANET_OUI value from your output (e.g. "000000006", "00000000b", "00000000d", "0000000e", etc. looping back to "00000000"). I'd say something is very wrong with your environment either gnutls_rnd is returning bad values or the /dev/urandom is bogus. (In reply to John Ferlan from comment #2) > Generation of the wwnn/wwpn is handled via virRandomGenerateWWN which as the > name suggests is supposed to be a random generation. > > I attempted your script on the host I have and 10 unique scsi_host/fc_host's > were created with unique wwnn and wwpn values. > > Compared to the generated numbers you show using repeated patterns where the > "random" number is essentially rotate/increment in a linear and non random > pattern removing the "5001a4a" which is the QUMRANET_OUI value from your > output (e.g. "000000006", "00000000b", "00000000d", "0000000e", etc. looping > back to "00000000"). > > I'd say something is very wrong with your environment either gnutls_rnd is > returning bad values or the /dev/urandom is bogus. We just have one machine with hba card to test npiv cases, and the OS of that machine will always be clearly installed by beaker before auto script executed. I'll trigger a npiv auto job to reserve that machine and try to see if anything wrong on it. btw, do you have any suggestions about how to check gnutls_rnd? I tried using "dd if=/dev/urandom of=/tmp/file bs=1K count=1024" and can get random output. thx Usage of gnutls_rnd is a build environment thing. IOW: whatever built the rpm for libvirt-4.5.0-10.virtcov.el7_6.3.x86_64 shown above. I'm sure there's some way in some log file to know whether gnutls is being used, but it's not in my short term memory. Either way, it doesn't make sense that the value is essentially increasing by 1 rotating back through 16 values unless someone installed some sort of library that returned "known" values in order to perform some other test. Our test environment allows returning known/specific values... Here's a "quick hack" for something to try on the system that I've cobbled together from extracting out the code libvirt uses in order to generate the random byte string. $ cat random.c #define _GNU_SOURCE #include <stdio.h> #include <unistd.h> #include <string.h> #include <stdlib.h> #include <limits.h> #include <gnutls/gnutls.h> #include <gnutls/crypto.h> static int virRandomBytes(unsigned char *buf, size_t buflen) { int rv; /* Generate the byte stream using gnutls_rnd() if possible */ if ((rv = gnutls_rnd(GNUTLS_RND_RANDOM, buf, buflen)) < 0) { printf("failed to generate byte stream: %s\n", gnutls_strerror(rv)); return -1; } return 0; } static unsigned long long virRandomBits(int nbits) { unsigned long long ret = 0; if (virRandomBytes((unsigned char *) &ret, sizeof(ret)) < 0) { /* You're already hosed, so this particular non-random value * isn't any worse. */ return 0; } if (nbits < 64) ret &= (1ULL << nbits) - 1; return ret; } #define QUMRANET_OUI "001a4a" static int virRandomGenerateWWN(char **wwn) { if (asprintf(wwn, "5" "%s%09llx", QUMRANET_OUI, (unsigned long long)virRandomBits(36)) < 0) return -1; return 0; } int main(int argc, int **argv) { char *wwnn = NULL; char *wwpn = NULL; if (virRandomGenerateWWN(&wwnn) < 0 || virRandomGenerateWWN(&wwpn) < 0) { printf("Failure to generate wwnn/wwpn\n"); exit(1); } printf("wwnn=0x%s wwpn=0x%s\n", wwnn, wwpn); } $ cc -o random random.c -lgnutls $ ./random wwnn=0x5001a4a6380cd412 wwpn=0x5001a4a9b1aa0277 $ That's output on my host. on the clean installed 7.6.z systems i mentioned in comment 5, with the gnutls-devel version mentioned in comment 7 and with the test code mentioned in comment 9... nothing wrong happens in my hosts: Host1: [root@ibm-x3850x5-05 ~]# rpm -qa | grep gnutls-devel gnutls-devel-3.3.29-8.el7.x86_64 [root@ibm-x3850x5-05 ~]# for i in {1..10}; do ./random; done wwnn=0x5001a4a3f9444d3b wwpn=0x5001a4aaf9fb744a wwnn=0x5001a4a4eddefd7f wwpn=0x5001a4ad38b0818c wwnn=0x5001a4a2f33f7dc2 wwpn=0x5001a4aebf743fe0 wwnn=0x5001a4a6b173d4f1 wwpn=0x5001a4aef1357189 wwnn=0x5001a4a6bb7601aa wwpn=0x5001a4adae0f530f wwnn=0x5001a4afb50d447f wwpn=0x5001a4a162df0ceb wwnn=0x5001a4a4d3f9329c wwpn=0x5001a4a242009d4a wwnn=0x5001a4a94bf299ef wwpn=0x5001a4adfe246ed4 wwnn=0x5001a4a56e66792d wwpn=0x5001a4a35bf61c48 wwnn=0x5001a4a9367cdd27 wwpn=0x5001a4a82d5b48c5 Host2: [root@hp-dl360eg8-06 ~]# rpm -qa | grep gnutls-devel gnutls-devel-3.3.29-8.el7.x86_64 [root@hp-dl360eg8-06 ~]# for i in {1..10}; do ./random; done wwnn=0x5001a4a3341815dd wwpn=0x5001a4a42d9a4341 wwnn=0x5001a4ae0375dc37 wwpn=0x5001a4ae15c94b91 wwnn=0x5001a4ad6dbc1b4d wwpn=0x5001a4a28288d56c wwnn=0x5001a4abd0b070b9 wwpn=0x5001a4a5c3b9a6ea wwnn=0x5001a4a6c22bdf98 wwpn=0x5001a4a8aa7fdd19 wwnn=0x5001a4a4fe083e3d wwpn=0x5001a4a80cc59754 wwnn=0x5001a4aa48f01fd3 wwpn=0x5001a4a00c68d95a wwnn=0x5001a4a5758f709b wwpn=0x5001a4a0a5e0af8f wwnn=0x5001a4a01bb5f7b8 wwpn=0x5001a4ad9fa8136c wwnn=0x5001a4a3d38d17c9 wwpn=0x5001a4a32591f7ce So do we need to run test code on the rpm builder host for a check?
> So do we need to run test code on the rpm builder host for a check?
I do not believe that matters. The failure is happening on the target host not the build system.
At this point I'm leaning towards environmental with respect to what gets loaded when libvirtd is loaded. IOW: Is there something installed on the target host(s) that is intercepting the libgnutls.so gnutls_rnd call and returning non random values.
Does the "failed to generate byte stream: %s" message show up in the libvirt logs (assuming you turn them on)?
Since you've proven you can build directly linking to the "host" gnutls library and not get certain results, but then run as part of libvirtd and get those bad results, then there's something in that "stack" that's causing the problem.
So a bit of debugging is needed. I think a 'starting point' would be to run 'lsof -p <pid> | grep gnutls' on the pid that's running libvirtd since in the long run that's what will make the call to gnutls_rnd. Let's make sure that the library is there and loaded.
Another data point could be to use gdb to attach to the libvirtd process (gdb <pathto>/libvirtd and attach <pid>). From the gdb> prompt using "info symbol gnutls_rnd" should tell you which library loaded the symbol. Staying in gdb> (after a continue) and running the client call again to create the vHBA may result in seeing the above mentioned "failed to generate" error (it may not, but it may).
The "good news" to a degree is that I don't believe creating vHBAs in a loop such as is being done in the problem statement is not a "typical" operation.
(In reply to John Ferlan from comment #11) > > So do we need to run test code on the rpm builder host for a check? > > I do not believe that matters. The failure is happening on the target host > not the build system. Hi John, I checked the latest rhel7.6 branch and found that there maybe a data type misusing. *** root@yisun /tmp/libvirt-rhel ## git show 60da4a11 ... @@ -108,26 +61,14 @@ VIR_ONCE_GLOBAL_INIT(virRandom) uint64_t virRandomBits(int nbits) { uint64_t ret = 0; ... + ret &= (1U << nbits) - 1; *** Here the 64 bits ret is &= with a 1U which is a 32 bits type, seems a lot of zeros will be generated... In comment 9, your test code using a 1ULL which is a 64 bits value so no error happened, but if changed the test code to 1U, same error will happened with the test code. just change the 1ULL to 1U of your test code, same error happens as follow: [root@hp-dl360eg8-06 tmp]# ./random wwnn=0x5001a4a000000000 wwpn=0x5001a4a00000000e [root@hp-dl360eg8-06 tmp]# ./random wwnn=0x5001a4a000000003 wwpn=0x5001a4a00000000c [root@hp-dl360eg8-06 tmp]# ./random wwnn=0x5001a4a00000000d wwpn=0x5001a4a00000000b [root@hp-dl360eg8-06 tmp]# ./random wwnn=0x5001a4a00000000b wwpn=0x5001a4a00000000c [root@hp-dl360eg8-06 tmp]# ./random wwnn=0x5001a4a000000002 wwpn=0x5001a4a000000002 [root@hp-dl360eg8-06 tmp]# ./random wwnn=0x5001a4a000000006 wwpn=0x5001a4a000000001 Set as a regression. Some other functions may be effected by the random number generator. A patch is already in upstream libvirt 4.7.0 via commit: commit 0a5a6f0d019996b015bb0acbe30efa8f2fbbb351 Author: Michal Privoznik <mprivozn> Date: Thu Aug 2 09:29:38 2018 +0200 virrandom: Avoid undefined behaviour in virRandomBits If nbits is 64 (or greater) then shifting 1ULL left is undefined. Signed-off-by: Michal Privoznik <mprivozn> Reviewed-by: Daniel P. Berrangé <berrange> diff --git a/src/util/virrandom.c b/src/util/virrandom.c index 3c011a8615..7915f6531e 100644 --- a/src/util/virrandom.c +++ b/src/util/virrandom.c @@ -68,7 +68,9 @@ uint64_t virRandomBits(int nbits) return 0; } - ret &= (1ULL << nbits) - 1; + if (nbits < 64) + ret &= (1ULL << nbits) - 1; + return ret; } Thus "process wise" for RHEL 7.7, this bz changes to POST for 7.7 and I've set the 7.6.z flag. Although the usage in this case is for vHBA WWNN/WWPN in a loop, the virRandomBits function is far more widely used for things like generation of a MAC or a UUID. Seems commit 78c47a92ecb450c9f8bcabd35da7006dc2547882 from libvirt.4.6.0 is also needed: Author: Michal Privoznik <mprivozn> Date: Wed Aug 1 13:26:46 2018 +0200 util: Don't overflow in virRandomBits The function is supposed to return up to 64bit long integer. In order to do that it calls virRandomBytes() to fill the integer with random bytes and then masks out everything but requested bits. However, when doing that it shifts 1U and not 1ULL. So effectively, requesting 32 random bis or more always return 0 which is not random enough. Signed-off-by: Michal Privoznik <mprivozn> Reviewed-by: Daniel P. Berrangé <berrange> Reviewed-by: Pino Toscano <ptoscano> diff --git a/src/util/virrandom.c b/src/util/virrandom.c index 01cc82a052..3c011a8615 100644 --- a/src/util/virrandom.c +++ b/src/util/virrandom.c @@ -68,7 +68,7 @@ uint64_t virRandomBits(int nbits) return 0; } - ret &= (1U << nbits) - 1; + ret &= (1ULL << nbits) - 1; return ret; } test result: PASSED tested with: libvirt-4.5.0-12.virtcov.el7.x86_64 covered by auto cases: npiv.nodedev_create_destroy.negative_testing.create_max_vhbas npiv.nodedev_create_destroy.negative_testing.frequently_create_destroy_vhbas.random_wwn Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2294 |