Bug 1524833

Summary: pkispawn is unable to generate a CSR
Product: [Fedora] Fedora Reporter: Standa Laznicka <slaznick>
Component: freeipaAssignee: IPA Maintainers <ipa-maint>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 26CC: abokovoy, alee, edewata, hkario, ipa-maint, jcholast, jhrozek, jpazdziora, kwright, mharmsen, pvoborni, rcritten, slaznick, ssorce, tdudlak, tmraz
Target Milestone: ---Keywords: DevelBlocker, Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-15 07:25:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
pki-ca-spawn log
none
freeIPA server install log
none
openssl strace
none
openssl ltrace none

Description Standa Laznicka 2017-12-12 08:38:49 UTC
Created attachment 1366462 [details]
pki-ca-spawn log

Description of problem:
When installing freeIPA in a container with external CA, the pkispawn command fails because it can't do `openssl rand -out somethingsomething`

Version-Release number of selected component (if applicable):
pki-base-10.5.1-1.fc27.noarch

How reproducible:
100%

Steps to Reproduce:
1. setsebool -P container_manage_cgroup 1 && setsebool -P domain_kernel_load_modules 1 && mkdir -p /opt/ipa-data-master && chcon -t svirt_sandbox_file_t /opt/ipa-data-master
2. domain=IPA.TEST; docker run -t --name freeipa-popelnice -h $HOSTNAME \
                         --tmpfs /run --tmpfs /tmp \
                         -v /dev/urandom:/dev/random:ro -v /opt/ipa-data-master:/data \
                         -v /sys/fs/cgroup:/sys/fs/cgroup:ro --cap-add=SYS_TIME \
                         -e DEBUG_TRACE=1 -e DEBUG_NO_EXIT=1 freeipa/freeipa-server:fedora-27 \
                        --hostname $HOSTNAME --domain $domain --realm ${domain^^} \
                        -p milan_je_buh123 -a milan_je_buh123 --setup-dns --auto-forwarders --no-reverse -U --external-ca

Actual results:
"""
...
Configuring certificate server (pki-tomcatd). Estimated time: 3 minutes
  [1/8]: configuring certificate server instance
ipaserver.install.dogtaginstance: CRITICAL Failed to configure CA instance: Command '/usr/sbin/pkispawn -s CA -f /tmp/tmpfk89bf3o' returned non-zero exit status 1.
"""


Expected results:
Certificate server gives us a CSR to sign.

Additional info:
The ipaserver-install.log also shows:
"""
Installation failed: Command '['openssl', 'rand', '-out', '/tmp/tmpJaE76a/noise.bin', '2048']' returned 
non-zero exit status 1
2017-12-12T07:42:53Z DEBUG stderr=unable to write 'random state'
"""
You'll find it enclosed.

Please note that `docker exec freeipa-popelnice openssl rand -out /tmp/something 2048` works fine.

Comment 1 Standa Laznicka 2017-12-12 08:42:32 UTC
Created attachment 1366463 [details]
freeIPA server install log

Comment 2 Standa Laznicka 2017-12-13 07:58:53 UTC
This can actually be observed in Fedora 26 container, too, and breaks a valid freeIPA use case.

Comment 4 Endi Sukma Dewata 2018-01-02 12:27:17 UTC
Are you able to run this operation manually?

$ openssl rand -out /tmp/noise.bin 2048

If that doesn't work, there might be something specific about your system related to OpenSSL. The command works fine on my system.

Comment 5 Standa Laznicka 2018-01-02 12:31:16 UTC
As the last line of the bug description says:

"""
Please note that `docker exec freeipa-popelnice openssl rand -out /tmp/something 2048` works fine.
"""

Comment 6 Endi Sukma Dewata 2018-01-02 12:54:08 UTC
Are you running openssl with the same user used to run pkispawn in IPA?

Perhaps someone more familiar with OpenSSL should take a look at this, but it probably has something to do with ~/.rnd file:

https://stackoverflow.com/questions/94445/using-openssl-what-does-unable-to-write-random-state-mean

pkispawn is simply calling openssl to generate a random noise.

Comment 7 Standa Laznicka 2018-01-03 07:32:06 UTC
Yes, it's run from freeIPA installer which is run as root in a container.

The same scenario seem to have been working in Fedora 25-based containers, not sure what changed.

Comment 8 Jan Pazdziora 2018-01-03 07:39:40 UTC
Well, the ipa-server-install is run as root but are we sure that that's the id the openssl command is run as, invoked somehow by that pkispawn execution? I don't see noise.bin anywhere in IPA's git repository, so that seems to come completely from the pki* land, doesn't it?

Comment 9 Standa Laznicka 2018-01-03 08:00:21 UTC
pkispawn is run as a mere subprocess of the freeIPA installer so I should hope the user stays the same (root) at least for that process. Whatever pkispawn does after that with the openssl command I can't be sure and that is probably a question for Endi.

I believe noise.bin is the file that's getting generated to be a random seed for a CSR creation and thus it won't be in any repository. I'll check the ownership of the /root/.rnd file though.

Comment 10 Standa Laznicka 2018-01-03 09:03:33 UTC
There's no such file as /root/.rnd in the container. Given the link to stackoverflow, that should be probably OK as pkispawn and its subprocesses should be able to access the /root directory (HOME=/root), right?

Comment 11 Endi Sukma Dewata 2018-01-03 09:31:40 UTC
As mentioned earlier, pkispawn is simply calling openssl to generate a random noise file (i.e. noise.bin) in a temporary folder since it's needed by certutil to generate a CSR to be signed by an external CA (installation step 1). I'm not aware of any user switching inside pkispawn. The openssl does create a /root/.rnd file on my system. It also always resets the file permission to 0600 regardless of the original permission.

Does the SELinux configuration prevent any of these operations? Have you tried running in permissive mode?

Comment 12 Standa Laznicka 2018-01-08 07:44:53 UTC
Endi,

Sorry for getting back to this this late. There don't seem to be any SELinux denials connected to this and I did try to run the container in permissive.

Comment 13 Standa Laznicka 2018-01-08 07:45:44 UTC
(In reply to Stanislav Laznicka from comment #12)
> Endi,
> 
> Sorry for getting back to this this late. There don't seem to be any SELinux
> denials connected to this and I did try to run the container in permissive.

To no avail, I forgot to add.

Comment 14 Standa Laznicka 2018-01-11 11:13:26 UTC
Some new findings -> if I inject pdb.set_trace() in freeIPA installer during docker build, and then run `pkispawn` manually by `docker exec -ti <contname> bash` at the point where it would normally be run by the installer, it succeeds. It also succeeds if I use the same trick and try to do this from python command line by importing the `run()` function from `ipapython.ipautil` and run pkispawn with it.

Weird.

Comment 15 Standa Laznicka 2018-01-11 13:10:38 UTC
Created attachment 1380037 [details]
openssl strace

Tried stracing the openssl process. There does not seem to be anything interesting in the trace to point out the error except for the openssl process returning error in the end. Attaching the strace just for the reference.

The strace also proves that the noise.bin file is indeed being written into as a root user.

Comment 16 Standa Laznicka 2018-01-11 15:24:42 UTC
Created attachment 1380067 [details]
openssl ltrace

Adding an ltrace from a different run (ends in the same way).

Comment 17 Standa Laznicka 2018-01-11 15:28:30 UTC
Tomas, Hubert, we're dealing with this issue where openssl called from an installer subprocess ends with an error: "unable to write 'random state'"

I attached an ltrace and an strace to this BZ. Do you think you'd be able to read any information about why this failure happens?

Comment 18 Jan Pazdziora 2018-01-11 15:46:53 UTC
Well, app_RAND_write_file has code

    if (file == NULL)
        file = RAND_file_name(buffer, sizeof buffer);
    if (file == NULL || !RAND_write_file(file)) {
        BIO_printf(bio_err, "unable to write 'random state'\n");
        return 0;
    }

So that message (and return 0) would happen either if file == NULL (meaning RAND_file_name returned NULL) or the RAND_write_file failed.

Looking at RAND_write_file, it starts with

    i = stat(file, &sb);

and we do not see that stat, so I assume we never get into there, meaning the file is null.

The

[pid   572] getuid()                    = 0
[pid   572] geteuid()                   = 0
[pid   572] getgid()                    = 0
[pid   572] getegid()                   = 0
[pid   572] write(2, "unable to write 'random state'\n", 31) = 31

suggests the

int OPENSSL_issetugid(void)
{
    if (getuid() != geteuid())
        return 1;
    if (getgid() != getegid())
        return 1;
    return 0;
}

is called and since all those four calls return 0, the !='s are never true and so OPENSSL_issetugid returns 0. So we were in RAND_file_name and in

    if (OPENSSL_issetugid() != 0) {
        use_randfile = 0;
    } else {
        s = secure_getenv("RANDFILE");
        if (s == NULL || *s == '\0') {
            use_randfile = 0;
            s = secure_getenv("HOME");
        }
    }

the control would go through the else branch and if RANDFILE is not set, it would do

  use_randfile = 0;

and try to set path in HOME ... is HOME set in the container (in that script's execution)? Maybe openssl never tries to actually write the 'random state' and we need to figure out how to actually help it write one, help it to figure out the filename it should write.

These were done on openssl-1.1.0g-1.fc26.src.rpm.

Comment 19 Jan Pazdziora 2018-01-11 15:50:28 UTC
Looking at the ltrace, both the RAND_file_name calls (not sure why there are two) end with 0, and there is never RAND_write_file called, which suggests that

    file == NULL

which suggests that RAND_file_name did not figure out what the filename should be.

Comment 20 Jan Pazdziora 2018-01-11 15:56:27 UTC
https://www.openssl.org/docs/faq.html#USER2:

A possible reason is that no default filename is known because neither RANDFILE nor HOME is set. (Versions up to 0.9.6 used file ".rnd" in the current directory in this case, but this has changed with 0.9.6a.)

Not sure why we did not see the error on older Fedoras that also had openssl newer than 0.9.6 ... but maybe something else changed which made either of those environment variables unset? In any case it'd be good to verify if they are or aren't set.

Comment 21 Standa Laznicka 2018-01-11 16:08:05 UTC
I believe first of the two RAND_file_name calls tries to read it, second tries to write it (RAND_status() is called while trying to get a method to get some random data).

I tried a kind of stupid workaround where I took the file generated by `openssl rand 2048` on my fs, moved it to /opt/ipa-data-master, set proper context and added `-e RANDFILE="/data/.rnd"` to docker run arguments. It did not help, though.

Comment 22 Jan Pazdziora 2018-01-11 16:11:49 UTC
Note that we run the installer from systemd service, so your environment variable value probably was not propagated to the actual process. You can check /proc/${the-pid}/environ. While you are at it, could you please check if it has or doesn't have HOME set?

Comment 24 Jan Pazdziora 2018-01-11 16:30:32 UTC
Just set HOME (or RANDFILE) in ipa-server-configure-first, that should do the trick.

Comment 25 Tomas Mraz 2018-01-11 16:47:48 UTC
Just to clarify - actually ignoring the error message and exit value would also work because the random data is written to the file on the command-line, only the auxiliary random data storage in ~/.rnd is not updated.

However I will fix this to not fail this way in openssl upstream. I think the auxiliary .rnd write failure should not make the command to report failure.

Comment 26 Standa Laznicka 2018-01-11 16:57:05 UTC
Thank you Tomas for the clarification. I added a PR with an environment fix to freeipa-container repository: https://github.com/freeipa/freeipa-container/pull/184

Comment 27 Jan Pazdziora 2018-01-11 16:57:21 UTC
So if we do not care about he auxiliary data, would RANDFILE=/dev/null also work, without any ill effects?

Comment 28 Tomas Mraz 2018-01-11 17:11:24 UTC
Yes, that should work as well.

Comment 29 Jan Pazdziora 2018-01-12 08:17:28 UTC
(In reply to Tomas Mraz from comment #25)
> 
> However I will fix this to not fail this way in openssl upstream. I think
> the auxiliary .rnd write failure should not make the command to report
> failure.

Maybe amending the message to distinguish the situation when openssl tried to write and failed and the one when it did not even try because it did not know where to write it would help some people.

Comment 30 Rob Crittenden 2018-01-12 19:42:50 UTC
Add HOME env var to ipa-server-configure.service merged, https://github.com/freeipa/freeipa-container/commit/bd3a33adb878a823d2b20c8f1f4fcdeb063a9436

Comment 31 Standa Laznicka 2018-01-15 07:25:43 UTC
Thank you, Rob.

Tomas, I read you're performing the changes upstream and thus I presume you're tracking them as such. I am going to close this BZ as fixed but feel free to reopen pointing to openssl component if you'd like to track you progress here.

Comment 32 Jan Pazdziora 2018-01-15 12:53:43 UTC
(In reply to Rob Crittenden from comment #30)
> Add HOME env var to ipa-server-configure.service merged,

Actually, in the end we went with RANDFILE=/dev/null.