Description of problem: When I start a new message (or do anything else to open the mail composer), Evolution stops responding and one CPU core sits near 100% usage. Version-Release number of selected component (if applicable): $ dnf list installed evolution* gnupg* Installed Packages evolution.x86_64 3.50.1-1.fc39 @updates evolution-data-server.x86_64 3.50.1-1.fc39 @updates evolution-data-server-langpacks.noarch 3.50.1-1.fc39 @updates evolution-langpacks.noarch 3.50.1-1.fc39 @updates gnupg2.x86_64 2.4.3-3.fc39 @updates gnupg2-smime.x86_64 2.4.3-3.fc39 @updates How reproducible: Always Steps to Reproduce: 1. Open Evolution. It's responsive; you can open and click on stuff. 2. Start a new mail message by clicking the New Mail button in the headerbar. (Or open the mail composer another way; I've tried: click reply with a message selected in the main window; click reply in a message window; right-click on a message list item and pick Edit as new message; double-click a message in the Drafts folder; run xdg-email) Actual results: The composer window does not open. Evolution stops responding and eventually Gnome Shell pops up the offer to kill it or wait. One CPU core sits near 100% for the 'evolution' process. Expected results: A composer window should open and let you use it. Additional info: strace shows the 'evolution' process is doing a whole lot of polling: poll([{fd=97, events=POLLIN}, {fd=99, events=POLLIN}, {fd=101, events=POLLIN}, {fd=94, events=POLLOUT}, {fd=-1}, {fd=-1}], 6, 30000) = 1 ([{fd=94, revents=POLLOUT}]) When trying to open the composer, Evolution does create a couple of additional bwrap'ed Webkit processes for the composer, but it also runs `gpg2`. Killing the gpg2 process fixes the hang; Evolution opens the composer window and continues -- as far as I can tell so far, nothing else goes wrong.
Oh, I removed my ~/.gnupg folder and everything works normally again. My current .gnupg (which got recreated shortly after I removed the problematic one) contains 3 files (common.conf, pubring.kbx, trustdb.gpg). The problematic one contains common.conf and trustdb.gpg, but also: - a whole bunch of files with names like .#lk0x000055b42110eb50.[hostname].4693 (with my hostname there) - gnupg_spawn_keyboxd_sentinel.lock - a subfolder public-keys.d which in turn contains: - pubring.db - pubring.db.lock - many more files of the form .#lk0x0000555d50829e70.[hostname].5078 (again with my hostname there, but some say "fedora" there, which was the default hostname when I installed Fedora 39 but I changed before running Evolution) I will try to make time later today to reproduce it in a VM fresh installation.
Thanks for a bug report. I'm wondering what would the composer call the gpg for. It does that on send, when PGP signing or encryption is enabled, but otherwise it should have much need to call it, maybe apart of trying to get your public key, to add it into the Autocrypt header. Still, it's a single call, thus it might be gnupg itself going odd. Could you install debuginfo packages for glib2, evolution-data-server and evolution and when the composer will be in this stuck state, grab a backtrace of the process, to see what it is doing, please? You can install the packages with: sudo debuginfo-install glib2 evolution-data-server evolution You can get the backtrace with a command like this: $ gdb --batch --ex "t a a bt" --pid=`pidof evolution` &>bt.txt Please check the bt.txt for any private information, like passwords, email addresses, server addresses,... I usually search for "pass" at least (quotes for clarity only), before sharing it anywhere. Could you check whether there's any gpg process running, please? There probably isn't, but just in case. Thanks in advance.
Good news! I can reliably reproduce the bug in a fresh VM. Steps to reproduce: 1. Install Fedora Workstation 39 from the live image. 2. sudo dnf install evolution seahorse 3. Search in the overview. 4. Open Gnome Settings' Sharing pane and change the hostname. 5. Reboot. 6. Search in the overview. 7. Open Evolution. 8. In the Evolution wizard, create a new account (I used a dummy address, local Maildir, and Sendmail). 9. Click the new message button in the headerbar. 10. Evolution is frozen. In the "search in the overview" steps you can search for anything, including just using search to do the next step of opening Settings (step 3+4) or Evolution (step 6+7). I'm pretty sure the search step is needed only to make Seahorse's search provider run, which uses GPG. Yes, there is a GPG process still running starting after step 6. Looking at ps -ef | grep gpg, after step 6 there is this: gpg --batch --status-fd 16 --logger-fd 20 --no-tty --charset=utf8 --enable-progress-filter --exit-on-status-write-error --display=:0 --with-colons --list-secret-keys -- More of them seem to accumulate every time you use the overview search. So there's probably a bug in Seahorse or GPG. Then after step 9, there is additionally this running: gpg2 --verbose --no-secmem-warning --no-greeting --no-tty --batch --yes --status-fd=88 --export --export-options export-minimal no-export-attributes <me> (with the dummy email address me I made up in step 8). I am about to attach the backtrace you requested. (Since I'd made it with a dummy email account I left it verbatim in the backtrace :)
Created attachment 1999192 [details] Backtrace while hung gdb --batch --ex "t a a bt" --pid=`pidof evolution` &>bt.txt
Thanks for the update and all the testing. I see Evolution is waiting for a response from the gpg process, it's asking for a public key for your test account, to add it into the message for the Autocrypt feature. Those `--list-secret-key` processes are suspicious, they should end quickly as well. Maybe there's a problem with gpg itself. I'll check it here and let you know when I know more. Thank you again for your help on this.
I've installed a Fedora 39 machine, which does not exhibit the problem, but when I install a new virtual machine from Fedora-Workstation-Live-x86_64-39-1.5.iso and update it to the latest bits (which includes gnupg2-2.4.3-4.fc39.x86_64), then even gpg --list-keys is stuck and waiting for nothing. Running `gdb gpg --ex "r --list-keys"` shows the below backtrace when it's waiting. (gdb) bt #0 0x00007ffff7b897dd in recvmsg () from /lib64/libc.so.6 #1 0x00007ffff7c8ab57 in __assuan_recvmsg (ctx=ctx@entry=0x55555567c280, fd=fd@entry=4, msg=msg@entry=0x7fffffffcda0, flags=flags@entry=0) at /usr/src/debug/libassuan-2.5.6-2.fc39.x86_64/src/system-posix.c:133 #2 0x000055555556f6e1 in _assuan_npth_recvmsg (ctx=0x55555567c280, fd=4, msg=0x7fffffffcda0, flags=0) at /usr/src/debug/gnupg2-2.4.3-4.fc39.x86_64/g10/gpg.c:1085 #3 0x00007ffff7c88c39 in _assuan_recvmsg (flags=0, msg=0x7fffffffcda0, fd=<optimized out>, ctx=0x55555567c280) at /usr/src/debug/libassuan-2.5.6-2.fc39.x86_64/src/system.c:258 #4 uds_reader (ctx=0x55555567c280, buf=<optimized out>, buflen=<optimized out>) at /usr/src/debug/libassuan-2.5.6-2.fc39.x86_64/src/assuan-uds.c:113 #5 0x00007ffff7c88ea0 in readline (ctx=ctx@entry=0x55555567c280, buf=buf@entry=0x55555567c3d0 "", buflen=buflen@entry=1002, r_nread=r_nread@entry=0x7fffffffce9c, r_eof=r_eof@entry=0x55555567c3cc) at /usr/src/debug/libassuan-2.5.6-2.fc39.x86_64/src/assuan-buffer.c:79 #6 0x00007ffff7c89c3b in _assuan_read_line (ctx=ctx@entry=0x55555567c280) at /usr/src/debug/libassuan-2.5.6-2.fc39.x86_64/src/assuan-buffer.c:151 #7 0x00007ffff7c8c568 in assuan_client_read_response (ctx=ctx@entry=0x55555567c280, line_r=line_r@entry=0x7fffffffcfb0, linelen_r=linelen_r@entry=0x7fffffffcfac) at /usr/src/debug/libassuan-2.5.6-2.fc39.x86_64/src/client.c:87 #8 0x00007ffff7c8c921 in _assuan_read_from_server (ctx=ctx@entry=0x55555567c280, response=response@entry=0x7fffffffd004, off=off@entry=0x7fffffffd000, convey_comments=convey_comments@entry=0) at /usr/src/debug/libassuan-2.5.6-2.fc39.x86_64/src/client.c:209 #9 0x00007ffff7c8caec in _assuan_connect_finalize (ctx=ctx@entry=0x55555567c280, fd=fd@entry=4, flags=flags@entry=1) at /usr/src/debug/libassuan-2.5.6-2.fc39.x86_64/src/assuan-socket-connect.c:125 #10 0x00007ffff7c91824 in assuan_socket_connect (ctx=0x55555567c280, name=<optimized out>, name@entry=0x55555567d150 "/run/user/1000/gnupg/S.keyboxd", server_pid=server_pid@entry=0, flags=flags@entry=1) at /usr/src/debug/libassuan-2.5.6-2.fc39.x86_64/src/assuan-socket-connect.c:343 #11 0x0000555555621859 in start_new_service.constprop.0 (r_ctx=r_ctx@entry=0x7fffffffd290, module_name_id=module_name_id@entry=13, program_name=0x0, opt_lc_ctype=opt_lc_ctype@entry=0x0, opt_lc_messages=opt_lc_messages@entry=0x0, session_env=session_env@entry=0x0, autostart=1, verbose=0, debug=0, status_cb_arg=0x55555567b3c0, status_cb=0x0, errsource=GPG_ERR_SOURCE_GPG) at ../common/asshelp.c:447 #12 0x000055555558330c in start_new_keyboxd (errsource=GPG_ERR_SOURCE_GPG, status_cb=0x0, status_cb_arg=0x55555567b3c0, debug=<optimized out>, verbose=<optimized out>, autostart=<optimized out>, keyboxd_program=<optimized out>, r_ctx=0x7fffffffd290) at ../common/asshelp.c:629 #13 create_new_context (r_ctx=0x55555567d998, ctrl=0x55555567b3c0) at /usr/src/debug/gnupg2-2.4.3-4.fc39.x86_64/g10/call-keyboxd.c:143 #14 open_context (r_kbl=0x55555567d530, ctrl=0x55555567b3c0) at /usr/src/debug/gnupg2-2.4.3-4.fc39.x86_64/g10/call-keyboxd.c:219 #15 keydb_new (ctrl=0x55555567b3c0) at /usr/src/debug/gnupg2-2.4.3-4.fc39.x86_64/g10/call-keyboxd.c:277 #16 0x00005555555b5170 in list_all (ctrl=ctrl@entry=0x55555567b3c0, secret=secret@entry=0, mark_secret=0) at /usr/src/debug/gnupg2-2.4.3-4.fc39.x86_64/g10/keylist.c:753 #17 0x00005555555b5851 in public_key_list (ctrl=0x55555567b3c0, list=0x0, locate_mode=0, no_local=0) at /usr/src/debug/gnupg2-2.4.3-4.fc39.x86_64/g10/keylist.c:201 #18 0x000055555556b9f3 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/gnupg2-2.4.3-4.fc39.x86_64/g10/gpg.c:4733 When I move away ~/.gnupg/ directory and run `gpg --list-keys`, I get this: $ gpg --list-keys gpg: directory '/home/user/.gnupg' created gpg: /home/user/.gnupg/trustdb.gpg: trustdb created gpg: DBG: kbx_client_data_cmd: finished command with error: Not initialized gpg: keydb_search_first failed: Not initialized and it finishes immediately. Re-running the command shows: $ gpg --list-keys gpg: DBG: kbx_client_data_cmd: finished command with error: Not initialized gpg: keydb_search_first failed: Not initialized from which I understand gnupg is not able to initialize its own directory from scratch for some reason. Interestingly, removing the directory on the machine gpg worked makes the directory recration work fine. Both are up-to-date Fedora 39 workstations using the same gnupg2 packages. I also tried to move ~/.gnupg from the good machine to the bad machine and it worked fine (by an accident both machines had been renamed to `f39`, if that matters - I noticed the hostname is stored in some gpg files). Since then, even when I return back the "bad" directory, the `gpg --list-keys` is happy, even when I remove the directory and let gpg recreate it, then it begins to work. This is truly weird. In any case, as you mentioned it at the beginning, there's a problem with gnupg, thus I move this there for further investigation. I do not think my above test with the removal of the .gnupg/ directory made during install/user-creation is related here, it's only a side problem.
I just installed fresh Fedora 39 VM (with updated packages including gnupg2-2.4.3-4.fc39.x86_64) and running just gpg --list-keys work ok [root@fedora39 ~]# gpg --list-keys gpg: directory '/root/.gnupg' created gpg: /root/.gnupg/trustdb.gpg: trustdb created Downgrading or upgrading gnupg (only versions -2 and -4 are in repositories now) looks working ok. Could you make sure that on fresh installation/after reboot with the -4 version it still hangs for you? The version -3 was bad as I was too fast fixing things, but if you just dnf update, there might still be some daemon services running that gnupg wants to talk to and fails.
> if you just dnf update, there might still be some daemon services running that gnupg wants to talk to and fails. Do you mean after `dnf udpate` the things should just work or you mean to restart the machine after the `dnf update`, to make things work properly?
Likely something starts the gnupg tools/daemons in the background so dnf update might not restart them so rebooting should be safest option.
Okay, I gave this another go. Steps I've done, which are close to Peter's steps from the comment #3: a) install Fedora-Workstation-Live-x86_64-39-1.5.iso into a new virtual machine b) boot the installed machine - it has gnupg2-2.4.3-2.fc39.x86_64 - there is running: gpg-agent --homedir /var/lib/fwupd/gnupg --use-standard-socket --daemon - I do not know what started it, this is the first run after finishing initial setup wizard c) search in the Activities for "terminal" and run it d) execute in the terminal: sudo dnf install evolution seahorse e) close the terminal f) open Settings->Sharing and change the host name from "fedora" to, say, "f39" g) restart the machine, disable install of pending updates h) login and search for "evolution" in Activities and run it i) create a test account, then press Ctrl+N (in the Mail window) j) the evolution is frozen; once asked by GNOME Shell "Force Quit" it k) search for terminal and open it l) run: sudo dnf update gnupg2 ; you get gnupg2-2.4.3-4.fc39.x86_64 m) reboot the machine n) login, then search for terminal and run it o) execute: gpg --list-keys * it's still frozen, the restart did not help p) break it (Ctrl+C), then run: sudo dnf update ; many packages are updated q) once the update is finished run `reboot` r) login, search for terminal, then run `gpg --list-keys` * it's frozen; verified the installed version is gnupg2-2.4.3-4.fc39.x86_64 also with gnupg2-smime of the same version s) reboot the machine, boot it, in gdb press Ctrl+Alt+F4 and login to the text console, not GUI t) run: gpg --list-keys * there's a message: "gpg: removing stale lock file (created by 2902) * the gpg is stuck * Ctrl+C it u) run: sudo hostnamectl set-hostname fedora * it's the original name, as set by the Live image setup * run: reboot v) login to the GUI, search for "terminal" and run there: gpg --list-keys * there are repeating messages: gpg: waiting for lock (held by 1786) ... * `ps ax | grep 1786` doesn't find a hit, there is no such process running Thus, it looks like this is about some stall lock file, from previously named machine, which blocks the process when the machine is renamed. Why install of seahorse and evolution is needed here I've no idea, they just help to reproduce it.
Evolution is not required to trigger the GPG bug, but it makes it obviously visible. Seahorse is only needed because it uses GPGME in its search provider, which is why triggering search in the Overview is required in the Comment #3 steps. You can verify this by disabling its search provider (without triggering a search first!) and see that Evolution doesn't freeze. Distilling Milan's work in comment 10, here are minimal steps to reproduce the GPG hang without other software: 1. Install Fedora Workstation 39 from the live image and open a terminal (I don't think it matters how). 2. (Optionally, sudo dnf upgrade gnupg2) 3. gpg --list-keys This reports that it created ~/.gnupg and returns immediately. 4. Change the hostname (either from Gnome Settings or with sudo hostnamectl set-hostname whatever) 5. reboot 6. gpg --list-keys After the reboot, all attempts to run gpg --list-keys will hang. This happens both with the original -2 version and with upgrading to -4 in step 2.
Not sure if this is the same issue, but two reports on Ask Fedora look similar where gpg hangs: https://discussion.fedoraproject.org/t/gpg-blocking-forever-cant-git-commit-as-a-result/96605/7 https://discussion.fedoraproject.org/t/weird-libreoffice-freeze-due-to-gnupg/96360 There, the issue was the new "keyboxd" bit that gpg uses by default in v2.4.1+, and the "fix" was to disable it in ~/.gnupg/common.conf.
Thank you for the investigation and adding another data points. So this is the issue with the new 2.4.1+ series of GnuPG. Interesting. So the trigger is the hostname change? Following the comment #11, without the reboot, I do not see the same issue yet, regardless of killing the keyboxd.
It turned out to be very tricky to reproduce. For example I tried to install only seahorse, but it was not enough. Installing both seahorse and evolution did trigger the problem (using the fresh machine installed from the live DVD).
(In reply to Jakub Jelen from comment #13) > Interesting. So the trigger is the hostname change? Following the comment > #11, without the reboot, I do not see the same issue yet, regardless of > killing the keyboxd. I was the one posting on Ask Fedora about the Libreoffice GnuPG related freeze. Just wanted to mention that there was actually a hostname change in my case too - I changed the machine name which changed the hostname.
Finally got to reproduce this on Fedora 39 and debug this a bit. Debug logs from gpg do not show much information, just that it starts the keyboxd (and after it is killed we continue): tester@whatever:~$ gpg --list-keys --debug-all 2>&1 | tee /tmp/gpg-debug gpg: reading options from '[cmdline]' gpg: reading options from '/home/tester/.gnupg/common.conf' gpg: enabled debug flags: packet mpi crypto filter iobuf memory cache memstat trust hashing ipc clock lookup extprog gpg: enabled compatibility flags: gpg: DBG: [no clock] start gpg: using pgp trust model gpg: DBG: [no clock] keydb_new gpg: no running keyboxd - starting '/usr/libexec/keyboxd' gpg: removing stale lockfile (created by 3458) gpg: waiting for the keyboxd to come up ... (5s) gpg: DBG: chan_4 <- [error: Connection reset by peer] In the meantime, I see the keyboxd process running: [tester@whatever ~]$ ps auxf | grep keybox tester 2989 0.0 0.0 225884 1548 ? Ss 16:37 0:00 \_ keyboxd --homedir /home/tester/.gnupg --daemon After killing it and trying to start it manually, I can see that what the keyboxd is waiting for (the last message repeats indefinitely): [tester@whatever ~]$ /usr/libexec/keyboxd --debug-all --homedir /home/tester/.gnupg --daemon keyboxd[3548]: reading options from '[cmdline]' keyboxd[3548]: reading options from '/home/tester/.gnupg/common.conf' keyboxd[3548]: enabled debug flags: mpi crypto memory cache memstat hashing ipc clock lookup keyboxd[3548]: listening on socket '/run/user/1000/gnupg/S.keyboxd' keyboxd[3549]: waiting for lock (held by 2886) ... Oh ... there is a lock file hold by a process that no longer exists! This one even with the old hostname in the filename, from yesterday: tester@whatever:~1$ ls -al .gnupg/public-keys.d/ | grep 2886 -rw-r--r--. 2 tester tester 18 Nov 23 17:40 .#lk0x000056307f85fe60.fedora.2886 In the first output, we see that the gpg removes some stale locks, but probably ignores the ones from different hostnames. This is probably reasonable countermeasure when the home is shared among multiple machines though. But given that this is causing a dead lock and the locks are stale anyway, I think there needs to be some reasonable way out. I am not sure if I have a knowledge to debug this further so I will open an usptream issue
https://dev.gnupg.org/T6838
+1 this just affected me too; I've found this issue via search, seems crazy (to me) that a hostname change would break gnupg; blogged about it onhttps://github.com/vorburger/vorburger.ch-Notes/blob/develop/security/gpg-hangs-on-hostname-change.md.
> blogged about it on https://github.com/vorburger/vorburger.ch-Notes/blob/develop/security/gpg-hangs-on-hostname-change.md.
*** Bug 2265550 has been marked as a duplicate of this bug. ***
(In reply to Jakub Jelen from comment #17) > https://dev.gnupg.org/T6838 The upstream bug is closed now, though it does not suggest any automatic cleanup of the stale locks (which makes sense as long as there is no way to differentiate stale lock from a used/valid lock). Supposing the locks do save current host name in them, would it be possible (and correct) to skip/auto-cleanup locks from host name other than the current host name is? The time when the host name had been just changed, without doing re-login/restart, can cause some trouble, maybe?
I don't know. I updated Fedora 40 to gnupg2-2.4.4, which should have the fix from the above upstream issue (regardless how partial it is). I can probably update also 39, but I do not know what other issues it might cause. The automatic cleanup is problematic for the homes that can be shared among different hosts, usually over network so the safest option is likely to keep the decision on the user to figure out what is stale and what not based on the use case.
Sure, if you think it's not safe for the f39, then it's okay to not add it there. I also agree with the automatic cleanup, the app cannot know what is really stale and what not (it's sometimes impossible to delete opened files, but I do not know how that precisely works, especially with network file systems, which you mentioned). If I understand it correctly, the upstream comment [1] summarizes what a user or either gpg version can do to unlock the db/remove stale locks, right? I may update evolution-data-server (libcamel) to check for those "waiting for lock release" notices from the gnupg, to stop waiting and claim an error to the user, because right now Evolution client can just freeze with no response, which is pretty bad user experience, to say the least. In any case, if the [1] is the cook book for the affected users, then you can close this bug, from my point of view. [1] https://dev.gnupg.org/T6838#181242
The situation in 2.4.3 is unfortunate as it can easily hang forever. Reading through the changelog, I do not see many huge changes so I will likely rebase also in F39, which should improve the state for Evolution.
FEDORA-2024-a9db24201d (gnupg2-2.4.4-1.fc39) has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2024-a9db24201d
FEDORA-2024-a9db24201d has been pushed to the Fedora 39 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-a9db24201d` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-a9db24201d See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2024-a9db24201d (gnupg2-2.4.4-1.fc39) has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.
Just as a confirmation, I re-tested this with gnupg2-2.4.4-1.fc39.x86_64 and: a) `gpg --list-keys` prints several lines containing "gpg: Note: database_open 134217901 waiting for lock (held by 3645) ..."; b) gpg does not wait indefinitely, it takes only several seconds and then it gives up with: "gpg: keydb_search_first failed: Connection timed out", which is a great thing. Evolution can show the lines from the gnupg in the GUI, for example when sending a gpg-signed message, thus the users can be informed about the problem. It does not contain what to do next, but it's the Internet for. The fix from the upstream is a good one, from my point of view.
Thanks for verifying that and glad it is doing the job.