Bug 1904415 - bind-9.11.25-2.fc34 broke app system starts
Summary: bind-9.11.25-2.fc34 broke app system starts
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: bind
Version: rawhide
Hardware: aarch64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Petr Menšík
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-04 12:00 UTC by customercare
Modified: 2020-12-08 16:03 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-12-08 16:03:32 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description customercare 2020-12-04 12:00:37 UTC
Description of problem:

Tested on aarch64 

As soon as bind-libs from bind-9.11.25-2.fc34 are installed on the system,
apps like firefox, chatty etc. do not start. 

Debugging this with strace shows a timeout due to crashing the resolve function() which denies the app start in the end.

Version-Release number of selected component (if applicable):

bind-9.11.25-2.fc34

Example crash:

Dez 02 23:29:58 fedorapine systemd-coredump[1750]: Process 1682 (chatty) of user 1000 dumped core.
                                                   
                                                   Stack trace of thread 1682:
                                                   #0  0x0000007fa82cd630 raise (libc.so.6 + 0x38630)
                                                   #1  0x0000007fa82b981c abort (libc.so.6 + 0x2481c)
                                                   #2  0x0000007f7e4c51fc log_assert_failed_realm.constprop.0 (libnss_resolve.so.2 + 0xf1fc)
                                                   #3  0x0000007f7e4b9044 strv_find (libnss_resolve.so.2 + 0x3044)
                                                   #4  0x0000007f7e4c2734 _nss_resolve_gethostbyname4_r (libnss_resolve.so.2 + 0xc734)
                                                   #5  0x0000007fa835b208 gaih_inet.constprop.0 (libc.so.6 + 0xc6208)
                                                   #6  0x0000007fa835bd44 getaddrinfo (libc.so.6 + 0xc6d44)
                                                   #7  0x0000007fa6e93a3c get_fqhostname (libsasl2.so.3 + 0xfa3c)
                                                   #8  0x0000007fa6e94088 sasl_client_new (libsasl2.so.3 + 0x10088)
                                                   #9  0x0000007fa8abdd58 jabber_auth_start_cyrus (libjabber.so.0 + 0x4fd58)


O== Solution to get app back to start was downgrading to 9.11.25-1

2020-12-02T23:38:03+0100 SUBDEBUG Downgrade: bind-license-32:9.11.25-1.fc34.noarch
2020-12-02T23:38:03+0100 SUBDEBUG Downgrade: bind-libs-lite-32:9.11.25-1.fc34.aarch64
2020-12-02T23:38:03+0100 SUBDEBUG Downgrade: bind-libs-32:9.11.25-1.fc34.aarch64
2020-12-02T23:38:03+0100 SUBDEBUG Downgrade: bind-utils-32:9.11.25-1.fc34.aarch64

Comment 1 Petr Menšík 2020-12-08 15:50:05 UTC
I doubt it can be related, bind-9.11.25-2 did just change documentation rebuild, no code changed in bind source between those packages.

Also, bind component changes dig, host and nslookup in bind-utils package. But it does not change in any way *getaddrinfo* calls or *libnss_resolve*. That would be failure of systemd-resolved.

Can you recheck downgrade of bind package actually helps? Is bind running on localhost machine? It seems to me this should be moved to systemd component, I think they did rebase not long ago. Would systemctl stop

Can you check command 'getent hosts <anyname>' would not crash and deliver expected results?

Can you reproduce crashes with 'resolvectl query <anyname>'? Replace <anyname> with names chatty or crashing firefox is using.

can you install debuginfo and try more detailed stack trace? ::

dnf debuginfo-install systemd
dnf install gdb
coredumpctl dump

And run commands in debugger:
 (gdb) bt
 (gdb) quit

Can you attach also 'named-checkconf -p' output? Is nameserver 127.0.0.1 in /etc/resolv.conf?

Comment 2 customercare 2020-12-08 16:03:32 UTC
1) downgrade helped. immediately after downgrade apps started again. No relogin or reboot needed.

2) the system changed in the meantime. We had several updates i.e. a glibc update. I will update bind and recheck it for you.

Result after update to latest bind: Firefox works again. Chatty works too. 

I assume, this update fixed it :

2020-12-06T13:03:27+0100 SUBDEBUG Upgraded: glibc-devel-2.32.9000-17.fc34.aarch64
2020-12-06T13:04:12+0100 SUBDEBUG Upgraded: glibc-langpack-de-2.32.9000-17.fc34.aarch64
2020-12-06T13:04:13+0100 SUBDEBUG Upgraded: glibc-common-2.32.9000-17.fc34.aarch64
2020-12-06T13:04:13+0100 SUBDEBUG Upgraded: glibc-langpack-en-2.32.9000-17.fc34.aarch64
2020-12-06T13:04:14+0100 SUBDEBUG Upgraded: glibc-2.32.9000-17.fc34.aarch64


As it looks, a case of inconsistent package states. 

3) it is and was a wifi connection with dhcp, so it set the correct nameserver in /etc/resolv.conf

cat /etc/resolv.conf 
# Generated by NetworkManager
search fritz.box
nameserver 192.168.0.254
nameserver fd00::e228:6dff:fe26:b918


Note You need to log in before you can comment on or make changes to this bug.