Bug 1228445
Summary: | unbound-libs takes 50 minutes to install | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | York Possemiers <ypossem> | ||||||||||
Component: | unbound | Assignee: | Tomáš Hozza <thozza> | ||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | low | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 23 | CC: | gsgatlin, magnus, mmedvede, pj.pandit, psimerda, pwouters, thozza, ypossem | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: |
Anaconda netinst
Fedora Workstation configuration
x86_64
|
|||||||||||
Last Closed: | 2016-01-27 23:20:49 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
York Possemiers
2015-06-05 01:25:51 UTC
Hi. Thank you for the report. The change made in the %post script was only to start the unbound-anchor.timer unit. I can try to reproduce your issue. Can you please attach installation log from the installation? Thanks! Created attachment 1036108 [details]
Installation log
Sorry I am not sure which constitutes the "install log" so I will provide a couple of the files in /var/log/anaconda. I'd prefer to be able to attach multiple files to one response, but that doesn't seem to be possible.
Created attachment 1036109 [details]
dnf log
Created attachment 1036110 [details]
dnf rpm log
(In reply to York Possemiers from comment #2) > Created attachment 1036108 [details] > Installation log > > Sorry I am not sure which constitutes the "install log" so I will provide a > couple of the files in /var/log/anaconda. I'd prefer to be able to attach > multiple files to one response, but that doesn't seem to be possible. That's exactly what I meant. I'll have a look to see if there is any clue. Thank you! This issue persists with the upgrade (with dnf update) to unbound-libs.x86_64 1.5.4-1.fc23 This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle. Changing version to '23'. (As we did not run this process for some time, it could affect also pre-Fedora 23 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23 I am observing similar behavior during unbound-libs-1.5.3-4.fc21.ppc64.rpm install on Fedora 21. I did find that the process causing the delay is showing up as -unbound-anchor,3656,unbound -a /var/lib/unbound/root.anchor -c /etc/unbound/icannbundle.pem When listed with 'pstree -panuh' Killing the process allowed the installation to proceed. Thanks for the clue. This is really helpful. I'll try to look at the possible cause and will update the bug once I have some new information. So the %post section for unbound-libs is executing unbound-anchor on installation and upgrade. I can see that the /var/lib/unbound/root.anchor is one mistake, since it should be /var/lib/unbound/root.key. The purpose of this is to update the root key. I think it makes sense to run it only on the package installation, or not at all, since there is a systemd timer unit that does this every 24h. Never the less, the utility should not hang for such long. Could you please attach to the process using gdb and get the traceback? I guess there is some issue with the network communication. Is the host on which you installed the package connected to the Internet? Is there any network connectivity at all? Can you please run the command by hand and see if it hangs, too? unbound-anchor -a /var/lib/unbound/root.anchor -c /etc/unbound/icannbundle.pem I'm not able to reproduce it. I'll most probably remove the call from %post section completely, but would like to check if there is any real issue in the tool. Thanks in advance! actually the root.key and root.anchor had a different format in which they stored the root key. To prevent apps from breaking on the format change, I used a different file name. Maybe we should move the anchor call from %post to a onetime service call on startup? That at least will not cause an install to hang (In reply to Paul Wouters from comment #12) > Maybe we should move the anchor call from %post to a onetime service call on > startup? That at least will not cause an install to hang That's what the timer unit does... runs the oneshot service. Therefore I don't think it is necessary to call it during the installation at all. It is done regularly, so there is no need for explicit call. Tried starting a clean f21 VM and running 'yum install -y unbound-libs'. The long wait is reproducible every time. After killing the offending process and letting the installation to complete, I manually ran: unbound-anchor -a /var/lib/unbound/root.anchor -c /etc/unbound/icannbundle.pem Which also hanged. I got this bt each time with gdb: #0 0x00003fff8e0a4958 in __epoll_wait_nocancel () from /lib64/libc.so.6 #1 0x00003fff8e76a824 in 00000043.plt_call.bufferevent_disable () from /lib64/libevent-2.0.so.5 #2 0x00003fff8e74b61c in .event_base_loop () from /lib64/libevent-2.0.so.5 #3 0x00003fff8e74cec4 in .event_base_dispatch () from /lib64/libevent-2.0.so.5 #4 0x00003fff8e8eaa6c in 0000106d.plt_call.PyTuple_New () from /lib64/libunbound.so.2 #5 0x00003fff8e965fd4 in .ub_resolve () from /lib64/libunbound.so.2 #6 0x0000000045d746b0 in 000002e1.plt_call.ub_ctx_create () #7 0x00003fff8dfa438c in .generic_start_main.isra () from /lib64/libc.so.6 #8 0x00003fff8dfa45b4 in .__libc_start_main () from /lib64/libc.so.6 #9 0x0000000000000000 in ?? () The VM I ran the test on has network connectivity. Just in case, I have also ran the same test on baremetal F21 host, with the same result of unbound-anchor hanging, with exactly the same gdb backtrace. Thanks for the traceback. What is strange is the call to PyTuple_New(), since this is part of the Python bindings and unbound-anchor is a C binary. Reported to upstream: https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=689 Maybe they will have some ideas... Would you be able to capture the packet communication (using e.g. tcpdump) from the VM while running unbound-anchor (with the reproduced hang)? Thanks in advance. Created attachment 1054022 [details]
unbound-anchor strace hang
I have attached an strace captured with like this: while [ -z "$(ps ax |grep unboun[d]-anch | grep -v runuser)" ]; do sleep 0.1; done; export PID="$(ps ax |grep unboun[d]-anch |grep -v runuser |awk '{print $1}')";echo $PID; strace -p $PID -f -e trace=network -s 10000 2> strace_unbound_anchor I am seeing problems in fedora 22, MATE spin installing from DVD. Ran "dnf update" as first thing after install completes. it hung on unbound-libs in two separate installs. I will leave my "dnf update" command running overnight to see if it ever finishes. after leaving my pc overnight it did work. I did a dnf downgrade unbound-libs and then did a "time dnf upgrade unbound-libs" and it took 38m26.542s to run the dnf transaction. Pretty long time in scriptlet. I had the exact same problem, and found that the problem was that my firewall (Cisco ASA) was dropping a DNS UDP reply. I have a DNS inspection rule and that rule drops packages lager than 512 bytes. If you have a Cisco ASA, you could either disable the DNS inspection, or edit the rule: https://supportforums.cisco.com/discussion/11325716/asa-udp-dns-reply-dropped We do not control the networking here at NCSU. There is another bureaucracy that deals with that. We cannot log into the devices or change anything about them. I am unsure if they would be willing to change the cisco switches because of one package in fedora. But I will submit a support ticket and ask them. That cisco default value has been a bane to DNSSEC. If that is enabled, your DNS is basically getting mangled, Cisco even tells you themselves: http://www.cisco.com/web/about/security/intelligence/dnssec.html#15 You can confirm this is your problem at: https://www.dns-oarc.net/oarc/services/replysizetest However, if this is really the cause of the unbound-libs stall, we should look into fixing that and as Tomas said, we should remove it from the installer and let a one-shot service deal with this. Here is what the networking group said. ------------------------------------------------------------------------------- Hi Gary, Are you talking about outgoing or incoming UDP DNS packets? What is the business justification for enabling this change? I see 913 bytes UDP DNS packets coming from the outside to your machine, but they are from international IPs. (Amsterdam, Tokyo, etc.) Regards, Early Yu Network Operations Communication Technologies ------------------------------------------------------------------------------- Is the problem with outgoing or incoming UDP DNS packets? It's about incoming to you, eg from DNSSEC servers on the outside world to your machine that is installing unbound-libs business justification: Do not break DNS - The cisco option is wrong and obsoleted. See URL I provided in comment 23. Paul, thank you for your help. They changed DNS Inspection and now the package updates in like 7 seconds. Haven't seen this in a while, I'm presuming it fixed |