Created attachment 369726 [details] /tmp/syslog Description of problem: * Fails to activate networking Version-Release number of selected component (if applicable): * anaconda version 13.8 How reproducible: Steps to Reproduce: 1. Initiate a manual install on virt or bare metal 2. provide a location for a remote install source Actual results: ┌────────────┤ Error ├────────────┐ │ │ │ Unable to retrieve │ │ http://download.fedoraproject.o │ │ rg/pub/fedora/linux/development │ │ /x86_64/os//images/install.img. │ │ │ │ ┌────┐ │ │ │ OK │ │ │ └────┘ │ │error reading header: cpio: read failed -│Success │ │ └─────────────────────────────────┘ Expected results: Downloading images/install.img correctly. Additional info: * See attached files (/tmp/syslog and anaconda.log). * From the failing system, I am able to ping other hosts, but only by IP. It seems DNS might not be setup correctly?
Created attachment 369727 [details] /tmp/anaconda.log
According to your syslog, you've got nameserver information but this looks likely to be the problem: <185>Nov 16 15:03:02 NET: dhclient: failed to create default route: 10.10.11.254 dev eth0
I wonder why it's even try to do that; when run by NetworkManager dhclient-script doesn't get run but instead NetworkManager handles the default route. But that message appears to come from dhclient-script's add_default_gateway() function. Which I don't understand... David, any idea here? NM runs dhclient with a command-line like: /sbin/dhclient -d -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/dhclient-usb0.pid -lf /var/lib/dhclient/dhclient-f4419c0a-1740-4b2b-b61e-91935bdae692-usb0.lease -cf /var/run/nm-dhclient-usb0.conf usb0 using the custom script file of course... the script handling seems a bit convoluted in dhclient, but I can't offhand see where it would ever be calling dhclient-script anywhere.
I'm definitely able to reproduce this locally. NetworkManager and loader are working fine. At least for me, I get a DHCP lease, NM does what it does, I have an IP, routing table configured, and /etc/resolv.conf written. I hit OK at the error message for Unable to Download and change the hostname of my install server to the IP address. It works after that. It looks like our problem is with libcurl and DNS resolution. I added this to loader/urls.c: diff --git a/loader/urls.c b/loader/urls.c index 495516a..24ceb33 100644 --- a/loader/urls.c +++ b/loader/urls.c @@ -104,6 +104,8 @@ int urlinstTransfer(struct loaderData_s *loaderData, struct char **extraHeaders, char *dest) { struct progressCBdata *cb_data; CURLcode status; + CURLSHcode sh; + CURLSH *sharedns = NULL; struct curl_slist *headers = NULL; char *version; FILE *f = NULL; @@ -126,6 +128,34 @@ int urlinstTransfer(struct loaderData_s *loaderData, struct curl_easy_setopt(loaderData->curl, CURLOPT_URL, ui->url); curl_easy_setopt(loaderData->curl, CURLOPT_WRITEDATA, f); + if ((sharedns = curl_share_init()) != NULL) { + sh = curl_share_setopt(sharedns, CURLSHOPT_SHARE, CURL_LOCK_DATA_DNS); + + if (sh != CURLSHE_OK) { + logMessage(ERROR, "%s: %d: %s", __func__, __LINE__, + curl_easy_strerror(sh)); + sh = curl_share_cleanup(sharedns); + + if (sh != CURLSHE_OK) { + logMessage(ERROR, "%s: %d: %s", __func__, __LINE__, + curl_easy_strerror(sh)); + } + + sharedns = NULL; + } else { + status = curl_easy_setopt(loaderData->curl, CURLOPT_SHARE, + sharedns); + if (status != CURLE_OK) { + logMessage(ERROR, "%s: %d: %s", __func__, __LINE__, + curl_easy_strerror(status)); + } + } + } else { + logMessage(ERROR, "%s: %d: curl_share_init() returned NULL", + __func__, __LINE__); + sharedns = NULL; + } + /* If a proxy was provided, add the options for that now. */ if (loaderData->proxy && strcmp(loaderData->proxy, "")) { curl_easy_setopt(loaderData->curl, CURLOPT_PROXY, loaderData->proxy); @@ -183,6 +213,12 @@ int urlinstTransfer(struct loaderData_s *loaderData, struct fclose(f); free(version); + sh = curl_share_cleanup(sharedns); + if (sh != CURLSHE_OK) { + logMessage(ERROR, "%s: %d: %s", __func__, __LINE__, + curl_easy_strerror(sh)); + } + return status; } *BUT*, that didn't work. I need to read up on libcurl a bit more and figure out what's happening. libcurl has CURLOPT_DNS_USE_GLOBAL_CACHE, which is marked as deprecated. You are supposed to create a shared variable and enable the DNS settings there. That's what I tried, but my first attempt didn't work. Just wanted to let people know where I am. It's definitely not a NetworkManager problem.
That's odd. Why does curl care at all? Doesn't it just do gethostbyname() or whatever and libc takes care of the DNS resolution? Maybe the res_init() call in get_connection() somehow isn't doing what we want? Can you try calling res_init() right before you start the libcurl requests? Looks like the only place the loader calls res_init() is in get_connection(): if (state == NM_STATE_CONNECTED) { logMessage(INFO, "%s (%d): NetworkManager connected", __func__, __LINE__); res_init(); g_object_unref(client); return 0; } which should be fine, but lets try tossing res_init() in a few other places to work around glibc stupidity...
libcurl maintains some sort of state when you initialize it. From their curl_easy_setopt() documentation: "NOTE: the name resolve functions of various libc implementations don't re-read name server information unless explicitly told so (for example, by calling res_init(3)). This may cause libcurl to keep using the older server even if DHCP has updated the server info, and this may look like a DNS cache issue to the casual libcurl-app user." What we were doing in loader was calling curl_global_init() once and then using that curl object throughout. I moved the init calls for curl to our urlinstTransfer() function and added clean up functions, so each time we need to use curl, we set things up, call curl, and clean up. Fixes the problem we're seeing.
Fixed in commit 46312dc05b61d7fd18fe9710461eb9b0a9118607.