Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Error handling in the output of the *dhcp-script* has been improved
Previously, any error in the output of the *dhcp-script* was ignored. With this update the output of the script is logged on the `add`, `old`, `del`, `arp-add`, `arp-del`, `tftp` actions. As a result, errors are displayed while *dnsmasq* is running.
Note that the lease-init action happens only at a start of *Dnsmasq*. With this update, only a summary of the output is logged and not the standard error output, which passes to the *systemd* service for logging.
Description of problem:
dnsmasq is not giving a reasonable log out when it fails to parse the input of -dhcp-script, for example giving that command:
#dnsmasq --strict-order --bind-interfaces --conf-file= --pid-file=/var/lib/nova/networks/nova-br100.pid --listen-address=192.168.12.1 --except-interface=lo --dhcp-range=set:novanetwork,192.168.12.2,static,255.255.252.0,120s --dhcp-lease-max=1024 --dhcp-hostsfile=/var/lib/nova/networks/nova-br100.conf --dhcp-script=/usr/bin/nova-dhcpbridge --leasefile-ro --domain=novalocal
and a script giving that output:
2015-01-31 12:59:03.499 12398 DEBUG nova.openstack.common.lockutils [req-8cb351cc-a2fe-4bf6-8801-acaa0a6b424c None None] Got semaphore / lock \"__get_backend\" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py
dnsmasq crash with a broken pipe:
write(1, "2015-01-31 12:59:03.499 12398 DEBUG nova.openstack.common.lockutils [req-8cb351cc-a2fe-4bf6-8801-acaa0a6b424c None None] Got semaphore / lock \"__get_backend\" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:245\n", 236) = -1 EPIPE (Broken pipe)
12398 --- SIGPIPE (Broken pipe) @ 0 (0) ---
nova-dhcpbridge return that output because it was in debug mode, but this is another bugzilla
but dnsmasq only throws the following entries:
Jan 30 20:36:38 dddm0996a dnsmasq[15337]: lease-init script returned exit code 1
Jan 30 20:36:38 dddm0996a dnsmasq[15337]: FAILED to start up
Version-Release number of selected component (if applicable):
dnsmasq-2.48-14.el6.x86_64
How reproducible:
not sure
Steps to Reproduce:
1. - create a bash script ( for example /tmp/myscript.sh ) returning the following output:
2015-01-31 12:59:03.499 12398 DEBUG nova.openstack.common.lockutils [req-8cb351cc-a2fe-4bf6-8801-acaa0a6b424c None None] Got semaphore / lock \"__get_backend\" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py
2. - execute dnsmasq with --dhcp-script=/tmp/myscript.sh
#dnsmasq --strict-order --bind-interfaces --conf-file= --pid-file=/var/lib/nova/networks/nova-br100.pid --listen-address=192.168.12.1 --except-interface=lo --dhcp-range=set:novanetwork,192.168.12.2,static,255.255.252.0,120s --dhcp-lease-max=1024 --dhcp-hostsfile=/var/lib/nova/networks/nova-br100.conf --dhcp-script=/tmp/myscript.sh --leasefile-ro --domain=novalocal
Actual results:
Broken pipe if we strace dnsmasq
write(1, "2015-01-31 12:59:03.499 12398 DEBUG nova.openstack.common.lockutils [req-8cb351cc-a2fe-4bf6-8801-acaa0a6b424c None None] Got semaphore / lock \"__get_backend\" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:245\n", 236) = -1 EPIPE (Broken pipe)
12398 --- SIGPIPE (Broken pipe) @ 0 (0) ---
these log entries in /var/log/messages
Jan 30 20:36:38 dddm0996a dnsmasq[15337]: lease-init script returned exit code 1
Jan 30 20:36:38 dddm0996a dnsmasq[15337]: FAILED to start up
Expected results:
We could have a log entry like
Jan 30 20:36:38 dddm0996a dnsmasq[15337]: error parsing lease-init output: <some piece of the information parsed>
This small improvement can help us to determine faster what is the real problem
Additional info:
Reporter: In the future, please file bugs for the platform product. There is no such thing as separate dnsmasq component in RHOS. Every update to dnsmasq has to go through regular RHEL update process.
Created attachment 1266224[details]
proposed fix to dhcp-script error logging
I have prepared a patch to log output of dhcp-script. It logs all actions but init, that would end only with SIGPIPE before.
It does not handle possible errors from script with init action. It is started differently and should not prevent script from continuing. It reads stdout of script already, but does not touch stderr. It would require to change parsing of leases from pull approach to asynchronous push approach. It seems to me that part should be easy to debug in dnsmasq debug mode (-d). All other actions are handled well by an implemented pipe.
Upstream merged both logging of unusual output from dhcp-script for all actions but init. For lease init action, it does handle only stdout of script. It can detect that there is something wrong and print few words what appeared.
It does one new thing however. If there is failure in dhcp-script leases init, it will die. It will not start with empty leases, as it did before. Please let me know if this behaviour if such behaviour is not appreciated.
Is it ok, if now lease-init action will prevent dnsmasq from starting not only in case of exit status not equal to zero? If format produced by script is not recognized or is no longer supported, it would require manual action from administrator to be able to start again. It silently skipped reading any leases on the first format error before. Are you ok with that?
That will be true only for dhcp-script, leases from plain file will be silently rewritten. It is considered feature required on plain file. Required for skipping unsupported IPv6 leases if IPv6 support was disabled, but I doubt we will ever need it.
Created attachment 1277639[details]
talking-script for garbage handling tests
I used this script when testing input into dnsmasq. I used libvirt dnsmasq setup from systemctl show libvirtd, but replaced dhcp-script to point to this script. Dnsmasq has to be killed manually before starting it again. Then commented and uncommented some lines in talking-script to produce different output.
Hello Ioanna,
the package name is dnsmasq. "Masq" comes from word masquerade. I admit the name is strange, but it is the name. While dnsmasq provides dns and dhcp services in a single process, *dhcp* cannot be running. I replaced it with dhcp-script, because it is the script running on every dhcp request. I keep dash, because that is name of the configuration option. Space can be used, but it changes *dhcp script*, not *dhcp*. I removed also format errors. It logs just any error the script will produce. Format error or different kind (resource limits failure, error in script itself) are logged. Otherwise I am ok with that.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2018:0733