Bug 441570 (dnx) - Review Request: dnx - Distributed Nagios Executor
Summary: Review Request: dnx - Distributed Nagios Executor
Keywords:
Status: CLOSED CANTFIX
Alias: dnx
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kevin Fenzi
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-04-08 19:52 UTC by Jeff Goldschrafe
Modified: 2009-09-24 18:06 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-24 18:06:38 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Jeff Goldschrafe 2008-04-08 19:52:55 UTC
Spec URL: http://neckbeard.info/packages/fedora/dnx.spec
SRPM URL: http://neckbeard.info/packages/fedora/dnx-0.18-6.src.rpm
Description: DNX is an event broker module and set of worker daemons for Nagios that transparently distributes service checks among a set of designated worker nodes. The spec contains a main package, "dnx", which contains the worker daemon, and a subpackage, "nagios-dnx", which contains the event broker module.

This package is based on the openSUSE package by John Calcote. I have included a patch to disable the unit test that runs the daemon, which fails inside mock. This problem exists in openSUSE's build system as well, and is documented here:
http://lists.opensuse.org/opensuse-buildservice/2008-02/msg00198.html

As this is my first package submission, I will need a sponsor.

Comment 1 Xavier Bachelot 2008-04-14 15:21:36 UTC
License: does not match the guidelines.
http://fedoraproject.org/wiki/Packaging/LicensingGuidelines

Source: should be a full url. There are specific rules in the packaging
guidelines  for Sourceforge URLs. http://fedoraproject.org/wiki/Packaging/SourceURL

BuildRoot: doesn't match the guidelines.
http://fedoraproject.org/wiki/Packaging/Guidelines#head-b4fdd45fa76cbf54c885ef0836361319ab962473


There are specific rules on how to create an user.
http://fedoraproject.org/wiki/Packaging/UsersAndGroups

initscript is not handled properly :  
http://fedoraproject.org/wiki/Packaging/SysVInitScript#head-91577460937bcc1e94127ca00afd8e69274823b0

Some files path should be replaced by macros :
/var/log %{_localstatedir}/log
/etc/nagios %{sysconfdir}/nagios
%{_prefix}/bin %{_bindir}

Stopping the nagios server on %%preun looks a bit suspicious to me.


Comment 2 Jeff Goldschrafe 2008-04-14 15:57:59 UTC
Okay, I think I've gotten all of these fixed, except the BuildRoot, which I
don't believe was really in error -- it was copied and pasted directly from the
packaging guidelines.

Anyway, here's the fixed URLs:

Spec URL: http://neckbeard.info/packages/fedora/dnx.spec
SRPM URL: http://neckbeard.info/packages/fedora/dnx-0.18-8.src.rpm

Thanks for the feedback.

Comment 3 Xavier Bachelot 2008-04-14 17:50:12 UTC
(In reply to comment #2)
> Okay, I think I've gotten all of these fixed, except the BuildRoot, which I
> don't believe was really in error -- it was copied and pasted directly from the
> packaging guidelines.
> 
Ok, my bad for the BuildRoot.


> Anyway, here's the fixed URLs:
> 
> Spec URL: http://neckbeard.info/packages/fedora/dnx.spec
> SRPM URL: http://neckbeard.info/packages/fedora/dnx-0.18-8.src.rpm
> 
> Thanks for the feedback.

The package doesn't build in mock. Changing :
%{__install} -d -m 0755 %{dnxrundir}
to
%{__install} -d -m 0755 %{buildroot}%{dnxrundir}
fixes it.

Once built, rpmlint output is not clean, mostly because of the strange %%defattr
of the %%files sections.
Also, please take care of the changelog. The release bumps from -1 to -7 and I
don't know what happened in between and the package actual release is -8 anyway.

The group creation is still not clean :
getent group %{dnxgroup} >/dev/null || groupadd -r GROUPNAME

Missing %post section :
%post
/sbin/chkconfig --add %{dnxsvc}



Comment 4 Xavier Bachelot 2008-04-14 18:15:24 UTC
The README file will need a little surgery:
sed -i -e 's/\r//' README

The tarball included in the doc should be handled differently.

The %{_sbindir}/rcdnxcld symlink should be killed and the initscript needs some
work too.

nagios-dnx probably needs to require nagios.

I've not read enough about dnx, but the package names are maybe not optimal.
Something like dnx and dnx-client might be better.

Comment 5 Jeff Goldschrafe 2008-04-14 18:23:56 UTC
The openSUSE packages were originally just named dnx-server and dnx-client; it
seemed silly to me when I first looked at it because the dnx-client is a daemon
so I changed it based on the recommendations of a few people in #fedora-devel,
but looking at it more it probably makes sense to keep it the same as it was in
openSUSE.

Nice catch again on the nagios require. I don't know how I managed to miss
something that obvious.

Comment 6 Xavier Bachelot 2008-04-14 18:40:59 UTC
(In reply to comment #5)
> The openSUSE packages were originally just named dnx-server and dnx-client; it
> seemed silly to me when I first looked at it because the dnx-client is a daemon
> so I changed it based on the recommendations of a few people in #fedora-devel,
> but looking at it more it probably makes sense to keep it the same as it was in
> openSUSE.
>
You probably know better than I ;-) I had the same trouble with nsca.
rpmlint will complain if the initscript is not named after the package name
anyway...
 
> Nice catch again on the nagios require. I don't know how I managed to miss
> something that obvious.

The other one will need to own %{_sysconfdir}/nagios, as they are both dropping
conf file in /etc/nagios.


Comment 7 Xavier Bachelot 2008-05-21 10:49:08 UTC
ping ?

Comment 8 Jeff Goldschrafe 2008-05-22 14:04:34 UTC
I've reworked a bunch of things in the package and got it almost perfect, but
I'm confused about how I should be handling the doxygen documentation included
in the install -- it's in there with a regular %doc right now, but this behavior
seems wrong to me. What's the right way to include this?

Comment 9 Xavier Bachelot 2008-05-22 14:12:48 UTC
Could you please post a link to the updated spec and srpm ?
I'll try and see what to do with the doxygen docs.

Comment 10 Jeff Goldschrafe 2008-05-22 15:02:11 UTC
Note: I've kicked the release number back to 1, since it was never actually a
real release, and was just a stupid thing I was doing to get my test system to
update from my yum repo. (I've switched to a 0 release with point release bumps
for development of new RPMs internally.)

Here are the new files:
- http://neckbeard.info/packages/fedora/dnx.spec
- http://neckbeard.info/packages/fedora/dnx-0.18-1.src.rpm

Comment 11 Xavier Bachelot 2008-05-24 10:30:11 UTC
About the doxygen docs, you should not package the tarball, instead you should
either include the untar'ed file in the regulat %%doc, or if it's too big, you
might want to build a dnx-docs subpackage.

Not sure the nagios 3 support is acceptable as is, because nagios 3 is currently
 not available in Fedora, even in Rawhide. You'll be able however to add it when
nagios gets updated in Rawhide, but then in the devel branch. No need to
complicate things in the other branches.

On the same register, not sure all the %define at the beginning of the spec are
useful. nagios user/group/etc... are nagios in Fedora, not need to complicate
things.

That said not sure, I've already told it, I'm unfortunately not a sponsor so I
won't be able to help much further now the package is in a reasonably good shape.

Comment 12 Jeff Goldschrafe 2008-06-11 16:20:53 UTC
Updated to 0.18-2, with the following changes:

- Removed most of the defines except for %dnxrundir and %dnxlogdir. Moved users
to nagios.
- No longer dump things in nagios directories. /etc/dnx, /var/lib/dnx, etc. are
now used.
- Corrected a typo in specfile that would have added wrong group name.
- Change to init script to keep it from spitting out startup info.

I don't think the Nagios 3.0 support poses a real problem -- it's not a
substantial amount of cruft (five lines in total) which doesn't pollute the
script in a way such that it prevents it from being readable in any way. In
addition, it documents exactly what changes need to be made when compiling with
Nagios 3.0 support, if and when the project makes that change. Finally, it
allows the same Fedora specfile to be used by people who are already running
Nagios 3.0 in their environments -- they just need to pass a -D'nagios_version
3' when rebuilding the SRPM.

New files:
- http://neckbeard.info/packages/fedora/dnx.spec
- http://neckbeard.info/packages/fedora/dnx-0.18-2.src.rpm

Comment 13 Kevin Fenzi 2008-08-14 02:41:01 UTC
Hey Jeff. 

Are you still looking for a sponsor? 

Do you have other packages you would like to submit? 
Have you done any pre-reviews of other pending review submissions?

Take a look at: 

http://fedoraproject.org/wiki/PackageMaintainers/HowToGetSponsored

I am happy to look at this package and see about sponsoring you, but it would be good to have some more packages or see some pre-reviews you have done to make sure you understand the guidelines. 

Look for a full review of this package at some point here...

Comment 14 Kevin Fenzi 2008-08-14 03:05:08 UTC
This package doesn't seem to build for me in mock for f10. I get at the end of the build.log: 

/bin/sh ../libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I..     -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -MT dnxTransport.lo -MD -MP -MF .deps/dnxTransport.Tpo -c -o dnxTransport.lo dnxTransport.c
mv -f .deps/dnxSleep.Tpo .deps/dnxSleep.Plo
dnxTcp.c: In function 'dnxTcpNew':
dnxTcp.c:369: error: 'HOST_NAME_MAX' undeclared (first use in this function)
dnxTcp.c:369: error: (Each undeclared identifier is reported only once
dnxTcp.c:369: error: for each function it appears in.)

Comment 15 Jeff Goldschrafe 2008-08-14 05:38:08 UTC
I'm absolutely still looking for a sponsor -- this is a really useful piece of software and I hope someone else finds the package handy.

I've added a patch that adds a conditional to common/dnxTransport.h defining HOST_NAME_MAX on platforms where it's not defined (e.g. some glibc). The new files are here:

http://neckbeard.info/packages/fedora/dnx.spec
http://neckbeard.info/packages/fedora/dnx-0.18-3.spec

I can't find it in Bugzilla at the moment, but I commented a bit on Xavier Bachelot's pnp4nagios package. I haven't really written much else, but I think I've got a pretty good understanding of the packaging guidelines at this point. I've been spending the last couple of months tweaking packages from EPEL for my sysadmin environment, and I've got a very good handle on how the project does things. (I'm completely anal retentive about standards-compliance.) However, if you need more to indicate that I know what I'm doing, I'd be perfectly willing to do something with a few more packages.

Comment 16 Jeff Goldschrafe 2008-08-14 05:39:59 UTC
Sorry; I typo'd the URL on that last SRPM. It should be as follows:

http://neckbeard.info/packages/fedora/dnx-0.18-3.src.rpm

Comment 17 Mamoru TASAKA 2008-08-14 06:24:06 UTC
(In reply to comment #15)
> I've added a patch that adds a conditional to common/dnxTransport.h defining
> HOST_NAME_MAX on platforms where it's not defined (e.g. some glibc). 

I guess now we shoud use sysconf(_SC_HOST_NAME_MAX) to get HOST_NAME_MAX
(please see sysconf(3) man page and
 https://www.redhat.com/archives/fedora-maintainers/2007-June/msg00737.html )

Comment 18 Jeff Goldschrafe 2008-08-14 14:32:35 UTC
It's tricky, because the standard isn't as transparent as that -- sysconf(_SC_HOST_NAME_MAX) can return -1, indicating that there _is_ no maximum host name size, and the application should handle that value accordingly. This one doesn't, and until this is changed upstream, I think it's a better idea to mangle the code as little as possible and just provide a sensible default instead.

Comment 19 Jeff Goldschrafe 2008-08-14 16:05:48 UTC
Submitted a better patch upstream to use sysconf(_SC_HOST_NAME_MAX) as recommended. Included patch in new 0.18-4:

http://neckbeard.info/packages/fedora/dnx.spec
http://neckbeard.info/packages/fedora/dnx-0.18-4.src.rpm

Comment 20 Kevin Fenzi 2008-08-23 23:15:35 UTC
Hey Jeff. Sorry for the long delay here... lets get this moving again with a formal review: 

OK - Package meets naming and packaging guidelines
OK - Spec file matches base package name.
OK - Spec has consistant macro usage.
OK - Meets Packaging Guidelines.
OK - License (GPLv2)
OK - License field in spec matches
OK - License file included in package
OK - Spec in American English
OK - Spec is legible.
OK - Sources match upstream md5sum:
b8e262dd148f22d000015e638ab75b7e  dnx-0.18.tar.gz
b8e262dd148f22d000015e638ab75b7e  dnx-0.18.tar.gz.orig
OK - BuildRequires correct
OK - Package has %defattr and permissions on files is good.
OK - Package has a correct %clean section.
OK - Package has correct buildroot
      %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
OK - Package is code or permissible content.
OK - Doc subpackage needed/used.
OK - Packages %doc files don't affect runtime.
see below - Package has rm -rf RPM_BUILD_ROOT at top of %install

OK - Package compiles and builds on at least one arch.
See below - Package has no duplicate files in %files.
See below - Package doesn't own any directories other packages own.
See below - Package owns all the directories it creates.
See below - No rpmlint output.
See below - final provides and requires are sane:

SHOULD Items:

OK - Should build in mock.
OK - Should build on all supported archs
OK - Should have dist tag
OK - Should package latest version

Issues:

1. There shouldn't be any need for
rm -rf %{buildroot} >/dev/null 2>&1
at the top of %install... should just be
rm -rf %{buildroot}
(there shouldn't be any output from that ever.

2. rpmlint says:

dnx.src:142: E: files-attr-not-set
dnx.src:143: E: files-attr-not-set
I think this is due to the doc subpackage missing a:
%defattr(-,root,root,-)
line

dnx.x86_64: W: non-standard-uid /var/run/dnx nagios
dnx.x86_64: W: non-standard-gid /var/run/dnx nagios
dnx.x86_64: W: non-standard-uid /var/log/dnx nagios
dnx.x86_64: W: non-standard-gid /var/log/dnx nagios
dnx-server.x86_64: W: non-standard-uid /var/log/dnx nagios
dnx-server.x86_64: W: non-standard-gid /var/log/dnx nagios

Ignore.

dnx.x86_64: W: log-files-without-logrotate /var/log/dnx
dnx-server.x86_64: W: log-files-without-logrotate /var/log/dnx

Need to make a logrotate file and add a Requires: logrotate?

dnx.x86_64: W: no-reload-entry /etc/rc.d/init.d/dnx

Would be nice to have if it makes sense. Not sure if this can be
reloaded though. Can it?

dnx.x86_64: W: wrong-file-end-of-line-encoding /usr/share/doc/dnx-0.18/README
dnx-server.x86_64: W: wrong-file-end-of-line-encoding /usr/share/doc/dnx-server-0.18/README

Would be nice to fix with a sed. Check the wiki for recipe.

dnx-doc.x86_64: W: file-not-utf8 /usr/share/doc/dnx-doc-0.18/dnx-doxy-0.18.tar.gz

This could be unpacked and shipped unpacked?

3. Some of the directories here are owned in the base package and the subpackages
as well. Could you describe why or rethink this? What are the use cases?

doc - used by itself, shouldn't need any others?
server - could be used by itself? or does it need the base package?
base - by itself or not?

This ties in with the users setup... the nagios package (currently required by
this server subpackage) already makes a nagios user, why do that here?
I also see things like the base package having files owned by nagios, but it
has no dependency there or user creation.

It's best to try and make only one package own dirs/files and use dependencies
for subpackages that need those directories or files. Ie, if the server needs
the base package we can remove some of the dual listed dirs in the server
subpackage. Does that make sense?

Comment 21 Jeff Goldschrafe 2008-09-15 16:36:24 UTC
Hey there,

Sorry for the delays. I just got back from vacation and have been doing a lot of testing with DNX in my environment.

I think I need to clarify the topology of DNX a little bit:

"dnx-server" is the server component, which runs as a Nagios module. The base "dnx" package is a client package, which does not require Nagios, and should be running on systems that are _not_ running Nagios. Neither of them depends on the other, which prevents any kind of useful inheritance mechanism from being put in place. Since the configure script expects the client and server portions of DNX to be using the same config/log directories, I decided to respect this, though they could be separated out if it was really necessary.

Since the DNX server component is an event broker module that runs inside the Nagios process, it runs with the credentials of the Nagios user. This is why the user "nagios" is specified as the owner of its files in the configure script, which affects both the client and server portions of DNX. Though the client doesn't have to use the same user, it seems cleaner to do so rather than manually overriding the user on a ton of files in the spec. (The client generally doesn't run on systems with Nagios installed. In fact, in my experience, it seems to fail outright connecting to a Nagios instance on the same machine.)

Logrotate configs have been added.

DNX doesn't support reload, unless it's relying on some custom signals somewhere. It dies on a HUP signal.

Finally, it looks like the doxygen docs are already extracted redundantly into doc/html, so I've just added those instead of the tarball.

New spec: http://neckbeard.info/packages/fedora/dnx.spec
New SRPM: http://neckbeard.info/packages/fedora/dnx-0.18-5.src.rpm

Comment 22 Kevin Fenzi 2008-09-21 20:52:09 UTC
Sorry now for my delay here. ;) 

items 1 and 2 all look ok now. 

On 3 your description makes sense...I maintain the 'munin' package which has similar weird interrelations between the server and client packages. 
Really it would be nice if the two parts were released as seperate packages. ;( 
I think as you currently have it both packages should handle being installed on their own and/or with the other one present, so I think we are ok. 

1. You are still missing a 'Requires: logrotate' in the both base and server packages. 

Thats a pretty minor change, and I don't see any other issues now, so 
I will go ahead and APPROVE this package provided you add the logrotate requires before import.

I will also go ahead and sponsor you... continue the process from: 
http://fedoraproject.org/wiki/PackageMaintainers/Join#Get_a_Fedora_Account

Feel free to email me any questions or catch me on irc.freenode.net in #fedora-devel (my nick is nirik).

Comment 23 Kevin Fenzi 2009-01-14 05:14:36 UTC
Jeff? Any movement here? 

Sorry this bug dropped off my radar, but I was waiting for you to move forward on it.

Comment 24 Jeff Goldschrafe 2009-01-20 15:29:37 UTC
Sorry, fell off my radar as well -- was waiting for Fedora 10 to have a final release so I could test appropriately, and didn't have time to pick it up afterwards. You'll see some submissions by the end of the week.

Comment 25 Jeff Goldschrafe 2009-03-06 18:57:05 UTC
I just bumped this to 0.19a before doing all the Fedora voodoo to get it in the build system, and had a question before I start moving to get this package in:

I had removed the parts of the specfile that create a Nagios user if it doesn't exist, since it was deemed redundant. However, this breaks the build in mock, since the nagios user doesn't exist to be the owner of the installed files. Should I be adding back the part that creates the user (what I think I should be doing) or adding the full Nagios package as a BuildRequires instead?

Comment 26 Kevin Fenzi 2009-03-06 21:42:56 UTC
Well, I think you could do either... Possibly BuildRequires on nagios would be better as then you don't have to carrry the not needed userradd code in your package for real installs?

Comment 27 Jeff Goldschrafe 2009-03-10 16:23:54 UTC
I'm finding the latest version, 0.19a, to cause near-immediate deadlocks in Nagios, often before it even runs any checks at all. Is there anyone running Nagios 2.0 or 3.0 who would be willing to help me test on i386 and/or x86_64?

(I can supply the Nagios 3.0 RPMs I'm using, if necessary.)

Comment 28 Kevin Fenzi 2009-05-11 06:32:47 UTC
Any word here Jeff?

Comment 29 Kevin Fenzi 2009-06-06 05:55:16 UTC
Jeff, are you still there? ;) 

I guess I will close this review out soon if I don't hear from you.

Comment 30 Kevin Fenzi 2009-09-24 18:06:38 UTC
No news. :( 

Feel free to let me know when/if you are around and interested in this again. 
Closing now.


Note You need to log in before you can comment on or make changes to this bug.