Bug 676335 - Review Request: dmtcp - Checkpoint/Restart functionality for Linux processes
Summary: Review Request: dmtcp - Checkpoint/Restart functionality for Linux processes
Keywords:
Status: CLOSED DUPLICATE of bug 750394
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Timothy St. Clair
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-09 15:04 UTC by Kapil Arya
Modified: 2013-10-19 14:42 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-10-31 23:30:01 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Kapil Arya 2011-02-09 15:04:25 UTC
Spec URL: www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp.spec
SRPM URL: www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp-1.2.0+svn886-1.src.rpm
Description:
DMTCP (Distributed MultiThreaded Checkpointing) is a tool to transparently
checkpointing the state of an arbitrary group of programs including
multi-threaded and distributed computations.  It operates directly on the user
binary executable, with no Linux kernel modules or other kernel mods.

Among the applications supported by DMTCP are OpenMPI, MATLAB, Python, Perl,
and many programming languages and shell scripting languages.  DMTCP also
supports GNU screen sessions, including vim/cscope and emacs. With the use of
TightVNC, it can also checkpoint and restart X-Windows applications, as long as
they do not use extensions (e.g.: no OpenGL, no video).

NOTE: This is my first package and I need a sponsor.

Comment 1 Neal Becker 2011-02-10 00:53:28 UTC
1. Needs patch.

%build
sed -i -e 's/enable_option_checking=fatal/enable_option_checking=no/'\ configure.ac
autoreconf --force
%configure --disable-option-checking

2. A number of 'file listed twice' warnings.

Comment 2 Neal Becker 2011-02-10 00:58:43 UTC
%{_libdir}/libdmtcpaware.so.*
%{_libdir}/%{name}/libdmtcpaware.so.*

That seems odd.  Install same lib in both places?

Comment 3 Kapil Arya 2011-02-10 01:08:18 UTC
Thanks for review Neal!

I have updated %build section in dmtcp.spec as you suggested.

About the file listed twice warnings, I am not sure how to eliminate them. My guess is that this is due to the fact that we have all the binaries and helping files in %{_libdir}/dmtcp/* and the binaries  %{_bindir}/dmtcp_* are actually symlinks to the real binaries in %{_libdir}/dmtcp/dmtcp_*.

Is there a better way to do that?

Similarly %{_libdir}/libdmtcpaware.so.* are symlinks to %{_libdir}/dmtcp/libdmtcpaware.so.*

Should I keep them as is, or move these .so files from %{_libdir}/dmtcp/ to %{_libdir}/?

Thanks,
-Kapil

Comment 4 Neal Becker 2011-02-10 11:59:07 UTC
I don't know what is best for %{_libdir}/dmtcp/xxx.  I don't know enough about dmtcp yet, for one thing.  It's fairly unusual to put package-specific stuff into a subdir of libdir.  What are these things?  Are they all arch-specific? 

If things like /usr/lib64/dmtcp/dmtcp_nocheckpoint
are binary files, I think /usr/libexec might be the place.

Comment 5 Neal Becker 2011-02-10 12:53:53 UTC
I've been reading this thread:

http://www.redhat.com/archives/rhl-devel-list/2005-May/msg00264.html

From that, it sounds like /usr/lib64/dmtcp may be the correct place.

Comment 6 Kapil Arya 2011-02-10 21:09:40 UTC
In /usr/lib64/dmtcp, we have the dmtcp_* binaries and some other helper files that are needed by DMTCP : dmtcphijack.so, mtcp_restart, libmtcp.so etc. And since we were already putting these extra files in /usr/lib64/dmtcp, we placed the binaries and the .so file in there as well so that they can easily find these helper files.

Comment 7 Kapil Arya 2011-03-17 00:24:45 UTC
Hello,

We have released a new version of DMTCP upstream. It's 1.2.1. Can someone tell me how to update the source rpm and the spec file here. Should I just post the URL to the new package or is there some other way.

Thanks,
-Kapil

Comment 8 Jerry James 2011-03-17 02:59:45 UTC
Yes, just post the new URLs.  You say "we".  Are you one of the developers?

Comment 9 Kapil Arya 2011-04-01 21:05:02 UTC
I am sorry for the delay, but I am back now with the new URLs. Here they are:

Spec URL: http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp.spec
SRPM URL: http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp-1.2.1-1.1.src.rpm

Please let me know if there are any concerns/comments with respect to the package.

> Yes, just post the new URLs.  You say "we".  Are you one of the developers?
Yes, I am one of the developers of DMTCP.

Thanks,
-Kapil

Comment 10 Matthew Farrellee 2011-05-16 15:55:08 UTC
I understand there is a dependency on libc.a that is being eliminated, please update when that change is available.

Comment 11 Kapil Arya 2011-05-28 20:41:28 UTC
I am again sorry for the delay in responding to this thread. Here are the links to the new SRPM and Spec file:
Spec URL: http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp.spec
SRPM URL: http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp-1.2.1+svn1031-5.1.src.rpm

This package eliminates the dependency on libc.a.

We are also planning a new release in the next week or two. I will post the new links when it's done.

Thanks,
-Kapil

Comment 12 Jerry James 2011-07-20 21:42:07 UTC
Kapil, do you plan to update your SRPM to 1.2.2?  Also, do you still need a sponsor?  If so, can you point to any of the activities covered here:

https://fedoraproject.org/wiki/How_to_get_sponsored_into_the_packager_group

besides this package, of course? :-)

Comment 13 Kapil Arya 2011-07-20 21:55:09 UTC
Hi Jerry,

Thanks for the pointer. I do need a sponsor :-) for this package and another package that will be rolled out in the next month or two :).

At this point we are planning on releasing 1.2.3 in the next day or two and I will update the SRPM as soon as 1.2.3 is out.

Comment 14 Kapil Arya 2011-07-22 20:12:17 UTC
Hi All,

Here is the url for 1.2.3 release:
Spec URL: http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp.spec
SRPM URL:
http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp-1.2.3-2.1.src.rpm

Thanks,
Kapil

Comment 15 Neal Becker 2011-07-22 21:26:30 UTC
Doesn't build on my f15 x86_64 platform:

configure: error: unrecognized options: --disable-dependency-tracking

Comment 16 Kapil Arya 2011-07-22 21:56:47 UTC
I used OpenBuild services by OpenSUSE and the package built fine. Here is the link:
https://build.opensuse.org/package/show?package=dmtcp&project=home%3Akarya0

Is there an way to fix this bug?

Thanks,
-Kapil

Comment 17 Neal Becker 2011-07-22 22:20:49 UTC
I'm totally lost.  I tried adding
autoreconf --force --install

before 
%configure

which usually fixes this sort of thing.  Didn't help.

The top-level configure file seems to have been rebuilt
(it's got the current date/time), but doesn't seem to grok
--disable-dependency-checking.

OTOH, dmtcp/configure explicitly features it (it's even part of the --help documentation).

I don't know enough about autoconf etc.

Comment 18 Neal Becker 2011-07-25 01:03:50 UTC
I believe the problem is in configure.ac


dnl Autoconf manual says option checking is set to warn ("yes") by
dnl  by default.  But it's actually set to "no".
dnl So, we enforce our own choice ("fatal") if autoconf won't cooperate.
enable_option_checking=fatal


Please remove this, and I think it should work.

Comment 19 Kapil Arya 2011-07-25 16:10:45 UTC
Thanks for the info Neal.

This problem has been fixed upstream by adding dependency-tracking option to top-level configure.ac. I will put the updated links to sprm and spec file in a few hours.

Comment 20 Kapil Arya 2011-07-26 03:27:37 UTC
Hi All,

Here are the new urls fixing the configure issue:
Spec URL: http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp.spec
SRPM URL:
http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp-1.2.3+svn1214-1.1.src.rpm

Please let me know if there are some other issues.

Thanks!

Comment 21 Neal Becker 2011-07-26 11:09:03 UTC
  /usr/lib64/dmtcp:
  total used in directory 2356 available 38220644
  drwxr-xr-x    3 root root   4096 Jul 26 07:02 .
  dr-xr-xr-x. 180 root root 135168 Jul 26 07:02 ..
  -rwxr-xr-x    1 root root 169928 Jul 26 06:58 dmtcp_checkpoint
  -rwxr-xr-x    1 root root 149600 Jul 26 06:58 dmtcp_command
  -rwxr-xr-x    1 root root 196200 Jul 26 06:58 dmtcp_coordinator
  -rwxr-xr-x    1 root root 713464 Jul 26 06:58 dmtcphijack.so
  -rwxr-xr-x    1 root root 430024 Jul 26 06:58 dmtcp_inspector
  -rwxr-xr-x    1 root root   4696 Jul 26 06:58 dmtcp_nocheckpoint
  -rwxr-xr-x    1 root root 498112 Jul 26 06:58 dmtcp_restart
  drwxr-xr-x    2 root root   4096 Jul 26 07:02 examples
  -rwxr-xr-x    1 root root  87696 Jul 26 06:58 libmtcp.so

These things don't all belong here.  Certainly not examples (should be in /usr/share/dmtcpxxx/doc).

Comment 22 Kapil Arya 2011-08-19 00:45:06 UTC
Hi All,

Here are the links to updated SPRM and SPEC which address the concerns raised by Neal and others. Please let us know the feedback.

Spec URL: http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp.spec
SRPM URL:
http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp-1.2.3-1.svn1264.fc15.src.rpm

Thanks,
-Kapil

Comment 23 Neal Becker 2011-08-19 11:20:27 UTC
1. should own it's doc dir:
rpm -qf /usr/share/doc/dmtcp-1.2.3/
file /usr/share/doc/dmtcp-1.2.3 is not owned by any package

Comment 24 Neal Becker 2011-08-19 11:59:12 UTC
2. Why is the lib package called  libdmtcpaware1, rather than just  libdmtcpaware?

Comment 25 Neal Becker 2011-09-08 11:57:45 UTC
I fixed comment 23, and one rpmlint error (extraneous devel depenency)

There are 2 other issues:

1. I still need answer to comment 24.  Is the '1' a version number?  It doesn't belong there.

2. dmtcp.x86_64: E: statically-linked-binary /usr/bin/mtcp_restart

does mtcp_restart have to be statically linked?

Comment 26 Kapil Arya 2011-10-03 18:56:50 UTC
Hello Neal,

1. The '1' in libdmtcpaware1 resembles to the major library number. We were following int the similar way as glibc does. Is there a problem with that?

2. Unfortunately, mtcp_restart has to be statically linked for DMTCP to work properly with various things like ASLR and vdso.

Thanks,
-Kapil

Comment 27 Neal Becker 2011-10-03 22:30:41 UTC
1. Here is an example:

ls -l /usr/lib64/libunuran*
lrwxrwxrwx. 1 root root     19 May 13 13:53 /usr/lib64/libunuran.so -> libunuran.so.15.0.0
lrwxrwxrwx. 1 root root     19 May 13 13:53 /usr/lib64/libunuran.so.15 -> libunuran.so.15.0.0
-rwxr-xr-x. 1 root root 732880 Feb  7  2011 /usr/lib64/libunuran.so.15.0.0

rpm -qf /usr/lib64/libunuran*
unuran-devel-1.8.0-2.fc15.x86_64
unuran-1.8.0-2.fc15.x86_64
unuran-1.8.0-2.fc15.x86_64

As for glibc:
rpm -q glibc
glibc-2.14-5.x86_64
glibc-2.14-5.i686

rpm -qf /lib64/libc-2.14.so 
glibc-2.14-5.x86_64

Here, the name is glibc, and the version is 2.14.  The rpmspec revision is 5.  So no, the '1' is not part of the name.  And the soname is handled by the dynamic linker, it's not part of the file name.

At least, that's my understanding.

2. If it has to be statically linked, I think that is OK, we just need to ask for some exception from the usual rules (statically linked binaries are discouraged).

Comment 28 Kapil Arya 2011-10-03 22:44:58 UTC
Aha. Thanks for the clarification.

Since we created the debian package earlier, we used the debian naming convention (debian uses names like libc6, libc6-dbg, libc6-dev etc.). 

Now that it's clear, we should revert the package name to libdmtcpaware without the '1'.

Since you mentioned earlier that you made some fixes to the spec file, would it be possible to provide us that spec file so that I can make the changes or if you prefer, you can make this change. Either is fine with me.

Thanks !!

Comment 29 Thomas Spura 2011-10-04 20:27:10 UTC
(In reply to comment #16)
> I used OpenBuild services by OpenSUSE and the package built fine. Here is the
> link:
> https://build.opensuse.org/package/show?package=dmtcp&project=home%3Akarya0
> 
> Is there an way to fix this bug?

There are many differences between opensuse and fedora, so I'm afraid, that it won't be easy possible to:
  * use the OpenBuild service for testing fedora spec files
  * using the same spec in both distributions

Some issues:
- disable-option-tracking results in failure of configure:
  setting enable_option_checking=no in configure.ac helps here, see comment #1
- Where do you have the source from?
  https://fedoraproject.org/wiki/Packaging:SourceURL#Referencing_Source
- %make_install is way different from the usual "make install DESTDIR=%{buildroot}
  and only should be used as last resort. Please use the "make install ..." 
  command
- the "static" packages are named e.g. libdmtcpaware-static (without the -devel 
  in between)
- "# disable the test for now as bash is failing with 32-bit when built on 64-bit
  machine.":
- cp QUICK-START COPYING %{buildroot}/%{_defaultdocdir}/%{name}-%{version}/
  fails because the directory doesn't exist yet.
  %doc them in the %files section would be best here (apparently you don't 
  install docs within %make_install).
- Requireing should be done with %{_isa} (also the devel packages etc):
  https://fedoraproject.org/wiki/PackagingGuidelines#Requiring_Base_Package

Comment 30 Kapil Arya 2011-10-04 20:53:23 UTC
Thanks Thomas, for pointing out the differences.

We do realize now that we would need separate spec files for the two distros and have started working in that direction. 

(In reply to comment #29)
> (In reply to comment #16)
> > I used OpenBuild services by OpenSUSE and the package built fine. Here is the
> > link:
> > https://build.opensuse.org/package/show?package=dmtcp&project=home%3Akarya0
> > 
> > Is there an way to fix this bug?
> 
> There are many differences between opensuse and fedora, so I'm afraid, that it
> won't be easy possible to:
>   * use the OpenBuild service for testing fedora spec files
>   * using the same spec in both distributions
> 
> Some issues:
> - disable-option-tracking results in failure of configure:
>   setting enable_option_checking=no in configure.ac helps here, see comment #1
> - Where do you have the source from?
>   https://fedoraproject.org/wiki/Packaging:SourceURL#Referencing_Source

The latest sources (comment #22) were generated on a fedora 15 virtual machine. (The earlier ones were generated by OpenBuild service, but now we are generating in the fedora15 VM).

> - %make_install is way different from the usual "make install
> DESTDIR=%{buildroot}
>   and only should be used as last resort. Please use the "make install ..." 
>   command

Ok.

> - the "static" packages are named e.g. libdmtcpaware-static (without the -devel 
>   in between)

Will take care of this.

> - "# disable the test for now as bash is failing with 32-bit when built on
> 64-bit
>   machine.":

I will go back an recheck this. This used to be true for OpenBuild services, but since we are not using them any more, I need to confirm this.

> - cp QUICK-START COPYING %{buildroot}/%{_defaultdocdir}/%{name}-%{version}/
>   fails because the directory doesn't exist yet.
>   %doc them in the %files section would be best here (apparently you don't 
>   install docs within %make_install).

Will fix it as well.

> - Requireing should be done with %{_isa} (also the devel packages etc):
>   https://fedoraproject.org/wiki/PackagingGuidelines#Requiring_Base_Package

Thanks for the pointer. I will make these changes and those suggested by Neal and put a pointer to the updated RPM and SPEC file here.

Thanks,
-Kapil

Comment 31 Kapil Arya 2011-10-25 23:09:53 UTC
Hi All,

I have updated the spec file as suggested by Neal and Thomas. Here are the URLs:

Spec URL: http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp.spec
SRPM URL:
http://www.ccs.neu.edu/home/kapil/fedora_rpms/dmtcp-1.2.3-2.svn1321.fc15.src.rpm

Please let me know if I missed something.

Thanks,
-Kapil

Comment 32 Thomas Spura 2011-10-31 17:47:47 UTC
* Assigned To: Timothy St. Clair
* APPROVED by Neal Becker without a comment
* Don't know who is the sponsor for Kapil Arya here
  (of if sponsord in another bug)

What's going on here?

Comment 33 Neal Becker 2011-10-31 18:32:36 UTC
1. I was attempting to accept ownership of this.  Sorry if I did not proceed correctly.  What do I need to do?

2. It was not my attempt to APPROVE this.  I thought setting the flag was to request review.

3. Please see my latest here:
http://nbecker.fedorapeople.org/dmtcp.spec
http://nbecker.fedorapeople.org/dmtcp-1.2.3-3.svn1321.fc15.src.rpm

Comment 34 Thomas Spura 2011-10-31 19:10:36 UTC
(In reply to comment #33)
> 1. I was attempting to accept ownership of this.  Sorry if I did not proceed
> correctly.  What do I need to do?

Ah, I understand.

Please open your own review request (so that bug opener=later package owner) and close this as a dublicate of your new one.
(Don't know if the NEEDSPONSOR flag needs to get cleard...)

Comment 35 Neal Becker 2011-10-31 23:30:01 UTC
I am taking ownership of this (at the request of upstream), so am closing

*** This bug has been marked as a duplicate of bug 750394 ***


Note You need to log in before you can comment on or make changes to this bug.