Bug 1326497 - libdevmapper: dm_task_get_info() backwards compatibility
Summary: libdevmapper: dm_task_get_info() backwards compatibility
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.2
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: LVM and device-mapper development team
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1356957
TreeView+ depends on / blocked
 
Reported: 2016-04-12 19:56 UTC by Steve Schremmer
Modified: 2021-09-03 12:35 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1356957 (view as bug list)
Environment:
Last Closed: 2016-04-13 08:10:04 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Steve Schremmer 2016-04-12 19:56:47 UTC
Description of problem:
The NetApp sanlun command-line utility crashes when using libdevmapper under RHEL7.2.
  *** Error in `sanlun': free(): invalid next size (fast): 0x00007f06040009c0 ***

sanlun dynamically loads libdevmapper.so to get access to a few calls, including dm_task_get_info().

I believe I've tracked the issue back to the addition of 2 new fields to struct dm_info. 'int deferred_remove' and 'int internal_suspend' were added to the struct, causing the struct to grow, but the sanlun binary doesn't know this and only allocates enough memory for the older struct dm_info. When dm_task_get_info() memset's dm_info to zero, it uses the new larger size and steps on memory past the allocated region. 

libdevmapper is intended to be backwards compatible using versioned dm_task_get_info() functions, but it may not be working correctly. Does this version of libdevmapper contain the fix discussed here: https://www.redhat.com/archives/lvm-devel/2015-September/msg00000.html

readelf show the following symbols in libdevmapper:
# readelf -Ws /lib64/libdevmapper.so | grep dm_task_get_info | awk '{print $8}'
dm_task_get_info@Base
dm_task_get_info@@DM_1_02_97
dm_task_get_info_with_deferred_remove@@Base

Version-Release number of selected component (if applicable):
Name        : device-mapper-libs
Arch        : x86_64
Epoch       : 7
Version     : 1.02.107
Release     : 5.el7
 and also tested with
Version     : 1.02.107
Release     : 5.el7_2.1

Steps to Reproduce:
run 'sanlun lun show -pv' with NetApp storage attached.

Additional info:
deferred_removed added here: https://git.fedorahosted.org/cgit/lvm2.git/commit/?h=v2_02_130&id=42e07d2bceaad21f197de5dde9ead69742425159
internal_suspend added here: https://git.fedorahosted.org/cgit/lvm2.git/commit/?h=v2_02_149&id=797c18d543947f4c2777b4dcf3ceff57cb55352b
some discussion of fixing the symbol export control: https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=82a27a85b54b68277e79de2fa9b4fadb01330497

Comment 1 Zdenek Kabelac 2016-04-12 20:15:37 UTC
Hi

Is your applicatoin properly requesting fully qualified 'dm_task_get_info@Base' (if you want the old one) ?

Please show us sample application code and which symbol you are requesting ?

If you are writing dynamically loadable plugin - you are unfortunately 'on your own boat' - since you can't use luxury provided by 'gcc' when linking versioned symbols automatically right.

The further question is - why you are is actually even trying to dynamically load
libdevmapper ?

If you have your own 'code' using libdevmapper - you should link this plugin with libdevmapper linked (and then you do not need to resolve magic yourself).

Otherwise you should just 'link' the library to have it portable - there is not much point to open 'library' dynamically and then use it as you will not save memory since unused code is not paged-in and will just complicate your code in major way.

Comment 3 Steve Schremmer 2016-04-12 20:57:35 UTC
I'm not sure the existing application code *knew* to ask for the old function.

It opens the library like this:
  if ((libdm_handle = dlopen(LIBDEVMAPPER_PATH, RTLD_NOW | RTLD_GLOBAL)) == NULL){ /* snipped out error path */ }

And requests the symbol like this:
  if ((lib_dm_task_get_info = (int (*)(struct dm_task *, struct dm_info *)) dlsym(libdm_handle, "dm_task_get_info")) == NULL){ /* error path */

I can't answer the "why" questions, as I wasn't involved in the original development of the application, and this issue has only shown itself in RHEL7.

Which of these is the "correct" symbol?
dm_task_get_info@Base
dm_task_get_info@@Base

Comment 4 Alasdair Kergon 2016-04-12 23:15:30 UTC
dlvsym() lets you specify the version

Comment 5 Alasdair Kergon 2016-04-13 01:08:40 UTC
If I understand this correctly:

After the code was compiled, we added 2 fields to the end of struct dm_info.

The code has not been changed since that happened.

The code expects to use the original version of the function viz. dm_task_get_info@Base.

dlsym() is actually picking up the new (now default) version, dm_task_get_info@@DM_1_02_97.

Our symbols have always been versioned, but it's only recently that we introduced more than one version of the same symbol.  Until dm version 1.02.97 everything had the version string "Base" and code got away with not specifying which version it needed to use because there was only one.  But strictly speaking, all code should have been using dlvsym(..., "Base") instead of dlsym() to make it future proof.

Symbol versioning is very poorly documented.  (And we have some code of our own that uses dlsym() so also needs making future proof.)

Comment 6 Alasdair Kergon 2016-04-13 01:17:14 UTC
(In reply to Steve Schremmer from comment #3)

> Which of these is the "correct" symbol?
> dm_task_get_info@Base
> dm_task_get_info@@Base

@@ simply indicates which version is the default.  Newly-compiled code will pick up that one which will be consistent with the current version of libdevmapper.h.

Comment 7 Zdenek Kabelac 2016-04-13 08:09:28 UTC
I'd suggest to redesign your code to 'dlopen' only just 'your' plugins.

Those plugins should be then correctly linked objects with full .so lib deps.
As example - see our 'dmeventd' our plugins - we have full control over them.
(yes we could switch dlvsym())

Those plugins are using i.e. device mapper library:

$ ldd ./libdevmapper-event-lvm2mirror.so 
	linux-vdso.so.1 (0x00007ffcac94e000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fe8d14db000)
	libdevmapper-event-lvm2.so.2.02 => /var/tmp/lvm/lib/libdevmapper-event-lvm2.so.2.02 (0x00007fe8d12d7000)
	libdevmapper.so.1.02 => /var/tmp/lvm/lib/libdevmapper.so.1.02 (0x00007fe8d1068000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fe8d0ca1000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe8d0a83000)
	liblvm2cmd.so.2.02 => /var/tmp/lvm/lib/liblvm2cmd.so.2.02 (0x00007fe8d06a6000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fe8d047f000)
	libsepol.so.1 => /lib64/libsepol.so.1 (0x00007fe8d01e6000)
	libudev.so.1 => /lib64/libudev.so.1 (0x00007fe8d01c5000)
	libm.so.6 => /lib64/libm.so.6 (0x00007fe8cfebb000)
	/lib64/ld-linux-x86-64.so.2 (0x000055ac34e7c000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fe8cfcb6000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fe8cfa73000)
	libdevmapper-event.so.1.02 => /var/tmp/lvm/lib/libdevmapper-event.so.1.02 (0x00007fe8cf86b000)
	libpcre.so.1 => /lib64/libpcre.so.1 (0x00007fe8cf5f9000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fe8cf3f4000)
	libresolv.so.2 => /lib64/libresolv.so.2 (0x00007fe8cf1d9000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fe8cefc1000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fe8cedbc000)
	libattr.so.1 => /lib64/libattr.so.1 (0x00007fe8cebb6000)


Using libdevmapper with dlopen is more complicated and user has to always use full function name (with version).

Suggested links:

https://github.com/apitrace/apitrace/issues/258
https://www.akkadia.org/drepper/symbol-versioning
http://www.akkadia.org/drepper/dsohowto.pdf


Note You need to log in before you can comment on or make changes to this bug.