736486 – Add support for encoding of udev-blacklisted characters in dm device names

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 736486 - Add support for encoding of udev-blacklisted characters in dm device names

Summary: Add support for encoding of udev-blacklisted characters in dm device names

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	6.1
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Peter Rajnoha
QA Contact:	Corey Marthaler
Docs Contact:
URL:
Whiteboard:
Depends On:	633222
Blocks:	736493 740575 756082
TreeView+	depends on / blocked

Reported:	2011-09-07 19:57 UTC by Mike Burns
Modified:	2012-06-20 14:59 UTC (History)
CC List:	21 users (show)
Fixed In Version:	lvm2-2.02.95-1.el6
Doc Type:	Bug Fix
Doc Text:	Device-mapper allows any character except '/' to be used in a device-mapper name. However, this is in conflict with udev as its character whitelist is restricted to 0-9, A-Z, a-z and #+-.:=@_. Using any blacklisted character in the device-mapper name ends up with incorrect /dev entries to be created by udev. To solve this issue, the libdevmapper library together with the dmsetup command now supports encoding of udev-blacklisted characters by using the \xNN format where NN is the hex value of the character. This format is supported by udev. There are three 'mangling' modes in which libdevmapper can operate: 'none' (no mangling), 'hex' (always mangle any blacklisted character) and 'auto' (use detection and mangle only if not mangled yet). Default mode used is 'auto' and any libdevmapper user is affected unless this setting is changed by respective libdevmapper call. To support this feature, the dmsetup command has a new '--manglename <mangling_mode>' option to define the name mangling mode used while processing device-mapper names. The 'dmsetup info -c -o' has new fields to display: 'mangled_name' and 'unmangled_name'. There's also a new dmsetup 'mangle' command that renames any existing device-mapper names to its correct form automatically. It is strongly advised to call this command after an update to correct any existing device-mapper names.
Clone Of:	633222
Clones:	736493 (view as bug list)
Environment:
Last Closed:	2012-06-20 14:59:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2012:0962	0	normal	SHIPPED_LIVE	lvm2 bug fix and enhancement update	2012-06-19 21:12:11 UTC

Description Mike Burns 2011-09-07 19:57:43 UTC

+++ This bug was initially created as a clone of Bug #633222 +++

Description of problem:

Issue found using RHEV-H.  In RHEV-H we run with device-mapper-multipath for all devices that are supported.  This includes single path devices like local ATA disks.  This issue is easiest to reproduce with a VM using scsi emulation.  


In some cases, on a virtual machine using scsi disk emulation, partitioning for rhevh could fail if get_dm_device picks the incorrect entry under /dev/mapper.  

When using scsi emulation, you get two devices under /dev/mapper, one is a block device and the other is a symbolic link to /dev/dm-X.  If the symbolic link is chosen, during partitioning, when looking for /dev/mapper/<wwid>p1, it will fail.  

duplicate entry is made when wwid contains spaces


To work around this in RHEVH, we changed the udev rules:

--- a/recipe/common-el6.ks
+++ b/recipe/common-el6.ks
@@ -116,6 +116,6 @@ python -m compileall /usr/share/rhn/up2date_client
 # fixes libvirtd startup on firstboot but migration won't work
 sed -i -e '/by vdsm$/d' /etc/sysconfig/libvirtd
 
-# udev: replace spaces in device names
-sed -i -e 's/DM_NAME}"$/DM_NAME}", OPTIONS+="string_escape=replace"/' /lib/udev/rules.d/10-dm.rules
+# udev: do not create symlinks under /dev/mapper/ rhbz#633222
+sed -i -e '/^ENV{DM_UDEV_DISABLE_DM_RULES_FLAG}/d' /lib/udev/rules.d/10-dm.rules


Example output of local disk owned by dmm after this change was made. 

[root@amd-1216-8-5 ~]# ll /dev/mapper/*
brw-rw----. 1 root disk 253,  0 Jan 31 06:49 /dev/mapper/1ATA     WDC WD2502ABYS-18B7A0                        WD-WCAT19563677
brw-rw----. 1 root disk 253,  8 Jan 31 06:49 /dev/mapper/1ATA     WDC WD2502ABYS-18B7A0                        WD-WCAT19563677p1
brw-rw----. 1 root disk 253,  9 Jan 31 06:49 /dev/mapper/1ATA     WDC WD2502ABYS-18B7A0                        WD-WCAT19563677p2
brw-rw----. 1 root disk 253, 10 Jan 31 06:49 /dev/mapper/1ATA     WDC WD2502ABYS-18B7A0                        WD-WCAT19563677p3

Comment 2 Mike Burns 2011-09-07 20:07:45 UTC

On RHEL 6 host with device-mapper-multipath-0.4.9-41.el6.x86_64

# ls -l /dev/mapper/0QEMU*
lrwxrwxrwx. 1 root root       7 Sep  7 16:05 0QEMU -> ../dm-2
brw-rw----. 1 root disk 253,  2 Sep  7 16:05 0QEMU    QEMU HARDDISK   drive-scsi0-0-0


First attempt to fix in RHEV-H was to make this change:

--- 10-dm.rules.orig	2011-09-07 16:04:04.681215892 -0400
+++ 10-dm.rules	2011-09-07 16:04:28.369204965 -0400
@@ -105,7 +105,7 @@
 # possible future changes.
 ENV{DM_UDEV_RULES_VSN}="1"
 
-ENV{DM_UDEV_DISABLE_DM_RULES_FLAG}!="1", ENV{DM_NAME}=="?*", SYMLINK+="mapper/$env{DM_NAME}"
+ENV{DM_UDEV_DISABLE_DM_RULES_FLAG}!="1", ENV{DM_NAME}=="?*", SYMLINK+="mapper/$env{DM_NAME}", OPTIONS+="string_escape=replace"
 
 # We have to ignore further rule application for inappropriate events
 # and devices. But still send the notification if cookie exists.

which resulted in this:

brw-rw----. 1 root disk 253, 2 Sep  7 15:57 0QEMU    QEMU HARDDISK   drive-scsi0-0-0
lrwxrwxrwx. 1 root root      7 Sep  7 15:57 0QEMU____QEMU_HARDDISK___drive-scsi0-0-0 -> ../dm-2

Comment 3 Alasdair Kergon 2011-09-07 20:08:36 UTC

I think we need a solution to this for 6.2, or else we need to re-enable verify_udev_operations.

Comment 4 Peter Rajnoha 2011-09-08 08:45:36 UTC

This is completely under udev control, I think. What we can do is to use the 'OPTIONS+="string_escape=replace" as already done by RHEV team, but this will replace all 'unsafe characters' with an underscore which does not seem to be acceptable.

CC'ing Harald - is there any way, we can tell udev to use all characters (like spaces) for devname/symlink names?

If not, I'm afraid that RHEV team has to deal with all the escape logic that udev does. In my opinion, putting back the udev fallback and creating the nodes/symlinks directly is not a way to go at all...

Comment 5 Alan Pevec 2011-09-08 09:35:23 UTC

(In reply to comment #4)
> If not, I'm afraid that RHEV team has to deal with all the escape logic that
> udev does. In my opinion, putting back the udev fallback and creating the
> nodes/symlinks directly is not a way to go at all...

But is such change in default behaviour safe for RHEL 6.2, a minor release?

Comment 6 Harald Hoyer 2011-09-08 20:36:38 UTC

(In reply to comment #4)
> This is completely under udev control, I think. What we can do is to use the
> 'OPTIONS+="string_escape=replace" as already done by RHEV team, but this will
> replace all 'unsafe characters' with an underscore which does not seem to be
> acceptable.
> 
> CC'ing Harald - is there any way, we can tell udev to use all characters (like
> spaces) for devname/symlink names?

We could fix udev to accept symlinks with spaces, but I would recommend using the escape logic, because of all unknown side effects to other scripts and tools.

Comment 7 Peter Rajnoha 2011-09-09 07:31:41 UTC

(In reply to comment #6)
> We could fix udev to accept symlinks with spaces, but I would recommend using
> the escape logic, because of all unknown side effects to other scripts and
> tools.

Note that the escape logic is not complete - the escape character itself is not escaped so we could end up with ambiguous symlink, e.g. "a b" and "a_b" will result in "a_b" and the former symlink will get overwritten - that should be fixed.

Comment 8 Peter Rajnoha 2011-09-09 07:36:52 UTC

But I'd lean to a solution where such special characters are allowed - you can use them in directory/file names as well - that's the same situation for any scripts and tools.

Comment 9 Peter Rajnoha 2011-09-13 09:00:49 UTC

Harald, Kay, would it be possible to allow these special characters in udev? Or is this a no-go for you?

Comment 10 Alan Pevec 2011-09-27 15:29:52 UTC

Or could we turn this around: let device-mapper compress and replace spaces with underscore, so that d-m names match udev?

Here's another duplicate:
# ls /dev/mapper/
brw-rw----. 1 root disk 253,  0 2011-09-26 08:04 1ATA     HDS728080PLA380      39M3701 26K5308IBM       PFDB32E7RRDX6M
brw-rw----. 1 root disk 253,  2 2011-09-26 08:06 1ATA_HDS728080PLA380_39M3701_26K5308IBM_PFDB32E7RRDX6M

# dmsetup info -j 253 -m 0
Name:              1ATA     HDS728080PLA380      39M3701 26K5308IBM       PFDB32E7RRDX6M
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        2
Event number:      0
Major, minor:      253, 0
Number of targets: 1
UUID: mpath-1ATA     HDS728080PLA380      39M3701 26K5308IBM       PFDB32E7RRDX6M

# dmsetup info -j 253 -m 2
Name:              1ATA_HDS728080PLA380_39M3701_26K5308IBM_PFDB32E7RRDX6M
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        3
Event number:      0
Major, minor:      253, 2
Number of targets: 1
UUID: mpath-1ATA_HDS728080PLA380_39M3701_26K5308IBM_PFDB32E7RRDX6M

Two d-m devices for the same underlying devices:
# dmsetup deps -j 253 -m 2
1 dependencies	: (8, 0)
# dmsetup deps -j 253 -m 0
1 dependencies	: (8, 0)

Would it be enough to modify scsi_id to filter wwids with s/ +/_/ ?

Comment 11 Peter Rajnoha 2011-09-28 02:24:30 UTC

(In reply to comment #10)
> Or could we turn this around: let device-mapper compress and replace spaces
> with underscore, so that d-m names match udev?

But that escape logic could be ambiguous (see comment #7). Udev would need to be fixed, otherwise, sooner or later, we'll hit the other problem.

Comment 12 Kay Sievers 2011-09-28 09:35:02 UTC

(In reply to comment #9)
> Harald, Kay, would it be possible to allow these special characters in udev? Or
> is this a no-go for you?

Sorry, I don't really know how to support device names with spaces in
udev. For historic reasons, the names are used internally and exported
in lists where the space char is the separator. I fear we cannot just
change things here without breaking other stuff that works that way
since the beginning.

(In reply to comment #11)
> But that escape logic could be ambiguous (see comment #7). Udev would need to
> be fixed, otherwise, sooner or later, we'll hit the other problem.

Which is the case for all udev device names symlinks since day one. And
hardware INQUIRY data might not be unique by itself. So I do not thing that
is a real world problem, it is more just ugly.

Sure I agree that all this isn't nice, and it would surely be done differently today, but it's what we have.

Deriving device names from un-mangled INQUIRY data of disks is not really an
option for udev, we need to be more careful here. Proper escaping, which is reversible, might be what we want and we could try to make that work. We do
that already for filesystem labels. Just copying the strings 1:1 from the
hardware and put them into /dev is not really an option. We cannot trust
hardware that way, and /dev content must follow some rules of sanity.

Comment 16 Alasdair Kergon 2011-09-30 22:04:25 UTC

Is it only a space that causes a problem as it's a separator, or are there more characters that can no longer be used in /dev?  Is it better to provide a list of *accepted* characters, and then any other character can be escaped (in a reversible way) with a simple tool provided if the user wants to do the escape/unescape.  (We already have an outstanding request to support UTF-8, for example, and we could kill 2 birds with 1 stone.)

Comment 17 Alasdair Kergon 2011-09-30 22:05:17 UTC

- If there is such a list of accepted characters, please would someone paste that list into this bz.

Comment 18 Kay Sievers 2011-10-01 17:05:11 UTC

These character are allowed in device node / symlink names:
  0-9
  A-Z
  a-z
  #+-.:=@_
  any valid utf8 character sequence (they are validated)

(This is all for historic reasons when Linux hotplug was still bunch of
shell scripts, and udev was just a binary called in that context. Udev
reads untrusted data from hardware that is plugged into the system
and composes symlink names out of it. The exported strings from udev
have been consumed by (some horribly written) shell scripts, and we
needed to make sure that didn't do anything bad, so we allowed only
'shell safe' characters.)

Comment 19 Alasdair Kergon 2011-10-04 19:09:51 UTC

So of the 1-byte characters, say we pick one as an escape character.  I can imagine all of them appearing in device names.

First suggestion:
# followed by hex representation terminated by another #

So # itself would appear as:
#23#

Space would appear as
#20#

and utf8 might have 4 or more hex chars between the #s


At which layer should we apply the encoding?
- we could leave dm exactly as-is, and do the encoding when creating the names for udev to use.  That's the most general solution.

Or
- we could do the encoding at the libdevmapper side, so the in-kernel names would match.  However we are currently limited to 127 characters in names - would that be enough?

Comment 20 Alasdair Kergon 2011-10-04 20:14:05 UTC

For passing utf8 through dm, 0x2f ('/') would need escaping if it can occur inside a multi-byte character - anything else?

Then perhaps it's only the remaining single byte characters that would need # encoding.

Comment 21 Kay Sievers 2011-10-05 10:57:11 UTC

Most udev tools encode non-whitelisted characters with
-backslash-hex-encoding which shells can directly reverse with eval:
  ID_MODEL_ENC=EHCI\x20Host\x20Controller
  ID_MODEL_ENC=Logitech\x20USB\x20Speaker
  ID_VENDOR_ENC=Chicony\x20Electronics\x20Co.\x2c\x20Ltd.

(Validated utf8 is just passed, all byte values in a utf sequence are bigger
than 0x80, so there can't be any meaningful char in it.)

If that fits, you could copy the code from: libudev-util.c
  https://github.com/kaysievers/udev/blob/master/libudev/libudev-util.c#L427

or if wanted, libudev could export this function, which is currently
internal only.

Comment 22 Alasdair Kergon 2011-10-05 14:13:20 UTC

Can the backslash appear in a filename inside /dev or does it need further encoding into one of the allowed characters first?

Comment 23 Kay Sievers 2011-10-05 14:19:13 UTC

A '\' followed by 'x' and two hex digits is not replaced.

Comment 24 Zdenek Kabelac 2011-10-07 08:04:59 UTC

(In reply to comment #21)
> Most udev tools encode non-whitelisted characters with
> -backslash-hex-encoding which shells can directly reverse with eval:
>   ID_MODEL_ENC=EHCI\x20Host\x20Controller
>   ID_MODEL_ENC=Logitech\x20USB\x20Speaker
>   ID_VENDOR_ENC=Chicony\x20Electronics\x20Co.\x2c\x20Ltd.
> 
> (Validated utf8 is just passed, all byte values in a utf sequence are bigger
> than 0x80, so there can't be any meaningful char in it.)
> 
> If that fits, you could copy the code from: libudev-util.c
>   https://github.com/kaysievers/udev/blob/master/libudev/libudev-util.c#L427
> 
> or if wanted, libudev could export this function, which is currently
> internal only.

Please take it as naive question - but how is the internal DB encoding used in udev related to the name which is used in /dev filesystem ?

I do not know internal udev details, but it seems to me, you could keep internal DB names in whatever encoding you want. Just whenever you call system function (i.e. link(2), rename(2)... )  you just takes your encoded name you want to pass in - decode it into the filesystem's native format thus if devfs would support utf8 - you would decode your  \x20 and other encoded sequences to the native format. And when you read some filename back from filesystem (readdir(3)...) you take the name and encode it back to your internal representation  - thus I don't see you would need to make many changes in your internal udev code - just adding some wrappers around udevs' systemcalls ?

It looks to me, that native representation in utf-8 in places like:

/sys/dev/block/253:0/dm/name 

is what most users would except - currently I'd not expect there some 'mangle' \x20 sequences - basically all tools would need major update (library??) to parse such names ?

And the same name I'd expect in other places in /dev dir.

Comment 25 Zdenek Kabelac 2011-10-07 09:26:31 UTC

Looking more into this issue - I've played a bit with the behavior on my rawhide system - and it seems like some tools already  do support this "evolutionary" way of creating encoding standards  - 

And here are my current results:

LANG=cz_CZ.UTF-8

# mkfs.ext4 -L "MY TEST ěšěeě" /dev/loop0


# blkid  /dev/loop0 
/dev/loop0: LABEL="MY TEST M-DM-^[M-EM-!M-DM-^[eM-D" UUID="76055538-f2ea-4552-a707-9ffb3bdf9cf7" TYPE="ext4" 


# blkid -o udev /dev/loop0 
ID_FS_LABEL=MY_TEST_ěšěe_
ID_FS_LABEL_ENC=MY\x20TEST\x20ěšěe\xc4
ID_FS_UUID=76055538-f2ea-4552-a707-9ffb3bdf9cf7
ID_FS_UUID_ENC=76055538-f2ea-4552-a707-9ffb3bdf9cf7
ID_FS_TYPE=ext4

/run/udev/links/disk\\x2fby-label\\x2fMY\\x5cx20TEST\\x5cx20ěšěe\\x5cxc4

/dev/disk/by-label/MY\x20TEST\x20ěšěe\xc4

Seems like this encoding has currently minor issue rawhide where mkfs.ext4 - which has only 16 bytes for label - and takes just 1st. byte from encoded last 'ě' - making funny  \xc4 character - but otherwise it seems to give result which is understand between few tools: udev/blkid/palimpsest somehow.

Looking into your upstream commit where you've just made the function public just few days ago I could probably assume everyone has currently it's own undocumented way for character transformation.

That probably means for our dmsetup tool - that when user wants to create "a b"

dmsetup needs to create table with the name  'a\x20b'

I think that needs heavy documentation - and it's probably major API change.

On my system it seems to be working already - when I use such sequence.
(i.e. I encode  space to \x20 - which means user already could use this on their own).

Now where is this standard documented and described ?

Comment 26 Peter Rajnoha 2011-10-07 09:41:08 UTC

(In reply to comment #23)
> A '\' followed by 'x' and two hex digits is not replaced.

Yes, this must be documented! I mean, users should know what to
expect, probably a mention in udev man page would be fine if we're going to
support such encoding (so we can make a reference to udev documentation for
users if needed).

There are two ways how to support that - either we directly encode the name and
use the *encoded* name as device-mapper name OR we can keep the name as it is
and encode it only in udev rules directly by calling an extra binary to do the
encoding - then we can pass the encoded string for the SYMLINK udev directive.

For me, it seems the better approach is the first one - to encode the name
directly and use it as dm name (though it would be a bit misleading for users
to see "a\x20b" as device name instead of "a b", but if documented properly, it
should be fine.

This would be default behaviour. And if someone wants to do it the old way for
any reason (e.g. by bypassing udev), there should be an option like "dmsetup
--nonameencode".

Now, a more practical question - the encode function has been exported in
libudev just recently. So the question is whether we should just copy the tiny
code part and use it without the need to call a libudev fn. If not, we need a
backport of that patch that exports the function for 6.2 version of libudev so
we can use it right away (and then make a proper Requires in the spec file).

Comment 27 Kay Sievers 2011-10-07 09:56:22 UTC

(In reply to comment #25)
> which is understand between few tools: udev/blkid/palimpsest somehow.
> 
> Looking into your upstream commit where you've just made the function public
> just few days ago I could probably assume everyone has currently it's own
> undocumented way for character transformation.

Udisks just uses the escaped udev strings and does not need to encode.
Libblkid just has an exact copy of the encoding code from libudev.

(In reply to comment #26)
> Now, a more practical question - the encode function has been exported in
> libudev just recently.

Yesterday. :)

> So the question is whether we should just copy the tiny
> code part and use it without the need to call a libudev fn. If not,
> we need a backport of that patch that exports the function for
> 6.2 version of libudev so we can use it right away

Both works. Maybe blkid_encode_string() is exported, and available in 6.2
already?

Comment 28 Peter Rajnoha 2011-10-07 10:01:31 UTC

(In reply to comment #27)
> > So the question is whether we should just copy the tiny
> > code part and use it without the need to call a libudev fn. If not,
> > we need a backport of that patch that exports the function for
> > 6.2 version of libudev so we can use it right away
> 
> Both works. Maybe blkid_encode_string() is exported, and available in 6.2
> already?

So I can assume that whitelist will never ever change, I hope, right?

Comment 29 Kay Sievers 2011-10-07 12:29:32 UTC

There are no plans to change it, it's like this for years.

But there can surely no guarantee about 'never'. :)

Comment 30 Peter Rajnoha 2011-10-07 13:09:31 UTC

(In reply to comment #29)
> There are no plans to change it, it's like this for years.
> 
> But there can surely no guarantee about 'never'. :)

If you do, I'll reassign all future bugs related to such a change to udev then :)

Comment 31 Zdenek Kabelac 2011-10-07 14:07:13 UTC

(In reply to comment #29)
> There are no plans to change it, it's like this for years.
> 
> But there can surely no guarantee about 'never'. :)

Well - so maybe you have some fix in mind for dmsetup.

Problem - User may want to create device   'a\b'

with current  dmsetup create 'a\b'  I'll get  /dev/mapper/a_b'

that's interesting - but somewhat unpredictable.

And of course this lead to selfrecursion problem - when user supply 
\x20 - was this mangle sequence for udev - or should we mangle it again ?

It seems like the only way is to limit formally supported chars which has been supported before.

Comment 32 Karel Zak 2011-10-10 07:25:51 UTC

(In reply to comment #31)
> And of course this lead to selfrecursion problem - when user supply 
> \x20 - was this mangle sequence for udev - or should we mangle it again ?

The '\' char is "unsafe", so it's encoded to \xc1.

The safe chars are:

 (c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') ||
  strchr("#+-.:=@_", c) != NULL)

and valid utf8 sequences. Anyway, it's will be better to reuse/copy existing functions from libblkid/libudev.

Comment 33 Zdenek Kabelac 2011-10-11 16:18:54 UTC

(In reply to comment #32)
> (In reply to comment #31)
> > And of course this lead to selfrecursion problem - when user supply 
> > \x20 - was this mangle sequence for udev - or should we mangle it again ?
> 
> The '\' char is "unsafe", so it's encoded to \xc1.
> 
> The safe chars are:
> 
>  (c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') ||
>   strchr("#+-.:=@_", c) != NULL)
> 
> and valid utf8 sequences. Anyway, it's will be better to reuse/copy existing
> functions from libblkid/libudev.

The problem here is - that till some point in Unix history users were able to use certain set of allowed characters in the filesystem to create a device node.

Now with no big reason certain new (and till this BZ not really undocumented) characters are transcoded to create completely different device node. (Sure we know how this routine spreads across /dev applications)

So now when the user creates node with some name 'a' - and wants to run mkfs.ext3 he needs to pass a differently named node 'b' (or a second option -  all tools accepting a device node as its parameter will need check for both names).

Of course we may overcome this - by either prohibiting such characters, or introducing numerous option to give the user encrypted name for whatever he passed in -  but quite frankly - the best and the most logical is to fix udev here (and it doesn't look like a big task compared with let's fix all other tools to handle our encrypted names reasonable)  - as I do not see a single reason to bother users with transcoded device names.

Of course you need to fix various shell script to handle spaces - but this problem was there forever and it's not a big problem to write shell script which handles filenames with spaces as we all know.

Comment 35 Peter Rajnoha 2011-10-18 09:00:35 UTC

The patchset that deals with reversible mangling of device names (comment #21) is on the lvm-devel list awaiting review. However, considering the nature of the change and the time in which we're in the 6.2 development cycle (also the time when this problem has been officialy reported and a time taken to decide whether this should be resolved by udev team or lvm team), this change poses a risk of regressions to appear which we can't neglect.

The patches provide name mangling on input and uses such mangled name as proper dm name - a care has been taken so that the name /dev/mapper is always in sync with the name in /sys/block/dm-X/dm/name. Therefore, we always do the mangling before the name hits ioctl request for device-mapper in kernel.

We could try to put this change in next snapshot, but it would be risky. Also, any application expecting characters such as the space character, would simply not find it there - udev does not support it. Any application using such device names would need to understand the mangling used (e.g. \x20 instead of the space character itself, like not expecting "/dev/mapper/a b", but "/dev/mapper/a\x20b"). This requires changes on the application side itself, so RHEV team would need to expect such encoding in /dev to appear.

As discussed with udev team in this BZ, this is the only way we can support such characters, at least for now, unless udev changes that sometime in the future (comment #12).

General facts and consequences of using the fallback mode:

- fallback mode is used automatically for each libdevmapper user unless DM_UDEV_DISABLE_LIBRARY_FALLBACK flag is set for the DM task (IOW, by default, the fallback is still used on libdevmapper side unless disabled explicitly)

- dmsetup disables the fallback by default (you can reenable it by using the "--verifyudev" switch)

- LVM2 disables the fallback by defualt (you can reenable it by using the "activation/verify_udev_operations=1" setting in the lvm.conf)

- when using LVM2 with udev fallback mode (where this mode is now considered as debug-only option for exceptional situations), it's recommended to disable the "activation/obtain_device_list_from_udev". That's because LVM2 *will not* see block devices missing in udev db (and that's what happens with devices not created by udevd, but by calling mknod directly, see also related bug #740575).
So care has to be taken when using this fallback mode!

If possible, I'd recommend using the original workaround used and mentioned in comment #1 - using the OPTIONS+="string_escape=replace" that will replace all spaces with "_" character. Is this acceptable for 6.2 (6.3 will surely include all of the "\xNN" mangling support). RHEV team would need to deal with the escape logic anyway - whether it's that simple "_" or "\xNN". So there's little difference from this point of view (the only one being that the "_" escape logic is not complete and could lead to ambiguities as stated in comment #7).

Comment 36 Peter Rajnoha 2011-10-18 11:15:48 UTC

I strongly recommend using the original workaround (OPTIONS+="string_escape=replace) and move this request for name mangling to 6.3.

Comment 37 Alan Pevec 2011-10-18 22:50:56 UTC

(In reply to comment #36)
> I strongly recommend using the original workaround
> (OPTIONS+="string_escape=replace) and move this request for name mangling to
> 6.3.

Fine with moving to 6.3, RHEV-H implemented following workaround:
- 10-dm.rules is left unmodified
- changed defaults in multipath.conf in RHEV-H image:
  getuid_callout "/lib/udev/scsi_id --replace-whitespace --whitelisted --device=/dev/%n"

This avoids spaces in names and also duplicates under /dev/mapper/

Ben, shouldn't "--replace-whitespace" be in hard-coded d-m-m defaults, just to avoid any trouble?

Comment 38 Peter Rajnoha 2011-10-19 16:10:29 UTC

(In reply to comment #37)
> Fine with moving to 6.3, RHEV-H implemented following workaround:

Moving to 6.3 then.

Comment 40 Peter Rajnoha 2011-12-16 14:00:39 UTC

Patches already posted to lvm-devel for review (still pending acceptance, partly due to the reasons described briefly in comment #33). Setting "cond nak upstream" for now.

Although there's a known workaround as described in comment #37, we have to fix this properly either on libdevmapper side (but with inevitable oddities in naming where a user must be aware of the escaping used) or provide a fix on udev side.

Comment 41 Peter Rajnoha 2012-02-15 13:19:49 UTC

The patchset has been committed upstream (lvm2 v2.02.92/libdevmapper v1.02.71).

The character whitelist used (which is taken from udev): 0-9, A-Z, a-z, #+-.:=@_
Encoding: \xNN where NN is the hex value of the character

The dmsetup has a new '--manglename <mangling_mode>' option and a new 'mangle' command. It also has new fields 'mangled_name' and 'unmangled_name' in dmsetup info -c -o output. Default mangling mode is set during configure '--with-default-name-mangling' option.

The mangling mode is one of: none (no mangling), hex (always mangle), auto (mangle only if not already mangled, otherwise keep it without any change; error on mixed mangled/unmangled input string).

To convert (rename) any existing device-mapper device names into properly mangled names, please run "dmsetup mangle" (this will probably be called directly during the package update in rpm update scriptlet so users won't need to do anything in particular).

Please, keep in mind that the exact kernel name will be mangled!

Mangling/unmangling will happen transparently and directly in libdevmapper, no extra calls needed in existing libdevmapper users - dm_task_set_name will automatically mangle the name and dm_task_get_name/names will return unmangled form of the name (so from this point of view, the functionality stays the same as before). There are new libdevmapper functions to get separate mangled/unmangled forms of the device-mapper name by using new 'dm_task_get_name_mangled' and 'dm_task_get_name_unmangled' fn.

dmsetup will still show the unmangled form, to see mangled/unmangled form directly, please use new dmsetup info -c -o mangled_name/unmangled_name field.
While in AUTO mode, you can use mangled as well unmangled form of the name on input, e.g.

  dmsetup info 'a b' is the same as dmsetup info 'a\x20b'

If you want to see what the kernel name looks like exactly, you can use:

  dmsetup info --manglename none

Of course, the /dev/mapper content will be in mangled form - so any blacklisted character is encoded with \xNN format. ANY TOOL LOOKING AT /dev/mapper NEED TO COUNT WITH THIS ENCODED FORM OF THE DEVICE-MAPPER NAME!

(In case of emergency, if this does not work for whatever reason, you can use DM_DEFAULT_NAME_MANGLING_MODE=none environment variable to switch mangling off)

Comment 43 Peter Rajnoha 2012-02-15 15:43:03 UTC

[0] nostromo/~ # dmsetup create "a b" --table "0 32768 linear /dev/sda 0"

[0] nostromo/~ # dmsetup info -c "a b"
Name             Maj Min Stat Open Targ Event  UUID                            
a b              253   2 L--w    0    1      0                                 

[0] nostromo/~ # dmsetup info -c "a\x20b"
Name             Maj Min Stat Open Targ Event  UUID                            
a b              253   2 L--w    0    1      0                 

[0] nostromo/~ # dmsetup info -c -o name,mangled_name,unmangled_name "a b"
Name             MangledName      UnmangledName   
a b              a\x20b           a b             

[0] nostromo/~ # ls /dev/mapper/
a\x20b  control

[0] nostromo/~ # dmsetup create x --table '0 8 linear /dev/mapper/a\x20b 0'
Name             Maj Min Stat Open Targ Event  UUID
x                253   3 L--w    0    1      0
a b              253   2 L--w    1    1      0



[0] nostromo/~ # dmsetup create "c d" --manglename none --table "0 32768 linear /dev/sda 0"

[0] nostromo/~ # dmsetup info -c --manglename none 
Name             Maj Min Stat Open Targ Event  UUID
c d              253   4 L--w    0    1      0
x                253   3 L--w    0    1      0
a\x20b           253   2 L--w    1    1      0

[0] nostromo/~ # dmsetup info --manglename none -c -o name,mangled_name,unmangled_name
Name             MangledName      UnmangledName   
c d              c\x20d           c d             
x                x                x               
a\x20b           a\x20b           a b  

[0] nostromo/~ # dmsetup mangle
c d: renaming to c\x20d
x: name already in correct form
a\x20b: name already in correct form

[0] nostromo/~ # dmsetup info -c
Name             Maj Min Stat Open Targ Event  UUID
x                253   3 L--w    0    1      0
a b              253   2 L--w    1    1      0                        
c d              253   4 L--w    0    1      1

[0] nostromo/~ # dmsetup info -c --manglename none
Name             Maj Min Stat Open Targ Event  UUID
x                253   3 L--w    0    1      0                   
a\x20b           253   2 L--w    1    1      0                        
c\x20d           253   4 L--w    0    1      1

[0] nostromo/~ # ls /dev/mapper/
a\x20b  c\x20d  control  x

Comment 44 Peter Rajnoha 2012-02-15 15:54:53 UTC

[0] nostromo/~ # dmsetup create "a b" --table "0 32768 linear /dev/sda 0"

(adding two partitions on /dev/mapper/a\x20b)

[0] nostromo/~ # kpartx -a '/dev/mapper/a\x20b'

[0] nostromo/~ # dmsetup info -c
Name             Maj Min Stat Open Targ Event  UUID
a b2             253   4 L--w    0    1      0 part2-a\x20b
a b1             253   3 L--w    0    1      0 part1-a\x20b   
a b              253   2 L--w    2    1      0

[0] f16/~ # ls /dev/mapper/
a\x20b  a\x20b1  a\x20b2  control

Comment 47 Guohua Ouyang 2012-04-18 08:40:22 UTC

Verified on 6.3-20120411.1:
1. dmsetup create "a b" --table "0 32768 linear /dev/sdb 0"
2. Check the space is displayed to \xNN, in this case its "\x20".
[root@localhost admin]# ll /dev/mapper/a*
lrwxrwxrwx. 1 root root 8 2012-04-18 08:34 /dev/mapper/a\x20b -> ../dm-10

3. Also no duplicated entries under /dev/mapper
# ll /dev/mapper/
total 0
lrwxrwxrwx. 1 root root      7 2012-04-17 12:05 1ATA_WDC_WD3200AAKS-75L9A0_WD-WMAV26627303 -> ../dm-0
lrwxrwxrwx. 1 root root      7 2012-04-17 12:05 1ATA_WDC_WD3200AAKS-75L9A0_WD-WMAV26627303p1 -> ../dm-1
lrwxrwxrwx. 1 root root      7 2012-04-17 12:05 1ATA_WDC_WD3200AAKS-75L9A0_WD-WMAV26627303p2 -> ../dm-2
lrwxrwxrwx. 1 root root      7 2012-04-17 12:05 1ATA_WDC_WD3200AAKS-75L9A0_WD-WMAV26627303p3 -> ../dm-3
lrwxrwxrwx. 1 root root      7 2012-04-17 12:05 1ATA_WDC_WD3200AAKS-75L9A0_WD-WMAV26627303p4 -> ../dm-4
lrwxrwxrwx. 1 root root      8 2012-04-18 08:34 a\x20b -> ../dm-10
crw-rw----. 1 root root 10, 58 2012-04-17 12:04 control
lrwxrwxrwx. 1 root root      7 2012-04-17 12:04 HostVG-Config -> ../dm-7
lrwxrwxrwx. 1 root root      7 2012-04-17 12:04 HostVG-Data -> ../dm-9
lrwxrwxrwx. 1 root root      7 2012-04-17 12:04 HostVG-Logging -> ../dm-8
lrwxrwxrwx. 1 root root      7 2012-04-17 12:04 HostVG-Swap -> ../dm-6
lrwxrwxrwx. 1 root root      7 2012-04-17 12:04 live-rw -> ../dm-5

Comment 48 Peter Rajnoha 2012-04-24 13:45:06 UTC

Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
Device-mapper allows any character except '/' to be used in a device-mapper name. However, this is in conflict with udev as its character whitelist is restricted to 0-9, A-Z, a-z and #+-.:=@_. Using any blacklisted character in the device-mapper name ends up with incorrect /dev entries to be created by udev.

To solve this issue, the libdevmapper library together with the dmsetup command now supports encoding of udev-blacklisted characters by using the \xNN format where NN is the hex value of the character. This format is supported by udev.

There are three 'mangling' modes in which libdevmapper can operate: 'none' (no mangling), 'hex' (always mangle any blacklisted character) and 'auto' (use detection and mangle only if not mangled yet). Default mode used is 'auto' and any libdevmapper user is affected unless this setting is changed by respective libdevmapper call.

To support this feature, the dmsetup command has a new '--manglename <mangling_mode>' option to define the name mangling mode used while processing device-mapper names. The 'dmsetup info -c -o' has new fields to display: 'mangled_name' and 'unmangled_name'.

There's also a new dmsetup 'mangle' command that renames any existing device-mapper names to its correct form automatically. It is strongly advised to call this command after an update to correct any existing device-mapper names.

Comment 50 errata-xmlrpc 2012-06-20 14:59:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0962.html

Note You need to log in before you can comment on or make changes to this bug.

agk
apevec
bmarzins
coughlan
ddumas
dwysocha
gouyang
harald
heinzm
jbrassow
kay
kzak
leiwang
mbroz
mburns
moli
prajnoha
prockai
thornber
ycui
zkabelac