Bug 1313473

Summary: dashes are unicode \x2d in systemd warning message in journal
Product: Red Hat Enterprise Linux 7 Reporter: mulhern <amulhern>
Component: systemdAssignee: systemd-maint
Status: CLOSED NOTABUG QA Contact: qe-baseos-daemons
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: systemd-maint-list
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-15 07:11:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description mulhern 2016-03-01 16:28:20 UTC
Description of problem:

Some hyphens end up as unicode minus sign (\x2d) in systemd journal message:

Device dev-disk-by\x2did-wwn\x2d0xceb4b5944ee25001.device appeared twice with different sysfs paths...

Note that the hyphen as unicode minus appears in only one place, in between the 'by' and the next word, as "id", "path", "uuid".

Version-Release number of selected component (if applicable):

Installed Packages
Name        : systemd
Arch        : x86_64
Version     : 219
Release     : 19.el7
Size        : 21 M
Repo        : installed
From repo   : RHEL-7-RTT

How reproducible:

Consistently, also shown but not remarked on in https://bugzilla.redhat.com/show_bug.cgi?id=1296249.

Steps to Reproduce:
1. You need to generate the error message, probably by messing with multipath.

Actual results:

Unicode character substituted for ASCII dash in warning message.

Expected results:

I would expect all characters to be ASCII.

Additional info:

I think this was introduced recently. I've seen these messages fairly frequently, and I don't remember seeing the unicode character previously.

Comment 2 Lukáš Nykrýn 2016-04-15 07:11:55 UTC
This is the name of the device unit. It should not look differently.

Comment 3 mulhern 2016-04-15 13:46:46 UTC
There is a method unit-name-from_path() which mangles paths, substituting '-'s for '/'s and unicode -'s for '-'s.

Sometimes a path is just a path, sometimes it's considered a name. When it is considered a name it is escaped in this fashion.

So, these by-* paths can be escaped differently or not escaped or converted to names, as the case may be, e.g.:
MESSAGE=found 'b65:96' claiming '/run/udev/links/\x2fdisk\x2fby-id\x2fwwn-0xcc9d60494ee25001', where some '/'s are escaped with unicode '/'s
MESSAGE=LINK 'disk/by-id/ata-WDC_WD10EFRX-68PJCN0_WD-WCC4JLVFF3U9' /usr/lib/udev/rules.d/60-persistent-storage.rules:42, no escaping