Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1487615

Summary:

bash fails to execute commands containing multibyte characters

Product:

Red Hat Enterprise Linux 7

Reporter:

Renaud Métrich <rmetrich>

Component:

bash

Assignee:

Siteshwar Vashisht <svashisht>

Status:

CLOSED ERRATA

QA Contact:

Martin Kyral <mkyral>

Severity:

low

Docs Contact:

Priority:

low

Version:

7.4

CC:

jorge_martinez, kdudka, mkyral, mzalewsk, svashisht

Target Milestone:

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

bash-4.2.46-30.el7

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-04-10 13:51:19 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Fix quoting for wide characters	none
Fix quoting for wide characters	kdudka: review+
Test results for bash 4.2 before and after the patch, plus bash 4.4	none

Description Renaud Métrich 2017-09-01 12:05:14 UTC

Description of problem:

Seen on RHEL7.4 installed from scratch only.
When an executable contains non-ASCII character (e.g. "é") and the executable is not found, bash prints the executable name in a non human-readable manner:

# é
bash: $'\303\251': command not found

Maybe this issue is not within bash, but I could not found the root cause yet.

Unexpectedly, when upgrading a RHEL7.3 system with bash from RHEL7.4 (or completely upgrading the system), the issue cannot be reproduced on the newly upgraded system.

Version-Release number of selected component (if applicable):

bash-4.2.46-28.el7.x86_64


How reproducible:

Always


Steps to Reproduce:
1. Install a RHEL7.4 VM from scratch
2. Run "é"

Actual results:

# é
bash: $'\303\251': command not found

Expected results:

# é
bash: é: command not found


Additional info:

I'm not sure at all if it is due to Bash itself for the following reasons:
- trying with zsh on RHEL7.4 doesn't reproduce the issue
- upgrading a RHEL7.3 system to RHEL7.4 doesn't reproduce the issue

Comment 2 Siteshwar Vashisht 2017-09-05 17:17:53 UTC

Created attachment 1322344 [details]
Fix quoting for wide characters

Backported patch from http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?h=devel&id=7610e0c52

Comment 4 Kamil Dudka 2017-09-08 15:14:14 UTC

Comment on attachment 1322344 [details]
Fix quoting for wide characters

> diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
> --- a/lib/sh/strtrans.c
> +++ b/lib/sh/strtrans.c
> @@ -208,6 +211,8 @@ ansic_quote (str, flags, rlen)
>    char *r, *ret, *s;
>    int l, rsize;
>    unsigned char c;
> +  size_t slen;
> +  DECLARE_MBSTATE;
>  
>    if (str == 0 || *str == 0)
>      return ((char *)0);
> @@ -219,9 +224,11 @@ ansic_quote (str, flags, rlen)
>    *r++ = '$';
>    *r++ = '\'';
>  
> -  for (s = str, l = 0; *s; s++)
> +  s = str;
> +  slen = strlen (str);
> +
> +  for (s = str; c = *s; s++)
>      {
> -      c = *s;
>        l = 1;		/* 1 == add backslash; 0 == no backslash */
>        switch (c)
>  	{

I know it comes from the upstream commit ... but why is the slen variable
assigned strlen(str) if the value is not used anywhere?

Also the 's' variable is assigned twice in a row for no apparent reason.

Could not it suggest that the upstream commit was incomplete?

Comment 5 Siteshwar Vashisht 2017-09-09 16:06:11 UTC

Created attachment 1324104 [details]
Fix quoting for wide characters

Besides upstream patch from comment 2, I have also backportec below patches :

http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?id=ec860d767

http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?id=c920c360d

http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?id=be06f7783

http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?id=595e3e692

Comment 6 Kamil Dudka 2017-09-11 15:48:12 UTC

Comment on attachment 1324104 [details]
Fix quoting for wide characters

Looks good.

Comment 8 Mirek Długosz 2017-10-03 14:03:19 UTC

As part of testing process I decided to try and reproduce this bug on every single character up to end of Unicode's Supplementary Multilingual Plane (sorry CJK!). Detailed results and reproducer script are attached to this report as `unicode-results.tar.gz`. As a summary, below is number of characters that trigger the bug:

$ grep -c '\$' *.txt
bash-4.2.46-28.el7.txt:129407
bash-4.2.46-30.el7.txt:64051
bash-4.4.12-7.fc26.txt:48655

My understanding of these results:
1. Patch greatly improves the situation - after applying it, more than half of characters don't trigger the issue anymore.
2. Patch doesn't bring bash 4.2 to bash 4.4 level - there are over 15000 characters that trigger the bug in former, but not the latter.
3. bash 4.4 is far from ideal - there are almost 50000 characters that still trigger the bug (that is more than 1/3 of all tested characters).

I am kind of lost here.

On the one hand, I want to mark this issue as verified. Character from original report no longer triggers it, patch brings vast improvements, most of affected characters are affected upstream as well and chances of running into this are minuscule anyway (who puts emoji in their command line tools names?). On the other hand, issue is clearly not fixed.

Thoughts?

Comment 9 Mirek Długosz 2017-10-03 14:03:30 UTC

Created attachment 1333720 [details]
Test results for bash 4.2 before and after the patch, plus bash 4.4

Comment 10 Siteshwar Vashisht 2017-10-04 15:28:50 UTC

Mirosław,

Thanks for the thorough tests. It looks like for most of the characters where bash triggers error string "$<unicode character>: command not found", do not have a corresponding unicode character in the utf-8 table. Also, in some cases bash uses a separate character '𘊴' (instead of $<unicode character>) while printing the error message for some of these unicode characters.

Comment 17 errata-xmlrpc 2018-04-10 13:51:19 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0800