RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1487615 - bash fails to execute commands containing multibyte characters
Summary: bash fails to execute commands containing multibyte characters
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: bash
Version: 7.4
Hardware: x86_64
OS: Linux
low
low
Target Milestone: rc
: ---
Assignee: Siteshwar Vashisht
QA Contact: Martin Kyral
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-01 12:05 UTC by Renaud Métrich
Modified: 2020-12-14 09:49 UTC (History)
5 users (show)

Fixed In Version: bash-4.2.46-30.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-10 13:51:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Fix quoting for wide characters (1.99 KB, patch)
2017-09-05 17:17 UTC, Siteshwar Vashisht
no flags Details | Diff
Fix quoting for wide characters (3.10 KB, patch)
2017-09-09 16:06 UTC, Siteshwar Vashisht
kdudka: review+
Details | Diff
Test results for bash 4.2 before and after the patch, plus bash 4.4 (1.05 MB, application/x-gzip)
2017-10-03 14:03 UTC, Mirek Długosz
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0800 0 None None None 2018-04-10 13:51:33 UTC

Description Renaud Métrich 2017-09-01 12:05:14 UTC
Description of problem:

Seen on RHEL7.4 installed from scratch only.
When an executable contains non-ASCII character (e.g. "é") and the executable is not found, bash prints the executable name in a non human-readable manner:

# é
bash: $'\303\251': command not found

Maybe this issue is not within bash, but I could not found the root cause yet.

Unexpectedly, when upgrading a RHEL7.3 system with bash from RHEL7.4 (or completely upgrading the system), the issue cannot be reproduced on the newly upgraded system.

Version-Release number of selected component (if applicable):

bash-4.2.46-28.el7.x86_64


How reproducible:

Always


Steps to Reproduce:
1. Install a RHEL7.4 VM from scratch
2. Run "é"

Actual results:

# é
bash: $'\303\251': command not found

Expected results:

# é
bash: é: command not found


Additional info:

I'm not sure at all if it is due to Bash itself for the following reasons:
- trying with zsh on RHEL7.4 doesn't reproduce the issue
- upgrading a RHEL7.3 system to RHEL7.4 doesn't reproduce the issue

Comment 2 Siteshwar Vashisht 2017-09-05 17:17:53 UTC
Created attachment 1322344 [details]
Fix quoting for wide characters

Backported patch from http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?h=devel&id=7610e0c52

Comment 4 Kamil Dudka 2017-09-08 15:14:14 UTC
Comment on attachment 1322344 [details]
Fix quoting for wide characters

> diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
> --- a/lib/sh/strtrans.c
> +++ b/lib/sh/strtrans.c
> @@ -208,6 +211,8 @@ ansic_quote (str, flags, rlen)
>    char *r, *ret, *s;
>    int l, rsize;
>    unsigned char c;
> +  size_t slen;
> +  DECLARE_MBSTATE;
>  
>    if (str == 0 || *str == 0)
>      return ((char *)0);
> @@ -219,9 +224,11 @@ ansic_quote (str, flags, rlen)
>    *r++ = '$';
>    *r++ = '\'';
>  
> -  for (s = str, l = 0; *s; s++)
> +  s = str;
> +  slen = strlen (str);
> +
> +  for (s = str; c = *s; s++)
>      {
> -      c = *s;
>        l = 1;		/* 1 == add backslash; 0 == no backslash */
>        switch (c)
>  	{

I know it comes from the upstream commit ... but why is the slen variable
assigned strlen(str) if the value is not used anywhere?

Also the 's' variable is assigned twice in a row for no apparent reason.

Could not it suggest that the upstream commit was incomplete?

Comment 6 Kamil Dudka 2017-09-11 15:48:12 UTC
Comment on attachment 1324104 [details]
Fix quoting for wide characters

Looks good.

Comment 8 Mirek Długosz 2017-10-03 14:03:19 UTC
As part of testing process I decided to try and reproduce this bug on every single character up to end of Unicode's Supplementary Multilingual Plane (sorry CJK!). Detailed results and reproducer script are attached to this report as `unicode-results.tar.gz`. As a summary, below is number of characters that trigger the bug:

$ grep -c '\$' *.txt
bash-4.2.46-28.el7.txt:129407
bash-4.2.46-30.el7.txt:64051
bash-4.4.12-7.fc26.txt:48655

My understanding of these results:
1. Patch greatly improves the situation - after applying it, more than half of characters don't trigger the issue anymore.
2. Patch doesn't bring bash 4.2 to bash 4.4 level - there are over 15000 characters that trigger the bug in former, but not the latter.
3. bash 4.4 is far from ideal - there are almost 50000 characters that still trigger the bug (that is more than 1/3 of all tested characters).

I am kind of lost here.

On the one hand, I want to mark this issue as verified. Character from original report no longer triggers it, patch brings vast improvements, most of affected characters are affected upstream as well and chances of running into this are minuscule anyway (who puts emoji in their command line tools names?). On the other hand, issue is clearly not fixed.

Thoughts?

Comment 9 Mirek Długosz 2017-10-03 14:03:30 UTC
Created attachment 1333720 [details]
Test results for bash 4.2 before and after the patch, plus bash 4.4

Comment 10 Siteshwar Vashisht 2017-10-04 15:28:50 UTC
Mirosław,

Thanks for the thorough tests. It looks like for most of the characters where bash triggers error string "$<unicode character>: command not found", do not have a corresponding unicode character in the utf-8 table. Also, in some cases bash uses a separate character '𘊴' (instead of $<unicode character>) while printing the error message for some of these unicode characters.

Comment 17 errata-xmlrpc 2018-04-10 13:51:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0800


Note You need to log in before you can comment on or make changes to this bug.