Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1487615 - bash fails to execute commands containing multibyte characters
bash fails to execute commands containing multibyte characters
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: bash (Show other bugs)
7.4
x86_64 Linux
low Severity low
: rc
: ---
Assigned To: Siteshwar Vashisht
Martin Kyral
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-09-01 08:05 EDT by Renaud Métrich
Modified: 2018-04-10 09:51 EDT (History)
5 users (show)

See Also:
Fixed In Version: bash-4.2.46-30.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-04-10 09:51:19 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Fix quoting for wide characters (1.99 KB, patch)
2017-09-05 13:17 EDT, Siteshwar Vashisht
no flags Details | Diff
Fix quoting for wide characters (3.10 KB, patch)
2017-09-09 12:06 EDT, Siteshwar Vashisht
kdudka: review+
Details | Diff
Test results for bash 4.2 before and after the patch, plus bash 4.4 (1.05 MB, application/x-gzip)
2017-10-03 10:03 EDT, Mirosław Zalewski
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0800 None None None 2018-04-10 09:51 EDT

  None (edit)
Description Renaud Métrich 2017-09-01 08:05:14 EDT
Description of problem:

Seen on RHEL7.4 installed from scratch only.
When an executable contains non-ASCII character (e.g. "é") and the executable is not found, bash prints the executable name in a non human-readable manner:

# é
bash: $'\303\251': command not found

Maybe this issue is not within bash, but I could not found the root cause yet.

Unexpectedly, when upgrading a RHEL7.3 system with bash from RHEL7.4 (or completely upgrading the system), the issue cannot be reproduced on the newly upgraded system.

Version-Release number of selected component (if applicable):

bash-4.2.46-28.el7.x86_64


How reproducible:

Always


Steps to Reproduce:
1. Install a RHEL7.4 VM from scratch
2. Run "é"

Actual results:

# é
bash: $'\303\251': command not found

Expected results:

# é
bash: é: command not found


Additional info:

I'm not sure at all if it is due to Bash itself for the following reasons:
- trying with zsh on RHEL7.4 doesn't reproduce the issue
- upgrading a RHEL7.3 system to RHEL7.4 doesn't reproduce the issue
Comment 2 Siteshwar Vashisht 2017-09-05 13:17 EDT
Created attachment 1322344 [details]
Fix quoting for wide characters

Backported patch from http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?h=devel&id=7610e0c52
Comment 4 Kamil Dudka 2017-09-08 11:14:14 EDT
Comment on attachment 1322344 [details]
Fix quoting for wide characters

> diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
> --- a/lib/sh/strtrans.c
> +++ b/lib/sh/strtrans.c
> @@ -208,6 +211,8 @@ ansic_quote (str, flags, rlen)
>    char *r, *ret, *s;
>    int l, rsize;
>    unsigned char c;
> +  size_t slen;
> +  DECLARE_MBSTATE;
>  
>    if (str == 0 || *str == 0)
>      return ((char *)0);
> @@ -219,9 +224,11 @@ ansic_quote (str, flags, rlen)
>    *r++ = '$';
>    *r++ = '\'';
>  
> -  for (s = str, l = 0; *s; s++)
> +  s = str;
> +  slen = strlen (str);
> +
> +  for (s = str; c = *s; s++)
>      {
> -      c = *s;
>        l = 1;		/* 1 == add backslash; 0 == no backslash */
>        switch (c)
>  	{

I know it comes from the upstream commit ... but why is the slen variable
assigned strlen(str) if the value is not used anywhere?

Also the 's' variable is assigned twice in a row for no apparent reason.

Could not it suggest that the upstream commit was incomplete?
Comment 6 Kamil Dudka 2017-09-11 11:48:12 EDT
Comment on attachment 1324104 [details]
Fix quoting for wide characters

Looks good.
Comment 8 Mirosław Zalewski 2017-10-03 10:03:19 EDT
As part of testing process I decided to try and reproduce this bug on every single character up to end of Unicode's Supplementary Multilingual Plane (sorry CJK!). Detailed results and reproducer script are attached to this report as `unicode-results.tar.gz`. As a summary, below is number of characters that trigger the bug:

$ grep -c '\$' *.txt
bash-4.2.46-28.el7.txt:129407
bash-4.2.46-30.el7.txt:64051
bash-4.4.12-7.fc26.txt:48655

My understanding of these results:
1. Patch greatly improves the situation - after applying it, more than half of characters don't trigger the issue anymore.
2. Patch doesn't bring bash 4.2 to bash 4.4 level - there are over 15000 characters that trigger the bug in former, but not the latter.
3. bash 4.4 is far from ideal - there are almost 50000 characters that still trigger the bug (that is more than 1/3 of all tested characters).

I am kind of lost here.

On the one hand, I want to mark this issue as verified. Character from original report no longer triggers it, patch brings vast improvements, most of affected characters are affected upstream as well and chances of running into this are minuscule anyway (who puts emoji in their command line tools names?). On the other hand, issue is clearly not fixed.

Thoughts?
Comment 9 Mirosław Zalewski 2017-10-03 10:03 EDT
Created attachment 1333720 [details]
Test results for bash 4.2 before and after the patch, plus bash 4.4
Comment 10 Siteshwar Vashisht 2017-10-04 11:28:50 EDT
Mirosław,

Thanks for the thorough tests. It looks like for most of the characters where bash triggers error string "$<unicode character>: command not found", do not have a corresponding unicode character in the utf-8 table. Also, in some cases bash uses a separate character '𘊴' (instead of $<unicode character>) while printing the error message for some of these unicode characters.
Comment 17 errata-xmlrpc 2018-04-10 09:51:19 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0800

Note You need to log in before you can comment on or make changes to this bug.