Bug 1487615
| Summary: | bash fails to execute commands containing multibyte characters | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Renaud Métrich <rmetrich> | ||||||||
| Component: | bash | Assignee: | Siteshwar Vashisht <svashisht> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Martin Kyral <mkyral> | ||||||||
| Severity: | low | Docs Contact: | |||||||||
| Priority: | low | ||||||||||
| Version: | 7.4 | CC: | jorge_martinez, kdudka, mkyral, mzalewsk, svashisht | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | bash-4.2.46-30.el7 | Doc Type: | If docs needed, set a value | ||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2018-04-10 13:51:19 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
Renaud Métrich
2017-09-01 12:05:14 UTC
Created attachment 1322344 [details] Fix quoting for wide characters Backported patch from http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?h=devel&id=7610e0c52 Comment on attachment 1322344 [details] Fix quoting for wide characters > diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c > --- a/lib/sh/strtrans.c > +++ b/lib/sh/strtrans.c > @@ -208,6 +211,8 @@ ansic_quote (str, flags, rlen) > char *r, *ret, *s; > int l, rsize; > unsigned char c; > + size_t slen; > + DECLARE_MBSTATE; > > if (str == 0 || *str == 0) > return ((char *)0); > @@ -219,9 +224,11 @@ ansic_quote (str, flags, rlen) > *r++ = '$'; > *r++ = '\''; > > - for (s = str, l = 0; *s; s++) > + s = str; > + slen = strlen (str); > + > + for (s = str; c = *s; s++) > { > - c = *s; > l = 1; /* 1 == add backslash; 0 == no backslash */ > switch (c) > { I know it comes from the upstream commit ... but why is the slen variable assigned strlen(str) if the value is not used anywhere? Also the 's' variable is assigned twice in a row for no apparent reason. Could not it suggest that the upstream commit was incomplete? Created attachment 1324104 [details] Fix quoting for wide characters Besides upstream patch from comment 2, I have also backportec below patches : http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?id=ec860d767 http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?id=c920c360d http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?id=be06f7783 http://git.savannah.gnu.org/cgit/bash.git/diff/lib/sh/strtrans.c?id=595e3e692 Comment on attachment 1324104 [details]
Fix quoting for wide characters
Looks good.
As part of testing process I decided to try and reproduce this bug on every single character up to end of Unicode's Supplementary Multilingual Plane (sorry CJK!). Detailed results and reproducer script are attached to this report as `unicode-results.tar.gz`. As a summary, below is number of characters that trigger the bug: $ grep -c '\$' *.txt bash-4.2.46-28.el7.txt:129407 bash-4.2.46-30.el7.txt:64051 bash-4.4.12-7.fc26.txt:48655 My understanding of these results: 1. Patch greatly improves the situation - after applying it, more than half of characters don't trigger the issue anymore. 2. Patch doesn't bring bash 4.2 to bash 4.4 level - there are over 15000 characters that trigger the bug in former, but not the latter. 3. bash 4.4 is far from ideal - there are almost 50000 characters that still trigger the bug (that is more than 1/3 of all tested characters). I am kind of lost here. On the one hand, I want to mark this issue as verified. Character from original report no longer triggers it, patch brings vast improvements, most of affected characters are affected upstream as well and chances of running into this are minuscule anyway (who puts emoji in their command line tools names?). On the other hand, issue is clearly not fixed. Thoughts? Created attachment 1333720 [details]
Test results for bash 4.2 before and after the patch, plus bash 4.4
Mirosław, Thanks for the thorough tests. It looks like for most of the characters where bash triggers error string "$<unicode character>: command not found", do not have a corresponding unicode character in the utf-8 table. Also, in some cases bash uses a separate character '𘊴' (instead of $<unicode character>) while printing the error message for some of these unicode characters. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0800 |