Bug 2112870

Summary: unexpand cannot process form feed
Product: [Fedora] Fedora Reporter: Jonathan Wakely <jwakely>
Component: coreutilsAssignee: Kamil Dudka <kdudka>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 36CC: admiller, dmueller, jamartis, jarodwilson, kdudka, kzak, mail, mliska, ooprala, ovasik, p, sebastian.kisela, svashisht
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: coreutils-9.1-5.fc37 coreutils-9.0-8.fc36 coreutils-8.32-36.fc35 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2112873 (view as bug list) Environment:
Last Closed: 2022-08-10 01:16:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2112985    

Description Jonathan Wakely 2022-08-01 11:13:23 UTC
Description of problem:

$ printf '\f'  | unexpand 
unexpand: input line is too long


Version-Release number of selected component (if applicable):




How reproducible:

Always


Steps to Reproduce:
1. printf '\f'  | unexpand 

Actual results:

unexpand: input line is too long

Expected results:

No error.

Additional info:

This seems to be a Fedora/RHEL bug, because it works fine on Ubuntu and Debian, and works fine on  BSD, Solaris and AIX.

Working versions

coreutils-8.22-24.el7_9.2.x86_64 (CentOS 7)
8.26-3 (Debian 9.13 stretch)
8.28-1ubuntu1 (Ubuntu 18.04)
8.32-4.1ubuntu1 (Ubuntu 22.04)
8.32-4.1 (Debian unstable sid)
9.1.43-ab976f (self-built from coreutils Git repo)

Not working versions:

coreutils-8.30-12.el8.x86_64 (CentOS 8)
coreutils-9.0-5.fc36.x86_64 (Fedora 36)

Comment 1 Jonathan Wakely 2022-08-01 11:25:50 UTC
Also broken:

coreutils-9.1-3.fc37.x86_64
coreutils-8.32-32.fc34.x86_64
coreutils-8.32-33.fc35.x86_64
coreutils-8.32-32.el9.x86_64


Using LANG=C doesn't change anything.

Comment 2 Kamil Dudka 2022-08-01 14:58:28 UTC
Thank you for reporting it!  It seems to be caused by coreutils-i18n.patch.

Comment 4 Martin Liška 2022-08-01 19:03:23 UTC
(In reply to Kamil Dudka from comment #3)
> Fedora commit:
> https://src.fedoraproject.org/rpms/coreutils/c/
> e0f6afe5e62f80aa5d2007bd725b76158f76222e?branch=rawhide

What about simplifying that to:

+            +                  /* mb_width() returns 0 for control characters */            
+            +                  column += MIN(1, mb_width (c));

?

Comment 5 Kamil Dudka 2022-08-02 08:27:23 UTC
That might result in two calls to mb_width() in the most common case.

Comment 6 Martin Liška 2022-08-02 08:31:21 UTC
Then what about:

const int width = mb_width (c);            
column += MIN(1, width);

?

Comment 7 Kamil Dudka 2022-08-02 09:07:57 UTC
That works for me.  I found the current version more debugging-friendly as one does not need to step through a macro expansion.  But I have no strong opinion on that.  Martin, thank you for reviewing the change in the first place!

Comment 10 Martin Liška 2022-08-02 10:15:22 UTC
(In reply to Kamil Dudka from comment #7)
> That works for me.  I found the current version more debugging-friendly as
> one does not need to step through a macro expansion.  But I have no strong
> opinion on that.  Martin, thank you for reviewing the change in the first
> place!

Thanks for taking my suggestion.

Comment 11 Fedora Update System 2022-08-02 10:42:02 UTC
FEDORA-2022-536c8337c8 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2022-536c8337c8

Comment 12 Fedora Update System 2022-08-02 10:42:05 UTC
FEDORA-2022-9fc64fd114 has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-9fc64fd114

Comment 13 Dirk Mueller 2022-08-02 11:40:07 UTC
the MIN should have been a MAX(), now the reproducer is again triggering.

Comment 14 Martin Liška 2022-08-02 11:51:27 UTC
Whoops, yeah..

Comment 15 Kamil Dudka 2022-08-02 12:17:33 UTC
Indeed.  That is embarrassing.  Thank you for letting us know, Dirk!

Comment 17 Fedora Update System 2022-08-03 01:16:43 UTC
FEDORA-2022-9fc64fd114 has been pushed to the Fedora 36 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-9fc64fd114`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-9fc64fd114

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 18 Fedora Update System 2022-08-03 02:56:49 UTC
FEDORA-2022-536c8337c8 has been pushed to the Fedora 35 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-536c8337c8`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-536c8337c8

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 19 Fedora Update System 2022-08-04 01:33:48 UTC
FEDORA-2022-9fc64fd114 has been pushed to the Fedora 36 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 20 Dirk Mueller 2022-08-05 06:18:06 UTC
I got additional feedback on this change. namely expand and unexpand are treated differently in the coreutils-i18n modification. suggest this modification (its a diff of a diff, sorry) on the unexpand case, to match the (simpler) expand implementation:

-@@ -218,9 +277,11 @@ unexpand (void)
+@@ -218,9 +277,9 @@ unexpand (void)
                    next_tab_column = column;
                    tab_index -= !!tab_index;
                  }
-              else
-+              else if (!mb_iseq (c, '\n'))
++              else if (!mb_iscntrl (c))
                  {
 -                  column++;
-+                  /* mb_width() returns 0 for control characters */
-+                  const int width = mb_width (c);
-+                  column += (width) ? width : 1;
++                  column += mb_width (c);
                    if (!column)
                      die (EXIT_FAILURE, 0, _("input line is too long"));
                  }

now the implementation is symmetric to the one in expand().

Comment 21 Kamil Dudka 2022-08-05 08:50:27 UTC
Thanks for feedback!  I think it makes sense.  iswcntrl('\n') returns true, so we do not need to check for '\n' specifically.

While we are at it, isn't the "input line is too long" check lame in case mb_width() returns value greater than one?

Comment 22 Dirk Mueller 2022-08-05 09:10:06 UTC
I think this is a check for arithmetic "size_t" overflow, which is defined unlike the signed integer overflow. 
It can not practically happen I'd say, but it is more correct with the check in place.

Comment 23 Kamil Dudka 2022-08-05 10:39:48 UTC
The type of `column` is in fact uintmax_t.  The check is correct in upstream coreutils because `column` is always incremented by one.  If you take the highest uintmax_t number and increment it by one, the result is zero and the check triggers.  If you take the highest uintmax_t number and add more than one to it, the result is not zero.  I think the check in the i18n patch should actually look like this:

              else if (!mb_iscntrl (c))
                {
                  const uintmax_t orig_column = column;                                       
                  column += mb_width (c);
                  if (column < orig_column)
                    die (EXIT_FAILURE, 0, _("input line is too long"));                       
                }

Comment 24 Bernhard Voelker 2022-08-07 18:24:51 UTC
This avoids the overflow check completely for non-control characters.
This was also the case for older versions of the I18 patches.

What about the following which is closer to the non-I18N code,
i.e., just does the check in a condition-less 'else'?

              else
                {
                  const uintmax_t orig_column = column;
                  /* mb_width returns 0 for control chars which have width 1.  */
                  column += mb_iscntrl(c) ? 1 : mb_width(c);
                  /* Test overflow.  */
                  if (column < orig_column)
                    die (EXIT_FAILURE, 0, _("input line is too long"));
                }

Comment 25 Bernhard Voelker 2022-08-07 19:15:57 UTC
ah, no, my previous suggestion was wrong.

@Kamil: your suggestion in #c23 looks correct.
One suggestion: as mb_width returns 0 for control characters, we don't need to
single them out either, and can simplify to a plain 'else':

              else
                {
                  const uintmax_t orig_column = column;
                  column += mb_width(c);
                  if (column < orig_column)
                    die (EXIT_FAILURE, 0, _("input line is too long"));
                }

Comment 26 Kamil Dudka 2022-08-08 11:33:37 UTC
Thanks for review, Berny!  I will pick the version from comment #25.

Fedora commit: https://src.fedoraproject.org/rpms/coreutils/c/f3c6ff7e2cdc49208be8add4e51eec3c45767196?branch=rawhide

Comment 27 Fedora Update System 2022-08-08 15:12:41 UTC
FEDORA-2022-d2973d1e3b has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-d2973d1e3b

Comment 28 Fedora Update System 2022-08-08 15:12:43 UTC
FEDORA-2022-50c9e3fecf has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2022-50c9e3fecf

Comment 29 Fedora Update System 2022-08-09 01:33:27 UTC
FEDORA-2022-50c9e3fecf has been pushed to the Fedora 35 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-50c9e3fecf`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-50c9e3fecf

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 30 Fedora Update System 2022-08-09 02:52:31 UTC
FEDORA-2022-d2973d1e3b has been pushed to the Fedora 36 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-d2973d1e3b`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-d2973d1e3b

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 31 Fedora Update System 2022-08-10 01:16:37 UTC
FEDORA-2022-d2973d1e3b has been pushed to the Fedora 36 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 32 Fedora Update System 2022-08-25 07:47:34 UTC
FEDORA-2022-50c9e3fecf has been pushed to the Fedora 35 stable repository.
If problem still persists, please make note of it in this bug report.