Bug 150569

Summary: mc freezes while coping or deleting certain files
Product: [Fedora] Fedora Reporter: Radek Vokál <rvokal>
Component: mcAssignee: Jindrich Novy <jnovy>
Status: CLOSED RAWHIDE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: leonard-rh-bugzilla, pknirsch
Target Milestone: ---Keywords: EasyFix
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 4.6.1a-0.5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-03-17 14:06:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Radek Vokál 2005-03-08 14:28:55 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6)
Gecko/20050302 Firefox/1.0.1 Fedora/1.0.1-1.3.2

Description of problem:
mc freezes in deadloop when deleting certain files. The filenames are
in UFT-8 and has some white spaces and special characters. 100%
reproducible on my system.  

Version-Release number of selected component (if applicable):
mc-4.6.1a-0.4

How reproducible:
Always

Steps to Reproduce:
1. scp files from remote machine (dir names like rock&roll/Guano Apes
  2002/01 - Open Your Eyes.ogg
2. try to copy the file locally with mc
3. observe mc taking 99.9% of your CPU
    

Additional info:

Comment 1 Leonard den Ottolander 2005-03-08 23:03:31 UTC
I'm not sure the forward slash in the file name is valid. The shell
doesn't handle it correctly.

$ touch foo\/bar gives an error.

Seems more like an extfs bug. Don't think it should accept files with
slashes in the name.


Comment 2 Leonard den Ottolander 2005-03-08 23:15:23 UTC
Ok. Assuming the second slash is a path separator...

I have no problem whatsoever moving or copying either the file, or one
of the parent dirs around. I'm running a 4.6.1-pre build, but that
shouldn't matter for this issue.

Please explain how step 1) is involved.


Comment 3 Roland Illig 2005-03-08 23:19:22 UTC
Dear Radek,

1. Please verify that the loop is really endless.
2. What happens if you leave out step 1?
3. Which process is taking the CPU? (run top(1)). Is it really mc?
   Is it in userspace or in system space?

Roland

Comment 4 Radek Vokál 2005-03-09 07:42:03 UTC
Roland, step 3 .. mc takes 99.9% of CPU .. this is what top says and
what I already said! 

Leonard, about step 1. Sorry for fwd slashes, the files were get from
Windows filesystem, downloaded to a local machine and then I've played
around with them. It really seems like there's sth broken in utf8
patch, attaching gdb to a process and doing backtrace says, that it
freezes on mbstrlen function. Here's the patch for mc-utf8.patch which
jnovy proposed to me and it kills the deadloop.. So with this patch
the issue is fixed.


@@ -2427,6 +2427,9 @@
 +            size_t len;
 +
 +            len = mbrtowc (&c, str, MB_CUR_MAX, NULL);
++	    
++	    if (len == (size_t)(-1)) break;
++	    
 +            if (len > 0) {
 +                int wcsize = wcwidth(c);
 +                width += wcsize > 0 ? wcsize : 0;


Comment 5 Jindrich Novy 2005-03-09 09:04:21 UTC
Radek,

the patch I sent you was a quick hack to figure out whether the problem is in
the fact that mbrtowc() founds characters that it considers as a multi-byte
sequence that it's unable to parse.

The code of the mbstrlen() is:

size_t
mbstrlen (const char *str)
{
#ifdef UTF8
    if (SLsmg_Is_Unicode) {
        size_t width = 0;

        for (; *str; str++) {
            wchar_t c;
            size_t len;

            len = mbrtowc (&c, str, MB_CUR_MAX, NULL);

            if (len > 0) {
                int wcsize = wcwidth(c);
                width += wcsize > 0 ? wcsize : 0;
                str += len-1;
            }
        }

        return width;
    } else
#endif
       return strlen (str);
}

Please note that if mbrtowc() returns (size_t)(-1) in case of invalid multibyte
sequence, what actually means that len=0xffffffff because size_t is an unsigned
int. The condition len > 0 is always true except of len==0. Then str+=len-1 will
cause decrement of two of the str pointer, what is incremented of one by str++
at the end of the for loop. So actually we moved one character back. If the call
of mbrtowc() for the previous character succeeds and len=1, we'll be parsing
again the invalid byte sequence -> here we go with the infinite loop.

Leonard, Roland, this won't affect the upstream mc as it's a utf8 patch that
needs to be fixed. I found at least one place in the upstream mc code, where the
glibc calls are not properly checked whether they end up successfully and the
problem is very similar to this one (may cause something different than infinite
loops, but for sure something odd). Unfortunately it remained unnoticed up to now.

Please see:
http://mail.gnome.org/archives/mc-devel/2005-February/msg00041.html


Comment 6 Jindrich Novy 2005-03-09 17:35:46 UTC
Fix for this issue is now applied in CVS. I also ported the UTF8 patch to the
today's mc CVS checkout, so the fix will come also with the recent mc.

I couldn't build it today, because of the BuildRoot pollution on ppc64,
hopefully it'll be fixed tomorrow.


Comment 7 Jindrich Novy 2005-03-17 14:06:42 UTC
The problem was finally with the fact that mc configure script on ppc64 defined
umode_t macro even if it was defined by sys/types.h already.