Bug 546480 - ls -l symlink characters are dangerous during copy-paste
Summary: ls -l symlink characters are dangerous during copy-paste
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: coreutils
Version: rawhide
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Ondrej Vasik
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-12-10 23:50 UTC by Greg Swift
Modified: 2009-12-11 15:35 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-12-11 07:26:29 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Greg Swift 2009-12-10 23:50:18 UTC
To prefix... i realize what I am suggesting here.  I realize that most people look at this and laugh it off, but its a valid concern, so please take a minute to consider it carefully.  The first time it happened to one of the admins I support I didn't think anything of it.  The second time it happened it made me think there was something to the problem.  There has already been some precedence that user actions that are accidentally dangerous are worth addressing (CTRL+ALT+BKSPACE), and this one has a higher potential damage rate than that.  To be quite honest I can't say that I've never accidentally done this, because there have been times over the years when I've turned up with odd zero'd out files and couldn't explain them.  And if you've never accidentally pasted your clipboard into a shell then you are a better user than most people I know.  Thanks :)


Description of problem:
If you accidentally copy+paste the output of 'ls -l' that has symlinks displayed into a linux terminal (right click in putty, center click in a linux terminal) you will accidentally zero our the target of the symlink.

The putty usage is the most likely to occur on accident. We've had multiple sys admins blow away cooked database files this way.

Version-Release number of selected component (if applicable):
specifically the output of ls, on pretty much all versions.

How reproducible:
very

Steps to Reproduce:
1: create a file you don't care about
  dd if=/dev/zero of=/path/to/file
2: create a symlink to that file in current dir
  ln -s /path/to/file
3: Run 'ls -l /path/to/file' to see file size
4: Run 'ls -l'
5: Select output of 'ls -l' and paste into same terminal.
6: Run 'ls -l /path/to/file' to see file size

Actual results:
File will now have a 0 size

Expected results:
Nothing to change

Additional info:
Here is an example.  I made sure not to copy any but the specific line to not touch the real files.

[user@box db]$ ls -l
total 0
lrwxrwxrwx 1 user user 31 Oct  2 16:43 indexdb1 -> /data/chunks1/db/indexdb1
lrwxrwxrwx 1 user user 30 Oct  2 16:37 livedb1 -> /data/chunks1/db/livedb1
lrwxrwxrwx 1 user user 30 Oct  2 16:35 livedb2 -> /data/chunks1/db/livedb2
lrwxrwxrwx 1 user user 30 Oct  2 16:28 llogdb1 -> /data/chunks1/db/llogdb1
lrwxrwxrwx 1 user user 30 Oct  2 16:28 llogdb2 -> /data/chunks1/db/llogdb2
lrwxrwxrwx 1 user user 27 Nov  3 16:42 oops -> /data/chunks1/db/oops
lrwxrwxrwx 1 user user 33 Oct  2 16:28 physlogdb1 -> /data/chunks1/db/physlogdb1
lrwxrwxrwx 1 root     root     30 Oct  2 14:38 rootdb1 -> /data/chunks1/db/rootdb1
lrwxrwxrwx 1 user user 33 Oct  2 17:00 sysadmindb -> /data/chunks1/db/sysadmindb
lrwxrwxrwx 1 user user 30 Oct  2 16:28 tempdb1 -> /data/chunks1/db/tempdb1
lrwxrwxrwx 1 user user 30 Oct  2 16:28 tempdb2 -> /data/chunks1/db/tempdb2
lrwxrwxrwx 1 user user 29 Oct  2 16:27 testdb -> /data/chunks1/db/testdb
[user@puffin db]$ ll /data/chunks1/db/
total 18618344
-rw-rw---- 1 user user 2048000000 Oct  2 16:44 indexdb1
-rw-rw---- 1 user user 2048000000 Oct  2 16:38 livedb1
-rw-rw---- 1 user user 2048000000 Oct  2 16:38 livedb2
-rw-rw---- 1 user user 2048000000 Nov  3 16:02 llogdb1
-rw-rw---- 1 user user 2048000000 Oct  2 16:34 llogdb2
-rw-rw-r-- 1 user user     102400 Nov  3 16:42 oops
-rw-rw---- 1 user user 1024000000 Oct  2 16:29 physlogdb1
-rw-rw---- 1 user user  614400000 Nov  3 15:45 rootdb1
-rw-rw---- 1 user user 1024000000 Nov  3 15:45 sysadmindb
-rw-rw---- 1 user user 2048000000 Oct  2 16:50 tempdb1
-rw-rw---- 1 user user 2048000000 Oct  2 16:50 tempdb2
-rw-rw---- 1 user user 2048000000 Oct  2 16:45 testdb
[user@puffin db]$ rmix user 27 Nov  3 16:42 oops -> /data/chunks1/db/oops
-bash: rmix: command not found
[user@puffin db]$ ll /data/chunks1/db/
total 18618240
-rw-rw---- 1 user user 2048000000 Oct  2 16:44 indexdb1
-rw-rw---- 1 user user 2048000000 Oct  2 16:38 livedb1
-rw-rw---- 1 user user 2048000000 Oct  2 16:38 livedb2
-rw-rw---- 1 user user 2048000000 Nov  3 16:02 llogdb1
-rw-rw---- 1 user user 2048000000 Oct  2 16:34 llogdb2
-rw-rw-r-- 1 user user          0 Nov  3 16:42 oops
-rw-rw---- 1 user user 1024000000 Oct  2 16:29 physlogdb1
-rw-rw---- 1 user user  614400000 Nov  3 15:45 rootdb1
-rw-rw---- 1 user user 1024000000 Nov  3 15:45 sysadmindb
-rw-rw---- 1 user user 2048000000 Oct  2 16:50 tempdb1
-rw-rw---- 1 user user 2048000000 Oct  2 16:50 tempdb2
-rw-rw---- 1 user user 2048000000 Oct  2 16:45 testdb



Discussion from RH Issue Tracker:

RH - 1. After running several tests we have come to the conclusion that the redirect character '>' would cause this to happen with any shell. Therefore this cannot be considered a bug.
2. To help ensure that US Courts does not damage any files if an user happens
  to paste the contents of a directory into the terminal, you could try the
  following suggestion:

Shells tcsh, bash, and apparently even ksh all have a "noclobber" feature, specifically designed to help protect users from inadvertently erasing/truncating existing files with an accidental use of some types of output redirection.

If US Courts would like to have some kind of protection in place, we could add  "set -o noclobber" in /etc/bashrc (assuming the users use bash).  If the users prefer tcsh for a shell, similar protection can be added to /etc/csh.cshrc.  (see the tcsh man page for details if needed).


Me - It does not surprise me that this would be a problem across all shells, as the `ls -l` isn't really shell dependent, nor is the >.  I see how the nocobbler would provide this protection, but at the determent of the many scripts common for administering a system.  Regardless, my point was that the output of 'ls' is buggy, not the behavior of the shells. 

RH - Please expand on the erroneous behavior found within the ls command. 

Me - The erroneous behavior of the ls command is to use a potentially destructive character (>).  A different character that provides the same visual representation, but a non-destructive behavior when copy-pasted, would be an ideal solution. Discussion w/ TAM mentioned a UTF8 character that looked like a >.  One problem I could see with this (and I don't know how often it would be true) is if people write scripts that parse 'ls -l' would have a harder time parsing a non-keyboard character.  They could, I just imagine it would be more difficult.  I don't know how much of an impact that would be.

RH - We have discussed this issue internally with engineering.  It is unlikely
   this would be accepted as a feature request by Product Management.   
  * Copying and pasting characters into the terminal and then pressing enter would likely be viewed as user error by the upstream community.  
  * Options for protection were suggested [using noclobber]
  * During phone conversation user suggested opening a bugzilla on the customer side to present this upstream.  This would be entered in the Fedora section of bugzilla.

https://bugzilla.redhat.com/enter_bug.cgi?classification=Fedora

At this time a feature request has been nacked.  

Normally I would suggest a Knowledge Base article. Red Hat can post an article to help users avoid file damage by using noclobber protection etc.

If the bugzilla in Fedora is accepted, we can then push for a Red Hat Enterprise Linux bugzilla to link to this ticket.

Comment 1 Ondrej Vasik 2009-12-11 07:26:29 UTC
Thanks for report and detailed description of situation, I see your point. I would say notabug as well - as this format is required by POSIX...

relevant section of ls specification about -l option:
"If the file is a symbolic link, this information shall be about the link itself and the <pathname> field shall be of the form:

"%s -> %s", <pathname of link>, <contents of link>"

See http://www.opengroup.org/onlinepubs/000095399/utilities/ls.html for full POSIX specification. Don't know if there is much to do. Knowledge Base article is maybe the easiest solution... Proposing the change to upstream will collide with POSIX. So maybe request for the format adjusting in next POSIX version could be other solution - the harder one - as this format is well established.

Closing this bugzilla NOTABUG, but adding upstream maintainer into CC - to let him know about the issue.

Comment 2 Jim Meyering 2009-12-11 07:44:25 UTC
Thanks for the heads up.
You're right.  The only hope for change is to go through POSIX,
and that would be a very long shot, since there are sure to be
scripts that parse ls -l output and that hence expect the " -> ".
POSIX would never invalidate all of those scripts.

You might get POSIX to bless a change whereby in any non-C locale,
another byte sequence may be used in place of " -> ".  That would
protect the common case (users at the command line usually have
LANG != C) yet allow scripted use (LC_ALL=C) to continue to have the
required behavior.

Comment 3 Greg Swift 2009-12-11 15:03:18 UTC
thanks for the feedback.  How would one go about proposing a change to POSIX? its a long shot, but doesn't hurt to ask.

Comment 4 Jim Meyering 2009-12-11 15:35:10 UTC
First, subscribe to one or more of the groups here: 

http://www.opengroup.org/austin/lists.html

austin-group-l
austin-group-futures-l

You might want to lurk for a while, or read archives.
Then ask if it such a change would be possible.


Note You need to log in before you can comment on or make changes to this bug.