Description of problem: Command-line tool, dfsadmin -report, returns no data. Report data is available from NameNode web interface. Version-Release number of selected component (if applicable): # rpm -qa gluster\* org.apache.hadoop\* glusterfs-3.4.0qa6-1.el6rhs.x86_64 glusterfs-fuse-3.4.0qa6-1.el6rhs.x86_64 org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch glusterfs-server-3.4.0qa6-1.el6rhs.x86_64 http://www.eng.lsu.edu/mirrors/apache/hadoop/common/stable/hadoop-1.0.4-bin.tar.gz How reproducible: 100% Steps to Reproduce: 1. Install and setup via https://access.redhat.com/knowledge/articles/264053 Actual results: [root@head hadoop-1.0.4]# ./bin/hadoop dfsadmin -report Initializing GlusterFS Expected results: Either access denied, e.g. report: org.apache.hadoop.security.AccessControlException: Access denied for user test. Superuser privilege is required or a report, e.g. Configured Capacity: 211378749436 (196.86 GB) Present Capacity: 184489046016 (171.82 GB) DFS Remaining: 171123924992 (159.37 GB) DFS Used: 13365121024 (12.45 GB) DFS Used%: 7.24% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Datanodes available: 4 (4 total, 0 dead) ...
Per Feb-13 bug triage meeting, reassigning to swatt.
Assigning to Jay
Are we sure this is a bug in gluster-fs hadoop plugin? GlusterFS Hadoop plugin *does not* replace the hadoop org.apache.hadoop.DistributedFileSystem, but rather, implements a FileSystem class of its own (GlusterFileSystem). Meanwhile - the DFS admin is supposed to be specific to DistributedFileSystem instances - this is enforced statically in the code, i.e. . https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20/src/hdfs/org/apache/hadoop/hdfs/tools/DFSAdmin.java /** * An abstract class for the execution of a file system command */ abstract private static class DFSAdminCommand extends Command { final DistributedFileSystem dfs; /** Constructor */ public DFSAdminCommand(FileSystem fs) { super(fs.getConf()); if (!(fs instanceof DistributedFileSystem)) { throw new IllegalArgumentException("FileSystem " + fs.getUri() + " is not a distributed file system"); } this.dfs = (DistributedFileSystem)fs; } }
Actually, at further glance - the report() function does nothing when run against a non "DistributedFileSystem". Where we define DistributedFileSystem as a file system which is an instance of class org.apache.hadoop.hdfs.DistributedFileSystem; SUGGESTION: This should be filed as a JIRA to the apache hadoop folks (i.e. to remove the hardcoded dependency on g.apache.hadoop.hdfs.DistributedFileSystem) ? Currently - the DFSADmin class does seem to be pluggable.
Marking as low since this may not be a "real" bug. We have to determine the correct behaviour for hadoop dfs calls when running on gluster.
Related - https://issues.apache.org/jira/browse/HDFS-4837 A service could exist for HCFS Gluster that implements some NameNode APIs used by various tools, such as dfsadmin or even the web ui.
Per 11/13 bug triage meeting, re-assigning to bchilds.
because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice. If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.