Description of problem: Hadoop S3A 2.7.3- distcp File bigger than ~5G - after bz#1470301 patch, upload goes in loop and never finishes RGW hasERROR: client_io->complete_request() returned -5 Version-Release number of selected component (if applicable): Red Hat Ceph Storage 2.3 Test package from bz#1470301 10.2.7-28.0.TEST.bz1470301.el7cp # hadoop version Hadoop 2.7.3 Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff Compiled by root on 2016-08-18T01:41Z Compiled with protoc 2.5.0 From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4 This command was run using /root/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar How reproducible: Always Steps to Reproduce: 1.Configure Hadoop S3A with 2.Use hadoop distcp for uploading file above 6GB I tried 7G and 8G both failed. 3. Size below 5G works fine.
1. Files above than 5G size are not getting uploaded from Hadoop. It started from: **Unable to copy part: Status Code: 404** and the version was 2.3 - 10.2.7-27.el7cp. --- 17/07/13 18:22:07 ERROR util.RetriableCommand: Failure in Retriable command: Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G com.amazonaws.AmazonClientException: Unable to copy part: Status Code: 404, AWS Service: Amazon S3, AWS Request ID: tx000000000000000000106-0059676cf1-6dbb4-default, AWS Error Code: NoSuchKey, AWS Error Message: null --- [1] 4G and 5G success and 7G Failed Test Results: ================================================== 1. 7G file: ===================== # hadoop distcp /root/testhadoop.7G s3a://test-bucket/ 17/07/13 18:14:59 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[/root/testhadoop.7G], targetPath=s3a://test-bucket/, targetPathExists=true, preserveRawXattrs=false} 17/07/13 18:14:59 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 17/07/13 18:14:59 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 17/07/13 18:15:00 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 17/07/13 18:15:00 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 17/07/13 18:15:00 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 17/07/13 18:15:00 INFO mapreduce.JobSubmitter: number of splits:1 17/07/13 18:15:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local375793309_0001 17/07/13 18:15:00 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 17/07/13 18:15:00 INFO tools.DistCp: DistCp job-id: job_local375793309_0001 17/07/13 18:15:00 INFO mapreduce.Job: Running job: job_local375793309_0001 17/07/13 18:15:00 INFO mapred.LocalJobRunner: OutputCommitter set in config null 17/07/13 18:15:00 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 18:15:00 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.tools.mapred.CopyCommitter 17/07/13 18:15:00 INFO mapred.LocalJobRunner: Waiting for map tasks 17/07/13 18:15:00 INFO mapred.LocalJobRunner: Starting task: attempt_local375793309_0001_m_000000_0 17/07/13 18:15:00 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 18:15:00 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 17/07/13 18:15:00 INFO mapred.MapTask: Processing split: file:/tmp/hadoop-root/mapred/staging/root680833703/.staging/_distcp-2118402004/fileList.seq:0+199 17/07/13 18:15:00 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 18:15:00 INFO mapred.CopyMapper: Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G 17/07/13 18:15:00 INFO mapred.RetriableFileCopyCommand: Creating temp file: s3a://test-bucket/.distcp.tmp.attempt_local375793309_0001_m_000000_0 17/07/13 18:15:01 INFO mapreduce.Job: Job job_local375793309_0001 running in uber mode : false 17/07/13 18:15:01 INFO mapreduce.Job: map 0% reduce 0% 17/07/13 18:15:06 INFO mapred.LocalJobRunner: 5.5% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [395.7M/7.0G] > map 17/07/13 18:15:07 INFO mapreduce.Job: map 100% reduce 0% 17/07/13 18:15:09 INFO mapred.LocalJobRunner: 7.3% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [519.9M/7.0G] > map 17/07/13 18:15:12 INFO mapred.LocalJobRunner: 7.8% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [560.4M/7.0G] > map 17/07/13 18:15:15 INFO mapred.LocalJobRunner: 11.7% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [836.4M/7.0G] > map 17/07/13 18:15:18 INFO mapred.LocalJobRunner: 13.6% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [971.9M/7.0G] > map 17/07/13 18:15:21 INFO mapred.LocalJobRunner: 15.8% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [1.1G/7.0G] > map 17/07/13 18:15:24 INFO mapred.LocalJobRunner: 16.8% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [1.2G/7.0G] > map 17/07/13 18:15:27 INFO mapred.LocalJobRunner: 18.8% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [1.3G/7.0G] > map [.....] 17/07/13 18:22:03 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [7.0G/7.0G] > map 17/07/13 18:22:06 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [7.0G/7.0G] > map 17/07/13 18:22:07 ERROR util.RetriableCommand: Failure in Retriable command: Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G com.amazonaws.AmazonClientException: Unable to copy part: Status Code: 404, AWS Service: Amazon S3, AWS Request ID: tx000000000000000000106-0059676cf1-6dbb4-default, AWS Error Code: NoSuchKey, AWS Error Message: null at com.amazonaws.services.s3.transfer.internal.CopyMonitor.collectPartETags(CopyMonitor.java:252) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.completeMultipartUpload(CopyMonitor.java:227) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.poll(CopyMonitor.java:171) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.call(CopyMonitor.java:136) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.call(CopyMonitor.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 404, AWS Service: Amazon S3, AWS Request ID: tx000000000000000000106-0059676cf1-6dbb4-default, AWS Error Code: NoSuchKey, AWS Error Message: null, S3 Extended Request ID: 6dbb4-default-default at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) at com.amazonaws.services.s3.AmazonS3Client.copyPart(AmazonS3Client.java:1638) at com.amazonaws.services.s3.transfer.internal.CopyPartCallable.call(CopyPartCallable.java:41) at com.amazonaws.services.s3.transfer.internal.CopyPartCallable.call(CopyPartCallable.java:28) ... 4 more 17/07/13 18:22:09 INFO mapred.RetriableFileCopyCommand: Creating temp file: s3a://test-bucket/.distcp.tmp.attempt_local375793309_0001_m_000000_0 17/07/13 18:22:12 INFO mapred.LocalJobRunner: 3.6% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [256.2M/7.0G] > map 17/07/13 18:22:15 INFO mapred.LocalJobRunner: 6.3% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [452.6M/7.0G] > map 17/07/13 18:22:18 INFO mapred.LocalJobRunner: 7.4% Copying file:/root/testhadoop.7G to s3a://test-bucket/testhadoop.7G [527.9M/7.0G] > map 2. 4G file: ===================== # bin/hadoop distcp /root/testhadoop.4g s3a://test-bucket/ 17/07/13 07:40:30 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[/root/testhadoop.4g], targetPath=s3a://test-bucket/, targetPathExists=true, preserveRawXattrs=false} 17/07/13 07:40:30 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 17/07/13 07:40:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 17/07/13 07:40:30 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 17/07/13 07:40:30 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 17/07/13 07:40:30 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 17/07/13 07:40:30 INFO mapreduce.JobSubmitter: number of splits:1 17/07/13 07:40:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1614843136_0001 17/07/13 07:40:30 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 17/07/13 07:40:30 INFO tools.DistCp: DistCp job-id: job_local1614843136_0001 17/07/13 07:40:30 INFO mapreduce.Job: Running job: job_local1614843136_0001 17/07/13 07:40:30 INFO mapred.LocalJobRunner: OutputCommitter set in config null 17/07/13 07:40:30 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 07:40:30 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.tools.mapred.CopyCommitter 17/07/13 07:40:30 INFO mapred.LocalJobRunner: Waiting for map tasks 17/07/13 07:40:30 INFO mapred.LocalJobRunner: Starting task: attempt_local1614843136_0001_m_000000_0 17/07/13 07:40:30 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 07:40:30 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 17/07/13 07:40:30 INFO mapred.MapTask: Processing split: file:/tmp/hadoop-root/mapred/staging/root1675888413/.staging/_distcp1070895476/fileList.seq:0+199 17/07/13 07:40:30 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 07:40:30 INFO mapred.CopyMapper: Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g 17/07/13 07:40:31 INFO mapred.RetriableFileCopyCommand: Creating temp file: s3a://test-bucket/.distcp.tmp.attempt_local1614843136_0001_m_000000_0 17/07/13 07:40:31 INFO mapreduce.Job: Job job_local1614843136_0001 running in uber mode : false 17/07/13 07:40:31 INFO mapreduce.Job: map 0% reduce 0% 17/07/13 07:40:36 INFO mapred.LocalJobRunner: 10.0% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [411.4M/4.0G] > map 17/07/13 07:40:37 INFO mapreduce.Job: map 100% reduce 0% 17/07/13 07:40:39 INFO mapred.LocalJobRunner: 13.0% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [533.9M/4.0G] > map 17/07/13 07:40:42 INFO mapred.LocalJobRunner: 16.1% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [659.4M/4.0G] > map 17/07/13 07:40:45 INFO mapred.LocalJobRunner: 18.4% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [753.7M/4.0G] > map 17/07/13 07:40:48 INFO mapred.LocalJobRunner: 20.0% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [820.1M/4.0G] > map [...........] 17/07/13 07:41:54 INFO mapred.LocalJobRunner: 99.8% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [4.0G/4.0G] > map 17/07/13 07:41:57 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [4.0G/4.0G] > map [........] 17/07/13 07:44:22 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [4.0G/4.0G] > map 17/07/13 07:44:23 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [4.0G/4.0G] > map 17/07/13 07:44:23 INFO mapred.Task: Task:attempt_local1614843136_0001_m_000000_0 is done. And is in the process of committing 17/07/13 07:44:23 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [4.0G/4.0G] > map 17/07/13 07:44:23 INFO mapred.Task: Task attempt_local1614843136_0001_m_000000_0 is allowed to commit now 17/07/13 07:44:23 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1614843136_0001_m_000000_0' to file:/tmp/hadoop-root/mapred/staging/root1675888413/.staging/_distcp1070895476/_logs/_temporary/0/task_local1614843136_0001_m_000000 17/07/13 07:44:23 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.4g to s3a://test-bucket/testhadoop.4g [4.0G/4.0G] 17/07/13 07:44:23 INFO mapred.Task: Task 'attempt_local1614843136_0001_m_000000_0' done. 17/07/13 07:44:23 INFO mapred.LocalJobRunner: Finishing task: attempt_local1614843136_0001_m_000000_0 17/07/13 07:44:23 INFO mapred.LocalJobRunner: map task executor complete. 17/07/13 07:44:23 WARN mapred.CopyCommitter: Unable to cleanup temp files java.lang.NullPointerException at org.apache.hadoop.fs.Path.<init>(Path.java:104) at org.apache.hadoop.fs.Path.<init>(Path.java:93) at org.apache.hadoop.tools.mapred.CopyCommitter.deleteAttemptTempFiles(CopyCommitter.java:136) at org.apache.hadoop.tools.mapred.CopyCommitter.cleanupTempFiles(CopyCommitter.java:126) at org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:83) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:537) 17/07/13 07:44:23 INFO mapred.CopyCommitter: Cleaning up temporary work folder: file:/tmp/hadoop-root/mapred/staging/root1675888413/.staging/_distcp1070895476 17/07/13 07:44:23 INFO mapreduce.Job: Job job_local1614843136_0001 completed successfully 17/07/13 07:44:23 INFO mapreduce.Job: Counters: 23 File System Counters FILE: Number of bytes read=4295096718 FILE: Number of bytes written=418550 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 S3A: Number of bytes read=0 S3A: Number of bytes written=4294967296 S3A: Number of read operations=14 S3A: Number of large read operations=0 S3A: Number of write operations=44 Map-Reduce Framework Map input records=1 Map output records=0 Input split bytes=156 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=18 Total committed heap usage (bytes)=249561088 File Input Format Counters Bytes Read=231 File Output Format Counters Bytes Written=8 org.apache.hadoop.tools.mapred.CopyMapper$Counter BYTESCOPIED=4294967296 BYTESEXPECTED=4294967296 COPY=1 3. 5G file: ===================== # bin/hadoop distcp /root/testhadoop.5g s3a://test-bucket/ 17/07/13 07:47:52 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[/root/testhadoop.5g], targetPath=s3a://test-bucket/, targetPathExists=true, preserveRawXattrs=false} 17/07/13 07:47:52 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 17/07/13 07:47:52 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 17/07/13 07:47:52 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 17/07/13 07:47:52 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 17/07/13 07:47:52 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 17/07/13 07:47:52 INFO mapreduce.JobSubmitter: number of splits:1 17/07/13 07:47:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local389646759_0001 17/07/13 07:47:52 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 17/07/13 07:47:52 INFO tools.DistCp: DistCp job-id: job_local389646759_0001 17/07/13 07:47:52 INFO mapreduce.Job: Running job: job_local389646759_0001 17/07/13 07:47:52 INFO mapred.LocalJobRunner: OutputCommitter set in config null 17/07/13 07:47:52 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 07:47:52 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.tools.mapred.CopyCommitter 17/07/13 07:47:52 INFO mapred.LocalJobRunner: Waiting for map tasks 17/07/13 07:47:52 INFO mapred.LocalJobRunner: Starting task: attempt_local389646759_0001_m_000000_0 17/07/13 07:47:53 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 07:47:53 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 17/07/13 07:47:53 INFO mapred.MapTask: Processing split: file:/tmp/hadoop-root/mapred/staging/root1638366799/.staging/_distcp-524910745/fileList.seq:0+199 17/07/13 07:47:53 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 07:47:53 INFO mapred.CopyMapper: Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g 17/07/13 07:47:53 INFO mapred.RetriableFileCopyCommand: Creating temp file: s3a://test-bucket/.distcp.tmp.attempt_local389646759_0001_m_000000_0 17/07/13 07:47:53 INFO mapreduce.Job: Job job_local389646759_0001 running in uber mode : false 17/07/13 07:47:53 INFO mapreduce.Job: map 0% reduce 0% 17/07/13 07:47:59 INFO mapred.LocalJobRunner: 8.9% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [454.1M/5.0G] > map 17/07/13 07:47:59 INFO mapreduce.Job: map 100% reduce 0% 17/07/13 07:48:02 INFO mapred.LocalJobRunner: 10.6% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [543.6M/5.0G] > map 17/07/13 07:48:05 INFO mapred.LocalJobRunner: 12.6% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [644.8M/5.0G] > map 17/07/13 07:48:08 INFO mapred.LocalJobRunner: 17.7% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [907.8M/5.0G] > map [........] 17/07/13 07:49:29 INFO mapred.LocalJobRunner: 92.6% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [4.6G/5.0G] > map 17/07/13 07:49:32 INFO mapred.LocalJobRunner: 95.3% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [4.8G/5.0G] > map 17/07/13 07:49:35 INFO mapred.LocalJobRunner: 97.9% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [4.9G/5.0G] > map 17/07/13 07:49:38 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [5.0G/5.0G] > map 17/07/13 07:49:41 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [5.0G/5.0G] > map 17/07/13 07:49:44 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [5.0G/5.0G] > map [.........] 17/07/13 07:52:38 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [5.0G/5.0G] > map 17/07/13 07:52:40 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [5.0G/5.0G] > map 17/07/13 07:52:40 INFO mapred.Task: Task:attempt_local389646759_0001_m_000000_0 is done. And is in the process of committing 17/07/13 07:52:40 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [5.0G/5.0G] > map 17/07/13 07:52:40 INFO mapred.Task: Task attempt_local389646759_0001_m_000000_0 is allowed to commit now 17/07/13 07:52:40 INFO output.FileOutputCommitter: Saved output of task 'attempt_local389646759_0001_m_000000_0' to file:/tmp/hadoop-root/mapred/staging/root1638366799/.staging/_distcp-524910745/_logs/_temporary/0/task_local389646759_0001_m_000000 17/07/13 07:52:40 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.5g to s3a://test-bucket/testhadoop.5g [5.0G/5.0G] 17/07/13 07:52:40 INFO mapred.Task: Task 'attempt_local389646759_0001_m_000000_0' done. 17/07/13 07:52:40 INFO mapred.LocalJobRunner: Finishing task: attempt_local389646759_0001_m_000000_0 17/07/13 07:52:40 INFO mapred.LocalJobRunner: map task executor complete. 17/07/13 07:52:40 WARN mapred.CopyCommitter: Unable to cleanup temp files java.lang.NullPointerException at org.apache.hadoop.fs.Path.<init>(Path.java:104) at org.apache.hadoop.fs.Path.<init>(Path.java:93) at org.apache.hadoop.tools.mapred.CopyCommitter.deleteAttemptTempFiles(CopyCommitter.java:136) at org.apache.hadoop.tools.mapred.CopyCommitter.cleanupTempFiles(CopyCommitter.java:126) at org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:83) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:537) 17/07/13 07:52:40 INFO mapred.CopyCommitter: Cleaning up temporary work folder: file:/tmp/hadoop-root/mapred/staging/root1638366799/.staging/_distcp-524910745 17/07/13 07:52:41 INFO mapreduce.Job: Job job_local389646759_0001 completed successfully 17/07/13 07:52:41 INFO mapreduce.Job: Counters: 23 File System Counters FILE: Number of bytes read=5368838542 FILE: Number of bytes written=416986 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 S3A: Number of bytes read=0 S3A: Number of bytes written=5368709120 S3A: Number of read operations=14 S3A: Number of large read operations=0 S3A: Number of write operations=55 Map-Reduce Framework Map input records=1 Map output records=0 Input split bytes=156 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=27 Total committed heap usage (bytes)=247463936 File Input Format Counters Bytes Read=231 File Output Format Counters Bytes Written=8 org.apache.hadoop.tools.mapred.CopyMapper$Counter BYTESCOPIED=5368709120 BYTESEXPECTED=5368709120 COPY=1 $ s3cmd ls s3://test-bucket [....] 2017-07-13 01:59 6198 s3://test-bucket/rgw.bucket.list 2017-07-13 02:14 4294967296 s3://test-bucket/testhadoop.4g 2017-07-13 02:22 5368709120 s3://test-bucket/testhadoop.5g
> Default RGW logs for 7G test with version 2.3 - 10.2.7-27.el7cp. 2017-07-13 18:29:07.573165 7fcdcf7e6700 1 ====== req done req=0x7fcdcf7e0710 op status=0 http_status=200 ====== 2017-07-13 18:29:07.573217 7fcdcf7e6700 1 civetweb: 0x7fce54000bb0: 10.74.128.32 - - [13/Jul/2017:18:29:07 +0530] "POST /test-bucket/.distcp.tmp.attempt_local375793309_0001_m_000000_0 HTTP/1.1" 200 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.291593 7fcdccfe1700 1 ====== starting new request req=0x7fcdccfdb710 ===== 2017-07-13 18:29:08.292490 7fcdccfe1700 1 ====== req done req=0x7fcdccfdb710 op status=0 http_status=404 ====== 2017-07-13 18:29:08.292533 7fcdccfe1700 1 civetweb: 0x7fce74003a80: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "HEAD /test-bucket/testhadoop.7G/ HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.300035 7fcdcd7e2700 1 ====== starting new request req=0x7fcdcd7dc710 ===== 2017-07-13 18:29:08.301197 7fcdcd7e2700 1 ====== req done req=0x7fcdcd7dc710 op status=0 http_status=200 ====== 2017-07-13 18:29:08.301232 7fcdcd7e2700 1 civetweb: 0x7fce6c003b80: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "GET /test-bucket/ HTTP/1.1" 200 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.308735 7fcdcb7de700 1 ====== starting new request req=0x7fcdcb7d8710 ===== 2017-07-13 18:29:08.309503 7fcdcb7de700 1 ====== req done req=0x7fcdcb7d8710 op status=0 http_status=404 ====== 2017-07-13 18:29:08.309539 7fcdcb7de700 1 civetweb: 0x7fce880074c0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "HEAD /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.315145 7fcdc97da700 1 ====== starting new request req=0x7fcdc97d4710 ===== 2017-07-13 18:29:08.316262 7fcdc97da700 1 ====== req done req=0x7fcdc97d4710 op status=0 http_status=200 ====== 2017-07-13 18:29:08.316298 7fcdc97da700 1 civetweb: 0x7fce980074c0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "GET /test-bucket/ HTTP/1.1" 200 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.323168 7fcdcbfdf700 1 ====== starting new request req=0x7fcdcbfd9710 ===== 2017-07-13 18:29:08.324512 7fcdcbfdf700 1 ====== req done req=0x7fcdcbfd9710 op status=0 http_status=200 ====== 2017-07-13 18:29:08.324551 7fcdcbfdf700 1 civetweb: 0x7fce7c003300: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "HEAD /test-bucket/.distcp.tmp.attempt_local375793309_0001_m_000000_0 HTTP/1.1" 200 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.355642 7fcdcafdd700 1 ====== starting new request req=0x7fcdcafd7710 ===== 2017-07-13 18:29:08.355672 7fcdc9fdb700 1 ====== starting new request req=0x7fcdc9fd5710 ===== 2017-07-13 18:29:08.355687 7fcdc4fd1700 1 ====== starting new request req=0x7fcdc4fcb710 ===== 2017-07-13 18:29:08.355700 7fcdc47d0700 1 ====== starting new request req=0x7fcdc47ca710 ===== 2017-07-13 18:29:08.355709 7fcdc57d2700 1 ====== starting new request req=0x7fcdc57cc710 ===== 2017-07-13 18:29:08.355897 7fcdc8fd9700 1 ====== starting new request req=0x7fcdc8fd3710 ===== 2017-07-13 18:29:08.355949 7fcdc87d8700 1 ====== starting new request req=0x7fcdc87d2710 ===== 2017-07-13 18:29:08.356689 7fcdcafdd700 1 ====== req done req=0x7fcdcafd7710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.356727 7fcdcafdd700 1 civetweb: 0x7fce84007530: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.356787 7fcdc57d2700 1 ====== req done req=0x7fcdc57cc710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.356826 7fcdc57d2700 1 civetweb: 0x7fcde0000ea0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.357093 7fcdc47d0700 1 ====== req done req=0x7fcdc47ca710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.357169 7fcdc47d0700 1 civetweb: 0x7fcde80050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.357180 7fcdc9fdb700 1 ====== req done req=0x7fcdc9fd5710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.357211 7fcdc9fdb700 1 civetweb: 0x7fce8c007c10: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.357239 7fcdc4fd1700 1 ====== req done req=0x7fcdc4fcb710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.357277 7fcdc4fd1700 1 civetweb: 0x7fcddc000ea0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.357404 7fcdc87d8700 1 ====== req done req=0x7fcdc87d2710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.357440 7fcdc87d8700 1 civetweb: 0x7fcea0005910: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.357531 7fcdc8fd9700 1 ====== req done req=0x7fcdc8fd3710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.357563 7fcdc8fd9700 1 civetweb: 0x7fce940009b0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.362042 7fcdc77d6700 1 ====== starting new request req=0x7fcdc77d0710 ===== 2017-07-13 18:29:08.362877 7fcdc77d6700 1 ====== req done req=0x7fcdc77d0710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.362913 7fcdc77d6700 1 civetweb: 0x7fcea40064a0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.365419 7fcdc7fd7700 1 ====== starting new request req=0x7fcdc7fd1710 ===== 2017-07-13 18:29:08.366235 7fcdc7fd7700 1 ====== req done req=0x7fcdc7fd1710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.366270 7fcdc7fd7700 1 civetweb: 0x7fce9c01eb70: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.367322 7fcdbffc7700 1 ====== starting new request req=0x7fcdbffc1710 ===== 2017-07-13 18:29:08.368166 7fcdbffc7700 1 ====== req done req=0x7fcdbffc1710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.368202 7fcdbffc7700 1 civetweb: 0x7fce04003f70: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.368463 7fcdc6fd5700 1 ====== starting new request req=0x7fcdc6fcf710 ===== 2017-07-13 18:29:08.369240 7fcdc6fd5700 1 ====== req done req=0x7fcdc6fcf710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.369276 7fcdc6fd5700 1 civetweb: 0x7fceac0009b0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.371327 7fcdc67d4700 1 ====== starting new request req=0x7fcdc67ce710 ===== 2017-07-13 18:29:08.371462 7fcdc3fcf700 1 ====== starting new request req=0x7fcdc3fc9710 ===== 2017-07-13 18:29:08.372086 7fcdc67d4700 1 ====== req done req=0x7fcdc67ce710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.372131 7fcdc67d4700 1 civetweb: 0x7fceb4000bd0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.372307 7fcdc3fcf700 1 ====== req done req=0x7fcdc3fc9710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.372345 7fcdc3fcf700 1 civetweb: 0x7fcde40050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.372444 7fcdc5fd3700 1 ====== starting new request req=0x7fcdc5fcd710 ===== 2017-07-13 18:29:08.373164 7fcdc5fd3700 1 ====== req done req=0x7fcdc5fcd710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.373200 7fcdc5fd3700 1 civetweb: 0x7fced3ee4860: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.378685 7fcdc17ca700 1 ====== starting new request req=0x7fcdc17c4710 ===== 2017-07-13 18:29:08.379524 7fcdc17ca700 1 ====== req done req=0x7fcdc17c4710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.379568 7fcdc17ca700 1 civetweb: 0x7fce000050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.380330 7fcdc07c8700 1 ====== starting new request req=0x7fcdc07c2710 ===== 2017-07-13 18:29:08.381155 7fcdc07c8700 1 ====== req done req=0x7fcdc07c2710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.381193 7fcdc07c8700 1 civetweb: 0x7fce08000bb0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.381462 7fcdc2fcd700 1 ====== starting new request req=0x7fcdc2fc7710 ===== 2017-07-13 18:29:08.381911 7fcdc1fcb700 1 ====== starting new request req=0x7fcdc1fc5710 ===== 2017-07-13 18:29:08.382264 7fcdc2fcd700 1 ====== req done req=0x7fcdc2fc7710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.382307 7fcdc2fcd700 1 civetweb: 0x7fcdec0050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.382575 7fcdc1fcb700 1 ====== req done req=0x7fcdc1fc5710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.382611 7fcdc1fcb700 1 civetweb: 0x7fcdf40050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.389254 7fcdbf7c6700 1 ====== starting new request req=0x7fcdbf7c0710 ===== 2017-07-13 18:29:08.390090 7fcdbf7c6700 1 ====== req done req=0x7fcdbf7c0710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.390140 7fcdbf7c6700 1 civetweb: 0x7fce100050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.391610 7fcdc0fc9700 1 ====== starting new request req=0x7fcdc0fc3710 ===== 2017-07-13 18:29:08.392560 7fcdc0fc9700 1 ====== req done req=0x7fcdc0fc3710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.392603 7fcdc0fc9700 1 civetweb: 0x7fcdfc004170: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.393855 7fcdc27cc700 1 ====== starting new request req=0x7fcdc27c6710 ===== 2017-07-13 18:29:08.394647 7fcdc27cc700 1 ====== req done req=0x7fcdc27c6710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.394687 7fcdc27cc700 1 civetweb: 0x7fcdf80050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.394921 7fcdc37ce700 1 ====== starting new request req=0x7fcdc37c8710 ===== 2017-07-13 18:29:08.395643 7fcdc37ce700 1 ====== req done req=0x7fcdc37c8710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.395679 7fcdc37ce700 1 civetweb: 0x7fcdf00050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.399502 7fcdbdfc3700 1 ====== starting new request req=0x7fcdbdfbd710 ===== 2017-07-13 18:29:08.400386 7fcdbdfc3700 1 ====== req done req=0x7fcdbdfbd710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.400391 7fcdbefc5700 1 ====== starting new request req=0x7fcdbefbf710 ===== 2017-07-13 18:29:08.400428 7fcdbdfc3700 1 civetweb: 0x7fce140050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.401189 7fcdbefc5700 1 ====== req done req=0x7fcdbefbf710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.401227 7fcdbefc5700 1 civetweb: 0x7fce0c0050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.410795 7fcdbe7c4700 1 ====== starting new request req=0x7fcdbe7be710 ===== 2017-07-13 18:29:08.410805 7fcdbd7c2700 1 ====== starting new request req=0x7fcdbd7bc710 ===== 2017-07-13 18:29:08.410819 7fcdbcfc1700 1 ====== starting new request req=0x7fcdbcfbb710 ===== 2017-07-13 18:29:08.411692 7fcdbcfc1700 1 ====== req done req=0x7fcdbcfbb710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.411728 7fcdbcfc1700 1 civetweb: 0x7fce24004e20: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.411854 7fcdbe7c4700 1 ====== req done req=0x7fcdbe7be710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.411905 7fcdbe7c4700 1 civetweb: 0x7fce180050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.411970 7fcdbd7c2700 1 ====== req done req=0x7fcdbd7bc710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.412000 7fcdbd7c2700 1 civetweb: 0x7fce1c000cf0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.414004 7fcdbc7c0700 1 ====== starting new request req=0x7fcdbc7ba710 ===== 2017-07-13 18:29:08.414009 7fcdbbfbf700 1 ====== starting new request req=0x7fcdbbfb9710 ===== 2017-07-13 18:29:08.414757 7fcdbbfbf700 1 ====== req done req=0x7fcdbbfb9710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.414793 7fcdbbfbf700 1 civetweb: 0x7fce2c0050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.414943 7fcdbc7c0700 1 ====== req done req=0x7fcdbc7ba710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.414981 7fcdbc7c0700 1 civetweb: 0x7fce200050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.423902 7fcdbafbd700 1 ====== starting new request req=0x7fcdbafb7710 ===== 2017-07-13 18:29:08.423911 7fcdb7fb7700 1 ====== starting new request req=0x7fcdb7fb1710 ===== 2017-07-13 18:29:08.423927 7fcdbb7be700 1 ====== starting new request req=0x7fcdbb7b8710 ===== 2017-07-13 18:29:08.423940 7fcdb9fbb700 1 ====== starting new request req=0x7fcdb9fb5710 ===== 2017-07-13 18:29:08.423943 7fcdba7bc700 1 ====== starting new request req=0x7fcdba7b6710 ===== 2017-07-13 18:29:08.424875 7fcdb7fb7700 1 ====== req done req=0x7fcdb7fb1710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.424914 7fcdb7fb7700 1 civetweb: 0x7fce440050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.425033 7fcdbafbd700 1 ====== req done req=0x7fcdbafb7710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.425071 7fcdbafbd700 1 civetweb: 0x7fce340050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.425200 7fcdbb7be700 1 ====== req done req=0x7fcdbb7b8710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.425235 7fcdbb7be700 1 civetweb: 0x7fce28002090: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.425415 7fcdb9fbb700 1 ====== req done req=0x7fcdb9fb5710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.425474 7fcdb9fbb700 1 civetweb: 0x7fce380050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.425485 7fcdba7bc700 1 ====== req done req=0x7fcdba7b6710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.425529 7fcdba7bc700 1 civetweb: 0x7fce30001980: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.432289 7fcdb87b8700 1 ====== starting new request req=0x7fcdb87b2710 ===== 2017-07-13 18:29:08.432651 7fcdb8fb9700 1 ====== starting new request req=0x7fcdb8fb3710 ===== 2017-07-13 18:29:08.433166 7fcdb87b8700 1 ====== req done req=0x7fcdb87b2710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.433205 7fcdb87b8700 1 civetweb: 0x7fce480050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:08.433426 7fcdb8fb9700 1 ====== req done req=0x7fcdb8fb3710 op status=-2 http_status=404 ====== 2017-07-13 18:29:08.433466 7fcdb8fb9700 1 civetweb: 0x7fce3c0050f0: 10.74.128.32 - - [13/Jul/2017:18:29:08 +0530] "PUT /test-bucket/testhadoop.7G HTTP/1.1" 404 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:13.352297 7fcdb97ba700 1 ====== starting new request req=0x7fcdb97b4710 ===== 2017-07-13 18:29:13.353607 7fcdb97ba700 1 ====== req done req=0x7fcdb97b4710 op status=0 http_status=200 ====== 2017-07-13 18:29:13.353650 7fcdb97ba700 1 civetweb: 0x7fce40004e20: 10.74.128.32 - - [13/Jul/2017:18:29:13 +0530] "HEAD /test-bucket/.distcp.tmp.attempt_local375793309_0001_m_000000_0 HTTP/1.1" 200 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 2017-07-13 18:29:13.360350 7fcdb77b6700 1 ====== starting new request req=0x7fcdb77b0710 ===== 2017-07-13 18:29:13.370771 7fcdb77b6700 1 ====== req done req=0x7fcdb77b0710 op status=0 http_status=204 ====== 2017-07-13 18:29:13.370823 7fcdb77b6700 1 civetweb: 0x7fce500050f0: 10.74.128.32 - - [13/Jul/2017:18:29:13 +0530] "DELETE /test-bucket/.distcp.tmp.attempt_local375793309_0001_m_000000_0 HTTP/1.1" 204 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131 ^C
Then I upgraded my RGW daemons which are in roundrobin with HAPROXY to bz#1470301 test packages and enabled debug_rgw=20 and debug_ms=1 I am getting error: 2017-07-14 00:50:46.452588 7f5fb7faf700 0 ERROR: client_io->complete_request() returned -5 Then I upgraded my complete cluster with test packages and still error is same. I am attaching debug logs from RGW upgrade only.
> This goes in loop I had to close with ctrl+c # hadoop distcp /root/testhadoop.8G s3a://test-bucket/ 17/07/13 23:56:45 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[/root/testhadoop.8G], targetPath=s3a://test-bucket/, targetPathExists=true, preserveRawXattrs=false} 17/07/13 23:56:45 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 17/07/13 23:56:45 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 17/07/13 23:56:45 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 17/07/13 23:56:45 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 17/07/13 23:56:45 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 17/07/13 23:56:45 INFO mapreduce.JobSubmitter: number of splits:1 17/07/13 23:56:46 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1571695601_0001 17/07/13 23:56:46 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 17/07/13 23:56:46 INFO tools.DistCp: DistCp job-id: job_local1571695601_0001 17/07/13 23:56:46 INFO mapreduce.Job: Running job: job_local1571695601_0001 17/07/13 23:56:46 INFO mapred.LocalJobRunner: OutputCommitter set in config null 17/07/13 23:56:46 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 23:56:46 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.tools.mapred.CopyCommitter 17/07/13 23:56:46 INFO mapred.LocalJobRunner: Waiting for map tasks 17/07/13 23:56:46 INFO mapred.LocalJobRunner: Starting task: attempt_local1571695601_0001_m_000000_0 17/07/13 23:56:46 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 23:56:46 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 17/07/13 23:56:46 INFO mapred.MapTask: Processing split: file:/tmp/hadoop-root/mapred/staging/root1227293053/.staging/_distcp2027239354/fileList.seq:0+199 17/07/13 23:56:46 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/13 23:56:46 INFO mapred.CopyMapper: Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G 17/07/13 23:56:46 INFO mapred.RetriableFileCopyCommand: Creating temp file: s3a://test-bucket/.distcp.tmp.attempt_local1571695601_0001_m_000000_0 17/07/13 23:56:47 INFO mapreduce.Job: Job job_local1571695601_0001 running in uber mode : false 17/07/13 23:56:47 INFO mapreduce.Job: map 0% reduce 0% 17/07/13 23:56:52 INFO mapred.LocalJobRunner: 4.8% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [396.7M/8.0G] > map 17/07/13 23:56:53 INFO mapreduce.Job: map 100% reduce 0% 17/07/13 23:56:55 INFO mapred.LocalJobRunner: 6.4% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [527.9M/8.0G] > map 17/07/13 23:56:58 INFO mapred.LocalJobRunner: 8.2% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [675.4M/8.0G] > map 17/07/13 23:57:01 INFO mapred.LocalJobRunner: 9.1% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [748.2M/8.0G] > map 17/07/13 23:57:04 INFO mapred.LocalJobRunner: 10.4% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [852.9M/8.0G] > map 17/07/13 23:57:07 INFO mapred.LocalJobRunner: 13.9% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [1.1G/8.0G] > map 17/07/13 23:57:10 INFO mapred.LocalJobRunner: 15.4% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [1.2G/8.0G] > map 17/07/13 23:59:19 INFO mapred.LocalJobRunner: 94.1% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [7.5G/8.0G] > map 17/07/13 23:59:22 INFO mapred.LocalJobRunner: 94.9% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [7.6G/8.0G] > map 17/07/13 23:59:25 INFO mapred.LocalJobRunner: 98.4% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [7.9G/8.0G] > map 17/07/13 23:59:28 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/13 23:59:31 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/13 23:59:34 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/13 23:59:37 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/14 00:02:37 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/14 00:02:40 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/14 00:02:43 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/14 00:02:46 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/14 00:02:49 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/14 00:02:52 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/14 00:04:25 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map 17/07/14 00:04:28 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/testhadoop.8G to s3a://test-bucket/testhadoop.8G [8.0G/8.0G] > map ^C[root@dell-per630-10 ~]#
Created attachment 1297830 [details] RGW node1 - dell-per630-12.log with debug_rgw=20 and debug_ms=1 , ceph version -10.2.7-28.0.TEST.bz1470301.el7cp
Created attachment 1297831 [details] RGW node2 - dell-per630-13.log with debug_rgw=20 and debug_ms=1 , ceph version -10.2.7-28.0.TEST.bz1470301.el7cp
I looked into node1- dell-per630-12 logs for one of the request: Request ID: 0x7efd76fb7710 Thread ID: 7efd76fbd700 2017-07-14 00:06:05.134757 7efd76fbd700 1 ====== starting new request req=0x7efd76fb7710 ===== 2017-07-14 00:06:05.134771 7efd76fbd700 2 req 75:0.000013::PUT /test-bucket/testhadoop.8G::initializing for trans_id = tx00000000000000000004b-005967bd95-79ab2-default 2017-07-14 00:06:05.134778 7efd76fbd700 10 rgw api priority: s3=5 s3website=4 [....] 2017-07-14 00:06:05.134965 7efd76fbd700 2 req 75:0.000208:s3:PUT /test-bucket/testhadoop.8G:put_obj:verifying op params 2017-07-14 00:06:05.134968 7efd76fbd700 2 req 75:0.000210:s3:PUT /test-bucket/testhadoop.8G:put_obj:pre-executing 2017-07-14 00:06:05.134970 7efd76fbd700 2 req 75:0.000212:s3:PUT /test-bucket/testhadoop.8G:put_obj:executing 2017-07-14 00:06:05.135023 7efd76fbd700 20 get_obj_state: rctx=0x7efd76fb6e50 obj=test-bucket:.distcp.tmp.attempt_local1571695601_0001_m_000000_0 state=0x7efe14011a78 s->prefetch_data=0 2017-07-14 00:06:05.135062 7efd76fbd700 1 -- 10.74.128.34:0/2387008942 --> 10.74.128.33:6800/414106 -- osd_op(client.498354.0:17031 12.646fbbc5 53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1_.distcp.tmp.attempt_local1571695601_0001_m_000000_0 [getxattrs,stat] snapc 0=[] ack+read+known_if_redirected e113) v7 -- ?+0 0x7efe14012aa0 con 0x7efe5c015cc0 2017-07-14 00:06:05.136058 7efd76fbd700 10 manifest: total_size = 8589934592 2017-07-14 00:06:05.136066 7efd76fbd700 20 get_obj_state: setting s->obj_tag to 53c7fd09-5c03-4602-af99-466b06a3fc32.498340.48^@ 2017-07-14 00:06:05.136081 7efd76fbd700 20 get_obj_state: rctx=0x7efd76fb6e50 obj=test-bucket:.distcp.tmp.attempt_local1571695601_0001_m_000000_0 state=0x7efe14011a78 s->prefetch_data=0 2017-07-14 00:06:05.136096 7efd76fbd700 20 rados->get_obj_iterate_cb oid=53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1__multipart_.distcp.tmp.attempt_local1571695601_0001_m_000000_0.2~u1u8vomDJTUgI1IddJplEeGS4NsW7-h.9 obj-ofs=838860800 read_ofs=0 len=524288 2017-07-14 00:06:05.136119 7efd76fbd700 1 -- 10.74.128.34:0/2387008942 --> 10.74.128.34:6804/165197 -- osd_op(client.498354.0:17033 12.8cadf48c 53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1__multipart_.distcp.tmp.attempt_local1571695601_0001_m_000000_0.2~u1u8vomDJTUgI1IddJplEeGS4NsW7-h.9 [read 0~524288] snapc 0=[] ack+read+known_if_redirected e113) v7 -- ?+0 0x7efe14014640 con 0x7efe5c01c7c0 2017-07-14 00:06:05.136147 7efd76fbd700 20 RGWObjManifest::operator++(): rule->part_size=104857600 rules.size()=2 2017-07-14 00:06:05.136149 7efd76fbd700 20 RGWObjManifest::operator++(): stripe_ofs=843055104 part_ofs=838860800 rule->part_size=104857600 2017-07-14 00:06:05.136152 7efd76fbd700 20 RGWObjManifest::operator++(): result: ofs=843055104 stripe_ofs=843055104 part_ofs=838860800 rule->part_size=104857600 > We have -EEXIST, need to restart write 2017-07-14 00:06:07.506048 7efd76fbd700 1 -- 10.74.128.34:0/2387008942 --> 10.74.128.34:6804/165197 -- osd_op(client.498354.0:17149 12.ca166856 53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1__multipart_testhadoop.8G.2~mwaZ7gOoGqmu1SbeI3Nz0DQl7X_5qPL.9 [create 0~0 [excl],writefull 0~524288] snapc 0=[] ack+ondisk+write+known_if_redirected e113) v7 -- ?+0 0x7efe14014310 con 0x7efe5c01c7c0 2017-07-14 00:06:07.858384 7efd76fbd700 5 NOTICE: processor->throttle_data() returned -EEXIST, need to restart write [....] 2017-07-14 00:07:07.653848 7efd76fbd700 20 bucket index object: .dir.53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1 2017-07-14 00:07:07.653907 7efd76fbd700 1 -- 10.74.128.34:0/2387008942 --> 10.74.128.33:6804/704143 -- osd_op(client.498354.0:20514 10.de6dd1f .dir.53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1 [call rgw.bucket_prepare_op] snapc 0=[] ondisk+write+known_if_redirected e113) v7 -- ?+0 0x7efe14061620 con 0x7efdbc00c3a0 2017-07-14 00:07:07.655997 7efd76fbd700 1 -- 10.74.128.34:0/2387008942 --> 10.74.128.33:6800/414106 -- osd_op(client.498354.0:20517 12.823e0ca8 53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1__multipart_testhadoop.8G.zSChIKp3PSgQwf2rKGsnZiQa7tzOpxg.9 [setxattr user.rgw.acl (137),setxattr user.rgw.content_type (49),setxattr user.rgw.etag (33),setxattr user.rgw.x-amz-copy-source (65),setxattr user.rgw.x-amz-copy-source-range (26),call rgw.obj_store_pg_ver,setxattr user.rgw.source_zone (4)] snapc 0=[] ondisk+write+known_if_redirected e113) v7 -- ?+0 0x7efe14052c90 con 0x7efe5c015cc0 2017-07-14 00:07:08.363957 7efd76fbd700 1 -- 10.74.128.34:0/2387008942 --> 10.74.128.33:6804/704143 -- osd_op(client.498354.0:20530 10.de6dd1f .dir.53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1 [call rgw.bucket_complete_op] snapc 0=[] ack+ondisk+write+known_if_redirected e113) v7 -- ?+0 0x7efe1407f770 con 0x7efdbc00c3a0 2017-07-14 00:07:08.364031 7efd76fbd700 15 omap_set bucket=test-bucket(@{i=default.rgw.buckets.index,e=default.rgw.buckets.non-ec}default.rgw.buckets.data[53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1]) oid=53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1__multipart_testhadoop.8G.2~mwaZ7gOoGqmu1SbeI3Nz0DQl7X_5qPL.meta key=part.00000009 2017-07-14 00:07:08.364058 7efd76fbd700 1 -- 10.74.128.34:0/2387008942 --> 10.74.128.35:6804/27200 -- osd_op(client.498354.0:20531 11.540307cb 53c7fd09-5c03-4602-af99-466b06a3fc32.214420.1__multipart_testhadoop.8G.2~mwaZ7gOoGqmu1SbeI3Nz0DQl7X_5qPL.meta [omap-set-vals 0~695] snapc 0=[] ondisk+write+known_if_redirected e113) v7 -- ?+0 0x7efe14061620 con 0x7efe5c00e320 2017-07-14 00:07:09.704648 7efd76fbd700 2 req 75:64.569890:s3:PUT /test-bucket/testhadoop.8G:put_obj:completing > We have ERROR: client_io->complete_request() returned -5 2017-07-14 00:07:09.704715 7efd76fbd700 0 ERROR: client_io->complete_request() returned -5 2017-07-14 00:07:09.704720 7efd76fbd700 2 req 75:64.569963:s3:PUT /test-bucket/testhadoop.8G:put_obj:op status=0 2017-07-14 00:07:09.704723 7efd76fbd700 2 req 75:64.569965:s3:PUT /test-bucket/testhadoop.8G:put_obj:http status=200 2017-07-14 00:07:09.704733 7efd76fbd700 1 ====== req done req=0x7efd76fb7710 op status=0 http_status=200 ====== 2017-07-14 00:07:09.704797 7efd76fbd700 1 civetweb: 0x7efe140050f0: 10.74.128.32 - - [14/Jul/2017:00:06:05 +0530] "PUT /test-bucket/testhadoop.8G HTTP/1.1" 200 0 - aws-sdk-java/1.7.4 Linux/3.10.0-510.el7.x86_64 OpenJDK_64-Bit_Server_VM/25.131-b12/1.8.0_131
I did remove HAPROXY and pointed Hadoop directly to node1 - dell-per630-12 and same results.
> Tested 4G file with Test packages got uploaded successfully. # hadoop distcp /root/Hadoop.4G.TEST s3a://test-bucket/ 17/07/14 02:00:31 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[/root/Hadoop.4G.TEST], targetPath=s3a://test-bucket/, targetPathExists=true, preserveRawXattrs=false} 17/07/14 02:00:31 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 17/07/14 02:00:31 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 17/07/14 02:00:31 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 17/07/14 02:00:31 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 17/07/14 02:00:31 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 17/07/14 02:00:31 INFO mapreduce.JobSubmitter: number of splits:1 17/07/14 02:00:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local977192475_0001 17/07/14 02:00:32 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 17/07/14 02:00:32 INFO tools.DistCp: DistCp job-id: job_local977192475_0001 17/07/14 02:00:32 INFO mapreduce.Job: Running job: job_local977192475_0001 17/07/14 02:00:32 INFO mapred.LocalJobRunner: OutputCommitter set in config null 17/07/14 02:00:32 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/14 02:00:32 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.tools.mapred.CopyCommitter 17/07/14 02:00:32 INFO mapred.LocalJobRunner: Waiting for map tasks 17/07/14 02:00:32 INFO mapred.LocalJobRunner: Starting task: attempt_local977192475_0001_m_000000_0 17/07/14 02:00:32 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/14 02:00:33 INFO mapreduce.Job: Job job_local977192475_0001 running in uber mode : false 17/07/14 02:00:33 INFO mapreduce.Job: map 0% reduce 0% 17/07/14 02:00:33 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 17/07/14 02:00:33 INFO mapred.MapTask: Processing split: file:/tmp/hadoop-root/mapred/staging/root1503752063/.staging/_distcp488381173/fileList.seq:0+201 17/07/14 02:00:33 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1 17/07/14 02:00:33 INFO mapred.CopyMapper: Copying file:/root/Hadoop.4G.TEST to s3a://test-bucket/Hadoop.4G.TEST 17/07/14 02:00:33 INFO mapred.RetriableFileCopyCommand: Creating temp file: s3a://test-bucket/.distcp.tmp.attempt_local977192475_0001_m_000000_0 17/07/14 02:00:38 INFO mapred.LocalJobRunner: 7.6% Copying file:/root/Hadoop.4G.TEST to s3a://test-bucket/Hadoop.4G.TEST [311.7M/4.0G] > map 17/07/14 02:00:39 INFO mapreduce.Job: map 100% reduce 0% 17/07/14 02:00:41 INFO mapred.LocalJobRunner: 11.2% Copying file:/root/Hadoop.4G.TEST to s3a://test-bucket/Hadoop.4G.TEST [457.7M/4.0G] > map 17/07/14 02:00:44 INFO mapred.LocalJobRunner: 14.4% Copying file:/root/Hadoop.4G.TEST to s3a://test-bucket/Hadoop.4G.TEST [587.9M/4.0G] > map 17/07/14 02:04:35 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/Hadoop.4G.TEST to s3a://test-bucket/Hadoop.4G.TEST [4.0G/4.0G] > map 17/07/14 02:04:35 INFO mapred.Task: Task:attempt_local977192475_0001_m_000000_0 is done. And is in the process of committing 17/07/14 02:04:35 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/Hadoop.4G.TEST to s3a://test-bucket/Hadoop.4G.TEST [4.0G/4.0G] > map 17/07/14 02:04:35 INFO mapred.Task: Task attempt_local977192475_0001_m_000000_0 is allowed to commit now 17/07/14 02:04:35 INFO output.FileOutputCommitter: Saved output of task 'attempt_local977192475_0001_m_000000_0' to file:/tmp/hadoop-root/mapred/staging/root1503752063/.staging/_distcp488381173/_logs/_temporary/0/task_local977192475_0001_m_000000 17/07/14 02:04:35 INFO mapred.LocalJobRunner: 100.0% Copying file:/root/Hadoop.4G.TEST to s3a://test-bucket/Hadoop.4G.TEST [4.0G/4.0G] 17/07/14 02:04:35 INFO mapred.Task: Task 'attempt_local977192475_0001_m_000000_0' done. 17/07/14 02:04:35 INFO mapred.LocalJobRunner: Finishing task: attempt_local977192475_0001_m_000000_0 17/07/14 02:04:35 INFO mapred.LocalJobRunner: map task executor complete. 17/07/14 02:04:36 WARN mapred.CopyCommitter: Unable to cleanup temp files java.lang.NullPointerException at org.apache.hadoop.fs.Path.<init>(Path.java:104) at org.apache.hadoop.fs.Path.<init>(Path.java:93) at org.apache.hadoop.tools.mapred.CopyCommitter.deleteAttemptTempFiles(CopyCommitter.java:136) at org.apache.hadoop.tools.mapred.CopyCommitter.cleanupTempFiles(CopyCommitter.java:126) at org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:83) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:537) 17/07/14 02:04:36 INFO mapred.CopyCommitter: Cleaning up temporary work folder: file:/tmp/hadoop-root/mapred/staging/root1503752063/.staging/_distcp488381173 17/07/14 02:04:36 INFO mapreduce.Job: Job job_local977192475_0001 completed successfully 17/07/14 02:04:36 INFO mapreduce.Job: Counters: 23 File System Counters FILE: Number of bytes read=4295096729 FILE: Number of bytes written=416959 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 S3A: Number of bytes read=0 S3A: Number of bytes written=4294967296 S3A: Number of read operations=14 S3A: Number of large read operations=0 S3A: Number of write operations=44 Map-Reduce Framework Map input records=1 Map output records=0 Input split bytes=155 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=19 Total committed heap usage (bytes)=249561088 File Input Format Counters Bytes Read=233 File Output Format Counters Bytes Written=8 org.apache.hadoop.tools.mapred.CopyMapper$Counter BYTESCOPIED=4294967296 BYTESEXPECTED=4294967296 COPY=1 2017-07-13 20:34 4294967296 s3://test-bucket/Hadoop.4G.TEST 2017-07-13 01:59 6198 s3://test-bucket/rgw.bucket.list 2017-07-13 02:44 1073741824 s3://test-bucket/testhadoop.1g 2017-07-13 02:41 2147483648 s3://test-bucket/testhadoop.2g 2017-07-13 02:14 4294967296 s3://test-bucket/testhadoop.4g 2017-07-13 02:51 536870912 s3://test-bucket/testhadoop.512M 2017-07-13 02:22 5368709120 s3://test-bucket/testhadoop.5g
> I verified s3cmd ls and I tried 5 times uploading that 8G file and it never completed at Hadoop layer and then I killed with CTRL+C. > But when I checked the s3cmd ls we have 5 uploads with temp filename. Something weird going on. $ s3cmd ls s3://test-bucket 2017-07-13 16:06 8589934592 s3://test-bucket/.distcp.tmp.attempt_local1437356062_0001_m_000000_0 2017-07-13 18:34 8589934592 s3://test-bucket/.distcp.tmp.attempt_local1571695601_0001_m_000000_0 2017-07-13 20:09 8589934592 s3://test-bucket/.distcp.tmp.attempt_local374556484_0001_m_000000_0 2017-07-13 20:26 8589934592 s3://test-bucket/.distcp.tmp.attempt_local789721629_0001_m_000000_0 2017-07-13 19:16 8589934592 s3://test-bucket/.distcp.tmp.attempt_local804247269_0001_m_000000_0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2903