zl程序教程

您现在的位置是:首页 >  大数据

当前栏目

Hadoop 项目实战之一WordCount程序

hadoop项目程序 实战 之一 WordCount
2023-09-14 09:13:17 时间

Hadoop 项目实战之一WordCount程序【updating…】

一.原理

一个 MapReduce 作业的运行周期是:

  • (1)先在client端,接着MapReduce作业被提交到JobTracker上
  • (2)由JobTracker将作业分解成若干个Task,并将这些Task进行调度和监控,以保障这些程序运行成功
  • (3)TaskTracker则启动JobTracker发来的Task,并向JobTracker汇报这些Task的运行状态和本节点上资源的使用情况。
  • (4)其中Task分为Map Task和Reducer Task

二.实例

我以hadoop中的第一个入门程序WordCount为例,详细描述一个Map-Reduce的过程。

  • MyMapper:继承Mapper类。
package MapReduce.MyTry;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;

//wordCount in MapReduce
public class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable>{
    //KEYIN:txt文件中每行文本的起始偏移量作为输入键
    //VALUEIN:txt文件中的每行的文本值作为输入值
    //KEYOUT:txt文件中的每行值作为输出键【作为reducer阶段的输入键】
    //VALUEOUT:txt文件中的每行值的个数作为输出值【作为reducer阶段的输入值】
    //需要根据相应的mapper<>中的泛型,实现map方法
    @Override
    protected void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {
        String line = value.toString();//将每行文本内容转换成String类型的line
        String [] word = line.split(" ");//因为文本中的每行中的单词是以空格分割【根据不同的文件设置】

        for(int i = 0;i < word.length;i++){
            //context.write(word[i],1) ;//将每行的结果输出  error:parameters mismatch
            context.write(new Text(word[i]),new IntWritable(1));//
        }
    }
}
  • Reducer
package MapReduce.MyTry;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

import java.io.IOException;

public class MyReducer extends Reducer<Text,IntWritable,Text,IntWritable> {
    @Override
    protected void reduce(Text key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {
        int count = 0;
        for(IntWritable val : values){
            /*but you should remember:
            1.because the type mismatch,so you couldn't use "count += val"
            2.so, you should get the 'real' value of val.The right method is get();
             */
            count +=  val.get();//计算总值
        }
        context.write(key,new IntWritable(count));
    }
}
  • MyJob类
package MapReduce.MyTry;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import java.io.IOException;

//Note
//1.if your class name is same as Apache's class name,Reference bag could use full name
public class MyJob{
    public static void main(String[] args) {
        Configuration conf = new Configuration();//care!the configuration is in hadoop
        try {
            Job job = Job.getInstance(conf);//get a instance of job
            job.setJarByClass(MyJob.class);
            job.setJobName("My wordCount");

            //job.setMapOutputKeyClass(MyMapper.class);//just as its name implies
            job.setMapperClass(MyMapper.class);
            job.setReducerClass(MyReducer.class);

            //set key,value class
            //if you don't this,you could meet an error!--->Type mismatch
            job.setMapOutputKeyClass(Text.class);
            job.setMapOutputValueClass(IntWritable.class);

            //The correct usage is shown below
            FileInputFormat.setInputPaths(job,new Path(
                    "hdfs://192.168.211.3:9000/input/data.txt"));
            FileOutputFormat.setOutputPath(job,new Path(
                    "hdfs://192.168.211.3:9000/result.txt"));
            try {
                job.waitForCompletion(true);
            } catch (InterruptedException e) {
                e.printStackTrace();
            } catch (ClassNotFoundException e) {
                e.printStackTrace();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

需要注意以下几个问题:

  • 在MyJob类的代码中,如果尝试使用以下代码
job.setInputPaths(new path());
job.setOutputPaths();

则无法通过编译,于是使用FileInputFormat.setInputPaths(...)FileOutputFormat.setOutputPath(...)这种方式。

  • 需要设置Job工作过程中的Map输出和Value输入类型。否则报错!!
 job.setMapOutputKeyClass(Text.class);
 job.setMapOutputValueClass(IntWritable.class);

报错如下:

10:06:40 INFO mapred.LocalJobRunner: map task executor complete.
10:06:40 WARN mapred.LocalJobRunner: job_local731065594_0001
java.lang.Exception: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, received org.apache.hadoop.io.Text
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, received org.apache.hadoop.io.Text
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1069)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:712)
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
	at MapReduce.MyTry.MyMapper.map(MyMapper.java:25)
	at MapReduce.MyTry.MyMapper.map(MyMapper.java:11)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
========== update on 2018/12/22============

3. 运行结果

在执行上述程序之后,可能会得到如下几种报错:

Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://192.168.211.4:9000/output already exists
	at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
	at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:267)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:140)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1315)
	at mapReduce.FromFileToFile.JobSubmit.main(JobSubmit.java:52)

很显然,这里的报错原因是/output这个路径已经存在,查看hdfs 路径,如下示:

[root@server4 temp]# hdfs dfs -ls /
Found 3 items
drwxr-xr-x   - root supergroup          0 2018-12-18 11:25 /hbase
drwxr-xr-x   - root supergroup          0 2018-12-22 08:32 /input
drwxr-xr-x   - root supergroup          0 2018-12-22 08:29 /output

所以我们可以做如下处理。
将代码
FileOutputFormat.setOutputPath(wordCountJob,new Path("hdfs://192.168.211.4:9000/output")); 修改成FileOutputFormat.setOutputPath(wordCountJob,new Path("hdfs://192.168.211.4:9000/output/wordCount"));
但是继续运行,接着报错如下:

Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
	at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
	at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:557)
	at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
	at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:187)
	at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
	at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:108)
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
	at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:125)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
	at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:241)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1315)
	at mapReduce.FromFileToFile.JobSubmit.main(JobSubmit.java:52)

或者是如下这种错误:

Exceptionin thread "main" java.lang.UnsatisfiedLinkError:org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(II[BI[BIILjava/lang/String;JZ)V
···

均是因为没有在hadoop 安装目录添加hadoop.dll或者是该文件版本不对导致的问题。下载相应的 winutils.exehadoop.dll 到windows 上的 hadoop安装目录。然后再次运行程序即可。运行结果如下:

"D:\Program Files\Java\jdk1.8.0_77\bin\java" "-javaagent:D:\Program Files\JetBrains\IntelliJ IDEA 2017.2.3\lib\idea_rt.jar=2905:D:\Program Files\JetBrains\IntelliJ IDEA 2017.2.3\bin" -Dfile.encoding=UTF-8 -classpath "D:\Program Files\Java\jdk1.8.0_77\jre\lib\charsets.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\deploy.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\access-bridge-64.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\cldrdata.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\dnsns.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\jaccess.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\jfxrt.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\localedata.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\nashorn.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\sunec.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\sunjce_provider.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\sunmscapi.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\sunpkcs11.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\ext\zipfs.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\javaws.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\jce.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\jfr.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\jfxswt.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\jsse.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\management-agent.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\plugin.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\resources.jar;D:\Program Files\Java\jdk1.8.0_77\jre\lib\rt.jar;E:\intellij_Project\AllDemo\hadoopDemp\target\classes;E:\.m2\repository\org\apache\spark\spark-core_2.11\2.2.0\spark-core_2.11-2.2.0.jar;E:\.m2\repository\org\apache\avro\avro\1.7.7\avro-1.7.7.jar;E:\.m2\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;E:\.m2\repository\org\apache\avro\avro-mapred\1.7.7\avro-mapred-1.7.7-hadoop2.jar;E:\.m2\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7.jar;E:\.m2\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7-tests.jar;E:\.m2\repository\com\twitter\chill_2.11\0.8.0\chill_2.11-0.8.0.jar;E:\.m2\repository\com\esotericsoftware\kryo-shaded\3.0.3\kryo-shaded-3.0.3.jar;E:\.m2\repository\com\esotericsoftware\minlog\1.3.0\minlog-1.3.0.jar;E:\.m2\repository\org\objenesis\objenesis\2.1\objenesis-2.1.jar;E:\.m2\repository\com\twitter\chill-java\0.8.0\chill-java-0.8.0.jar;E:\.m2\repository\org\apache\xbean\xbean-asm5-shaded\4.4\xbean-asm5-shaded-4.4.jar;E:\.m2\repository\org\apache\hadoop\hadoop-client\2.6.5\hadoop-client-2.6.5.jar;E:\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.5\hadoop-mapreduce-client-app-2.6.5.jar;E:\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.5\hadoop-mapreduce-client-common-2.6.5.jar;E:\.m2\repository\org\apache\hadoop\hadoop-yarn-client\2.6.5\hadoop-yarn-client-2.6.5.jar;E:\.m2\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.5\hadoop-yarn-server-common-2.6.5.jar;E:\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.5\hadoop-mapreduce-client-shuffle-2.6.5.jar;E:\.m2\repository\org\apache\hadoop\hadoop-yarn-api\2.6.5\hadoop-yarn-api-2.6.5.jar;E:\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.5\hadoop-mapreduce-client-jobclient-2.6.5.jar;E:\.m2\repository\org\apache\spark\spark-launcher_2.11\2.2.0\spark-launcher_2.11-2.2.0.jar;E:\.m2\repository\org\apache\spark\spark-network-common_2.11\2.2.0\spark-network-common_2.11-2.2.0.jar;E:\.m2\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;E:\.m2\repository\org\apache\spark\spark-network-shuffle_2.11\2.2.0\spark-network-shuffle_2.11-2.2.0.jar;E:\.m2\repository\org\apache\spark\spark-unsafe_2.11\2.2.0\spark-unsafe_2.11-2.2.0.jar;E:\.m2\repository\net\java\dev\jets3t\jets3t\0.9.3\jets3t-0.9.3.jar;E:\.m2\repository\javax\activation\activation\1.1.1\activation-1.1.1.jar;E:\.m2\repository\mx4j\mx4j\3.0.2\mx4j-3.0.2.jar;E:\.m2\repository\javax\mail\mail\1.4.7\mail-1.4.7.jar;E:\.m2\repository\org\bouncycastle\bcprov-jdk15on\1.51\bcprov-jdk15on-1.51.jar;E:\.m2\repository\com\jamesmurty\utils\java-xmlbuilder\1.0\java-xmlbuilder-1.0.jar;E:\.m2\repository\net\iharder\base64\2.3.8\base64-2.3.8.jar;E:\.m2\repository\org\apache\curator\curator-recipes\2.6.0\curator-recipes-2.6.0.jar;E:\.m2\repository\org\apache\curator\curator-framework\2.6.0\curator-framework-2.6.0.jar;E:\.m2\repository\javax\servlet\javax.servlet-api\3.1.0\javax.servlet-api-3.1.0.jar;E:\.m2\repository\org\apache\commons\commons-lang3\3.5\commons-lang3-3.5.jar;E:\.m2\repository\org\apache\commons\commons-math3\3.4.1\commons-math3-3.4.1.jar;E:\.m2\repository\com\google\code\findbugs\jsr305\1.3.9\jsr305-1.3.9.jar;E:\.m2\repository\org\slf4j\slf4j-api\1.7.16\slf4j-api-1.7.16.jar;E:\.m2\repository\org\slf4j\jul-to-slf4j\1.7.16\jul-to-slf4j-1.7.16.jar;E:\.m2\repository\org\slf4j\jcl-over-slf4j\1.7.16\jcl-over-slf4j-1.7.16.jar;E:\.m2\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar;E:\.m2\repository\com\ning\compress-lzf\1.0.3\compress-lzf-1.0.3.jar;E:\.m2\repository\org\xerial\snappy\snappy-java\1.1.2.6\snappy-java-1.1.2.6.jar;E:\.m2\repository\net\jpountz\lz4\lz4\1.3.0\lz4-1.3.0.jar;E:\.m2\repository\org\roaringbitmap\RoaringBitmap\0.5.11\RoaringBitmap-0.5.11.jar;E:\.m2\repository\commons-net\commons-net\2.2\commons-net-2.2.jar;E:\.m2\repository\org\json4s\json4s-jackson_2.11\3.2.11\json4s-jackson_2.11-3.2.11.jar;E:\.m2\repository\org\json4s\json4s-core_2.11\3.2.11\json4s-core_2.11-3.2.11.jar;E:\.m2\repository\org\json4s\json4s-ast_2.11\3.2.11\json4s-ast_2.11-3.2.11.jar;E:\.m2\repository\org\scala-lang\scalap\2.11.0\scalap-2.11.0.jar;E:\.m2\repository\org\scala-lang\scala-compiler\2.11.0\scala-compiler-2.11.0.jar;E:\.m2\repository\org\glassfish\jersey\core\jersey-client\2.22.2\jersey-client-2.22.2.jar;E:\.m2\repository\javax\ws\rs\javax.ws.rs-api\2.0.1\javax.ws.rs-api-2.0.1.jar;E:\.m2\repository\org\glassfish\hk2\hk2-api\2.4.0-b34\hk2-api-2.4.0-b34.jar;E:\.m2\repository\org\glassfish\hk2\hk2-utils\2.4.0-b34\hk2-utils-2.4.0-b34.jar;E:\.m2\repository\org\glassfish\hk2\external\aopalliance-repackaged\2.4.0-b34\aopalliance-repackaged-2.4.0-b34.jar;E:\.m2\repository\org\glassfish\hk2\external\javax.inject\2.4.0-b34\javax.inject-2.4.0-b34.jar;E:\.m2\repository\org\glassfish\hk2\hk2-locator\2.4.0-b34\hk2-locator-2.4.0-b34.jar;E:\.m2\repository\org\javassist\javassist\3.18.1-GA\javassist-3.18.1-GA.jar;E:\.m2\repository\org\glassfish\jersey\core\jersey-common\2.22.2\jersey-common-2.22.2.jar;E:\.m2\repository\javax\annotation\javax.annotation-api\1.2\javax.annotation-api-1.2.jar;E:\.m2\repository\org\glassfish\jersey\bundles\repackaged\jersey-guava\2.22.2\jersey-guava-2.22.2.jar;E:\.m2\repository\org\glassfish\hk2\osgi-resource-locator\1.0.1\osgi-resource-locator-1.0.1.jar;E:\.m2\repository\org\glassfish\jersey\core\jersey-server\2.22.2\jersey-server-2.22.2.jar;E:\.m2\repository\org\glassfish\jersey\media\jersey-media-jaxb\2.22.2\jersey-media-jaxb-2.22.2.jar;E:\.m2\repository\javax\validation\validation-api\1.1.0.Final\validation-api-1.1.0.Final.jar;E:\.m2\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.22.2\jersey-container-servlet-2.22.2.jar;E:\.m2\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.22.2\jersey-container-servlet-core-2.22.2.jar;E:\.m2\repository\io\netty\netty-all\4.0.43.Final\netty-all-4.0.43.Final.jar;E:\.m2\repository\io\netty\netty\3.9.9.Final\netty-3.9.9.Final.jar;E:\.m2\repository\com\clearspring\analytics\stream\2.7.0\stream-2.7.0.jar;E:\.m2\repository\io\dropwizard\metrics\metrics-core\3.1.2\metrics-core-3.1.2.jar;E:\.m2\repository\io\dropwizard\metrics\metrics-jvm\3.1.2\metrics-jvm-3.1.2.jar;E:\.m2\repository\io\dropwizard\metrics\metrics-json\3.1.2\metrics-json-3.1.2.jar;E:\.m2\repository\io\dropwizard\metrics\metrics-graphite\3.1.2\metrics-graphite-3.1.2.jar;E:\.m2\repository\com\fasterxml\jackson\core\jackson-databind\2.6.5\jackson-databind-2.6.5.jar;E:\.m2\repository\com\fasterxml\jackson\module\jackson-module-scala_2.11\2.6.5\jackson-module-scala_2.11-2.6.5.jar;E:\.m2\repository\org\scala-lang\scala-reflect\2.11.7\scala-reflect-2.11.7.jar;E:\.m2\repository\com\fasterxml\jackson\module\jackson-module-paranamer\2.6.5\jackson-module-paranamer-2.6.5.jar;E:\.m2\repository\org\apache\ivy\ivy\2.4.0\ivy-2.4.0.jar;E:\.m2\repository\oro\oro\2.0.8\oro-2.0.8.jar;E:\.m2\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;E:\.m2\repository\net\sf\py4j\py4j\0.10.4\py4j-0.10.4.jar;E:\.m2\repository\org\apache\spark\spark-tags_2.11\2.2.0\spark-tags_2.11-2.2.0.jar;E:\.m2\repository\org\apache\commons\commons-crypto\1.0.0\commons-crypto-1.0.0.jar;E:\.m2\repository\org\spark-project\spark\unused\1.0.0\unused-1.0.0.jar;E:\.m2\repository\org\apache\spark\spark-streaming_2.11\2.2.0\spark-streaming_2.11-2.2.0.jar;E:\.m2\repository\org\apache\spark\spark-streaming-kafka_2.11\1.6.3\spark-streaming-kafka_2.11-1.6.3.jar;E:\.m2\repository\org\apache\kafka\kafka_2.11\0.8.2.1\kafka_2.11-0.8.2.1.jar;E:\.m2\repository\org\scala-lang\modules\scala-xml_2.11\1.0.2\scala-xml_2.11-1.0.2.jar;E:\.m2\repository\org\scala-lang\modules\scala-parser-combinators_2.11\1.0.2\scala-parser-combinators_2.11-1.0.2.jar;E:\.m2\repository\com\101tec\zkclient\0.3\zkclient-0.3.jar;E:\.m2\repository\org\scala-lang\scala-library\2.11.0\scala-library-2.11.0.jar;E:\.m2\repository\org\apache\spark\spark-sql_2.11\2.2.0\spark-sql_2.11-2.2.0.jar;E:\.m2\repository\com\univocity\univocity-parsers\2.2.1\univocity-parsers-2.2.1.jar;E:\.m2\repository\org\apache\spark\spark-sketch_2.11\2.2.0\spark-sketch_2.11-2.2.0.jar;E:\.m2\repository\org\apache\spark\spark-catalyst_2.11\2.2.0\spark-catalyst_2.11-2.2.0.jar;E:\.m2\repository\org\codehaus\janino\janino\3.0.0\janino-3.0.0.jar;E:\.m2\repository\org\codehaus\janino\commons-compiler\3.0.0\commons-compiler-3.0.0.jar;E:\.m2\repository\org\antlr\antlr4-runtime\4.5.3\antlr4-runtime-4.5.3.jar;E:\.m2\repository\org\apache\parquet\parquet-column\1.8.2\parquet-column-1.8.2.jar;E:\.m2\repository\org\apache\parquet\parquet-common\1.8.2\parquet-common-1.8.2.jar;E:\.m2\repository\org\apache\parquet\parquet-encoding\1.8.2\parquet-encoding-1.8.2.jar;E:\.m2\repository\org\apache\parquet\parquet-hadoop\1.8.2\parquet-hadoop-1.8.2.jar;E:\.m2\repository\org\apache\parquet\parquet-format\2.3.1\parquet-format-2.3.1.jar;E:\.m2\repository\org\apache\parquet\parquet-jackson\1.8.2\parquet-jackson-1.8.2.jar;E:\.m2\repository\org\apache\kafka\kafka-clients\1.0.0\kafka-clients-1.0.0.jar;E:\.m2\repository\org\lz4\lz4-java\1.4\lz4-java-1.4.jar;E:\.m2\repository\org\apache\hadoop\hadoop-common\2.6.4\hadoop-common-2.6.4.jar;E:\.m2\repository\org\apache\hadoop\hadoop-annotations\2.6.4\hadoop-annotations-2.6.4.jar;D:\Program Files\Java\jdk1.8.0_77\lib\tools.jar;E:\.m2\repository\com\google\guava\guava\11.0.2\guava-11.0.2.jar;E:\.m2\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;E:\.m2\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;E:\.m2\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;E:\.m2\repository\commons-codec\commons-codec\1.4\commons-codec-1.4.jar;E:\.m2\repository\commons-io\commons-io\2.4\commons-io-2.4.jar;E:\.m2\repository\commons-collections\commons-collections\3.2.2\commons-collections-3.2.2.jar;E:\.m2\repository\javax\servlet\servlet-api\2.5\servlet-api-2.5.jar;E:\.m2\repository\org\mortbay\jetty\jetty\6.1.26\jetty-6.1.26.jar;E:\.m2\repository\org\mortbay\jetty\jetty-util\6.1.26\jetty-util-6.1.26.jar;E:\.m2\repository\com\sun\jersey\jersey-core\1.9\jersey-core-1.9.jar;E:\.m2\repository\com\sun\jersey\jersey-json\1.9\jersey-json-1.9.jar;E:\.m2\repository\org\codehaus\jettison\jettison\1.1\jettison-1.1.jar;E:\.m2\repository\com\sun\xml\bind\jaxb-impl\2.2.3-1\jaxb-impl-2.2.3-1.jar;E:\.m2\repository\org\codehaus\jackson\jackson-xc\1.8.3\jackson-xc-1.8.3.jar;E:\.m2\repository\com\sun\jersey\jersey-server\1.9\jersey-server-1.9.jar;E:\.m2\repository\asm\asm\3.1\asm-3.1.jar;E:\.m2\repository\tomcat\jasper-compiler\5.5.23\jasper-compiler-5.5.23.jar;E:\.m2\repository\tomcat\jasper-runtime\5.5.23\jasper-runtime-5.5.23.jar;E:\.m2\repository\javax\servlet\jsp\jsp-api\2.1\jsp-api-2.1.jar;E:\.m2\repository\commons-el\commons-el\1.0\commons-el-1.0.jar;E:\.m2\repository\commons-logging\commons-logging\1.1.3\commons-logging-1.1.3.jar;E:\.m2\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;E:\.m2\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;E:\.m2\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;E:\.m2\repository\commons-beanutils\commons-beanutils-core\1.8.0\commons-beanutils-core-1.8.0.jar;E:\.m2\repository\org\codehaus\jackson\jackson-core-asl\1.9.13\jackson-core-asl-1.9.13.jar;E:\.m2\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.13\jackson-mapper-asl-1.9.13.jar;E:\.m2\repository\com\google\protobuf\protobuf-java\2.5.0\protobuf-java-2.5.0.jar;E:\.m2\repository\org\apache\hadoop\hadoop-auth\2.6.4\hadoop-auth-2.6.4.jar;E:\.m2\repository\org\apache\directory\server\apacheds-kerberos-codec\2.0.0-M15\apacheds-kerberos-codec-2.0.0-M15.jar;E:\.m2\repository\org\apache\directory\server\apacheds-i18n\2.0.0-M15\apacheds-i18n-2.0.0-M15.jar;E:\.m2\repository\org\apache\directory\api\api-asn1-api\1.0.0-M20\api-asn1-api-1.0.0-M20.jar;E:\.m2\repository\org\apache\directory\api\api-util\1.0.0-M20\api-util-1.0.0-M20.jar;E:\.m2\repository\com\jcraft\jsch\0.1.42\jsch-0.1.42.jar;E:\.m2\repository\org\apache\curator\curator-client\2.6.0\curator-client-2.6.0.jar;E:\.m2\repository\org\htrace\htrace-core\3.0.4\htrace-core-3.0.4.jar;E:\.m2\repository\org\apache\zookeeper\zookeeper\3.4.6\zookeeper-3.4.6.jar;E:\.m2\repository\org\apache\commons\commons-compress\1.4.1\commons-compress-1.4.1.jar;E:\.m2\repository\org\tukaani\xz\1.0\xz-1.0.jar;E:\.m2\repository\org\apache\hadoop\hadoop-hdfs\2.6.4\hadoop-hdfs-2.6.4.jar;E:\.m2\repository\commons-daemon\commons-daemon\1.0.13\commons-daemon-1.0.13.jar;E:\.m2\repository\xerces\xercesImpl\2.9.1\xercesImpl-2.9.1.jar;E:\.m2\repository\xml-apis\xml-apis\1.3.04\xml-apis-1.3.04.jar;E:\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.4\hadoop-mapreduce-client-core-2.6.4.jar;E:\.m2\repository\org\apache\hadoop\hadoop-yarn-common\2.6.4\hadoop-yarn-common-2.6.4.jar;E:\.m2\repository\javax\xml\bind\jaxb-api\2.2.2\jaxb-api-2.2.2.jar;E:\.m2\repository\javax\xml\stream\stax-api\1.0-2\stax-api-1.0-2.jar;E:\.m2\repository\com\sun\jersey\jersey-client\1.9\jersey-client-1.9.jar;E:\.m2\repository\com\google\inject\guice\3.0\guice-3.0.jar;E:\.m2\repository\javax\inject\javax.inject\1\javax.inject-1.jar;E:\.m2\repository\aopalliance\aopalliance\1.0\aopalliance-1.0.jar;E:\.m2\repository\com\sun\jersey\contribs\jersey-guice\1.9\jersey-guice-1.9.jar;E:\.m2\repository\com\google\inject\extensions\guice-servlet\3.0\guice-servlet-3.0.jar;E:\.m2\repository\com\github\stephenc\findbugs\findbugs-annotations\1.3.9-1\findbugs-annotations-1.3.9-1.jar;E:\.m2\repository\junit\junit\4.12\junit-4.12.jar;E:\.m2\repository\org\hamcrest\hamcrest-core\1.3\hamcrest-core-1.3.jar;E:\.m2\repository\org\apache\hbase\hbase-client\1.4.0\hbase-client-1.4.0.jar;E:\.m2\repository\org\apache\hbase\hbase-annotations\1.4.0\hbase-annotations-1.4.0.jar;E:\.m2\repository\org\apache\hbase\hbase-hadoop2-compat\1.4.0\hbase-hadoop2-compat-1.4.0.jar;E:\.m2\repository\org\apache\htrace\htrace-core\3.1.0-incubating\htrace-core-3.1.0-incubating.jar;E:\.m2\repository\org\jruby\jcodings\jcodings\1.0.8\jcodings-1.0.8.jar;E:\.m2\repository\org\jruby\joni\joni\2.1.2\joni-2.1.2.jar;E:\.m2\repository\com\yammer\metrics\metrics-core\2.2.0\metrics-core-2.2.0.jar;E:\.m2\repository\org\apache\hbase\hbase-server\1.4.0\hbase-server-1.4.0.jar;E:\.m2\repository\org\apache\hbase\hbase-procedure\1.4.0\hbase-procedure-1.4.0.jar;E:\.m2\repository\org\apache\hbase\hbase-common\1.4.0\hbase-common-1.4.0-tests.jar;E:\.m2\repository\org\apache\hbase\hbase-prefix-tree\1.4.0\hbase-prefix-tree-1.4.0.jar;E:\.m2\repository\org\apache\hbase\hbase-metrics-api\1.4.0\hbase-metrics-api-1.4.0.jar;E:\.m2\repository\org\apache\hbase\hbase-metrics\1.4.0\hbase-metrics-1.4.0.jar;E:\.m2\repository\org\apache\commons\commons-math\2.2\commons-math-2.2.jar;E:\.m2\repository\org\mortbay\jetty\jetty-sslengine\6.1.26\jetty-sslengine-6.1.26.jar;E:\.m2\repository\org\mortbay\jetty\jsp-2.1\6.1.14\jsp-2.1-6.1.14.jar;E:\.m2\repository\org\mortbay\jetty\jsp-api-2.1\6.1.14\jsp-api-2.1-6.1.14.jar;E:\.m2\repository\org\mortbay\jetty\servlet-api-2.5\6.1.14\servlet-api-2.5-6.1.14.jar;E:\.m2\repository\org\codehaus\jackson\jackson-jaxrs\1.9.13\jackson-jaxrs-1.9.13.jar;E:\.m2\repository\org\jamon\jamon-runtime\2.4.1\jamon-runtime-2.4.1.jar;E:\.m2\repository\com\lmax\disruptor\3.3.0\disruptor-3.3.0.jar;E:\.m2\repository\org\apache\httpcomponents\httpclient\4.5.2\httpclient-4.5.2.jar;E:\.m2\repository\org\apache\httpcomponents\httpcore\4.4.4\httpcore-4.4.4.jar;E:\.m2\repository\org\apache\hbase\hbase-common\1.4.0\hbase-common-1.4.0.jar;E:\.m2\repository\org\apache\hbase\hbase-protocol\1.4.0\hbase-protocol-1.4.0.jar;E:\.m2\repository\org\apache\hbase\hbase-hadoop-compat\1.4.0\hbase-hadoop-compat-1.4.0.jar;E:\.m2\repository\jline\jline\0.9.94\jline-0.9.94.jar;E:\.m2\repository\org\apache\yetus\audience-annotations\0.5.0\audience-annotations-0.5.0.jar;E:\.m2\repository\mysql\mysql-connector-java\5.1.30\mysql-connector-java-5.1.30.jar;E:\.m2\repository\commons-beanutils\commons-beanutils\1.8.0\commons-beanutils-1.8.0.jar;E:\.m2\repository\net\sf\ezmorph\ezmorph\1.0.6\ezmorph-1.0.6.jar;E:\.m2\repository\org\jsoup\jsoup\1.11.3\jsoup-1.11.3.jar;E:\.m2\repository\net\opentsdb\opentsdb\2.3.0\opentsdb-2.3.0.jar;E:\.m2\repository\com\fasterxml\jackson\core\jackson-annotations\2.4.3\jackson-annotations-2.4.3.jar;E:\.m2\repository\com\fasterxml\jackson\core\jackson-core\2.4.3\jackson-core-2.4.3.jar;E:\.m2\repository\com\stumbleupon\async\1.4.0\async-1.4.0.jar;E:\.m2\repository\org\apache\commons\commons-jexl\2.1.1\commons-jexl-2.1.1.jar;E:\.m2\repository\org\jgrapht\jgrapht-core\0.9.1\jgrapht-core-0.9.1.jar;E:\.m2\repository\org\slf4j\log4j-over-slf4j\1.7.7\log4j-over-slf4j-1.7.7.jar;E:\.m2\repository\ch\qos\logback\logback-core\1.0.13\logback-core-1.0.13.jar;E:\.m2\repository\ch\qos\logback\logback-classic\1.0.13\logback-classic-1.0.13.jar;E:\.m2\repository\com\google\gwt\gwt-user\2.6.0\gwt-user-2.6.0.jar;E:\.m2\repository\javax\validation\validation-api\1.0.0.GA\validation-api-1.0.0.GA-sources.jar;E:\.m2\repository\org\json\json\20090211\json-20090211.jar;E:\.m2\repository\net\opentsdb\opentsdb_gwt_theme\1.0.0\opentsdb_gwt_theme-1.0.0.jar;E:\.m2\repository\org\hbase\asynchbase\1.7.2\asynchbase-1.7.2.jar;E:\.m2\repository\com\google\code\gson\gson\2.7\gson-2.7.jar;E:\.m2\repository\com\alibaba\fastjson\1.2.10\fastjson-1.2.10.jar;E:\.m2\repository\org\slf4j\slf4j-log4j12\1.7.6\slf4j-log4j12-1.7.6.jar" mapReduce.FromFileToFile.JobSubmit
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/E:/.m2/repository/ch/qos/logback/logback-classic/1.0.13/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/E:/.m2/repository/org/slf4j/slf4j-log4j12/1.7.6/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
09:26:29.897 [main] DEBUG o.a.h.m.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, value=[Rate of successful kerberos logins and latency (milliseconds)], valueName=Time)
09:26:29.910 [main] DEBUG o.a.h.m.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, value=[Rate of failed kerberos logins and latency (milliseconds)], valueName=Time)
09:26:29.910 [main] DEBUG o.a.h.m.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, value=[GetGroups], valueName=Time)
09:26:29.911 [main] DEBUG o.a.h.m.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics
09:26:30.006 [main] DEBUG o.a.h.s.a.util.KerberosName - Kerberos krb5 configuration not found, setting default realm to empty
09:26:30.012 [main] DEBUG org.apache.hadoop.security.Groups -  Creating new Groups object
09:26:30.014 [main] DEBUG o.a.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
09:26:30.018 [main] DEBUG o.a.hadoop.util.NativeCodeLoader - Loaded the native-hadoop library
09:26:30.019 [main] DEBUG o.a.h.s.JniBasedUnixGroupsMapping - Using JniBasedUnixGroupsMapping for Group resolution
09:26:30.019 [main] DEBUG o.a.h.s.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping
09:26:30.051 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
09:26:30.058 [main] DEBUG o.a.h.security.UserGroupInformation - hadoop login
09:26:30.059 [main] DEBUG o.a.h.security.UserGroupInformation - hadoop login commit
09:26:30.065 [main] DEBUG o.a.h.security.UserGroupInformation - using local user:NTUserPrincipal: Administrator
09:26:30.065 [main] DEBUG o.a.h.security.UserGroupInformation - Using user: "NTUserPrincipal: Administrator" with name Administrator
09:26:30.065 [main] DEBUG o.a.h.security.UserGroupInformation - User entry: "Administrator"
09:26:30.066 [main] DEBUG o.a.h.security.UserGroupInformation - UGI loginUser:Administrator (auth:SIMPLE)
09:26:30.181 [main] DEBUG o.a.hadoop.hdfs.BlockReaderLocal - dfs.client.use.legacy.blockreader.local = false
09:26:30.181 [main] DEBUG o.a.hadoop.hdfs.BlockReaderLocal - dfs.client.read.shortcircuit = false
09:26:30.181 [main] DEBUG o.a.hadoop.hdfs.BlockReaderLocal - dfs.client.domain.socket.data.traffic = false
09:26:30.181 [main] DEBUG o.a.hadoop.hdfs.BlockReaderLocal - dfs.domain.socket.path = 
09:26:30.195 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - No KeyProvider found.
09:26:30.212 [main] DEBUG o.apache.hadoop.io.retry.RetryUtils - multipleLinearRandomRetry = null
09:26:30.228 [main] DEBUG org.apache.hadoop.ipc.Server - rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@7586beff
09:26:30.389 [main] DEBUG org.apache.hadoop.ipc.Client - getting client out of cache: org.apache.hadoop.ipc.Client@15de0b3c
09:26:30.628 [main] DEBUG o.a.hadoop.util.PerformanceAdvisory - Both short-circuit local reads and UNIX domain socket are disabled.
09:26:30.632 [main] DEBUG o.a.h.h.p.d.s.DataTransferSaslUtil - DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
09:26:30.642 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.connect(Job.java:1262)
09:26:30.645 [main] DEBUG org.apache.hadoop.mapreduce.Cluster - Trying ClientProtocolProvider : org.apache.hadoop.mapred.LocalClientProtocolProvider
09:26:30.649 [main] INFO  o.a.h.conf.Configuration.deprecation - session.id is deprecated. Instead, use dfs.metrics.session-id
09:26:30.650 [main] INFO  o.a.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
09:26:30.655 [main] DEBUG org.apache.hadoop.mapreduce.Cluster - Picked org.apache.hadoop.mapred.LocalClientProtocolProvider as the ClientProtocolProvider
09:26:30.655 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Cluster.getFileSystem(Cluster.java:161)
09:26:30.658 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
09:26:30.673 [main] DEBUG org.apache.hadoop.ipc.Client - The ping interval is 60000 ms.
09:26:30.674 [main] DEBUG org.apache.hadoop.ipc.Client - Connecting to /192.168.211.4:9000
09:26:30.724 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator: starting, having connections 1
09:26:30.726 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #0
09:26:30.733 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #0
09:26:30.734 [main] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getFileInfo took 73ms
09:26:30.776 [main] DEBUG o.apache.hadoop.io.nativeio.NativeIO - Initialized cache for IDs to User/Group mapping with a  cache timeout of 14400 seconds.
09:26:30.778 [main] DEBUG o.a.hadoop.mapreduce.JobSubmitter - Configuring job job_local1297123336_0001 with file:/tmp/hadoop-Administrator/mapred/staging/Administrator1297123336/.staging/job_local1297123336_0001 as the submit dir
09:26:30.779 [main] DEBUG o.a.hadoop.mapreduce.JobSubmitter - adding the following namenodes' delegation tokens:[file:///]
09:26:30.873 [main] WARN  o.a.h.mapreduce.JobResourceUploader - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
09:26:30.873 [main] DEBUG o.a.h.mapreduce.JobResourceUploader - default FileSystem: file:///
09:26:30.877 [main] WARN  o.a.h.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
09:26:30.879 [main] DEBUG o.a.hadoop.mapreduce.JobSubmitter - Creating splits at file:/tmp/hadoop-Administrator/mapred/staging/Administrator1297123336/.staging/job_local1297123336_0001
09:26:30.887 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #1
09:26:30.991 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #1
09:26:30.991 [main] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getFileInfo took 104ms
09:26:31.007 [main] DEBUG o.a.h.m.lib.input.FileInputFormat - Time taken to get FileStatuses: 124
09:26:31.007 [main] INFO  o.a.h.m.lib.input.FileInputFormat - Total input paths to process : 1
09:26:31.008 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #2
09:26:31.011 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #2
09:26:31.011 [main] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getBlockLocations took 3ms
09:26:31.038 [main] DEBUG o.a.h.m.lib.input.FileInputFormat - Total # of splits generated by getSplits: 1, TimeTaken: 157
09:26:31.077 [main] INFO  o.a.hadoop.mapreduce.JobSubmitter - number of splits:1
09:26:31.109 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for all properties in config...
09:26:31.109 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for mapreduce.jobtracker.address
09:26:31.109 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for yarn.resourcemanager.scheduler.monitor.policies
09:26:31.109 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for dfs.namenode.resource.check.interval
09:26:31.109 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for mapreduce.jobhistory.client.thread-count
09:26:31.109 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for mapred.child.java.opts
09:26:31.109 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for mapreduce.jobtracker.retiredjobs.cache.size
09:26:31.109 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for dfs.client.https.need-auth
···
09:26:31.476 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for yarn.resourcemanager.webapp.https.address
09:26:31.476 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for yarn.nodemanager.vmem-pmem-ratio
09:26:31.476 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for dfs.namenode.checkpoint.period
09:26:31.476 [main] DEBUG org.apache.hadoop.conf.Configuration - Handling deprecation for dfs.ha.automatic-failover.enabled
09:26:31.508 [main] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://localhost:8080/
09:26:31.509 [main] INFO  org.apache.hadoop.mapreduce.Job - Running job: job_local1297123336_0001
09:26:31.510 [Thread-4] INFO  o.a.hadoop.mapred.LocalJobRunner - OutputCommitter set in config null
09:26:31.510 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:31.510 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:31.542 [Thread-4] INFO  o.a.hadoop.mapred.LocalJobRunner - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
09:26:31.550 [Thread-4] DEBUG org.apache.hadoop.hdfs.DFSClient - /output/wordCount.txt/_temporary/0: masked=rwxr-xr-x
09:26:31.576 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #3
09:26:31.614 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #3
09:26:31.614 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: mkdirs took 39ms
09:26:31.673 [Thread-4] DEBUG o.a.hadoop.mapred.LocalJobRunner - Starting mapper thread pool executor.
09:26:31.673 [Thread-4] DEBUG o.a.hadoop.mapred.LocalJobRunner - Max local threads: 1
09:26:31.673 [Thread-4] DEBUG o.a.hadoop.mapred.LocalJobRunner - Map tasks to process: 1
09:26:31.674 [Thread-4] INFO  o.a.hadoop.mapred.LocalJobRunner - Waiting for map tasks
09:26:31.675 [LocalJobRunner Map Task Executor #0] INFO  o.a.hadoop.mapred.LocalJobRunner - Starting task: attempt_local1297123336_0001_m_000000_0
09:26:31.688 [LocalJobRunner Map Task Executor #0] DEBUG o.apache.hadoop.mapred.SortedRanges - currentIndex 0   0:0
09:26:31.764 [LocalJobRunner Map Task Executor #0] DEBUG o.a.hadoop.mapred.LocalJobRunner - mapreduce.cluster.local.dir for child : /tmp/hadoop-Administrator/mapred/local/localRunner//Administrator/jobcache/job_local1297123336_0001/attempt_local1297123336_0001_m_000000_0
09:26:31.765 [LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.mapred.Task - using new api for output committer
09:26:31.769 [LocalJobRunner Map Task Executor #0] INFO  o.a.h.y.util.ProcfsBasedProcessTree - ProcfsBasedProcessTree currently is supported only on Linux.
09:26:32.227 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task -  Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@4fa9f525
09:26:32.229 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - Processing split: hdfs://192.168.211.4:9000/input/data.txt:0+108
09:26:32.239 [LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.mapred.MapTask - Trying map output collector class: org.apache.hadoop.mapred.MapTask$MapOutputBuffer
09:26:32.280 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - (EQUATOR) 0 kvi 26214396(104857584)
09:26:32.280 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - mapreduce.task.io.sort.mb: 100
09:26:32.280 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - soft limit at 83886080
09:26:32.280 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - bufstart = 0; bufvoid = 104857600
09:26:32.280 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - kvstart = 26214396; length = 6553600
09:26:32.283 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
09:26:32.288 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #4
09:26:32.290 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #4
09:26:32.290 [LocalJobRunner Map Task Executor #0] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getBlockLocations took 2ms
09:26:32.290 [LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - newInfo = LocatedBlocks{
  fileLength=108
  underConstruction=false
  blocks=[LocatedBlock{BP-635201075-192.168.211.4-1531450855001:blk_1073749866_9076; getBlockSize()=108; corrupt=false; offset=0; locs=[192.168.211.5:50010, 192.168.211.4:50010, 192.168.211.6:50010]; storageIDs=[DS-0fd87bad-0e06-46d2-8b77-2d51cff18d61, DS-06e73156-72ab-4aac-99b2-9a878cb0855e, DS-384769a1-2033-4b92-83d8-1b5aa4d57564]; storageTypes=[DISK, DISK, DISK]}]
  lastLocatedBlock=LocatedBlock{BP-635201075-192.168.211.4-1531450855001:blk_1073749866_9076; getBlockSize()=108; corrupt=false; offset=0; locs=[192.168.211.6:50010, 192.168.211.4:50010, 192.168.211.5:50010]; storageIDs=[DS-0fd87bad-0e06-46d2-8b77-2d51cff18d61, DS-06e73156-72ab-4aac-99b2-9a878cb0855e, DS-384769a1-2033-4b92-83d8-1b5aa4d57564]; storageTypes=[DISK, DISK, DISK]}
  isLastBlockComplete=true}
09:26:32.292 [LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to datanode 192.168.211.5:50010
09:26:32.317 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #5
09:26:32.324 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #5
09:26:32.324 [LocalJobRunner Map Task Executor #0] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getServerDefaults took 7ms
09:26:32.333 [LocalJobRunner Map Task Executor #0] DEBUG o.a.h.h.p.d.s.SaslDataTransferClient - SASL client skipping handshake in unsecured configuration for addr = /192.168.211.5, datanodeId = 192.168.211.5:50010
09:26:32.511 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:32.511 [main] INFO  org.apache.hadoop.mapreduce.Job - Job job_local1297123336_0001 running in uber mode : false
09:26:32.678 [main] INFO  org.apache.hadoop.mapreduce.Job -  map 0% reduce 0%
09:26:32.678 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:677)
09:26:32.679 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:32.679 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:32.684 [LocalJobRunner Map Task Executor #0] INFO  o.a.hadoop.mapred.LocalJobRunner - 
09:26:32.697 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - Starting flush of map output
09:26:32.697 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - Spilling map output
09:26:32.697 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - bufstart = 0; bufend = 200; bufvoid = 104857600
09:26:32.697 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - kvstart = 26214396(104857584); kvend = 26214308(104857232); length = 89/6553600
09:26:32.711 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - Finished spill 0
09:26:32.718 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task:attempt_local1297123336_0001_m_000000_0 is done. And is in the process of committing
09:26:32.726 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #6
09:26:32.727 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #6
09:26:32.727 [LocalJobRunner Map Task Executor #0] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getFileInfo took 1ms
09:26:32.728 [LocalJobRunner Map Task Executor #0] INFO  o.a.hadoop.mapred.LocalJobRunner - map
09:26:32.729 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task 'attempt_local1297123336_0001_m_000000_0' done.
09:26:32.729 [LocalJobRunner Map Task Executor #0] INFO  o.a.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local1297123336_0001_m_000000_0
09:26:32.729 [Thread-4] INFO  o.a.hadoop.mapred.LocalJobRunner - map task executor complete.
09:26:32.730 [Thread-4] DEBUG o.a.hadoop.mapred.LocalJobRunner - Starting reduce thread pool executor.
09:26:32.730 [Thread-4] DEBUG o.a.hadoop.mapred.LocalJobRunner - Max local threads: 1
09:26:32.730 [Thread-4] DEBUG o.a.hadoop.mapred.LocalJobRunner - Reduce tasks to process: 1
09:26:32.731 [Thread-4] INFO  o.a.hadoop.mapred.LocalJobRunner - Waiting for reduce tasks
09:26:32.731 [pool-6-thread-1] INFO  o.a.hadoop.mapred.LocalJobRunner - Starting task: attempt_local1297123336_0001_r_000000_0
09:26:32.738 [pool-6-thread-1] DEBUG o.apache.hadoop.mapred.SortedRanges - currentIndex 0   0:0
09:26:32.741 [pool-6-thread-1] DEBUG o.a.hadoop.mapred.LocalJobRunner - mapreduce.cluster.local.dir for child : /tmp/hadoop-Administrator/mapred/local/localRunner//Administrator/jobcache/job_local1297123336_0001/attempt_local1297123336_0001_r_000000_0
09:26:32.742 [pool-6-thread-1] DEBUG org.apache.hadoop.mapred.Task - using new api for output committer
09:26:32.742 [pool-6-thread-1] INFO  o.a.h.y.util.ProcfsBasedProcessTree - ProcfsBasedProcessTree currently is supported only on Linux.
09:26:32.799 [pool-6-thread-1] INFO  org.apache.hadoop.mapred.Task -  Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@253beed0
09:26:32.803 [pool-6-thread-1] INFO  org.apache.hadoop.mapred.ReduceTask - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@2ffb27b6
09:26:32.827 [pool-6-thread-1] INFO  o.a.h.m.task.reduce.MergeManagerImpl - MergerManager: memoryLimit=2654155520, maxSingleShuffleLimit=663538880, mergeThreshold=1751742720, ioSortFactor=10, memToMemMergeOutputsThreshold=10
09:26:32.829 [EventFetcher for fetching Map Completion Events] INFO  o.a.h.m.task.reduce.EventFetcher - attempt_local1297123336_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
09:26:32.830 [EventFetcher for fetching Map Completion Events] DEBUG o.a.h.m.task.reduce.EventFetcher - Got 0 map completion events from 0
09:26:32.830 [EventFetcher for fetching Map Completion Events] DEBUG o.a.h.m.task.reduce.EventFetcher - GetMapEventsThread about to sleep for 1000
09:26:32.834 [localfetcher#1] DEBUG o.a.h.m.task.reduce.LocalFetcher - LocalFetcher 1 going to fetch: attempt_local1297123336_0001_m_000000_0
09:26:32.850 [localfetcher#1] DEBUG o.a.h.m.task.reduce.MergeManagerImpl - attempt_local1297123336_0001_m_000000_0: Proceeding with shuffle since usedMemory (0) is lesser than memoryLimit (2654155520).CommitMemory is (0)
09:26:32.850 [localfetcher#1] INFO  o.a.h.m.task.reduce.LocalFetcher - localfetcher#1 about to shuffle output of map attempt_local1297123336_0001_m_000000_0 decomp: 248 len: 252 to MEMORY
09:26:32.854 [localfetcher#1] INFO  o.a.h.m.t.reduce.InMemoryMapOutput - Read 248 bytes from map-output for attempt_local1297123336_0001_m_000000_0
09:26:32.855 [localfetcher#1] INFO  o.a.h.m.task.reduce.MergeManagerImpl - closeInMemoryFile -> map-output of size: 248, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->248
09:26:32.856 [localfetcher#1] DEBUG o.a.h.m.t.r.ShuffleSchedulerImpl - map attempt_local1297123336_0001_m_000000_0 done 1 / 1 copied.
09:26:32.856 [EventFetcher for fetching Map Completion Events] INFO  o.a.h.m.task.reduce.EventFetcher - EventFetcher is interrupted.. Returning
09:26:32.856 [pool-6-thread-1] INFO  o.a.hadoop.mapred.LocalJobRunner - 1 / 1 copied.
09:26:32.857 [pool-6-thread-1] INFO  o.a.h.m.task.reduce.MergeManagerImpl - finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
09:26:32.864 [pool-6-thread-1] INFO  org.apache.hadoop.mapred.Merger - Merging 1 sorted segments
09:26:32.864 [pool-6-thread-1] INFO  org.apache.hadoop.mapred.Merger - Down to the last merge-pass, with 1 segments left of total size: 241 bytes
09:26:32.866 [pool-6-thread-1] INFO  o.a.h.m.task.reduce.MergeManagerImpl - Merged 1 segments, 248 bytes to disk to satisfy reduce memory limit
09:26:32.867 [pool-6-thread-1] DEBUG o.a.h.m.task.reduce.MergeManagerImpl - Disk file: /tmp/hadoop-Administrator/mapred/local/localRunner/Administrator/jobcache/job_local1297123336_0001/attempt_local1297123336_0001_r_000000_0/output/map_0.out.merged Length is 252
09:26:32.867 [pool-6-thread-1] INFO  o.a.h.m.task.reduce.MergeManagerImpl - Merging 1 files, 252 bytes from disk
09:26:32.867 [pool-6-thread-1] INFO  o.a.h.m.task.reduce.MergeManagerImpl - Merging 0 segments, 0 bytes from memory into reduce
09:26:32.867 [pool-6-thread-1] INFO  org.apache.hadoop.mapred.Merger - Merging 1 sorted segments
09:26:32.868 [pool-6-thread-1] INFO  org.apache.hadoop.mapred.Merger - Down to the last merge-pass, with 1 segments left of total size: 241 bytes
09:26:32.868 [pool-6-thread-1] INFO  o.a.hadoop.mapred.LocalJobRunner - 1 / 1 copied.
09:26:32.870 [pool-6-thread-1] DEBUG org.apache.hadoop.hdfs.DFSClient - /output/wordCount.txt/_temporary/0/_temporary/attempt_local1297123336_0001_r_000000_0/part-r-00000: masked=rw-r--r--
09:26:32.906 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #7
09:26:32.932 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #7
09:26:32.932 [pool-6-thread-1] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: create took 26ms
09:26:32.934 [pool-6-thread-1] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize: src=/output/wordCount.txt/_temporary/0/_temporary/attempt_local1297123336_0001_r_000000_0/part-r-00000, chunkSize=516, chunksPerPacket=127, packetSize=65532
09:26:32.940 [pool-6-thread-1] INFO  o.a.h.conf.Configuration.deprecation - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
09:26:32.940 [LeaseRenewer:Administrator@192.168.211.4:9000] DEBUG org.apache.hadoop.hdfs.LeaseRenewer - Lease renewer daemon for [DFSClient_NONMAPREDUCE_2038479896_1] with renew id 1 started
09:26:32.948 [pool-6-thread-1] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk allocating new packet seqno=0, src=/output/wordCount.txt/_temporary/0/_temporary/attempt_local1297123336_0001_r_000000_0/part-r-00000, packetSize=65532, chunksPerPacket=127, bytesCurBlock=0
09:26:32.948 [pool-6-thread-1] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 0
09:26:32.948 [pool-6-thread-1] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 1
09:26:32.948 [pool-6-thread-1] DEBUG org.apache.hadoop.hdfs.DFSClient - Waiting for ack for: 1
09:26:32.948 [Thread-14] DEBUG org.apache.hadoop.hdfs.DFSClient - Allocating new block
09:26:32.992 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #8
09:26:33.002 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #8
09:26:33.002 [Thread-14] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: addBlock took 10ms
09:26:33.005 [Thread-14] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 192.168.211.5:50010
09:26:33.005 [Thread-14] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 192.168.211.6:50010
09:26:33.005 [Thread-14] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 192.168.211.4:50010
09:26:33.005 [Thread-14] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to datanode 192.168.211.5:50010
09:26:33.005 [Thread-14] DEBUG org.apache.hadoop.hdfs.DFSClient - Send buf size 131072
09:26:33.005 [Thread-14] DEBUG o.a.h.h.p.d.s.SaslDataTransferClient - SASL client skipping handshake in unsecured configuration for addr = /192.168.211.5, datanodeId = 192.168.211.5:50010
09:26:33.218 [DataStreamer for file /output/wordCount.txt/_temporary/0/_temporary/attempt_local1297123336_0001_r_000000_0/part-r-00000 block BP-635201075-192.168.211.4-1531450855001:blk_1073749869_9079] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-635201075-192.168.211.4-1531450855001:blk_1073749869_9079 sending packet packet seqno:0 offsetInBlock:0 lastPacketInBlock:false lastByteOffsetInBlock: 122
09:26:33.364 [ResponseProcessor for block BP-635201075-192.168.211.4-1531450855001:blk_1073749869_9079] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient seqno: 0 status: SUCCESS status: SUCCESS status: SUCCESS downstreamAckTimeNanos: 64714425
09:26:33.364 [DataStreamer for file /output/wordCount.txt/_temporary/0/_temporary/attempt_local1297123336_0001_r_000000_0/part-r-00000 block BP-635201075-192.168.211.4-1531450855001:blk_1073749869_9079] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-635201075-192.168.211.4-1531450855001:blk_1073749869_9079 sending packet packet seqno:1 offsetInBlock:122 lastPacketInBlock:true lastByteOffsetInBlock: 122
09:26:33.402 [ResponseProcessor for block BP-635201075-192.168.211.4-1531450855001:blk_1073749869_9079] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient seqno: 1 status: SUCCESS status: SUCCESS status: SUCCESS downstreamAckTimeNanos: 26266910
09:26:33.402 [DataStreamer for file /output/wordCount.txt/_temporary/0/_temporary/attempt_local1297123336_0001_r_000000_0/part-r-00000 block BP-635201075-192.168.211.4-1531450855001:blk_1073749869_9079] DEBUG org.apache.hadoop.hdfs.DFSClient - Closing old block BP-635201075-192.168.211.4-1531450855001:blk_1073749869_9079
09:26:33.405 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #9
09:26:33.412 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #9
09:26:33.412 [pool-6-thread-1] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: complete took 8ms
09:26:33.414 [pool-6-thread-1] INFO  org.apache.hadoop.mapred.Task - Task:attempt_local1297123336_0001_r_000000_0 is done. And is in the process of committing
09:26:33.415 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #10
09:26:33.416 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #10
09:26:33.416 [pool-6-thread-1] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getFileInfo took 2ms
09:26:33.416 [pool-6-thread-1] INFO  o.a.hadoop.mapred.LocalJobRunner - 1 / 1 copied.
09:26:33.416 [pool-6-thread-1] INFO  org.apache.hadoop.mapred.Task - Task attempt_local1297123336_0001_r_000000_0 is allowed to commit now
09:26:33.417 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #11
09:26:33.418 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #11
09:26:33.418 [pool-6-thread-1] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getFileInfo took 2ms
09:26:33.418 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #12
09:26:33.419 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #12
09:26:33.419 [pool-6-thread-1] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getFileInfo took 1ms
09:26:33.438 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #13
09:26:33.444 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #13
09:26:33.444 [pool-6-thread-1] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: rename took 6ms
09:26:33.445 [pool-6-thread-1] INFO  o.a.h.m.l.output.FileOutputCommitter - Saved output of task 'attempt_local1297123336_0001_r_000000_0' to hdfs://192.168.211.4:9000/output/wordCount.txt/_temporary/0/task_local1297123336_0001_r_000000
09:26:33.446 [pool-6-thread-1] INFO  o.a.hadoop.mapred.LocalJobRunner - reduce > reduce
09:26:33.446 [pool-6-thread-1] INFO  org.apache.hadoop.mapred.Task - Task 'attempt_local1297123336_0001_r_000000_0' done.
09:26:33.446 [pool-6-thread-1] INFO  o.a.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local1297123336_0001_r_000000_0
09:26:33.446 [Thread-4] INFO  o.a.hadoop.mapred.LocalJobRunner - reduce task executor complete.
09:26:33.452 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #14
09:26:33.470 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #14
09:26:33.470 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getListing took 18ms
09:26:33.496 [Thread-4] DEBUG o.a.h.m.l.output.FileOutputCommitter - Merging data from FileStatus{path=hdfs://192.168.211.4:9000/output/wordCount.txt/_temporary/0/task_local1297123336_0001_r_000000; isDirectory=true; modification_time=1545441993666; access_time=0; owner=Administrator; group=supergroup; permission=rwxr-xr-x; isSymlink=false} to hdfs://192.168.211.4:9000/output/wordCount.txt
09:26:33.496 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #15
09:26:33.497 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #15
09:26:33.498 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getFileInfo took 2ms
09:26:33.498 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #16
09:26:33.499 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #16
09:26:33.499 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getFileInfo took 1ms
09:26:33.499 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #17
09:26:33.500 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #17
09:26:33.500 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getListing took 1ms
09:26:33.501 [Thread-4] DEBUG o.a.h.m.l.output.FileOutputCommitter - Merging data from FileStatus{path=hdfs://192.168.211.4:9000/output/wordCount.txt/_temporary/0/task_local1297123336_0001_r_000000/part-r-00000; isDirectory=false; length=122; replication=3; blocksize=134217728; modification_time=1545441994157; access_time=1545441993666; owner=Administrator; group=supergroup; permission=rw-r--r--; isSymlink=false} to hdfs://192.168.211.4:9000/output/wordCount.txt/part-r-00000
09:26:33.501 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #18
09:26:33.502 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #18
09:26:33.502 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: getFileInfo took 1ms
09:26:33.502 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #19
09:26:33.505 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #19
09:26:33.505 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: rename took 3ms
09:26:33.507 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #20
09:26:33.512 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #20
09:26:33.512 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: delete took 5ms
09:26:33.513 [Thread-4] DEBUG org.apache.hadoop.hdfs.DFSClient - /output/wordCount.txt/_SUCCESS: masked=rw-r--r--
09:26:33.514 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #21
09:26:33.516 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #21
09:26:33.516 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: create took 2ms
09:26:33.516 [Thread-4] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize: src=/output/wordCount.txt/_SUCCESS, chunkSize=516, chunksPerPacket=127, packetSize=65532
09:26:33.516 [Thread-4] DEBUG org.apache.hadoop.hdfs.DFSClient - Waiting for ack for: -1
09:26:33.517 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator sending #22
09:26:33.520 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator got value #22
09:26:33.520 [Thread-4] DEBUG o.a.hadoop.ipc.ProtobufRpcEngine - Call: complete took 3ms
09:26:33.536 [Thread-4] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:331)
09:26:33.680 [main] INFO  org.apache.hadoop.mapreduce.Job -  map 100% reduce 100%
09:26:33.680 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:677)
09:26:33.680 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:33.680 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:33.680 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:677)
09:26:33.680 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:33.680 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:33.680 [main] INFO  org.apache.hadoop.mapreduce.Job - Job job_local1297123336_0001 completed successfully
09:26:33.681 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getCounters(Job.java:765)
09:26:33.687 [main] INFO  org.apache.hadoop.mapreduce.Job - Counters: 38
	File System Counters
		FILE: Number of bytes read=882
		FILE: Number of bytes written=532914
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=216
		HDFS: Number of bytes written=122
		HDFS: Number of read operations=13
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=4
	Map-Reduce Framework
		Map input records=2
		Map output records=23
		Map output bytes=200
		Map output materialized bytes=252
		Input split bytes=105
		Combine input records=0
		Combine output records=0
		Reduce input groups=17
		Reduce shuffle bytes=252
		Reduce input records=23
		Reduce output records=17
		Spilled Records=46
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=0
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=600834048
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=108
	File Output Format Counters 
		Bytes Written=122
09:26:33.687 [main] DEBUG o.a.h.security.UserGroupInformation - PrivilegedAction as:Administrator (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
09:26:33.689 [Thread-2] DEBUG org.apache.hadoop.ipc.Client - stopping client from cache: org.apache.hadoop.ipc.Client@15de0b3c
09:26:33.689 [Thread-2] DEBUG org.apache.hadoop.ipc.Client - removing client from cache: org.apache.hadoop.ipc.Client@15de0b3c
09:26:33.689 [Thread-2] DEBUG org.apache.hadoop.ipc.Client - stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@15de0b3c
09:26:33.689 [Thread-2] DEBUG org.apache.hadoop.ipc.Client - Stopping client
09:26:33.689 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator: closed
09:26:33.689 [IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1164440413) connection to /192.168.211.4:9000 from Administrator: stopped, remaining connections 0

Process finished with exit code 0

接下来,我会对这个日志输出进行详细的分析,具体见我的下一篇博客。
程序运行成功之后,再在hadoop 上查看是否生成了这个文件,

[root@server4 ~]# hdfs dfs -ls /output/wordCount
Found 2 items
-rw-r--r--   3 Administrator supergroup          0 2018-12-22 09:26 /output/wordCount/_SUCCESS
-rw-r--r--   3 Administrator supergroup        122 2018-12-22 09:26 /output/wordCount/part-r-00000

接着查看这个叫做part-r-00000的文件内容:

[root@server4 ~]# hdfs dfs -cat /output/wordCount/part-r-00000
RAM.	1
a	2
analyze	1
big	2
data	2
deal	1
disk.	1
good	1
hadoop	1
in	2
is	2
perfet	1
platform	1
spark	1
to	2
tool	1
with	1

同时我们看一下 /input/data.txt/ 这个文件:

[root@server4 ~]# hdfs dfs -cat /input/data.txt
hadoop is a good platform to analyze big data in disk.
spark is a perfet tool to deal with big data in RAM.

至此,hadoop 版本的 WordCount 就已经结束了。

4. 总结

实现一个 MapReduce 任务的步骤如下:
在这里插入图片描述在这里插入图片描述
【原谅我喜欢用纸笔做总结:)】