zl程序教程

您现在的位置是:首页 >  大数据

当前栏目

mapreduce demo实现详解大数据

数据 实现 详解 Demo MapReduce
2023-06-13 09:20:27 时间
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException { //配置项 Configuration conf = new Configuration(); //定义任务,第一个参数设定配置项,第二个参数设置任务名称 Job job = new Job(conf, "bscreenUserCount"); //设置输入文件格式化类 job.setInputFormatClass(FileInputFormat.class); //设置输出文件格式化类 job.setOutputFormatClass(FileOutputFormat.class); //设置启动类 job.setJarByClass(BreadPointDriver.class); //设置mapper job.setMapperClass(UserCountMapper.class); //设置reducer job.setReducerClass(UserCountReducer.class); //设置reducer数量, 如果不设置默认为1,如果设置0则不执行reducer,设置1以上需要编写partition类 job.setNumReduceTasks(0); job.setMapOutputKeyClass(Text.class);//map输出key数据类型 job.setMapOutputValueClass(Text.class);//map输出value数据类型 job.setOutputKeyClass(BscreenUseCount.class);//reducer输出key数据类型 job.setOutputValueClass(NullWritable.class);//reducer输出value数据类型 FileInputFormat.addInputPath(job, new Path(inputPath));//输入路径 FileOutputFormat.addInputPath(job, new Path(outputPath));//输入路径 job.waitForCompletion(true);//执行任务

2. map类

public class BreadPointMapper extends Mapper LongWritable, Text, Text, Text { 

 /** 

 * @param key 行索引 

 * @param value 行内容 

 * @param context 容器 

 * @throws IOException 

 * @throws InterruptedException 

 @Override 

 protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { 

 String outKey = ""; 

 String outValue = ""; 

 context.write(new Text(outKey),new Text(outValue)); 

}

3. reducer类



public class BreadPointReducer extends Reducer Text, Text, Text, NullWritable { 

 /** 

 * @param key map中存储的key,每个key会执行一次reduce 

 * @param value 对应的value列表 

 * @param context 

 * @throws IOException 

 * @throws InterruptedException 

 @Override 

 protected void reduce(Text key, Iterable Text value, Context context) throws IOException, InterruptedException { 

 //处理key和value 

 //... 

 String outKey = ""; 

 String outValue = ""; 

 context.write(new Text(outKey),new Text(outValue)); 

}

原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/9637.html

分布式文件系统,分布式数据库区块链并行处理(MPP)数据库,数据挖掘开源大数据平台数据中台数据分析数据开发数据治理数据湖数据采集