您当前的位置: 首页 > 

梁云亮

暂无认证

  • 2浏览

    0关注

    1211博文

    0收益

  • 0浏览

    0点赞

    0打赏

    0留言

私信
关注
热门博文

FileInputFormat 之 NlineInputFormat

梁云亮 发布时间:2019-12-06 13:24:34 ,浏览量:2

前置课程: HDFS开发环境搭建

数据
Call me by your name and I'll call you by mine.
请以你的名字呼唤我,我亦将如此。
In spite of you and me and the whole silly world going to pieces around us, I love you.
我爱你,直到世界终结。
Then she's horrible alcohol, tobacco, swearing, everything is not bad.
后来她勇敢的可怕烟酒脏话样样不差
具体代码 Mapper
public class NLineMapper extends Mapper {
    protected void map(LongWritable key, Text value,Context context) throws IOException, InterruptedException {
        final String line = value.toString();
        final String[] split = line.split("\\s+");
        for (String word : split) {
            //在for循环体内,临时变量word的出现次数是常量1
            context.write(new Text(word), new LongWritable(1));
        }
    }
}
测试代码:
public static void main(String[] args) throws Exception {
    // 数据输入路径和输出路径
    args = new String[2];
    args[0] = "src/main/resources/nlinei/";
    args[1] = "src/main/resources/nlineo";

    Configuration cfg = new Configuration();
    cfg.set("mapreduce.framework.name", "local");
    cfg.set("fs.defaultFS", "file:///");
    //设置每个map可以处理多少条记录,默认是1行
    cfg.setInt("mapreduce.input.lineinputformat.linespermap", 2);

    final FileSystem filesystem = FileSystem.get(cfg);
    if (filesystem.exists(new Path(args[1]))) {
        filesystem.delete(new Path(args[1]), true);
    }

    // 定义job  
    final Job job = Job.getInstance(cfg);  // 新建一个任务
    job.setJarByClass(NLineDriver.class);

    // 设置map
    job.setMapperClass(NLineMapper.class);
    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(LongWritable.class);

    job.setInputFormatClass(NLineInputFormat.class);

    job.setNumReduceTasks(0);

    FileInputFormat.setInputPaths(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.waitForCompletion(true);
}

结果: 在这里插入图片描述

关注
打赏
1665409997
查看更多评论
立即登录/注册

微信扫码登录

0.0399s