Hadoop/spark安装实战(系列篇4) Hadoop MapReduce词频统计之小试牛刀
运行hadoop 自带的例子的MapReduce 计算
1 上传文件到hadoop的hdfs的根目录 [root@localhost hadoop-1.2.1]# hadoop fs -put README.txt / 检验 [root@localhost hadoop-1.2.1]# hadoop fs -ls / Found 2 items -rw-r--r-- 1 root supergroup 1366 2015-09-12 06:45 /README.txt drwxr-xr-x - root supergroup 0 2015-09-12 06:34 /home [root@localhost hadoop-1.2.1]#
2 运行MapReduce 分词统计
[root@localhost hadoop-1.2.1]# hadoop jar hadoop-examples-1.2.1.jar wordcount /README.txt /wordcountoutput Warning: $HADOOP_HOME is deprecated.
15/09/12 06:48:00 INFO input.FileInputFormat: Total input paths to process : 1 15/09/12 06:48:00 INFO util.NativeCodeLoader: Loaded the native-hadoop library 15/09/12 06:48:00 WARN snappy.LoadSnappy: Snappy native library not loaded 15/09/12 06:48:03 INFO mapred.JobClient: Running job: job_201509120634_0001 15/09/12 06:48:04 INFO mapred.JobClient: map 0% reduce 0% 15/09/12 06:48:27 INFO mapred.JobClient: map 100% reduce 0% 15/09/12 06:48:34 INFO mapred.JobClient: map 100% reduce 33% 15/09/12 06:48:36 INFO mapred.JobClient: map 100% reduce 100% 15/09/12 06:48:36 INFO mapred.JobClient: Job complete: job_201509120634_0001 15/09/12 06:48:36 INFO mapred.JobClient: Counters: 29 15/09/12 06:48:36 INFO mapred.JobClient: Job Counters 15/09/12 06:48:36 INFO mapred.JobClient: Launched reduce tasks=1
3/运行结果 [root@localhost hadoop-1.2.1]# hadoop fs -ls /wordcountoutput Warning: $HADOOP_HOME is deprecated.
Found 3 items -rw-r--r-- 1 root supergroup 0 2015-09-12 06:48 /wordcountoutput/_SUCCESS drwxr-xr-x - root supergroup 0 2015-09-12 06:48 /wordcountoutput/_logs -rw-r--r-- 1 root supergroup 1306 2015-09-12 06:48 /wordcountoutput/part-r-00000
[root@localhost hadoop-1.2.1]# hadoop fs -cat /wordcountoutput/part-r-00000 Warning: $HADOOP_HOME is deprecated.
(BIS), 1 (ECCN) 1 (TSU) 1 (see 1 5D002.C.1, 1 740.13) 1 1 Administration 1 Apache 1 BEFORE 1 BIS 1 Bureau 1 Commerce, 1 Commodity 1 Control 1 Core 1 Department 1 ENC 1 Exception 1 Export 2 For 1
。。。。。
hadoop小试牛刀 OK