您现在的位置： Linux教程網 >> UnixLinux > >> Linux編程 >> Linux編程

Hadoop 上使用C 語言編程

今天嘗試用C語言在Hadoop上編寫統計單詞的程序，具體過程如下：

一、編寫map和reduce程序

mapper.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUF_SIZE 2048
#define DELIM '\n'
int main(int argc, char * argv[])
{
char buffer[BUF_SIZE];
while(fgets(buffer,BUF_SIZE-1,stdin))
{
int len = strlen(buffer);
if(buffer[len-1] == DELIM) // 將換行符去掉
buffer[len-1] = 0;
char *query = NULL;
query = strtok(buffer, " ");
while(query)
{
printf("%s\t1\n",query);
query = strtok(NULL," ");
}
}
return 0;
}

reducer.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUFFER_SIZE 1024
#define DELIM "\t"
int main(int argc, char * argv[])
{
char str_last_key[BUFFER_SIZE];
char str_line[BUFFER_SIZE];
int count = 0;
*str_last_key = '\0';
while( fgets(str_line,BUFFER_SIZE-1,stdin) )
{
char * str_cur_key = NULL;
char * str_cur_num = NULL;
str_cur_key = strtok(str_line,DELIM);
str_cur_num = strtok(NULL,DELIM);
if(str_last_key[0] =='\0')
{
strcpy(str_last_key,str_cur_key);
}
if(strcmp(str_cur_key, str_last_key))// 前後不相等，輸出
{
printf("%s\t%d\n",str_last_key,count);
count = atoi(str_cur_num);
}else{// 相等，則加當前的key的value
count += atoi(str_cur_num);
}
strcpy(str_last_key,str_cur_key);
}
printf("%s\t%d\n",str_last_key,count);
return 0;
}

二、編譯

gcc mapper.c -o mapper

gcc reducer.c -o reducer

三、運行

（一）啟動hadoop後將待統計單詞的輸入文件放到 input文件夾中：bin/hadoop fs -put 待統計文件 input

（二）使用contrib/streaming/下的jar工具調用上面的mapper\reducer:

bin/hadoop jar /home/huangkq/Desktop/hadoop/contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper /home/huangkq/Desktop/hadoop2/mapper -reducer /home/huangkq/Desktop/hadoop2/reducer -input input -output c_output -jobconf mapred.reduce.tasks=2

說明：hadoop-streaming-0.20.203.0.jar是一個管道工具

（三）查看結果：bin/hadoop fs -cat c_output/*

更多Hadoop相關信息見Hadoop 專題頁面 http://www.linuxidc.com/topicnews.aspx?tid=13

上一篇文章： Android 橫豎屏切換生命周期詳解
下一篇文章： Python3.X增加的關鍵字nonlocal

Linux編程

[C/C++基礎] C語言常用函數memset的使用方法

[C/C++基礎] C語言常用函數strlen的使用方法

從C語言到C++語言

C語言實現泛型編程

使用SWIG將C/C++庫移植到其他語言中

在Objective-C中使用C++

Matlab與C/C++聯合編程之從Matlab調用C/C++代碼

Linux C/C++(或標准C++或標准C)編程雜記

Linux編程

SHELL編程

PERL編程