MapReduce Apache Hadoop Technology

General Tech Technology & Software 2 years ago

0 1 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Technology & Software related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (1)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago


I am trying to wordcount program using the MapReduce Hadoop technology. What I need to do is develop an Indexed Word Count application that will count the number of occurences of each word in each file in a given input file set. This file set is present in the Amazon S3 bucket. It will also count the total occurences of each word. I have attached the code that counts the occurences of the words in the given file set. After this I need to print that which word is occuring in which file with the number of occurrences of the word in that particular file.

I know its a bit complex but any would be appreciated.

Map.java

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;

public class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
    private String pattern= "^[a-z][a-z0-9]*$";

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String line = value.toString();
        StringTokenizer tokenizer = new StringTokenizer(line);
        InputSplit inputSplit = context.getInputSplit();
        String fileName = ((FileSplit) inputSplit).getPath().getName();
        while (tokenizer.hasMoreTokens()) {
            word.set(tokenizer.nextToken());
            String stringWord = word.toString().toLowerCase();
            if (stringWord.matches(pattern)){
                context.write(new Text(stringWord), one);
            }

        }
    }
}

Reduce.java

import java.io.IOException;

import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;

public class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

    public 
                                                
                                                
0 views
0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.