February 07, 2014

Hadoop Mapper only MapReduce jobs

It's funny! You can have mapper only Hadoop MapReduce Jobs. This would be useful sometime when you need to change the structure of data. Otherwise you can use this way when want to filter out data. But I don't think you will need this much often.

First file is a simple mapper which really does nothing. You can change it as you want. You can see in main method job.setNumReduceTasks(0); line which set reduce tasks to 0. You can find Maven project on GitHub

Hadoop with Maven

Last couple of days, I have been playing with Hadoop. Because of that I couldn't blog much often.
I wanted to automated packaging with Maven. Below gist shows a sample Maven pom.xml for Hadoop.

This will resolve Hadoop dependency and package it as a jar file. Hope this will help you!