Configure Eclipse for Hadoop - "Map and Reduce"
- Download Eclipse from https://eclipse.org/downloads/
- Download it for Linux 64bit OS and select Eclipse IDE for Java developers .
- Download Hadoop Eclipse plugin from https://github.com/winghc/hadoop2x-eclipse-plugin.
- Extract the folder and copy hadoop-eclipse-plugin.jar to eclipse//plugin folder.
- Open Eclipse.
- Choose Map Reduce Perspective from Top-Right corner of eclipse IDE.
- Go to File--> New --> Map reduce Project.
- Give it a name and Select the option "specify Hadoop library location" and Browse it to the folder where hadoop is installed.
Click Next and finish....and your hadoop project is created.
Setting DFS Location
- Select the Map/Reduce locations TAB at the bottom of the screen.
- Right click on blank space and select "New Hadoop Location".
- Give Location name e.g. "master".
- Give host name e.g. "localhost".
- Give Port number Map Reduce = 9001 and DFS master = 9000.
- Open Terminal and go to Directory - /usr/local/hadoop/bin.
- And Write command ----->> "./hadoop dfs -mkdir /input" OR "hadoop fs -mkdir /input". This will create a folder named "input".
- Now reconnect your DFS location and explore it and you will find a INPUT folder inside a user folder.
Writing "WordCount" Program
- Link of a wordcount Java class ---->> http://cs.smith.edu/dftwiki/index.php/Hadoop_WordCount.java
- Create a class called it "WordCount".
- Paste the java program into it.
- Create a text file somewhere in your file system and write "Good Better Super Best" and copy paste the same line around 100 time.
- Now right click on INPUT folder and choose "Upload file to DFS" and select your text file.
- Refresh Hadoop folder in DFS location explorer.
- Set arguments for INPUT and OUTPUT ---->>
- Right click on any where in the screen and go to Run As and select Run As Configuration.
- Initially you can "Run on Hadoop".
- Select WordCount application and select Arguments TAB. and Write
- hdfs://localhost:9000/input hdfs://localhost:9000/user/anshul/output
- 1st part for input folder path and 2nd one for OUTPUT folder path in DFS location explorer.
- Click on Apply and then RUN.
- There is Output Folder is created and inside the folder there is OUTPUT file which has count value of each word in a INPUT file.
Thank You :)
Anshul Shrivastava