Import data from HDFS to HBase


There are 2 ways to directly import data from HDFS to HBase

1. By Running MapReduce Program on Eclipse.

1) Make a new Java Project whose classpath is as follow:

<?xml version="1.0" encoding="UTF-8"?>
<classpath>
  <classpathentry kind="src" path="src"/>
  <classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/hbase-0.94.5.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/commons-cli-1.2.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/commons-logging-1.1.1.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/commons-configuration-1.6.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/log4j-1.2.16.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/zookeeper-3.4.5.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hadoop-1.0.4/hadoop-core-1.0.4.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hadoop-1.0.4/lib/commons-lang-2.4.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/slf4j-log4j12-1.4.3.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/slf4j-api-1.4.3.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/protobuf-java-2.4.0a.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/jackson-mapper-asl-1.8.8.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/jackson-core-asl-1.8.8.jar"/>
  <classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/commons-httpclient-3.1.jar"/>
  <classpathentry kind="output" path="bin"/>
</classpath>

2) Set the program argument to point the input file location / input file URI on HDFS:

                hdfs://master:54310/home/input/weatherData
              hdfs://cssec164.nda.ac.jp:54310/home/input/weatherData
             
3) Run the Program, and as a result the designated data / files will be loaded to a table in HBase cluster.

2. By Running MapReduce Program through Command Line.

1) Having made sure that the program run normally on Eclipse, copy the class files in the ProjectName/bin directory to a location on Linux.




2) Make a jar file from the class files. BUT, before proceed it, make a text file which will be used as the MANIFEST of the jar files. This text file will contain the program MAIN CLASS name and the CLASSPATH as follow (hbaseClassPath.txt) :

Manifest-Version: 1.0
Main-Class: temperatureData.HBaseTemperatureImporter
Class-Path: /home/hadoop/hbase-0.94.5/hbase-0.94.5.jar /home/hadoop/hb
 ase-0.94.5/lib/commons-cli-1.2.jar /home/hadoop/hbase-0.94.5/lib/comm
 ons-logging-1.1.1.jar /home/hadoop/hbase-0.94.5/lib/commons-configura
 tion-1.6.jar /home/hadoop/hbase-0.94.5/lib/log4j-1.2.16.jar /home/had
 oop/hbase-0.94.5/lib/zookeeper-3.4.5.jar /home/hadoop/hadoop-1.0.4/ha
 doop-core-1.0.4.jar /home/hadoop/hadoop-1.0.4/lib/commons-lang-2.4.ja
 r /home/hadoop/hbase-0.94.5/lib/slf4j-log4j12-1.4.3.jar /home/hadoop/
 hbase-0.94.5/lib/slf4j-api-1.4.3.jar /home/hadoop/hbase-0.94.5/lib/pr
 otobuf-java-2.4.0a.jar /home/hadoop/hbase-0.94.5/lib/jackson-mapper-a
 sl-1.8.8.jar /home/hadoop/hbase-0.94.5/lib/jackson-core-asl-1.8.8.jar
  /home/hadoop/hbase-0.94.5/lib/commons-httpclient-3.1.jar

The Class-Path variable must at 70 characters width (except the last line) and must be started with a space in every new line (except the first line).

Make a jar file from the class files in bin directory using hbaseClassPath.txt to set the MANIFEST of the jar file.

jar -cvfm HBaseTemperatureImporter.jar hbaseClassPath.txt -C bin/ .

Execute the jar file to import data/files from HDFS to HBase

java -jar HBaseTemperatureImporter.jar hdfs://master:54310/home/input/weatherData

Comments

Popular posts from this blog

Apa itu Big Data : Menyimak Kembali Definisi Big Data, Jenis Teknologi Big Data, dan Manfaat Pemberdayaan Big Data

Apache Spark: Perangkat Lunak Analisis Terpadu untuk Big Data

Pentingnya Web Crawling sebagai Cara Pengumpulan Data di Era Big Data

Memahami Definisi Big Data

Bagaimana Cara Membaca Google Play eBook Secara Offline?

MapReduce: Besar dan Powerful, tapi Tidak Ribet

Cara Sederhana Install Hadoop 2 mode Standalone pada Windows 7 dan Windows 10

HDFS: Berawal dari Google untuk Big Data

Big Data dan Rahasia Kejayaan Google

Tutorial Swift: Membuat Aplikasi iPhone Sederhana dengan UITableView (bagian 1)