Import data from HDFS to HBase
There
are 2 ways to directly import data from HDFS to HBase
1. By Running MapReduce Program on Eclipse.
1)
Make a new Java Project whose classpath is as follow:
<?xml version="1.0"
encoding="UTF-8"?>
<classpath>
<classpathentry kind="src" path="src"/>
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/hbase-0.94.5.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/commons-cli-1.2.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/commons-logging-1.1.1.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/commons-configuration-1.6.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/log4j-1.2.16.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/zookeeper-3.4.5.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hadoop-1.0.4/hadoop-core-1.0.4.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hadoop-1.0.4/lib/commons-lang-2.4.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/slf4j-log4j12-1.4.3.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/slf4j-api-1.4.3.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/protobuf-java-2.4.0a.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/jackson-mapper-asl-1.8.8.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/jackson-core-asl-1.8.8.jar"/>
<classpathentry kind="lib" path="/home/hadoop/hbase-0.94.5/lib/commons-httpclient-3.1.jar"/>
<classpathentry kind="output" path="bin"/>
</classpath>
2) Set the program argument to point the input file
location / input file URI on HDFS:
hdfs://master:54310/home/input/weatherData
hdfs://cssec164.nda.ac.jp:54310/home/input/weatherData
3) Run the Program, and as a result the designated data
/ files will be loaded to a table in HBase cluster.
2. By Running MapReduce Program through
Command Line.
1)
Having made sure that the program run normally on Eclipse, copy the class files
in the ProjectName/bin directory to a location on Linux.
2) Make a jar file from the class files.
BUT, before proceed it, make a text file which will be used as the MANIFEST of
the jar files. This text file will contain the program MAIN CLASS name and the
CLASSPATH as follow (hbaseClassPath.txt) :
Manifest-Version: 1.0
Main-Class:
temperatureData.HBaseTemperatureImporter
Class-Path:
/home/hadoop/hbase-0.94.5/hbase-0.94.5.jar /home/hadoop/hb
ase-0.94.5/lib/commons-cli-1.2.jar
/home/hadoop/hbase-0.94.5/lib/comm
ons-logging-1.1.1.jar
/home/hadoop/hbase-0.94.5/lib/commons-configura
tion-1.6.jar
/home/hadoop/hbase-0.94.5/lib/log4j-1.2.16.jar /home/had
oop/hbase-0.94.5/lib/zookeeper-3.4.5.jar
/home/hadoop/hadoop-1.0.4/ha
doop-core-1.0.4.jar
/home/hadoop/hadoop-1.0.4/lib/commons-lang-2.4.ja
r
/home/hadoop/hbase-0.94.5/lib/slf4j-log4j12-1.4.3.jar /home/hadoop/
hbase-0.94.5/lib/slf4j-api-1.4.3.jar
/home/hadoop/hbase-0.94.5/lib/pr
otobuf-java-2.4.0a.jar
/home/hadoop/hbase-0.94.5/lib/jackson-mapper-a
sl-1.8.8.jar
/home/hadoop/hbase-0.94.5/lib/jackson-core-asl-1.8.8.jar
/home/hadoop/hbase-0.94.5/lib/commons-httpclient-3.1.jar
The Class-Path variable must at 70
characters width (except the last line) and must be started with a space in
every new line (except the first line).
Make a jar file from the class files in bin
directory using hbaseClassPath.txt to set the MANIFEST of the jar file.
jar -cvfm HBaseTemperatureImporter.jar hbaseClassPath.txt
-C bin/ .
Execute the jar file to import data/files
from HDFS to HBase
java -jar HBaseTemperatureImporter.jar
hdfs://master:54310/home/input/weatherData
Comments