USECase 1:
Transferring data from RDBMS to HBase:
Case: Transactional Data is in RDBMS
Input in RDBMS- THis data is in multiple RDBMS tables and it has to be converged to HDFS,
Step 1:
101 3000
102 2000
103 1000
101 5000
102 1000
Output in Hbase
101 3000,5000
102 2000,1000
103 1000
Input in RDBMS- THis data is in multiple RDBMS tables and it has to be converged to HDFS,
Step 1:
1. Open MySql and create tables,
mysql -u root -p
(By default the password is empty and I have changed it to cloudera)
create table trans1(accno int, amount int);
insert into table trans1 values(101, 3000);
insert into trans1 values (102, 2000);
create table trans2(accno int, amount int) values(101, 5000);
insert into trans2 values(101, 5000);
insert into trans2 values(102, 1000);
create table trans3(accno int, amount int);
insert into trans3 values(103, 1000);
2. Importing data via Sqoop
////////This will import the table data to Transaction_db2 by creating directory Transaction_db2 //////
sqoop import --connect jdbc:mysql://localhost/Leela_db --username root --password cloudera \
--table trans2 --m 1 \
--target-dir 'hdfs://quickstart.cloudera:8020/user/Leela/Hive/Transaction_db2'
//////////NOw we need to append the table data trans3 to the existing file, so use --append keyword
///////THis will create 2 more files in the same directory.
////THis means each table data will be copied as seperate files////
sqoop import --connect jdbc:mysql://localhost/Leela_db --username root --password cloudera \
--table trans3 --m 1 \
--target-dir 'hdfs://quickstart.cloudera:8020/user/Leela/Hive/Transaction_db2' --append
sqoop import --connect jdbc:mysql://localhost/Leela_db --username root --password cloudera \
--table trans3 --m 1 \
--target-dir 'hdfs://quickstart.cloudera:8020/user/Leela/Hive/Transaction_db2' \
--append --query="select accno,amount from trans2 and \$CONDITIONS"
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ TO append the new table values to tthe existing file
///////Need to use --check-column <columnname> --incremental append
sqoop import --connect jdbc:mysql://localhost:3306/Leela_db --username=root \
--password=cloudera --query="select * from trans3 where accno > 100 AND \$CONDITIONS" \
--append --target-dir 'hdfs://quickstart.cloudera:8020/user/Leela/Hive/Transaction_db2' \
--num-mappers 1 --check-column "accno" --incremental append --last-value 102
AccNo amt
101 3000
102 2000
103 1000
101 5000
102 1000
Output in Hbase
101 3000,5000
102 2000,1000
103 1000
No comments:
Post a Comment