Sqoop2从Mysql导入Hdfs的过程
这篇文章主要讲解了“Sqoop2从MySQL导入Hdfs的过程”,文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习“Sqoop2从Mysql导入Hdfs的过程”吧!
致力于企业形象建设和产品!多年来,创新互联公司不忘初心,以建立行业服务标杆为目标,不断提升技术设计服务水平,帮助客户在互联网推广自己的产品、服务,为客户创造价值从而实现自身价值!中小型企业如何做成都品牌网站建设?初创业公司建站预算不足?
一、环境搭建
1.Hadoop
http://my.oschina.net/u/204498/blog/519789
2.Sqoop2.x
http://my.oschina.net/u/204498/blog/518941
3. mysql
二、从mysql导入hdfs
1.创建mysql数据库、表、以及测试数据
xxxxxxxx$ mysql -uroot -p Enter password: mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | test | +--------------------+ 4 rows in set (0.00 sec) test => 是新建的数据库 mysql> use test; mysql> show tables; +----------------------+ | Tables_in_test | +----------------------+ | | test | +----------------------+ 1 rows in set (0.00 sec) test => 是新增的表 mysql> desc test; +-------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | name | varchar(45) | YES | | NULL | | | age | int(11) | YES | | NULL | | +-------+-------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec) mysql> select * from test; +----+------+------+ | id | name | age | +----+------+------+ | 7 | a | 1 | | 8 | b | 2 | | 9 | c | 3 | +----+------+------+ 3 rows in set (0.00 sec)
2. 为各个用户授权
注意:sqoop提交job后,各个节点在map阶段会访问数据库,所以需事先授权
mysql> grant [all | select | ...] on {db}.{table} to {user}@{host} identified by {passwd}; mysql> flush privileges; #我给特定的hostname授权 username:root passwd:root 访问db:test 中任意table,权限是all mysql> grant all on test.* to 'root'@{host} identified by 'root';
3.启动sqoop2-server
[hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ pwd /home/hadoop/sqoop-1.99.6-bin-hadoop200 [hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ ./bin/sqoop2-server start ... ... webui可以访问校验,也可以查看log
4.启动sqoop2-shell
[hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ pwd /home/hadoop/sqoop-1.99.6-bin-hadoop200 [hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ ./bin/sqoop2-shell ... ... sqoop:000> show version ... ... sqoop:000> show connector +----+------------------------+---------+------------------------------------------------------+----------------------+ | Id | Name | Version | Class | Supported Directions | +----+------------------------+---------+------------------------------------------------------+----------------------+ | 1 | generic-jdbc-connector | 1.99.6 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO | | 2 | kite-connector | 1.99.6 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO | | 3 | hdfs-connector | 1.99.6 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO | | 4 | kafka-connector | 1.99.6 | org.apache.sqoop.connector.kafka.KafkaConnector | TO | +----+------------------------+---------+------------------------------------------------------+----------------------+ 根据你的connector创建connector sqoop:000> create link -c 1 => 先创建jdbc 会填写name、jdbc-driver、url、username、passwd等等 sqoop:000> create link -c 3 => 创建hdfs 会填写name、hdfs url、等等 sqoop:000> show link +----+-------------+--------------+------------------------+---------+ | Id | Name | Connector Id | Connector Name | Enabled | +----+-------------+--------------+------------------------+---------+ | 3 | 10-21_jdbc1 | 1 | generic-jdbc-connector | true | | 4 | 10-21_hdfs1 | 3 | hdfs-connector | true | +----+-------------+--------------+------------------------+---------+ 创建job -f=> from -t to 即从哪些导入到哪里 sqoop:000> create job -f 3 -t 4 会填写,相应的table信息。还有hdfs信息 sqoop:000> show job +----+---------------+----------------+--------------+---------+ | Id | Name | From Connector | To Connector | Enabled | +----+---------------+----------------+--------------+---------+ | 1 | 10-20_sqoopy2 | 1 | 3 | true | +----+---------------+----------------+--------------+---------+ #启动job sqoop:000> start job -j 2 ... ... ... 可以再webui上访问到,查看进度,也可以使用 sqoop:000> status job -j 2
sqoop的guide
http://sqoop.apache.org/
5.troubleshooting
多看日志,慢慢的排查
感谢各位的阅读,以上就是“Sqoop2从Mysql导入Hdfs的过程”的内容了,经过本文的学习后,相信大家对Sqoop2从Mysql导入Hdfs的过程这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是创新互联,小编将为大家推送更多相关知识点的文章,欢迎关注!
当前名称:Sqoop2从Mysql导入Hdfs的过程
文章链接:http://azwzsj.com/article/pshpgi.html