Hbase Cluster Replication(集群复制)笔记-FreeOA

Hbase Cluster Replication(集群复制)笔记

2017-07-29 22:55:28

阿炯

前提条件：
两组完全配置好的hbase集群，网络上互联互通。配置复制的原因有多种，这里就不复述了。这里重点讲解配置的过程及其中可能出现的问题，另外就是对已有数据的同步。这里的情况是主库已经运行超过两年了，从库才加入的情形，且在从库建立好后主库就开始运行，没有将主库上的快照同步到从库上还原。

环境：hbase 0.98.12.1-hadoop2

另外，必须为整个系统留足够的内存，内存一紧张，程序很容易出错退出。为什么我的mapreduce作业总是运行到某个阶段就报出如下错误，然后失败呢，以前同一个作业没出现过的呀？”
INFO mapred.JobClient: Task Id : attempt_201701061331_0002_m_000027_0, Status : FAILED
java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:498)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:158)

这个实际上是 Out Of Memory(OOM)问题。

其实这样的错误有时候并不是程序逻辑的问题(当然有可能是由于程序写的不够高效，产生的内存消耗不合理而导致)，而是由于同样的作业，在数据量和数据本身发生不同时就会占据不同数量的内存空间。由于hadoop的mapreduce作业的运行机制是：在jobtracker接到客户端来的job提交后，将许多的task分配到集群中各个tasktracker上进行分块的计算，而根据代码中的逻辑可以看出，其实是在tasktracker上启了一个java进程进行运算，进程中有特定的端口和网络机制来保持map 和reduce之间的数据传输，所以这些OOM的错误，其实就是这些java进程中报出了OOM的错误。知道了原因以后就好办了，hadoop的mapreduce作业启动的时候，都会读取jobConf中的配置(yarm-site.xml)，只要在该配置文件中将每个task的jvm进程中的-Xmx所配置的java进程的max heap size加大，就能解决这样的问题：
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx1024m</value>
</property>

该选项默认是200M

新版本应该是在conf/Hadoop-env.sh文件中修改。默认为2000M，如果不生效，那么就在mapreduce中写jobconf配置。

---------------主---------------
hadoop配置

不仅开启hdfs，还要开启mapreduce框架:yarn。

hbase配置文件中加入

<property>
    <name>hbase.replication</name>
    <value>true</value>
</property>

在hadoop和hbase配置完成后，可以停止程序对hbase的写入了，当写入完全停止好后便可生成快照。
生成快照语句：snapshot '表名','快照名'

进入hbase shell脚本，加入从节点，为其编一个号，这个号可以自行决定(将从库的所有regionserver全部写上)：
add_peer '12','xds:2181:/hbase'
add_peer '10','alonemaster,alonemaster2,alonemaster3:2181:/hbase'

查看是否成功：list_peers

现在修改表结构，开启其复制功能：
describe 'freeoa_hbt'
disable 'freeoa_hbt'
alter 'freeoa_hbt', {NAME => '列族名',REPLICATION_SCOPE => '1'}
alter 'freeoa_hbt', {NAME => 'content',REPLICATION_SCOPE => '1'}

在创建表时不能指定REPLICATION_SCOPE属性，只能在建好表之后进行修改，下面这条语句是关闭其这种属性的：
alter 'freeoa_hbt', {NAME => 'cf',REPLICATION_SCOPE => '0'}

enable 'freeoa_hbt'

describe 'freeoa_hbt'

可将主上的快照导入到从hbase上：
bin/hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -overwrite -snapshot snapshot_wz_20170713 -copy-to hdfs://xds:9000/hbase

---------------备从---------------
从库中要先建立好表，不然即使修改的数据都不会同步过来。

从快照中恢复
disable 'wz'
restore_snapshot 'snapshot_wz_20170713'
enable 'wz'

这个没有什么问题，但主在做快照之后的更新的记录内容在从上快照恢复之前能看到(也只能看到这个更新过的内容，其它列中没有被改过的数据)的内容在恢复之后被覆盖了。

这种情况可将主上做快照后的一段时间修改过的记录导出到从上。
主库上执行：
bin/hbase org.apache.hadoop.hbase.mapreduce.Export con hdfs://xds:9000/import/con 12 1500022950830 1500023160200
指令解析为：
bin/hbase org.apache.hadoop.hbase.mapreduce.Export table hdfs://slave_hdfs:9000/import/con 从节点编号开始的毫秒时间戳结束的毫秒时间戳

从库上执行(从导入目录中导入到正式表中)：
bin/hbase org.apache.hadoop.hbase.mapreduce.Import con hdfs://xds:9000/import/con

2017-07-14 17:21:27,321 INFO [main] mapreduce.Job: The url to track the job: http://http://hbak:8088/proxy/application_1500019039263_0001/
2017-07-14 17:21:27,321 INFO [main] mapreduce.Job: Running job: job_1500019039263_0001
2017-07-14 17:21:35,405 INFO [main] mapreduce.Job: Job job_1500019039263_0001 running in uber mode : false
2017-07-14 17:21:35,406 INFO [main] mapreduce.Job: map 0% reduce 0%
2017-07-14 17:21:40,458 INFO [main] mapreduce.Job: map 100% reduce 0%
2017-07-14 17:21:40,465 INFO [main] mapreduce.Job: Job job_1500019039263_0001 completed successfully
Exception in thread "main" java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.mapreduce.JobCounter.MB_MILLIS_MAPS
    at java.lang.Enum.valueOf(Enum.java:236)
    at org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.valueOf(FrameworkCounterGroup.java:148)
    at org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.findCounter(FrameworkCounterGroup.java:182)
    at org.apache.hadoop.mapreduce.counters.AbstractCounters.findCounter(AbstractCounters.java:154)
    at org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:240)
    at org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:370)
    at org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:511)
    at org.apache.hadoop.mapreduce.Job$7.run(Job.java:756)
    at org.apache.hadoop.mapreduce.Job$7.run(Job.java:753)
    at java.security.AccessController.doPrivileged(Native Method)

出错了，也不知道成功了没有。检查了一下，成功了，新的数据过来了。这个问题后面会给出解决办法。

一个主写从查看的示例

scan 'con', {LIMIT =>10,COLUMNS => ['cf:id', 'cf:issue_dt'], CACHE_BLOCKS => false,REVERSED => true,STARTROW => '20110902042646'}
2015062217590742158
scan 'wz', {LIMIT =>10,REVERSED => true,STARTROW => '20150902042646'}

get 'con','2015082615145741838'
get 'con','2015082615145741838','cf:hits'
get 'con','2015082615145741838',{LIMIT =>10,COLUMNS => ['cf:hits', 'cf:title']}

put 'con','2015082615145741838','cf:hits','700'

put 'con','2011082908441625317','cf:title','EnterpriseDB base on Postgresql'

get 'con','2011082908441625317',{COLUMNS => ['cf:hits', 'cf:title']}

在从上scan
---
hbase(main):009:0> scan 'con'
ROW                                    COLUMN+CELL
2011082908441625317                   column=cf:title, timestamp=1500023160180, value=EnterpriseDB base on Postgresql
2015082615145741838                   column=cf:hits, timestamp=1500022950832, value=700
2015082615145741838                   column=cf:title, timestamp=1500023123231, value=EnterpriseDB base on Postgresql
---
scan 'con', {LIMIT =>2,REVERSED => true,COLUMNS => ['cf:id','cf:issue_dt','cf:title','cf:hits']}

即使从集群上没有建立对应的表，建表后也会将之前的修改记录同步过去。

put 'con','2017051420434877614','cf:hits','700'
从建表后
put 'con','2017051420434877614','cf:title','postgres的逻辑备份还原：pg_dump和pg_restore的使用'

原以为不会将hits改为700的那个同步过来，结果还是同步过来了。

put 'con','2017052118082332250','cf:title','PostgreSQL与MySQL两大开源数据库比较'

再写入数据，却发现rowkey写错了，需要将数据从快照中恢复到之前:
disable 'con'
restore_snapshot 'snapshot_con_20170726'
enable 'con'

但从库却没有像主库上那样重放这个操作，可见主库上的从快照中恢复不会在从库上实现。那将主上的快照导入到从上去呢：
bin/hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -overwrite -snapshot snapshot_con_20170726 -copy-to hdfs://xds:9000/hbase -mappers 2

受制于系统资源，没有成功。看能不能导出到本地文件：

bin/hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot snapshot_con_20170726 -copy-to /tmp/cons

还是失败。不知是因为版本原因还是什么问题，指定任务数量的(mappers)参数和限制带宽的(bandwidth)参数都没有能发挥其作用，

##############################
主从做好后有两种方法来将前面的数据同步到从集群：
1、停止写入，将快照导出到从集群，但线上有56TB的数据，77张表。在多千兆网卡做bond时依然要很长的时间，对于应用来说，根本不可接受。

2、对表分时间段(快照之前)进行导出，再在从库中导入，如此往复，这样不影响应用的写入，不过用时和精力都会用去很多。
##############################

[数据校验]

主库上执行：
bin/hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication --starttime=1501059600000 --stoptime=1501060200000 12 con -mappers 2

2017-07-26 17:12:26,218 ERROR [main] replication.ReplicationPeersZKImpl: Could not get configuration for peer because it doesn't exist. peerId=-mappers
2017-07-26 17:12:26,224 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2017-07-26 17:12:26,224 INFO [main] zookeeper.ZooKeeper: Session: 0x5d7e01208a0005 closed
Exception in thread "main" java.io.IOException: Couldn't get peer conf!

将mappers,bandwidth两个参数后不再报参数错误，但又报出了：No enum constant org.apache.hadoop.mapreduce.JobCounter.MB_MILLIS_MAPS

可以通过这里找到解决方法：

最终校验通过。
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier$Counters
    GOODROWS=17

现在开始同步主从同步之前的数据，首先看一下主库最早数据的时间戳：

scan 'con', {LIMIT =>2,REVERSED => false}

rowkey:ust
2009030815340389100:1495700438602
2009032915265550408:1495700439848

看一下从库中最老数据时间戳：
rowkey:ust
2017051420434877614:1501059257359
2017052118082332250:1501059611854

那么就需要拉取主库最老时戳与从库最老时戳间的数据了：

bin/hbase org.apache.hadoop.hbase.mapreduce.Export con hdfs://xds:9000/import/con 12 1495700438000 1501059257000 -mappers 2 -bandwidth 10

任务是成功了，但没有数据被同步过去。。难道要用记录里的时间来开始同步？

20090308153403=1236497643
now=1501062386

bin/hbase org.apache.hadoop.hbase.mapreduce.Export con hdfs://xds:9000/import/con 12 1236497643000 1501062386000 -mappers 2 -bandwidth 10

很快结束：
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://xds:9000/import/con already exists

看来前面的是导过去了的，但需要在从库上执行导入：
bin/hbase org.apache.hadoop.hbase.mapreduce.Import con hdfs://xds:9000/import/con

2017-07-26 17:51:12,714 INFO [main] mapreduce.Job: map 0% reduce 0%
2017-07-26 17:51:18,782 INFO [main] mapreduce.Job: map 100% reduce 0%
2017-07-26 17:51:18,794 INFO [main] mapreduce.Job: Job job_1501058627844_0001 completed successfully
Exception in thread "main" java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.mapreduce.JobCounter.MB_MILLIS_MAPS

修正这个问题后，任务是成功了的，但从库数据量还是没有变化，问题出在哪里？看来是操作方法上有问题，不应该用记录里的时间作为参考，而是该用最早记录的时戳(要是记录里没有日期时间字段该怎么办，呵呵)。

指定时间段的数据主从同步：
1、主集群执行：
bin/hbase org.apache.hadoop.hbase.mapreduce.Export con hdfs://xds:9000/import/con 12 1495700438000 1501059257000
指定了版本数，与线上一致
指定了起始时间戳(1495700438000 1501059257000)

但由于前面跑过一次，会报：
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://xds:9000/import/con already exists

手动在从库上删除了该表在import下的对应目录：
hadoop/bin/hdfs dfs -rm -r /import/con/

重拉一次：
2017-07-26 18:05:25,184 INFO [main] util.RegionSizeCalculator: Calculating region sizes for table "con".
2017-07-26 18:05:25,462 DEBUG [main] util.RegionSizeCalculator: Region con,,1495700417514.b3c7ec6db740711373ccedc0105d57ae. has size 26214400
2017-07-26 18:05:25,468 DEBUG [main] util.RegionSizeCalculator: Region sizes calculated
2017-07-26 18:05:25,518 DEBUG [main] mapreduce.TableInputFormatBase: getSplits: split -> 0 -> HBase table split(table name: con, scan: , start row: , end row: , region location: xd0)
2017-07-26 18:05:25,730 INFO [main] mapreduce.JobSubmitter: number of splits:1
2017-07-26 18:05:25,771 INFO [main] Configuration.deprecation: dfs.socket.timeout is deprecated. Instead, use dfs.client.socket-timeout
2017-07-26 18:05:25,771 INFO [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2017-07-26 18:05:25,941 INFO [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1501057582035_0004
2017-07-26 18:05:26,160 INFO [main] impl.YarnClientImpl: Submitted application application_1501057582035_0004
2017-07-26 18:05:26,195 INFO [main] mapreduce.Job: The url to track the job: http://htcom:8088/proxy/application_1501057582035_0004/
2017-07-26 18:05:26,196 INFO [main] mapreduce.Job: Running job: job_1501057582035_0004
2017-07-26 18:06:03,469 INFO [main] mapreduce.Job: Job job_1501057582035_0004 running in uber mode : false
2017-07-26 18:06:03,472 INFO [main] mapreduce.Job: map 0% reduce 0%
2017-07-26 18:06:43,813 INFO [main] mapreduce.Job: map 100% reduce 0%
2017-07-26 18:06:43,829 INFO [main] mapreduce.Job: Job job_1501057582035_0004 completed successfully
2017-07-26 18:06:43,941 INFO [main] mapreduce.Job: Counters: 40
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=136134
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=62
        HDFS: Number of bytes written=26163577
        HDFS: Number of read operations=4
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters
        Launched map tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=37800
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=37800
        Total vcore-milliseconds taken by all map tasks=37800
        Total megabyte-milliseconds taken by all map tasks=38707200
    Map-Reduce Framework
        Map input records=3117
        Map output records=3117
        Input split bytes=62
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=60
        CPU time spent (ms)=2870
        Physical memory (bytes) snapshot=217321472
        Virtual memory (bytes) snapshot=1660932096
        Total committed heap usage (bytes)=183500800
    HBase Counters
        BYTES_IN_REMOTE_RESULTS=0
        BYTES_IN_RESULTS=26040693
        MILLIS_BETWEEN_NEXTS=2247
        NOT_SERVING_REGION_EXCEPTION=0
        NUM_SCANNER_RESTARTS=0
        REGIONS_SCANNED=1
        REMOTE_RPC_CALLS=0
        REMOTE_RPC_RETRIES=0
        RPC_CALLS=34
        RPC_RETRIES=0
    File Input Format Counters
        Bytes Read=0
    File Output Format Counters
        Bytes Written=26163577

感觉数量上是正确的。在从上看看大小：
hadoop/bin/hdfs dfs -du -s -h /import/con
25.0 M /import/con

2、从集群执行：
bin/hbase org.apache.hadoop.hbase.mapreduce.Import con hdfs://xds:9000/import/con

2017-07-26 18:09:23,851 INFO [main] mapreduce.TableOutputFormat: Created table instance for con
2017-07-26 18:09:25,279 INFO [main] input.FileInputFormat: Total input paths to process : 1
2017-07-26 18:09:25,374 INFO [main] mapreduce.JobSubmitter: number of splits:1
2017-07-26 18:09:25,607 INFO [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1501058627844_0003
2017-07-26 18:09:25,989 INFO [main] impl.YarnClientImpl: Submitted application application_1501058627844_0003
2017-07-26 18:09:26,040 INFO [main] mapreduce.Job: The url to track the job: http://xds:8088/proxy/application_1501058627844_0003/
2017-07-26 18:09:26,041 INFO [main] mapreduce.Job: Running job: job_1501058627844_0003
2017-07-26 18:09:33,253 INFO [main] mapreduce.Job: Job job_1501058627844_0003 running in uber mode : false
2017-07-26 18:09:33,255 INFO [main] mapreduce.Job: map 0% reduce 0%
2017-07-26 18:09:43,458 INFO [main] mapreduce.Job: map 100% reduce 0%
2017-07-26 18:09:43,468 INFO [main] mapreduce.Job: Job job_1501058627844_0003 completed successfully
2017-07-26 18:09:43,591 INFO [main] mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=135678
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=26163681
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=3
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Job Counters
        Launched map tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=7354
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=7354
        Total vcore-milliseconds taken by all map tasks=7354
        Total megabyte-milliseconds taken by all map tasks=7530496
    Map-Reduce Framework
        Map input records=3117
        Map output records=3117
        Input split bytes=104
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=109
        CPU time spent (ms)=3430
        Physical memory (bytes) snapshot=235057152
        Virtual memory (bytes) snapshot=1779216384
        Total committed heap usage (bytes)=273678336
    File Input Format Counters
        Bytes Read=26163577
    File Output Format Counters
        Bytes Written=0
2017-07-26 18:09:43,593 INFO [main] mapreduce.Job: Running job: job_1501058627844_0003
2017-07-26 18:09:43,606 INFO [main] mapreduce.Job: Job job_1501058627844_0003 running in uber mode : false
2017-07-26 18:09:43,606 INFO [main] mapreduce.Job: map 100% reduce 0%
2017-07-26 18:09:43,611 INFO [main] mapreduce.Job: Job job_1501058627844_0003 completed successfully
2017-07-26 18:09:43,617 INFO [main] mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=135678
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=26163681
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=3
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Job Counters
        Launched map tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=7354
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=7354
        Total vcore-milliseconds taken by all map tasks=7354
        Total megabyte-milliseconds taken by all map tasks=7530496
    Map-Reduce Framework
        Map input records=3117
        Map output records=3117
        Input split bytes=104
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=109
        CPU time spent (ms)=3430
        Physical memory (bytes) snapshot=235057152
        Virtual memory (bytes) snapshot=1779216384
        Total committed heap usage (bytes)=273678336
    File Input Format Counters
        Bytes Read=26163577
    File Output Format Counters
        Bytes Written=0

对比一下表的数量，主从是相等的。数据主从是否完成一致，用上面的校验方法来试一下：
stoptime为NOW的时戳。

bin/hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication --starttime=1495700438000 --stoptime=1501064274000 12 con
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier$Counters
    GOODROWS=3132

现在开移一个大小在2.7G的一个表，得到其中记录的最老最新时间戳。

scan 'wz', {LIMIT =>1,REVERSED => false}

2015060723264517045:1489744357327

scan 'wz', {LIMIT =>1,REVERSED => true}

2015070309135227806:1489745908349

bin/hbase org.apache.hadoop.hbase.mapreduce.Export wz hdfs://xds:9000/import/wz 12 1489744357000 1489745909000

[从]
bin/hbase org.apache.hadoop.hbase.mapreduce.Import wz hdfs://xds:9000/import/wz

从机在执行导入时，负载会变得较高。

wz count:3959850 row(s) in 418.4610 seconds

bin/hbase org.apache.hadoop.hbase.mapreduce.Export wz hdfs://xds:9000/import/wz 12 1489744357000 1501064274000 -mappers 10 -bandwidth 80

mapreduce.Job: Task Id : attempt_1500974684883_0003_m_000000_0, Status : FAILED Error: Java heap space

Regions in Transition
Region    State    RIT time (ms)
1588230740    hbase:meta,,1.1588230740 state=PENDING_OPEN, ts=Wed Jul 26 14:57:12 CST 2017 (3s ago), server=xd1,60020,1501052156299     3062
Total number of Regions in Transition for more than 60000 milliseconds    0
Total number of Regions in Transition    1

执行：bin/hbase hbck
...
Summary:
Table hbase:meta is okay.
    Number of regions: 1
    Deployed on: xd1,60020,1501052156299
Table wz is okay.
    Number of regions: 34
    Deployed on: xd0,60020,1501052155967 xd1,60020,1501052156299 xd3,60020,1501052156041
Table con is okay.
    Number of regions: 1
    Deployed on: xd0,60020,1501052155967
Table wnm is okay.
    Number of regions: 271
    Deployed on: xd0,60020,1501052155967 xd1,60020,1501052156299 xd3,60020,1501052156041
Table hbase:namespace is okay.
    Number of regions: 1
    Deployed on: xd0,60020,1501052155967
Table wbanalysis_contentid is okay.
    Number of regions: 1
    Deployed on: xd1,60020,1501052156299
0 inconsistencies detected.
Status: OK
2017-07-26 15:04:58,679 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
2017-07-26 15:04:58,679 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x25d7dadfa580004
2017-07-26 15:04:58,686 INFO [main] zookeeper.ZooKeeper: Session: 0x25d7dadfa580004 closed
2017-07-26 15:04:58,686 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down

参考来源：

hbase备份恢复

HBase运维问题

Importing Data Into HBase

HBase集群数据迁移方案，hbase集群迁移

How to import/export hbase data via hdfs (hadoop commands)