HBase入门笔记 I

本文详细介绍了HBase的定义、数据模型(包括Namespace、Region、RowKey等)、基本架构(Master和RegionServer)、搭建与配置过程。涵盖了从理论到实践的关键步骤,适合HBase入门学习者。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

致谢:尚硅谷https://www.bilibili.com/video/BV1Y4411B7jy?from=search&seid=18054153211371292589

1.1Hbase定义

HBase是一种分布式、可扩展(动态上下线)、支持海量数据的NoSQL(KEY-VALUE)数据库

1.2数据模型

逻辑上数据模型和关系型数据库类似,数据存在一张表中。底层物理逻辑是K-V键值对。

与mysql区别:1.将列分成了列簇 (一行包含很多列簇)(宽表切分)2.行被切成了Region(瘦表切分)

逻辑结构:

物理存储:

row key,列簇,列名,时间戳,type,value

都是PUT会显示时间戳大的,删除是type为delete,配合时间戳查询的时候看是否删除了

1.2.2数据模型

1)Namespcae 类似database

2)Region:表的切片,类似于mysql的表概念。HBase定义表的时候只需要定义列簇,不需要具体的列,列动态增加的。

3)Row: 每一行数据是一个RowKey和多个Column组成,按照Rowkey的字典顺序存储,查询时只能用Rowkey检索

4)Column:列是有列簇和列限定符进行限定

5)timestamp:时间戳,表示数据的不同版本

6)cell 单元格,由以上五个字段可以唯一确定一个单元,cell中数据没有类型,是字节码形式存储。

 

1.3HBase基本架构

Region 放在Region server上,有多个region server,分布式存储

master:负责表的增删改查,分配regions到每个regionserver,监控每个RS的状态

备用master实现高可用

 

1.4HBase搭建和相关配置

下载对应HADOOP!版本HBASE!

https://www.apache.org/dyn/closer.lua/hbase/1.4.13/hbase-1.4.13-bin.tar.gz

配置文件修改

hbase-site.xml

注意rootdir的端口要和hdfs-site里面的fs default端口一致

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>

<property>
<name>hbase.rootdir</name>
<value>hdfs://master:8020/hbase</value>
</property>
<property>

<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>

<name>hbase.master.port</name>
<value>16000</value>
</property>

<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave1,slave2</value>
<description>Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on. </description>
</property>

<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/opt/apache-zookeeper-3.5.9-bin/zkData</value>
<description>Property from ZooKeeper's config zoo.cfg. The directory where the snapshot is stored. </description>
</property>


</configuration>

hbase-env.sh

@rem/**
@rem * Licensed to the Apache Software Foundation (ASF) under one
@rem * or more contributor license agreements.  See the NOTICE file
@rem * distributed with this work for additional information
@rem * regarding copyright ownership.  The ASF licenses this file
@rem * to you under the Apache License, Version 2.0 (the
@rem * "License"); you may not use this file except in compliance
@rem * with the License.  You may obtain a copy of the License at
@rem *
@rem *     http://www.apache.org/licenses/LICENSE-2.0
@rem *
@rem * Unless required by applicable law or agreed to in writing, software
@rem * distributed under the License is distributed on an "AS IS" BASIS,
@rem * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@rem * See the License for the specific language governing permissions and
@rem * limitations under the License.
@rem */

@rem Set environment variables here.

@rem The java implementation to use.  Java 1.7+ required.
@rem set JAVA_HOME=c:\apps\java

@rem Extra Java CLASSPATH elements.  Optional.
@rem set HBASE_CLASSPATH=

@rem The maximum amount of heap to use. Default is left to JVM default.
@rem set HBASE_HEAPSIZE=1000

@rem Uncomment below if you intend to use off heap cache. For example, to allocate 8G of 
@rem offheap, set the value to "8G".
@rem set HBASE_OFFHEAPSIZE=1000

@rem For example, to allocate 8G of offheap, to 8G:
@rem etHBASE_OFFHEAPSIZE=8G

@rem Extra Java runtime options.
@rem Below are what we set by default.  May only work with SUN JVM.
@rem For more on why as well as other possible settings,
@rem see http://wiki.apache.org/hadoop/PerformanceTuning
@rem JDK6 on Windows has a known bug for IPv6, use preferIPv4Stack unless JDK7.
@rem @rem See TestIPv6NIOServerSocketChannel.
set HBASE_OPTS="-XX:+UseConcMarkSweepGC" "-Djava.net.preferIPv4Stack=true"

@rem Configure PermSize. Only needed in JDK7. You can safely remove it for JDK8+
set HBASE_MASTER_OPTS=%HBASE_MASTER_OPTS% "-XX:PermSize=128m" "-XX:MaxPermSize=128m" "-XX:ReservedCodeCacheSize=256m"
set HBASE_REGIONSERVER_OPTS=%HBASE_REGIONSERVER_OPTS% "-XX:PermSize=128m" "-XX:MaxPermSize=128m" "-XX:ReservedCodeCacheSize=256m"

@rem Uncomment below to enable java garbage collection logging for the server-side processes
@rem this enables basic gc logging for the server processes to the .out file
@rem set SERVER_GC_OPTS="-verbose:gc" "-XX:+PrintGCDetails" "-XX:+PrintGCDateStamps" %HBASE_GC_OPTS%

@rem this enables gc logging using automatic GC log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. Either use this set of options or the one above
@rem set SERVER_GC_OPTS="-verbose:gc" "-XX:+PrintGCDetails" "-XX:+PrintGCDateStamps" "-XX:+UseGCLogFileRotation" "-XX:NumberOfGCLogFiles=1" "-XX:GCLogFileSize=512M" %HBASE_GC_OPTS%

@rem Uncomment below to enable java garbage collection logging for the client processes in the .out file.
@rem set CLIENT_GC_OPTS="-verbose:gc" "-XX:+PrintGCDetails" "-XX:+PrintGCDateStamps" %HBASE_GC_OPTS%

@rem Uncomment below (along with above GC logging) to put GC information in its own logfile (will set HBASE_GC_OPTS)
@rem set HBASE_USE_GC_LOGFILE=true

@rem Uncomment and adjust to enable JMX exporting
@rem See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.
@rem More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
@rem
@rem set HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false" "-Dcom.sun.management.jmxremote.authenticate=false"
@rem set HBASE_MASTER_OPTS=%HBASE_JMX_BASE% "-Dcom.sun.management.jmxremote.port=10101"
@rem set HBASE_REGIONSERVER_OPTS=%HBASE_JMX_BASE% "-Dcom.sun.management.jmxremote.port=10102"
@rem set HBASE_THRIFT_OPTS=%HBASE_JMX_BASE% "-Dcom.sun.management.jmxremote.port=10103"
@rem set HBASE_ZOOKEEPER_OPTS=%HBASE_JMX_BASE% -Dcom.sun.management.jmxremote.port=10104"

@rem File naming hosts on which HRegionServers will run.  $HBASE_HOME/conf/regionservers by default.
@rem set HBASE_REGIONSERVERS=%HBASE_HOME%\conf\regionservers

@rem Where log files are stored.  $HBASE_HOME/logs by default.
@rem set HBASE_LOG_DIR=%HBASE_HOME%\logs

@rem A string representing this instance of hbase. $USER by default.
@rem set HBASE_IDENT_STRING=%USERNAME%

@rem Seconds to sleep between slave commands.  Unset by default.  This
@rem can be useful in large clusters, where, e.g., slave rsyncs can
@rem otherwise arrive faster than the master can service them.
@rem set HBASE_SLAVE_SLEEP=0.1

@rem Tell HBase whether it should manage it's own instance of Zookeeper or not.
@rem set HBASE_MANAGES_ZK=true

regionserver

jamjar@master:/opt/hbase-1.3.2/conf$ cat regionservers 
master
slave1
slave2
 

ln -s 超链接hadoop core-site.xml 和 hdfs-site.xml

1.5Hbase启动

常用命令
开启 Master :
# sh hbase-1.4.13/bin/hbase-daemon.sh start master

关闭 Master:
# sh hbase-1.4.13/bin/hbase-daemon.sh stop master

开启 RegionServer :
# sh hbase-1.4.13/bin/hbase-daemon.sh start regionserver

停止 RegionServer :
# sh hbase-1.4.13/bin/hbase-daemon.sh stop regionserver

集群群体开启命令:
# sh hbase-1.4.13/bin/start-hbase.sh
集群群体关闭命令:
# sh hbase-1.4.13/bin/stop-hbase.sh
致谢:原文链接:https://blog.csdn.net/lyq19870515/article/details/103398180

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值