即:对同一行的变更操作(包括针对一列/多列/多column family的操作),要么完全成功,要么完全失败,不会有其他状态
示例:
A客户端针对rowkey=10的行发起操作:dim1:a = 1 dim2:b=1
B客户端针对rowkey=10的行发起操作:dim1:a = 2 dim2:b=2
dim1、dim2为column family, a、b为column
A客户端和B客户端同时发起请求,最终rowkey=10的行各个列的值可能是dim1:a = 1 dim2:b=1,也可能是dim1:a = 2 dim2:b=2
但绝对不会是dim1:a = 1 dim2:b=2
HBase基于行锁来保证单行操作的原子性 ,可以看下HRegion put的代码(base: HBase 0.94.20)::
org.apache.hadoop.hbase.regionserver.HRegion:
- /**
- * @param put
- * @param lockid
- * @param writeToWAL
- * @throws IOException
- * @deprecated row locks (lockId) held outside the extent of the operation are deprecated.
- */
- public void put(Put put, Integer lockid, boolean writeToWAL)
- throws IOException {
- checkReadOnly();
- // Do a rough check that we have resources to accept a write. The check is
- // 'rough' in that between the resource check and the call to obtain a
- // read lock, resources may run out. For now, the thought is that this
- // will be extremely rare; we'll deal with it when it happens.
- checkResources();
- startRegionOperation();
- this.writeRequestsCount.increment();
- this.opMetrics.setWriteRequestCountMetrics(this.writeRequestsCount.get());
- try {
- // We obtain a per-row lock, so other clients will block while one client
- // performs an update. The read lock is released by the client calling
- // #commit or #abort or if the HRegionServer lease on the lock expires.
- // See HRegionServer#RegionListener for how the expire on HRegionServer
- // invokes a HRegion#abort.
- byte [] row = put.getRow();
- // If we did not pass an existing row lock, obtain a new one
- Integer lid = getLock(lockid, row, true);
- try {
- // All edits for the given row (across all column families) must happen atomically.
- internalPut(put, put.getClusterId(), writeToWAL);
- } finally {
- if(lockid == null) releaseRowLock(lid);
- }
- } finally {
- closeRegionOperation();
- }
- }
- private Integer internalObtainRowLock(final HashedBytes rowKey, boolean waitForLock)
- throws IOException {
- checkRow(rowKey.getBytes(), "row lock");
- startRegionOperation();
- try {
- CountDownLatch rowLatch = new CountDownLatch(1);
- // loop until we acquire the row lock (unless !waitForLock)
- while (true) {
- CountDownLatch existingLatch = lockedRows.putIfAbsent(rowKey, rowLatch);
- if (existingLatch == null) {
- break;
- } else {
- // row already locked
- if (!waitForLock) {
- return null;
- }
- try {
- if (!existingLatch.await(this.rowLockWaitDuration,
- TimeUnit.MILLISECONDS)) {
- throw new IOException("Timed out on getting lock for row=" + rowKey);
- }
- } catch (InterruptedException ie) {
- // Empty
- }
- }
- }
- // loop until we generate an unused lock id
- while (true) {
- Integer lockId = lockIdGenerator.incrementAndGet();
- HashedBytes existingRowKey = lockIds.putIfAbsent(lockId, rowKey);
- if (existingRowKey == null) {
- return lockId;
- } else {
- // lockId already in use, jump generator to a new spot
- lockIdGenerator.set(rand.nextInt());
- }
- }
- } finally {
- closeRegionOperation();
- }
- }
HBase也提供API(lockRow/unlockRow)显示的获取行锁 ,但不推荐使用。原因是两个客户端很可能在拥有对方请求的锁时,又同时请求对方已拥有的锁,这样便形成了死锁,在锁超时前,两个被阻塞的客户端都会占用一个服务端的处理线程,而服务器线程是非常稀缺的资源
HBase提供了几个特别的原子操作接口:
checkAndPut/checkAndDelete/increment/append ,这几个接口非常有用,内部实现也是基于行锁
checkAndPut/checkAndDelete内部调用代码片段:
- // Lock row
- Integer lid = getLock(lockId, get.getRow(), true);
- ......
- // get and compare
- try {
- result = get(get, false);
- ......
- //If matches put the new put or delete the new delete
- if (matches) {
- if (isPut) {
- internalPut(((Put) w), HConstants.DEFAULT_CLUSTER_ID, writeToWAL);
- } else {
- Delete d = (Delete)w;
- prepareDelete(d);
- internalDelete(d, HConstants.DEFAULT_CLUSTER_ID, writeToWAL);
- }
- return true;
- }
- return false;
- } finally {
- // release lock
- if(lockId == null) releaseRowLock(lid);
- }
checkAndPut在实际应用中非常有价值,我们线上生成Dpid的项目,多个客户端会并行生成DPID,如果有一个客户端已经生成了一个DPID,则其他客户端不能生成新的DPID,只能获取该DPID
代码片段:
- ret = hbaseUse.checkAndPut("bi.dpdim_mac_dpid_mapping", mac, "dim",
- "dpid", null, dpid);
- if(false == ret){
- String retDpid = hbaseUse.query("bi.dpdim_mac_dpid_mapping", mac, "dim", "dpid");
- if(!retDpid.equals(ABNORMAL)){
- return retDpid;
- }
- }else{
- columnList.add("mac");
- valueList.add(mac);
- }
-
为了保证并发操作时数据的一致性和性能,HBase中应用了各种各样高效的可重入锁,包括行级别的rowlock、mvcc,region级别的读写锁,store级别的读写锁,memstore级别的读写锁等等。
1、 行级别的锁RowLock
HBase中为了解决行级别在并发操作中的一致性问题,采用了Rowlock机制。保证只有同一个线程同时对该行做操作。当然rowlock有lease租约的概念,超过期限,自动释放该行锁
2、 MVCC
处于并发性能的考虑,Rowlock只在write数据时采用,对于读写并发操作,HBase采用了MVCC解决方案。
基本原理是writer操作会经过WAL、Memstore等一系列过程,首先在Rowlock操作后,立即分配一个writer number,每个cf column cell在store中都会带上这个writer number,在写操作结束即release lock前,会标记writer number已经结束;每个读操作在开始时(readpoint)会分配最大的处于结束的writer number,即最新的处于结束的writer number(memstore中的值可用存在写操作未结束的,这些值不可以读,所以在flush cache的时候必须等待memstore中所有的值都是写结束的)。详细的MVCC分析可以参见以前写的blog:http://blog.csdn.net/yangbutao/article/details/8998800
3、 Region级别的锁
在做更新操作时,需要判断资源是否满足要求,如果到达临界点,则申请进行flush操作,等待直到资源满足要求(参见Region中的checkResource)
Region update更新锁(updatesLock),在internalFlushCache时加写锁,导致在做Put、delete、increment操作时候阻塞(这些操作加的是读锁)。
Region close保护锁(lock),在Region close或者split操作的时(加写锁),阻塞对region的其他操作(加读锁),比如compact、flush、scan和其他写操作。
4、 Store级别的锁
flush过程包括,
a 、prepare(基于memstore做snapshot)
b、flushcache(基于snapshot生成临时文件)
c、commit(确认flush操作完成,rename临时文件为正式文件名称,清除mem中的snapshot)
其中在flush过程的commit阶段,compact过程的completeCompaction阶段(rename临时compact文件名、清理旧的文件),close store(关闭store),bulkLoadHFile,会阻塞对store的写操作。
5、 MemStore级别的锁
对Store的写操作会调用Memstore的相关操作,在对memstore做snapshot以及清除snapshot的时候会阻塞其他操作(如add、delete、getNextRow)。
本文深入探讨HBase如何通过行锁机制确保单行数据操作的原子性,包括操作流程、实现原理及关键代码解析,并介绍HBase提供的原子操作接口及其应用场景。
363

被折叠的 条评论
为什么被折叠?



