报错信息
org.apache.iceberg.hive.HiveTableOperations$WaitingForLockException: Waiting for lock.
at org.apache.iceberg.hive.HiveTableOperations.lambda$acquireLock$9(HiveTableOperations.java:444) ~[dw-0.1.jar:?]
at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:405) ~[dw-0.1.jar:?]
at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214) ~[dw-0.1.jar:?]
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198) ~[dw-0.1.jar:?]
at org.apache.iceberg.hive.HiveTableOperations.acquireLock(HiveTableOperations.java:438) ~[dw-0.1.jar:?]
at org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:217) ~[dw-0.1.jar:?]
at org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:126) ~[dw-0.1.jar:?]
at org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:300) ~[dw-0.1.jar:?]
at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:405) ~[dw-0.1.jar:?]
at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214) ~[dw-0.1.jar:?]
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198) ~[dw-0.1.jar:?]
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190) ~[dw-0.1.jar:?]
at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:282) ~[dw-0.1.jar:?]
at org.apache.iceberg.flink.sink.IcebergFilesCommitter.commitOperation(IcebergFilesCommitter.java:308) ~[dw-0.1.jar:?]
at org.apache.iceberg.flink.sink.IcebergFilesCommitter.commitDeltaTxn(IcebergFilesCommitter.java:277) ~[dw-0.1.jar:?]
at org.apache.iceberg.flink.sink.IcebergFilesCommitter.commitUpToCheckpoint(IcebergFilesCommitter.java:219) ~[dw-0.1.jar:?]
at org.apache.iceberg.flink.sink.IcebergFilesCommitter.notifyCheckpointComplete(IcebergFilesCommitter.java:189) ~[dw-0.1.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.notifyCheckpointComplete(StreamOperatorWrapper.java:99) ~[flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpointComplete(SubtaskCheckpointCoordinatorImpl.java:330) ~[flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:1092) ~[flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointCompleteAsync$11(StreamTask.java:1057) ~[flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointOperation$13(StreamTask.java:1080) ~[flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) [flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) [flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:317) [flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:189) [flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:619) [flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:583) [flink-dist_2.12-1.12.5.jar:1.12.5]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:758) [flink-dist_2.12-1.12.5.jar:1.12.5]
- 原因:
当客户端调用锁时,hive MetastoreWAITING首先在 hive 表上创建一个具有状态的排他锁。如果表上没有其他锁,则 Metastore 将状态更改为ACQUIRED。否则,更新hl_blockedby_ext_id到最新的 lockId。无论是否获取到锁状态,锁信息都会如下存储在HIVE_LOCKS中。
HiveMetaStoreClient.lock()将一个新的锁请求加入到 HMSHIVE_LOCKS表中,如果我们在超时时在此处抛出异常,我们的锁请求将被困在那里WAITING(可能会阻塞其他后续请求),除非我们调用unlock()
当达到超时时,清理过程将删除这些锁,但在此之前这可能会阻止其他锁请求。所以最好在可能的情况下清洁这些锁
所以需要我们找到HiveMetaStore中的HIVE_LOCKS表 将报错的表所对应的锁记录删除
select hl_lock_ext_id,hl_table,hl_lock_state,hl_lock_type,hl_last_heartbeat,hl_blockedby_ext_id from HIVE_LOCKS;
-- 然后
delete from HIVE_LOCKS where hl_lock_ext_id=605 or hl_lock_ext_id=622;
最后问题解决