Resolve “org.apache.hadoop.hive.ql.lockmgr.LockException(No record of transaction could be found)”
Symptom:
Hive/Tez job fails with the below error messages
Error while compiling statement: FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.lockmgr.LockException(No record of transaction txnid:131342 could be found, may have timed out)
Root Cause:
– Every transaction in Hive will be recorded in the DB, and Heartbeats are sent regularly to the Hive Metastore server (HMS), This is to prevent stale transactions.
– Transaction will be aborted if the Hive Metastore server does not receive a heartbeat within the timeout
Different Lock state -> Acquired, Waiting, Aborted
Type of Locks -> Exclusive, Shared_read, Shared_write
– In this scenario, We can understand In the execution stage, the transaction is getting timed out and aborted
INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [Metastore Scheduled Worker 6]: Aborted the following transactions due to timeout: [131342]
INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [Metastore Scheduled Worker 6]: Aborted 1 transactions due to timeout
INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [Metastore Scheduled Worker 6]: Deleted 0 obsolete rows from WRTIE_SET
– We could able to see the transaction is getting aborted after 5 min
INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [pool-6-thread-7451]: Added entries to MIN_HISTORY_LEVEL with a single query for current txn: [131342]
INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [Metastore Scheduled Worker 6]: Aborted the following transactions due to timeout: [131342]
– This is because The default transaction timeout is 5 min (300 seconds)
Resolution:
To resolve the issue, We need to increase the transaction timeout to a higher value
We need to update the below property in hive-site.xml to a higher value
hive.txn.timeout => 1200
For Cloudera distribution:
Go to Clouder Manager -> Hive service -> Configuration -> Hive Metastore Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml -> Add
Name: hive.txn.timeout Value: 1200
This change requires a Hive service restart to take effect.
Check here for more posts
Good Luck with your Learning, Hit like if this post is helpful