Resolve “org.apache.hadoop.hive.ql.lockmgr.LockException(No record of transaction could be found)”

Symptom:

Hive/Tez job fails with the below error messages

Error while compiling statement: FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.lockmgr.LockException(No record of transaction txnid:131342 could be found, may have timed out) 

Root Cause:

– Every transaction in Hive will be recorded in the DB, and Heartbeats are sent regularly to the Hive Metastore server (HMS), This is to prevent stale transactions.

– Transaction will be aborted if the Hive Metastore server does not receive a heartbeat within the timeout

Different Lock state -> Acquired, Waiting, Aborted

Type of Locks -> Exclusive, Shared_read, Shared_write

In this scenario, We can understand In the execution stage, the transaction is getting timed out and aborted

INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [Metastore Scheduled Worker 6]: Aborted the following transactions due to timeout: [131342]

INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [Metastore Scheduled Worker 6]: Aborted 1 transactions due to timeout

INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [Metastore Scheduled Worker 6]: Deleted 0 obsolete rows from WRTIE_SET

– We could able to see the transaction is getting aborted after 5 min

INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [pool-6-thread-7451]: Added entries to MIN_HISTORY_LEVEL with a single query for current txn: [131342]

INFO org.apache.hadoop.hive.metastore.txn.TxnHandler: [Metastore Scheduled Worker 6]: Aborted the following transactions due to timeout: [131342]

– This is because The default transaction timeout is 5 min (300 seconds)

Resolution:

To resolve the issue, We need to increase the transaction timeout to a higher value

We need to update the below property in hive-site.xml to a higher value

hive.txn.timeout => 1200

For Cloudera distribution:

Go to Clouder Manager -> Hive service -> Configuration -> Hive Metastore Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml -> Add

Name:  hive.txn.timeout
Value:   1200

This change requires a Hive service restart to take effect.

Check here for more posts

Good Luck with your Learning, Hit like if this post is helpful

Similar Posts