Resolve “org.apache.hadoop.hive.serde2.SerDeException: Unexpected tag” in Spark and Hive

We usually see the ERROR “org.apache.hadoop.hive.serde2.SerDeException: Unexpected tag” in Spark, When you are trying to connect the hive table via the HWC connector
Below ERROR represents there is an issue with the MapJoin (Hash value) while joining multiple tables
ERROR:
ERROR] [TezChild] |tez.TezProcessor|: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.serde2.SerDeException: Unexpected tag: 45 reserialized to 28
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.serde2.SerDeException: Unexpected tag: 45 reserialized to 28
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:298)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:252)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
at
Symptom:
This will cause Spark/hive query failure
Resolution:
To resolve the issue, We need to either increase the container memory or have to change the join logic which didn’t use much memory
1) Increase Container memory
We can increase the memory to make sure the container can able to handle the hash values while joining the table
set hive.tez.java.opts=-Xmx8192m; set hive.tez.container.size=12384;
2) Change the Join Logic
set hive.auto.convert.join.noconditionaltask = true; set hive.auto.convert.join.noconditionaltask.size = 10000000;
This means you are reverting the value of ” hive.auto.convert.join.noconditionaltask.size” to default 10 MB (10000000)
Tuning the below property enables the user to control what size table can fit in memory. This value represents the sum of the sizes of tables that can be converted to hashmaps that fit in memory. Reducing the above value will result in not using map join
NOTE: This is applicable for both Spark and Hive/MR execution engine
Good Luck with your Learning !!
Related Topics:
Resolve the “java.lang.OutOfMemoryError: Java heap space” issue in Spark and Hive(Tez and MR)