Resolve the “java.lang.OutOfMemoryError: Java heap space” issue in Spark and Hive(Tez and MR)

OutOfMemoryError is not a surprise for spark as it is a memory-centric framework, To deal with memory issues, We need to make sure to understand the memory requirement of the application
java.lang.OutOfMemoryError: Java heap space
In Spark
We need to understand, Where we are getting the OutOfMemoryError either in the Driver or Executor process
OOM in the Driver process
– Driver is a Java process, Where the main method runs, Its main goal is to create a spark session or context and manage all the tasks across executors
– Mostly, We should not give any heavy lifting operations for the driver process, But unavoidable there are situations, Where we need to do some computation on the driver side
Example:
If you are running a collect () method from the driver, All the task starts to collect the data and send it back to the driver, This would need more memory on the Driver side else you will be seeing OutOfMemoryError
val collectdata = df.collect()
[Driver] ERROR org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception: java.lang.OutOfMemoryError: Java heap space
To resolve the issue
We can directly increase the Driver memory/Memory Overhead to accommodate the application request, But it is the ideal way not to send any data to the Driver (Recommended)
–driver-memory 10G --conf "spark.driver.memoryOverhead=2G"
spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client —driver-memory 10G --conf "spark.executor.memoryOverhead=2G" /opt/cloudera/parcels/CDH/jars/spark-examples*.jar pi 10 10
It is generally recommended to set this property to a value that is 10-20% of the total memory assigned to the driver process. For example, if you have set spark.driver.memory
to 10 GB, you might set spark.driver.memoryOverhead
it to 2 GB (or 12.5% of 8 GB).
Note: If you are still getting the same OOM exception even after increasing the memory, try to optimize the code to reduce the data that is getting sent to the driver
OOM in the Executor process
Executors are the slave Java process, Where the actual task runs, As we know spark is memory intense framework, it is essential to provide the memory needed for processing else you will be seeing OOM
To resolve the issue
We can tune the memory provided to the Executor and the number of executors to distribute the load
--executor-memory 10g --num-executors 10 --conf "spark.yarn.executor.memoryOverhead=2G"
spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --num-executors 10 --executor-memory 10g --conf "spark.yarn.executor.memoryOverhead=2G" /opt/cloudera/parcels/CDH/jars/spark-examples*.jar 10
As same as the driver process, it is generally recommended to set this property to a value that is 10-20% of the total memory assigned to the executor process
executor memory => 10% =>2GB
Will request 2 executor container(s), each with 1 core(s) and 10000 MB memory (including 1024 MB of overhead)
In HIVE (Tez or MR)
In Hive, you will be either using Tez or MR framework for the processing, Both of these use the Map and reduce framework
If you are seeing OOM in Hive either in Tez or MR we need to tune/increase the below property until you are getting into the ideal configuration
Example ERROR message
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1449486079177_5239_1_01, diagnostics=[Task failed, taskId=task_1449486079177_5239_1_01_000018, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task: attempt_1449486079177_5239_1_01_000018_0:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:157) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
For Tez
set tez.am.resource.memory.mb= 12288;
set tez.task.resource.memory.mb=10240;
set tez.am.java.opts=-Xmx9830m;
For MR:
set mapreduce.map.memory.mb=2048;
set mapreduce.map.java.opts="-Xmx1536m";
set mapreduce.reduce.memory.mb=2048;
set mapreduce.reduce.java.opts="-Xmx1536m";
Conclusion
It is important to find the exact process needed more memory for computation, increasing it blindly might cause a resource crunch across clusters.
We need to perform multiple trials and errors to attain an optimal memory value
Also, consider tuning the code and data size for better performance.
Check here to know more about how to edit memory settings in spark
Good luck with your Learning !!