Resolve the “java.lang.OutOfMemoryError: Java heap space” issue in Spark and Hive(Tez and MR)

Resolve the "java.lang.OutOfMemoryError: Java heap space" issue in Spark and Hive(Tez and MR)

OutOfMemoryError is not a surprise for spark as it is a memory-centric framework, To deal with memory issues, We need to make sure to understand the memory requirement of the application

java.lang.OutOfMemoryError: Java heap space

In Spark

We need to understand, Where we are getting the OutOfMemoryError either in the Driver or Executor process

OOM in the Driver process

– Driver is a Java process, Where the main method runs, Its main goal is to create a spark session or context and manage all the tasks across executors

– Mostly, We should not give any heavy lifting operations for the driver process, But unavoidable there are situations, Where we need to do some computation on the driver side

Example:

If you are running a collect () method from the driver, All the task starts to collect the data and send it back to the driver, This would need more memory on the Driver side else you will be seeing OutOfMemoryError

val collectdata = df.collect()
[Driver] ERROR org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception: java.lang.OutOfMemoryError: Java heap space

To resolve the issue

We can directly increase the Driver memory/Memory Overhead to accommodate the application request, But it is the ideal way not to send any data to the Driver (Recommended)

–driver-memory 10G --conf "spark.driver.memoryOverhead=2G" 
spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client —driver-memory 10G --conf "spark.executor.memoryOverhead=2G" /opt/cloudera/parcels/CDH/jars/spark-examples*.jar pi 10 10

It is generally recommended to set this property to a value that is 10-20% of the total memory assigned to the driver process. For example, if you have set spark.driver.memory to 10 GB, you might set spark.driver.memoryOverhead it to 2 GB (or 12.5% of 8 GB).

Note: If you are still getting the same OOM exception even after increasing the memory, try to optimize the code to reduce the data that is getting sent to the driver

OOM in the Executor process

Executors are the slave Java process, Where the actual task runs, As we know spark is memory intense framework, it is essential to provide the memory needed for processing else you will be seeing OOM

To resolve the issue

We can tune the memory provided to the Executor and the number of executors to distribute the load

--executor-memory 10g --num-executors 10 --conf "spark.yarn.executor.memoryOverhead=2G"
spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --num-executors 10 --executor-memory 10g --conf "spark.yarn.executor.memoryOverhead=2G" /opt/cloudera/parcels/CDH/jars/spark-examples*.jar 10

As same as the driver process, it is generally recommended to set this property to a value that is 10-20% of the total memory assigned to the executor process

executor memory => 10% =>2GB

Will request 2 executor container(s), each with 1 core(s) and 10000 MB memory (including 1024 MB of overhead)

In HIVE (Tez or MR)

In Hive, you will be either using Tez or MR framework for the processing, Both of these use the Map and reduce framework

If you are seeing OOM in Hive either in Tez or MR we need to tune/increase the below property until you are getting into the ideal configuration

Example ERROR message

Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1449486079177_5239_1_01, diagnostics=[Task failed, taskId=task_1449486079177_5239_1_01_000018, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task: attempt_1449486079177_5239_1_01_000018_0:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:157) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) 

For Tez

set tez.am.resource.memory.mb= 12288;
set tez.task.resource.memory.mb=10240;
set tez.am.java.opts=-Xmx9830m;  

For MR:

set mapreduce.map.memory.mb=2048;
set mapreduce.map.java.opts="-Xmx1536m";
set mapreduce.reduce.memory.mb=2048;
set mapreduce.reduce.java.opts="-Xmx1536m";

Conclusion

It is important to find the exact process needed more memory for computation, increasing it blindly might cause a resource crunch across clusters

We need to perform multiple trials and errors to attain an optimal memory value

Also, consider tuning the code and data size for better performance.

Check here to know more about how to edit memory settings in spark

Good luck with your Learning !!

Similar Posts