Total size of serialized results of tasks (1024.5 MB) is bigger than spark.driver.maxResultSize

Total size of serialized results of tasks (1024.5 MB) is bigger than spark.driver.maxResultSize

failure: Total size of serialized results of x tasks (1024.5 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)” usually occurs if the executor sends data to the spark driver which is higher than the allocated value (spark.driver.maxResultSize)

By default spark.driver.maxResultSize => 1 GB

NOTE: “spark.driver.maxResultSize” should be 80% of the Spark Driver Memory

What are Serialized results in Spark?

It’s the process of converting the intermediate data generated during the execution of Spark operations into a format that can be easily stored and transmitted across the network. Results from executors are typically a collection of data items organized as partitions. To make the data available across the node or to store it in the HDFS, it has to be Serialized

There are several serialization libraries available in Spark, including Java serialization, Kryo serialization, and the newer Arrow serialization. The choice of serialization library can have a significant impact on performance, as well as on the ability to share data between different nodes in a cluster.

Impact of this issue

In this case, The serialized results from the executors failed to get transferred into the Driver process as a result it will fail the task, Once we are seeing more failed tasks the entire job will be marked as failed

Symptoms:

You will be seeing the below error in the Spark driver logs

failure: Total size of serialized results of x tasks (1024.5 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)

Troubleshooting:

– Usually sending huge amounts of data to the Driver is not the recommended approach, This will affect the spark job performance significantly

– When you see the above error message in the spark driver logs, it shows that the executor is sending data to the driver

– Most common reason, When you call collect() method from the driver, Which collects the output from the executor and sends it Driver

– Check the code for any operation that results in sending data to the driver else we need to increase the memory to accommodate the data

To Replicate the issue

We can easily replicate this issue in our local setup and below are the step-by-step guide

– Create a partition table with at least 10000 records (Refer to this article for creating a table with more records )

– Start a spark-shell session with “–conf spark.driver.maxResultSize=10m”, We are reducing the maxResultsize to 10 MB to replicate the issue

spark-shell --conf spark.driver.maxResultSize=10m

spark-shell by default runs in client mode, Which means the driver will be triggered on the host, where the shell is triggered

– Try to collect results for the driver as below

spark.sql("select * from testing.tbl").collect()

We could able to see the ERROR message in the driver logs

scala> spark.sql("select * from test.t1p").collect()

[Stage 0:========================>                          (9744 + 23) / 20000]23/02/10 14:02:34 ERROR scheduler.TaskSetManager: Total size of serialized results of 9760 tasks (10.0 MB) is bigger than spark.driver.maxResultSize (10.0 MB)

23/02/10 14:02:34 ERROR scheduler.TaskSetManager: Total size of serialized results of 9761 tasks (10.0 MB) is bigger than spark.driver.maxResultSize (10.0 MB)

23/02/10 14:02:34 ERROR scheduler.TaskSetManager: Total size of serialized results of 9762 tasks (10.0 MB) is bigger than spark.driver.maxResultSize (10.0 MB)

23/02/10 14:02:34 ERROR scheduler.TaskSetManager: Total size of serialized results of 9763 tasks (10.0 MB) is bigger than spark.driver.maxResultSize (10.0 MB)

23/02/10 14:02:34 ERROR scheduler.TaskSetManager: Total size of serialized results of 9764 tasks (10.0 MB) is bigger than spark.driver.maxResultSize (10.0 MB)

From the above example, We can understand if the data sent from the executor to the driver is more than the maxResultSize you will see this failure

Resolution

Increase the below value based on the need, Do a few trial and error methods to obtain the optimum value

--conf spark.driver.maxResultSize

In this case, We could see the data that is sent to the driver is more than 1 GB (Default Value), So we have to increase it to 2 GB or more, Make sure the driver memory is more than that “maxResultSize“. If in case Driver memory is less or the same as of “maxResultSize” would result in GC and OOM issues

--conf spark.driver.memory=8g
--conf spark.driver.maxResultSize=2g

Example:

spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --conf spark.driver.memory=8g --conf spark.driver.maxResultSize=2g /opt/cloudera/parcels/CDH/jars/spark-examples*.jar 1 1 

Conclusion:

It is not recommended to send the data to the driver as it will impact the application performance,

To accommodate the results in the driver we need to tune the below property “–conf spark.driver.maxResultSize” as shown above

Good Luck with your Leaning !!

Similar Posts