11

I am running a spark job, and I got Not enough space to cache rdd_128_17000 in memory warning. However, in the attached file, it obviously saying only 90.8 G out of 719.3 G is used. Why is that? Thanks!


15/10/16 02:19:41 WARN storage.MemoryStore: Not enough space to cache rdd_128_17000 in memory! (computed 21.4 GB so far)
15/10/16 02:19:41 INFO storage.MemoryStore: Memory use = 4.1 GB (blocks) + 21.2 GB (scratch space shared across 1 thread(s)) = 25.2 GB. Storage limit = 36.0 GB.
15/10/16 02:19:44 WARN storage.MemoryStore: Not enough space to cache rdd_129_17000 in memory! (computed 9.4 GB so far)
15/10/16 02:19:44 INFO storage.MemoryStore: Memory use = 4.1 GB (blocks) + 30.6 GB (scratch space shared across 1 thread(s)) = 34.6 GB. Storage limit = 36.0 GB.
15/10/16 02:25:37 INFO metrics.MetricsSaver: 1001 MetricsLockFreeSaver 339 comitted 11 matured S3WriteBytes values
15/10/16 02:29:00 INFO s3n.MultipartUploadOutputStream: uploadPart /mnt1/var/lib/hadoop/s3/959a772f-d03a-41fd-bc9d-6d5c5b9812a1-0000 134217728 bytes md5: qkQ8nlvC8COVftXkknPE3A== md5hex: aa443c9e5bc2f023957ed5e49273c4dc
15/10/16 02:38:15 INFO s3n.MultipartUploadOutputStream: uploadPart /mnt/var/lib/hadoop/s3/959a772f-d03a-41fd-bc9d-6d5c5b9812a1-0001 134217728 bytes md5: RgoGg/yJpqzjIvD5DqjCig== md5hex: 460a0683fc89a6ace322f0f90ea8c28a
15/10/16 02:42:20 INFO metrics.MetricsSaver: 2001 MetricsLockFreeSaver 339 comitted 10 matured S3WriteBytes values

enter image description here

2
  • Total used / Total does not matter for caching blocks, they are atomic in memory sense. Can you try to increase # of partitions for that specific RDD? BTW, you have a nifty cluster. Commented Oct 16, 2015 at 5:10
  • So what would be the difference between caching block and the (Total used/Total) appeared on UI? Thanks!
    – Edamame
    Commented Oct 16, 2015 at 16:24

4 Answers 4

7

This is likely to be caused by the configuration of spark.storage.memoryFraction being too low. Spark will only use this fraction of the allocated memory to cache RDDs.

Try either:

  • increasing the storage fraction
  • rdd.persist(StorageLevel.MEMORY_ONLY_SER) to reduce memory usage by serializing the RDD data
  • rdd.persist(StorageLevel.MEMORY_AND_DISK) to partially persist onto disk if memory limits are reached.
1
0

This could be due to the following issue if you're loading lots of avro files:

https://mail-archives.apache.org/mod_mbox/spark-user/201510.mbox/%3CCANx3uAiJqO4qcTXePrUofKhO3N9UbQDJgNQXPYGZ14PWgfG5Aw@mail.gmail.com%3E

With a PR in progress at:

https://github.com/databricks/spark-avro/pull/95

0

I have a Spark-based batch application (a JAR with main() method, not written by me, I'm not a Spark expert) that I run in local mode without spark-submit, spark-shell, or spark-defaults.conf. When I tried to use IBM JRE (like one of my customers) instead of Oracle JRE (same machine and same data), I started getting those warnings.

Since the memory store is a fraction of the heap (see the page that Jacob suggested in his comment), I checked the heap size: IBM JRE uses a different strategy to decide default heap size and it was too small, so I simply added appropriate -Xms and -Xmx params and the problem disappeared: now the batch works fine both with IBM and Oracle JRE.

My usage scenario is not typical, I know, however I hope this can help someone.

0

you can fix this problem by increasing memory allocation.
In Pyspark, you can increase spark.driver.memory and spark.executor.memory to 4g using this configuration:

spark = SparkSession.builder 
    .appName("Pandas_on_spark") 
    .config("spark.driver.memory", "4g") 
    .config("spark.executor.memory", "4g") 
    .getOrCreate()

Not the answer you're looking for? Browse other questions tagged or ask your own question.