Concept
Stores
- Memory, Disk, and Off-Heap()
Levels
- persist(mark), cache(StorageLevel.MEMORY_ONLY)
1 | class StorageLevel private( |
Master/Slave
- Executor(Driver): SparkContext -> SparkEnv -> BlockTransferService(NettyBlockTransferService), BlockManagerMaster(BlockManagerMasterEndpoint), BlockManager
RPC
Update(Read/Write)
RDD -> BlockManager -> Remote: BlockTransferService(fetch/upload), Local: MemoryStore/DiskStore(get/put)
MemoryManager
acquire/release, get/set
MemoryMode(ON_HEAP, OFF_HEAP)
StaticMemoryManager
MaxStorageMemory = systemMaxMemory spark.storage.memoryFraction spark.storage.safetyFraction
MaxExecutionMemory = systemMaxMemory spark.shuffle.memoryFraction spark.shuffle.safetyFraction
UnifiedMemoryManager
MemoryPool(StorageMemoryPool, ExecutionMemoryPool)
acquireExecutionMemory, acquireStorageMemory, acquireUnrollMemory
getMaxMemory = (systemMemory - reservedMemory) * spark.memory.fraction
Links
- Apache Spark 内存管理详解
- Spark Storage ① - Spark Storage 模块整体架构
- Spark Storage ② - BlockManager 的创建与注册
- Spark Storage ③ - Master 与 Slave 之间的消息传递与时机
- Spark Storage ④ - 存储执行类介绍(DiskBlockManager、DiskStore、MemoryStore)
- Spark 内存管理的前世今生(上)
- Spark 内存管理的前世今生(下)
- Spark Documentation
- Author:HyperJ
- Source:HyperJ’s Blog
- Link:Spark Storage