MemVerge Introduces Open Source Solution to Improve Spark Shuffle Processes
MemVerge Splash Solves Performance and Elasticity Issues for Apache Spark Users
SAN JOSE, Calif. — Nov. 12, 2019 — MemVerge, the inventor of Memory-Converged Infrastructure (MCI), today announced MemVerge Splash, a first-of-its-kind, highly performant open source solution that allows shuffle data to be stored in an external storage system. MemVerge Splash is designed for Apache Spark software users looking to improve the performance, flexibility and resiliency of shuffle manager.
Traditionally, when shuffle data is stored remotely, system performance can degrade due to network and storage bottlenecks which can negatively impact performance and stability. MemVerge Splash, working together with MemVerge’s distributed system software named Distributed Memory Objects (DMO), solves these issues to make Spark highly performant through a high performance in-memory storage and networking stack.
MemVerge Splash allows shuffle data to be stored reliably by using pluggable storage and network backends and maintains a dedicated storage cluster. Splash also helps improve elasticity by allowing users to adjust the size of the computing cluster without interrupting their shuffle computation. This is particularly important when Kubernetes is used for scheduling Spark tasks.
“We engaged with the Spark community to identify their pain points and built MemVerge Splash with these in mind,” said Charles Fan, founder and CEO of MemVerge. “There is no other solution currently on the market that can provide a complete solution to tackle the shuffle elasticity and performance problems like Splash. We welcome all users and developers to try and contribute to this new open source solution.”
With MemVerge Splash, users can:
- Use any external storage systems as a remote shuffle service
- Extract the storage and network implementations from the shuffle procedure to allow users to apply different plugins for different storage and networks
- Separate storage and compute
- Tolerate node failure
“We chose to work with MemVerge because of the company’s deep understanding of big data applications and their ability to extract the most performance from the data,” said Zhen Fan, senior technologist at JD.com. “Splash is an optimized shuffle manager for a large scale Spark cluster. This solution improves shuffle performance and enables better tolerance of Spark node failures. With Splash, users can direct shuffle data to higher performance external storage to avoid data loss when Spark nodes fail. This is especially useful for users who manage Spark cluster of thousands of nodes, such as JD.com.”
MemVerge Splash not only works with MemVerge’s DMO, it is also compatible with any third party distributed storage system (e.g. HDFS, CephFS.) and network stack. Additionally, Splash works with both on-prem and cloud deployments.
MemVerge’s proprietary DMO technology provides a logical memory-storage convergence layer that leverages Intel’s latest persistent memory technology to allow data-intensive workloads to run seamlessly at memory speed, and can analyze and process large volumes of data in real time with ease.
MemVerge Splash is now available and can be accessed at https://github.com/MemVerge/splash. Additionally, MemVerge is currently available via its beta program and welcomes all interested users to visit www.memverge.com for more information on a trial and proof of concept.
MemVerge, the inventor of Memory-Converged Infrastructure (MCI), eliminates all boundaries between memory and storage to power the world’s most demanding data-centric workloads. Built on the latest Storage Class Memory (SCM) technology, MemVerge MCI provides a software virtualization layer that seamlessly integrates with existing applications and offers 10X the memory size and 10X the data I/O speed compared to current state-of-the-art computing and storage solutions. Its unique distributed memory objects (DMO) technology empowers data-intensive and high performance computing (HPC) workloads such as AI, big data analytics, IoT and data warehousing to run flawlessly at memory speed with guaranteed data consistency across multiple systems. MemVerge is the only solution offering a petabyte-scale data infrastructure and nanosecond speed response time, enabling businesses to process and derive insights from enormous amounts of data in real time, while handling small and large files with equal ease. In this era of machine-generated data, enterprises using MemVerge no longer contend with failed or painfully slow jobs due to performance bottlenecks, system crashes or worn out flash drives—they can now train AI models faster, analyze bigger states, complete more queries in less time and run complex workloads more predictably with fewer resources. Based in San Jose, MemVerge is used by leading financial services, retail, web and cloud companies as well as other leading innovators globally including LinkedIn, Tencent Cloud and JD.com.