SPAdes with SpotSurfer
SpotSurfer enables long-running SPAdes Jobs to run on low-cost Spot instances for leading plant genomics company
This SPAdes user is a global market leader in seed potato trading, innovative breeding, and concept development. The company needs to work with its commercial growers, genetic partners, and government organizations to identify and develop potato variants with the desired disease and pest resistance, water consumption, crop yields, soil compatibility, visual characteristics, and flavors. Developing each variant requires many genomic simulations, and each job requires significant investments by researchers and data engineers to develop variant models. Each variant model requires job runs that consume expensive cloud resources and can take days to execute.
The lead data engineer for and his colleagues faced multiple challenges when running SPAdes genomics job modeling on Azure. These workloads are massively parallel and involve multiple application instances; data sets to identify genomic variations and sequence the DNA. These data sets can be terabytes in size and consume a volume of CPU and memory resources is beyond their cloud budget.
- VM type: M32-16ms(16 CPU, 875GB Mem, 1TB storage, Memory optimized instance)
- Run time: 216 hours (9 days)
- MMCE Cost ($0.61/hr): $131.76
Their Primary Challenges Included:
100% First-Pass Execution
With genomics jobs requiring days to run, HZPC needs jobs to run with 100% effectiveness the first time without failures due to resource limitations (CPU/Memory) caused by spikes in demand related to data copies or other resource-hungry applications.
Excessive Cloud Costs
One of the biggest challenges is over-allocations to accommodate demand spikes during application runs. This creates excess costs and reduces the number of models that can be run.
Manage Cloud Spot Instances
The team needs the ability to start and stop job runs from matching job resources to available spot instances without risking first-pass execution.
Mapping CPU vs. Memory Resources
Depending on the genomics work stage, different resources are under stress. Developing a new variant sequence is more CPU intensive; versus comparing already assembled variants that require more memory storage than CPU resources.
Prioritizing Concurrent Job Runs
The team at HZPC needs to manage and reduce the size of jobs
The data engineering team sought a new way to dynamically manage SPAdes jobs to improve first-time job runs, prioritize job run order, optimize cloud costs, and leverage spot instances on Azure cloud. The chose to respond with Memory Machine Cloud Edition for 3 reasons:
This featured enabled the data engineering team to pause a running application, take a snapshot of the entire application state (data in memory, storage, cache, registers, etc.). The ability to create AppCapsules opened a new world of SPAdes job mobility and predictability.
The end goal was to run their SPAdes jobs on lower cost Spot instances. SpotSurfer combined the use of AppCapsules and cloud service orchestration to ensure their jobs would recover gracefully from Spot terminations.
The Bottom Line
EC2 compute costs reduced by 90% by moving workload from On-Demand to Spot
Request a Demo or Free Trial
We are happy to provide you a full demo, free trial, or access to play in a Sandbox set-up in AWS so that you can see the capabilities of the MemVerge solution. Get started for free!