Memory Machine™
Checkpoint Engine

Increases availability of AI workloads

Transparent Checkpointing & Restore

Memory Machine is a suite of powerful and intuitive container orchestration services for running data-intensive pipelines such as bioinformatics and interactive computing applications such as EDA.

Memory Machine Checkpoint Engine Graphic

Memory Machine Checkpoint Engine is a powerful checkpointing and restore engine that is designed for easy integration with Kubernetes, popular job scheduler, and AWS Batch environments. Once integrated, CPU and GPU resources can be checkpointed which allows hot re-starts of a pipeline, or interactive app, to a specific point in time.

The software is included with Memory Machine Cloud and Memory Machine AI and is available as a stand-alone application.

Integration with Cadence

SpotSurfer for AWS Batch

Delivering High QoS for Unreliable Spot Instances

Memory Machine Checkpoint Engine captures the entire running state of an AWS Batch Job into a consistent image and restores the Job on a new Compute Instance without losing any work progress. It ensures a high quality of service at the Batch level using low-cost, but unreliable Spot-based Compute Instances.

The MMC Batch Engine’s key features include:
Full integration into the customer Batch environment
Automated checkpoint and restore
No change to the customer workflow
No change to the Job applications and Workflow Manager scripts
Scalable across thousands of Batch Jobs and Compute Instances

Secure data processing within the customer VPC

Integrating Memory Machine Checkpoint Engine

Request Integration Guide

Integrating Memory Machine Checkpoint Engine and job scheduler into your environment will provide:

Automated checkpoint and restore
Customer Platform/DevOps configuration at the job scheduler level, without requiring individual end-users to change their job scripts
Scalability in a production environment

Integration deliverables include:

Design doc describing architecture, components, and logic
MMC Engine installer
Sample scripts

Memory Machine™
Checkpoint Engine

Increases availability of AI workloads

Transparent Checkpointing & Restore

Delivering High QoS for Unreliable Spot Instances

Integrating Memory Machine Checkpoint Engine

MemVerge.ai

Memory Machine™ Batch

Memory Machine™ for CXL^®

Memory Machine™Checkpoint Engine

Increases availability of AI workloads

Transparent Checkpointing & Restore

Delivering High QoS for Unreliable Spot Instances

Integrating Memory Machine Checkpoint Engine

Memory Machine™
Checkpoint Engine