MemVerge in the News
MemVerge on Supercharging AI Infrastructure
The MemVerge story is a path to simplifying the path for smarter scheduling of valuable GPU enabled resources that takes into account memory partitioning for fractional GPU with graceful preemption vs. different GPU virtualization alternatives.
More Than Just Hardware: The Art of GPU Optimization
In their presentation, MemVerge provided observations regarding GPU utilization. The surveyed organizations have hundreds to thousands of GPUs, and one main reason for the low utilization is that different groups own the GPUs and end up siloed.
MemVerge’s Memory Machine AI May Help Avert an Acute Crisis in Tech
In a very shop-your-own-closet style, MemVerge has developed a new offering that finds and harvests idle resources trapped within existing GPU stacks, relieving companies of the need to constantly pour money into infrastructure expansion. MemVerge named it Memory Machine AI or MMAI, a cloud-agnostic software solution that allows AI workloads to surf and share GPU resources between themselves.
MemVerge: Machining AI Memory Needs To Size
One of our presenters, MemVerge, offered some intriguing insights into what’s really happening in the AI arena. From their viewpoint, AI-centric systems continue to shift away from x86-centric computing towards GPU-centric processing for AI workloads.
Cadence Collaborates with MemVerge to Increase Resiliency and Cost-Optimization of Long-Running High-Memory EDA Jobs on AWS Spot Instances
In a move that promises significant cost savings and enhanced efficiency for design engineers, the Cadence and MemVerge collaboration solves this challenge by implementing a transparent, low-overhead incremental checkpoint/restore solution that makes these EDA jobs resilient (hot restart) to Spot pre-emptions or without needing to change the underlying EDA application.
Memcon 2024 Key Takeaways
Memory Tiering is hard. Like, really hard. It’s one of the biggest gaps in the market in terms of solutions, but Memverge and Kove are both taking different approaches to solving it.
What role does CXL play in AI? Depends on who you ask
At GTC, MemVerge, Micron and Supermicro demonstrated how CXL can increase large language model GPU utilization without the addition of more processing units. CXL does this by expanding the GPU memory pool to increase high-bandwidth memory usage at a lower cost than scaling out the infrastructure with more GPUs or more HBM. The tradeoff, however, is performance.
Startup claims to boost LLM performance using standard memory instead of GPU HBM — but experts remain unconvinced by the numbers despite promising CXL technology
Between Flexgen’s memory offloading capabilities and Memory Machine X’s memory tiering capabilities, the solution is managing the entire memory hierarchy that includes GPU, CPU and CXL memory modules.
MemVerge uses CXL to drive Nvidia GPU utilization higher
CXL v3.0 can pool external memory and share it with processors via CXL switches and coherent caches. MemVerge’s software virtually combines a CPU’s memory with a CXL-accessed external memory tier and is being demonstrated at Nvidia’s GTC event. Micron and MemVerge report they are “boosting the performance of large language models (LLMs) by offloading from GPU HBM to CXL memory.”
The promise of CXL still hangs in the balance
“… servers operate as one of three pillars — compute — with the other two being networking and storage. AI is changing this,” Fan said. “While the three [pillars] will continue to exist … a new center of gravity has emerged where AI workloads take place.”