Unleashing the Future of Memory Management: Exploring CXL Dynamic Capacity Devices with Docker and QEMU

Introduction

In the ever-advancing realm of technology, developers and application owners always look for innovative tools and methodologies to boost performance and scalability. A revolutionary stride in this direction is the integration of Compute Express Link (CXL) technology, particularly through the utilization of Dynamic Capacity Devices (DCD). CXL, an open standard for high-speed CPU-to-device and CPU-to-memory interconnects, substantially enhances data center and cloud environments, offering many benefits.

Dynamic Capacity Devices offer a practical solution for flexible and scalable memory allocation. DCDs were introduced in the CXL 3.0 specification, section 9.13.3. As the name suggests, a DCD is a CXL memory device that allows its memory capacity to change dynamically without the need to reset the device or host, reconfigure HDM decoders, or reconfigure software DAX regions. This feature becomes a game-changer in environments where memory demands, such as data analytics and machine learning applications, can spike unpredictably. One of the most significant use cases for Dynamic Capacity is to allow hosts to share memory dynamically within a data center without increasing the per-host attached memory.

How Dynamic Capacity Devices Work

The general flow for adding or removing memory is to have an orchestrator coordinate the use of the memory. Generally, there are 5
actors in such a system, the Orchestrator, Fabric Manager, the Device the host sees, the Host Kernel, and a Host User. The figure below describes the sequences used for the different operations.

OrchestratorFMDeviceHost KernelHost UserCreate regionCreate RegionSignal doneAdd CapacityAdd ExtentAdd ExtentAccept ExtentAccept ExtentCreate DAX devUse memoryRelease DAX devSignal doneRemove CapacityRelease ExtentRelease ExtentRelease ExtentRelease ExtentAdd CapacityAdd ExtentAdd ExtentAccept ExtentAccept ExtentCreate DAX devUse memoryRelease DAX devSignal doneRemove CapacityRelease ExtentRelease ExtentRelease ExtentRelease ExtentAdd CapacityAdd ExtentAdd ExtentCreate DAX devUse memoryRemove CapacityRelease ExtentRelease ExtentRelease DAX devSignal doneRelease ExtentRelease ExtentRelease ExtentRelease ExtentDestroy RegionOrchestratorFMDeviceHost KernelHost User

At the time of writing this article, the DCD feature is not yet upstreamed, but is available through patches in the following GitHub repositories:

MemVerge has created a Dockerized QEMU image that delivers an environment to simulate a CXL environment with DCD features. This setup provides a sandbox for testing and development and aids in understanding and leveraging CXL technology’s full capabilities without requiring immediate access to CXL-compatible hardware. This image has been tested on a host running Ubuntu 22.04, although it should work on other distributions.

We will guide you through setting up this environment, managing dynamic memory, and utilizing these powerful features to optimize your applications.

Command Execution Conventions

In the upcoming sections of this guide, we will be navigating between commands that need to be executed on the host system and those that should be run inside the QEMU virtual machine instance. To make it easier to distinguish between these environments, we will use specific prompts before each command:

  • host>: This prompt indicates that the command following it should be executed on the host system. These commands manage the ‘Orchestrator’, ‘Fabric Manager’, and ‘DCD Device’.
  • guest>: This prompt denotes commands to run inside the QEMU instance. These commands generally pertain to the configurations and operations directly involving the virtualized environment for the ‘Host Kernel’ and ‘Host User’.

Prerequisites

Hardware Recommendations

We recommend the following minimum host system configuration:

  • 4 vCPUs
  • 8GB memory
  • 80GB storage
  • Access to the Internet

Required Software Packages

Ensure your host system is equipped with the necessary software packages outlined below:

  1. Docker: Docker is crucial for running the containerized version of the QEMU instance. Ensure that Docker is installed and running on your host machine. For instructions, refer to the official Install Docker Engine documentation.
  2. SSH Client: You will need to connect to the QEMU instance using SSH. Most Unix-like operating systems come with an SSH client by default. For Windows, you may need to install an SSH client like PuTTY or use the built-in SSH client in Windows 10 and later.
  3. socat: This utility connects two endpoints, such as Unix sockets or TCP ports. We will use it to send DCD commands to the QEMU Machine Protocol (QMP) socket and act as the Fabric Management interface. Install socat using your package manager (e.g., sudo apt-get install socat).

Setting Up the Docker Environment

To begin leveraging the capabilities of CXL and DCD in a simulated environment, the first step is setting up the necessary Docker container with the QEMU image. Docker provides an isolated and consistent platform for testing, making it an excellent choice for experimenting with new technologies like CXL without impacting your primary system setup.

Initial Setup: Pulling the Docker Image

To pull the MemVerge DCD Docker image, execute the following command in your terminal. This command downloads the latest version of our qemu-dcd-example Docker image, which includes all necessary dependencies and configurations for simulating a CXL environment:

host>$ sudo docker pull mvpool/qemu-dcd-example:latest

Running the Docker Image and Connecting to the QEMU Instance

Once the Docker image is successfully downloaded, you can run it using the following command. This step initializes the QEMU virtual machine within the Docker container, setting up the necessary ports and volume mappings:

host>$ sudo docker run -v /tmp:/tmp -p 2222:2222 mvpool/qemu-dcd-example:latest

Once the Docker container is up and running, connect to the QEMU instance from your host machine using SSH. The default root password for access “memverge”:

host>$ ssh -p 2222 root@localhost

Working with the QEMU and DCD Features

Within the Dockerized QEMU environment, you can begin exploring CXL DCD’s capabilities. The first step involves creating a new memory region, which is critical for managing memory dynamically.

Creating New Regions and Devices

Execute the following command inside the QEMU host to create a new region. This command specifies the type of device and memory module to be used. Note that this QEMU image built a custom version of ndctl installed in the root user’s home directory. This avoids any conflicts with any other installations of ndctl.

guest>#./ndctl/build/cxl/cxl 
        create-region 
        -t dc0 
        -d decoder0.0 
        -m mem0

Managing Memory Dynamically with DCD Features Using QMP Commands

From a terminal session on the main host, outside the Dockerized QEMU instance, you can send QEMU QMP commands to dynamically add capacity. This method of communication mimics out-of-band management systems like BMCs in real server environments:

host>$ sudo socat UNIX-CONNECT:/tmp/qmp-sock STDIO
{ "execute": "qmp_capabilities" }
{ "execute": "cxl-add-dynamic-capacity",
  "arguments": {
      "path": "/machine/peripheral/cxl-mem0",
      "region-id": 0,
      "extents": [
      {
          "offset": 0,
          "len": 1073741824
      }
      ]
   }
}

Steps to Add and Reconfigure DCD Capacities

After successfully adding new memory capacity via the QMP command, update the region capacity and create a DAX device using the daxctl tool:

guest># ./daxctl create-device region0
[
  {
    "chardev":"dax0.1",
    "size":1073741824,
    "target_node":1,
    "align":2097152,
    "mode":"devdax"
  }
]

Convert the DAX device to a NUMA node using the system-ram namespace type, allowing it to be treated like regular system memory, which is crucial for applications requiring fast memory access:

guest># ./daxctl reconfigure-device -m system-ram dax0.1
[
  {
    "chardev":"dax0.1",
    "size":1073741824,
    "target_node":1,
    "align":2097152,
    "mode":"system-ram",
    "online_memblocks":8,
    "total_memblocks":8,
    "movable":true
  }
]
reconfigured 1 device

Summary

In this guide, we’ve explored the advanced capabilities of Compute Express Link (CXL) with Dynamic Capacity Devices (DCD) through a practical, hands-on approach using Docker and QEMU. By setting up a virtualized environment, application owners and developers can simulate and interact with CXL’s cutting-edge memory expansion and management features without needing immediate access to CXL-compatible hardware.

This guide not only serves as an introduction to leveraging CXL DCD in virtualized settings but also empowers developers and tech enthusiasts to explore further and innovate within their respective fields, paving the way for more efficient, scalable, and high-performance computing solutions. Whether you want to enhance your understanding of modern hardware interconnects or seek practical experience with emerging memory technologies, the insights provided here will be invaluable as you advance in your technology journey.

What’s Next

As we look toward the horizon, the arrival of hardware that supports Compute Express Link (CXL) Dynamic Capacity Devices (DCD) holds promising advancements for the technology industry. This upcoming phase will unlock the full potential of DCD features, transcending the current simulation-based scenarios and venturing into real-world applications that can dramatically enhance computing infrastructures.

Real-Time Dynamic Memory Allocation

With hardware implementations, DCD will allow for real-time dynamic memory allocation, which means systems can adapt their memory resources on-the-fly according to workload demands. This flexibility can significantly improve performance and efficiency in data centers, particularly those involved in big data analytics, artificial intelligence, and high-performance computing.

Enhanced Data Center Efficiency

The deployment of CXL-enabled hardware will lead to more efficient data center operations. By minimizing the physical constraints associated with memory expansion and upgrading, data centers can optimize their hardware usage, reduce power consumption, and lower the overall environmental impact.

New Avenues for Innovation

The arrival of hardware will also spur innovation across various sectors by enabling the development of new applications and services that leverage the capability to adjust memory resources dynamically. Developers will have new tools to experiment with, potentially leading to software design and functionality breakthroughs.

Better Integration with Emerging Technologies

As technologies like AI and machine learning continue to evolve, they require more sophisticated memory management solutions. Hardware-based DCDs integrated with CXL technology will provide the backbone needed to support these advancements, ensuring that memory-intensive applications perform optimally.

Preparing for Transition

For developers and IT professionals, now is the time to prepare for the transition to CXL-compatible hardware. This includes staying updated with the latest developments, understanding the integration challenges, and beginning to think about how existing systems can be upgraded or adapted to take full advantage of DCD features.

Community and Collaboration

Finally, the community around CXL and DCD is likely to grow, fostering collaboration and knowledge-sharing. Engaging with this community through forums, conferences, and collaborative projects will be vital in staying at the forefront of this technology wave.

In conclusion, the future of Dynamic Capacity Devices in hardware presents a thrilling prospect for the technology industry. As we anticipate these developments, embracing the changes and preparing for the next generation of memory management will ensure that businesses and developers are ready to harness the full power of CXL technology.