DragonHPC
Scale up your HPC & AI workflows.
From laptop to exaflop.

Leverage Python multiprocessing on small and large distributed systems
Dragon is a composable and distributed runtime that enables users to create scalable, complex, and resilient HPC and AI applications, workflows, and services through standard Python interfaces. Dragon provides capabilities to address many of the challenges around programmability, memory management, transparency, and efficiency on distributed computing systems.
Some of the features Dragon provides to address these challenges: portable and transparent programmability based on standard Python APIs, fine-grained process management, high-speed communication, telemetry tools,integration with Jupyter notebooks, compatibility with a range of existing Python apps and tools, and a high-performance distributed dictionary.
Develop scalable workflows with minimal code modification
Dragon implements nearly all the multiprocessing API, allowing users to manage processes, high-level communication constructs (eg Queue, Connection, Semaphore, Barrier), and shared data (eg Value, Array, Dict) at supercomputing scales. The current Dragon v0.10 release scales to roughly 1,000 supercomputing nodes for several use cases.
Dragon is a distributed environment for developing high-performance tools, libraries, and applications at scale.
Dragon also extends the multiprocessing API to support sophisticated HPC/AI workflows. For example, users can orchestrate ensembles of MPI and model training processes using multiprocessing and Dragon-native interfaces.
Some highlights that make DragonHPC a valuable distributed runtime
Visualize what your apps are up to in real-time.
Documentation and cookbook examples to get you up and running.
Can be deployed as part of your current stack for added performance and function.
Dragon is optimized for the HPE Cray Slingshot network, but also supports other networks including Infiniband and standard ethernet.
In-memory distributed dictionary for performance and versatility in managing data.
Deploy on a range of systems and with a variety of other workflow tools.
Our next v0.11 release is planned for early 2025