Charmworks debuts CharmMPI, offering unprecedented performance for MPI codes
Interested? Charmworks has prepared an onboarding tutorial which walks MPI programmers through the process of using CharmMPI and attaining the benefits it provides. Try it today!
Charmworks has released CharmMPI, a dynamic runtime environment that allows developers to dramatically improve the performance of their MPI-based applications. With this release, CharmMPI joins the ranks of a select group of message-passing interface implementations like MPICH, MVAPICH and OpenMPI. Users of existing MPI implementations can add the benefit of CharmMPI’s adaptive computing capabilities, like dynamic load balancing and automatic communication optimizations, immediately.
Expanding its popular and powerful Charm++ parallel programming system
CharmMPI constantly and automatically inspects and optimizes your simulation runs, providing runtime adaptivity without user intervention. It works with existing MPI applications written in C, C++, and Fortran. It operates on a variety of platforms, clusters, and interconnects, as well as cloud-based high-performance computing infrastructure.
The result is faster, higher-resolution insights. Even for applications that do not require runtime adaptivity, CharmMPI delivers more than 2 times lower latency and almost 4 times higher bandwidth for communication within a shared memory node.
Each process in a traditional MPI code, called a rank, is formally assigned a computing node or a core within a node. That assignment cannot change during the run. Messages for a given rank are explicitly addressed to those nodes or cores.
“Many parallel codes lose more than 50 percent of their performance due to load balancing issues alone. And that hardwired setup in traditional MPI-based codes prevents automatic load balancing and other runtime optimizations almost entirely,” said Sanjay Kale, CEO of Charmworks. Adding load balancing to a traditional MPI-based code often requires a complete restructuring of the code and significant programmer time.
“CharmMPI allows users to virtualize the ranks in their codes, decoupling the ranks from particular physical nodes or cores. Over-decomposition along with such virtual ranks lets CharmMPI automatically migrate ranks to resources that are idle. Such load balancing greatly improves the codes’ performance and neatly separates the rebalancing logic from the application code.”
CharmMPI uses MPI’s familiar API, making it easy to learn. It can help developers:
Alleviate scalability bottlenecks that are difficult to address directly in applications.
Tackle both static and dynamic load imbalances.
Tolerate communication latencies that can otherwise be hard to compensate for.
Run through node failures.
Dynamically shrink and expand computing runs based on available resource allocation, providing resource elasticity that is particularly valuable in today’s cloud environments.
CharmMPI was developed over the last four years, supported by $1.6 million from the Department of Energy’s Small Business Innovation Research program. The SBIR project led to significant performance improvements in CharmMPI and made the implementation much more robust. CharmMPI now allows users to easily transform their applications into thread-safe codes, ensuring that computations behave and interact properly as they are automatically spread across a computing resource.
“A major focus of the innovative work during the SBIR project was to automate those code transformations or to make them easier to do manually,” Kale said. “These transformations produce a code that is still a regular MPI code. In fact, it becomes cleaner and thread-safe, thus automating code modernization.”
CharmMPI is based on AMPI, which was developed at the University of Illinois Urbana-Champaign for several years. CharmMPI will continue to be available for free to non-profit organizations like Universities and Department of Energy national labs.
Charmworks provides scalable solutions that improve productivity in parallel programming. Charm++, the company’s primary product and core technology, is an adaptive runtime and supporting tools that allows developers to easily incorporate automatic load balancing, fault tolerance, and energy saving features into their codes. Charmworks also offers a suite of software built on top of Charm++, including an MPI implementation called CharmMPI and a discrete event simulator called CharmDES.