Charmworks will team with Argonne National Laboratory to expand and enhance the Argobots threading framework with support from the Department of Energy’s Small Business Innovation Research program.
Argobots Pro lets developers leverage flexible scheduling of computations within a node to implement high-performance multitasking strategies, with a special focus on lightweight user-level threads. The runtime system’s development was led by Argonne’s Programming Models & Runtime Systems group and included the University of Illinois Urbana-Champaign. It was part of DOE’s Argo project, which was aimed at building operating systems and runtimes for exascale supercomputers. Argobots’ design was inspired by Converse, a scheduling system that underpins the Charm++ parallel programming system. It was a finalist for an R&D 100 award from R&D World magazine in 2020.
Task-based runtimes like those supported by Argobots Pro are very powerful, but they’re also very difficult to design and develop in a way that can be reused across many applications and libraries.
“With the expansion of multiprocessor chips and GPGPUs within computing nodes, there’s a lot of asynchronous data movement and much less memory and network bandwidth. That makes for extremely complex, dynamic, and irregular work,” said Sanjay Kale, Charmworks’ CEO.
“Adequate software infrastructure to do the work just doesn’t exist. With Argobots Pro — especially with the enhancements we’ll be adding to Argobots Pro — developers can combine a variety of scheduling strategies and schedulers that control different aspects of the hardware and applications. Whether they’re building tasking models, runtime systems, backends of programming languages, or building applications, they can experiment and tune within a single platform to get the highest possible impact.”
Enhancements that Charmworks and ANL plan to implement include:
Registration-based queues to allow developers to flexibly control sequencing across multiple queues and asynchronous events.
Node parallel programming to improve nested parallel performance when using systems like OpenMP or BOLT.
Mechanisms for communication thread management to circumvent bottlenecks.
A dependence tracking module to determine data readiness and coordinate data movement.
Coordination of GPGPU and accelerator kernels, monitoring completion, tracking communication, and initiating data transfers.
Debugging and performance profiling tools.
Using Argobots Pro, developers can better address problems that frequently occur during large-scale simulations like hardware failure, load imbalance, and excessive energy consumption. The runtime is highly interoperable with MPI libraries like MPICH, Open MPI, and their derivatives. It is used by many commercial codes as well as open source software projects — including Intel DAOS, Margo/Mercury, and HDF5 — to achieve unprecedented performance.
Intel plans to deploy Argobots on the Aurora exascale supercomputer, and Argobots is already being used on the fastest supercomputer in the world, called Fugaku, and on various supercomputers at China’s National Supercomputing Centers in Guangzhou and Shenzhen.
About Charmworks
Charmworks provides scalable solutions that improve productivity in parallel programming. Charm++, the company’s primary product and core technology, is an adaptive runtime and supporting tools that allows developers to easily incorporate automatic load balancing, fault tolerance, and energy saving features into their codes. Charmworks also offers a suite of software built on top of Charm++, including an MPI implementation called CharmMPI and a discrete event simulator called CharmDES.
About Argonne National Laboratory
Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.
Comments