Exascale: SUPERLU/STRUMPACK Solvers Get Frontier Upgrade

“Before (the Exascale Computing Project), both packages had very little support for GPUs. We could get some benefit from running on a single GPU, but without updating the code, even 10 GPUs wouldn’t make it run much faster. We had to redesign a lot of algorithms ….”

ISC 2024’s Wednesday Keynote: Two Trends Transforming HPC

Overarching both themes is the slowing of Moore’s Law, the deceleration of improvement in HPC performance combined with increasing costs for new generations of systems. In addition, implementing and realizing the benefits of new systems is slowing ….

Lenovo Maximizes HPC Resources via Partnership with SchedMD and Slurm Workload Manager

[SPONSORED GUEST ARTICLE] In HPC, leveraging compute resources to the maximum is a constant goal and a constant source of pressure. The higher the usage rate, the more jobs get done, the less resources sit idle, the greater the return on the HPC investment. At Lenovo, with its….

NVIDIA at SIGGRAPH: DGX Integration with Hugging Face for LLM Training; Announcement of AI Workbench

At the the SIGGRAPH conference this morning in Los Angeles, NVIDIA made several generative AI-related announcements, including a partnership with Hugging Face intended to broaden access to generative AI supercomputing (NVIDIA’s DGX cloud hardware) for developers building large language models (LLMs) and other AI applications on the Hugging Face platform. The companies said the combination […]

ExaWorks: Tested Component for HPC Workflows

ExaWorks is an Exascale Computing Project (ECP)–funded project that provides access to hardened and tested workflow components through a software development kit (SDK). Developers use this SDK and associated APIs to build and deploy production-grade, exascale-capable workflows on US Department of Energy (DOE) and other computers. The prestigious Gordon Bell Prize competition highlighted the success of the ExaWorks SDK when the Gordewinner and two of three finalists in the 2020 Association for Computing Machinery (ACM) Gordon Bell Special Prize for High Performance Computing–Based COVID-19 Research competition leveraged ExaWorks technologies.

@HPCpodcast: HPC Software Rock Star Sunita Chandrasekaran on Exascale Programming and the Emergence of the RSE

In this episode of the @HPCpodcast, sponsored by Lenovo, Shahin and Doug talk with University of Delaware’s Sunita Chandrasekaran, a rock star in the world of supercomputing software. Chandrasekaran is Associate Professor at the Department of Computer Information Systems and co-directing the AI Center of Excellence at the university.

Sylabs Unveils ‘Singularity Containers 101’ Curriculum for Colleges

Reno, NV – June 14, 2022 – Sylabs, provider of tools and services for performance-intensive container technology, today announces the “Singularity Containers 101” curriculum. Using the open-source SingularityCE platform, this curriculum is designed for college programs, offering instruction in container technology. This program is built to prepare students to navigate and lead in the next […]

Exxact Partners Offers Run:ai for GPU Clusters in AI Workloads

FREMONT, CA — Nov. 30, 2022 — Exxact Corporation, a provider of high-performance computing (HPC), artificial intelligence (AI), and data center solutions, now offers Run:ai in their solutions. This groundbreaking Kubernetes-based orchestration tool incorporates an AI-dedicated, high-performant super-scheduler tailored for managing GPU resources in AI clusters. Run:ai dynamically optimizes hardware utilization for AI workloads, enabling clusters […]

Relief for the Solution Architect: Pushing Back on HPC Cluster Complexity with Warewulf and Apptainer

[SPONSORED CONTENT]  How did you, at heart and by training a research scientist, financial analyst or product design engineer doing multi-physics CAE, how did you end up as a… systems administrator? You set out to be one thing and became something else entirely. You finished school and began working with some hefty HPC-class clusters. One […]

Rocky Enterprise Software Foundation Approves Bylaws and Charter for Open Community Control of Rocky Linux and Projects

RENO, Nev.—November 10, 2022—The Rocky Enterprise Software Foundation (RESF) today published its charter and bylaws, documenting the organization’s governing structure and rules for hosting open source projects, including its namesake project, Rocky Linux. The charter and bylaws also describe the RESF vision to create and nurture a community of individuals and organizations that are committed […]