SlideShare a Scribd company logo
NRP & the Path forward
Frank Würthwein
Director, San Diego Supercomputer Center
February 9th 2023
Democratize Access
Long Term Vision
• Create an Open National Cyberinfrastructure that
allows the federation of CI at all ~4,000 accredited,
degree granting higher education institutions, non-
profit research institutions, and national laboratories.
§ Open Science
§ Open Data
§ Open Source
§ Open Infrastructure
Openness for an Open Society
Open Compute
Open Storage & CDN
Open devices/instruments/IoT, …?
Community vs Funded Projects
Community with
Shared Vision
Lot’s of funded projects that
contribute to this shared
vision in different ways.
We want you to …
… grow NRP.
… build on NRP.
NRP is “owned” and “built” by the community for the community
How is NRP different from OSG?
We’ve been doing federated cyberinfrastructure since 2005.
Why is NRP even needed given that OSG exists?
Cyberinfrastructure Stack
IPMI, Firmware, BIOS
NRP operates at all layers of the stack, from IPMI up
• IPMI reduces TCO and lower threshold to entry
• Kubernetes allows service deployments
• Also the natural layer for application container deployment
• Admiralty allows K8S federation with folks who want control
• Including cloud integration to access TPUs & other cloud only architectures
• HTCondor allows NRP to show up as a “site” in OSG
The layer you integrate at depends on
• Control you want
• Effort you can afford
Complementarity in Implementation of
“Bring Your Own Resource” model
OSG/PATh focused on campus cluster integration.
NRP focused on individual node integration instead of clusters.
Cyberinfrastructure Stack
IPMI, Firmware, BIOS
NRP operates at all layers of the stack, from IPMI up
• IPMI reduces TCO and lowers threshold to entry
• Kubernetes allows service deployments
• Also the natural layer for application container deployment
• Admiralty allows K8S federation with folks who want control
• Including cloud integration to access TPUs & other cloud only architectures
• HTCondor allows NRP to show up as site in OSG
• Under-resourced institutions
• Network providers and their POPs
• CS & ECE faculty specialized on:
• AI/ML => gaming GPUs
• systems R&D
All of these find it difficult to
justify staff to support all layers
Hardware on NRP is Global
NRP integrates hardware in USA, EU, and Asia
Cyberinfrastructure Stack
IPMI, Firmware, BIOS
NRP operates at all layers of the stack, from IPMI up
• IPMI reduces TCO and lowers threshold to entry
• Kubernetes allows service deployments
• Also the natural layer for application container deployment
• Admiralty allows K8S federation with folks who want control
• Including cloud integration to access TPUs & other cloud only architectures
• HTCondor allows NRP to show up as site in OSG
• Open Science Data Federation
• Origins & Caches in US, EU, Asia
• Protein Data Bank
• (Future) Replicas in EU & Asia
NRP is unique in its support of
global service deployments
OSDF ops by PATh PDB ops by PDB ???
Supporting Nautilus
for the next decade
Nautilus = K8S infrastructure of PRP for the last 5+ years
Nautilus = K8S of NRP for the next 10 years
The NSF Cat-II Program
• NSF supports via the Cat-II program novel systems ideas.
- 3 year “testbed” phase
§ The PI owns the resource, and has (some) freedom regarding who uses it.
Ÿ No requirements for making it available via any specific allocation mechanism.
§ It is expected that not all features work on day 1.
Ÿ 3 years of experimentation & development of features
- 2 year “allocation” phase
§ The resource is made available via an NSF supported allocation mechanism.
- The solicitation mentions the possibility of an additional 5 year
renewal without re-competition if system is successful.
• We decided that this is an ideal program to try and secure
NRP core operations funding for the next 10 years
- And thus provide the stability necessary for growth of NRP.
Cat-II: Prototype National
Research Platform (PNRP)
Funded as NSF 2112167
80GB A100
5 year project with $5M hardware & $6.45M people
Supports Nautilus, and thus the core NRP infrastructure
Promises to build on “PRP” functionality, and go beyond
NSF Acceptance Review of System scheduled for March 8 & 9th 2023
PI = Wuerthwein; Co-PIs: DeFanti, Rosing, Tatineni, Weitzel
• I1: Innovative network fabric that allows “rack” of hardware to
behave like a single “node” connected via PCIe.
• I2: Innovative application libraries to expose FPGAs hardware
to science apps at language constructs scientists understand
(C, C++ rather than firmware)
• I3: A “Bring Your Own Resource” model that allows campuses
nationwide to join their resources to the system.
• I4: Innovative scheduling to support urgent computing, including
interactive via Jupyter.
• I5: Innovative Data Infrastructure, including national scale
Content Delivery System like YouTube for science.
I3 & I4 & I5 turn ”PRP” into “NRP” and sustains it into the future.
I1 & I2 are totally new.
Data Infrastructure Model of NRP
• Support regional Ceph storage systems across the USA.
- Campuses can join individual storage hosts to the Ceph system in their region.
- All regional storage systems are Origins in OSG Data Federation (OSDF)
- Deploy replication system such that researchers can decide what part of their
namespace should be in which regional storage.
• Deploy caches in Internet2 backbone such that no campus nationwide is
more than 500 miles from a cache.
NRP data infrastructure model combines best of PRP & OSG
From PRP we take the regional Ceph storage concept
From OSG/PATh we take the data origin & caching concepts
And then we add as a totally new feature:
User controlled replication of partial namespaces across regions.
(We will develop this during 3 year “testbed” phase)
Want Others to build higher level data services on top
Matrix of Science x Innovations
NRP. The networks of researchers we create will use the NRP systems innovations to achieve broader im-
pact, with the help from our collaborators, users, staff, Co-PIs, and PI.
Table 3.1 Representative Science and Engineering Use Cases
Application domain
Lead researcher &
Science Driver
Themes NRP Innovations
Peter Couvares, LIGO Lab;
Erik Katsavounidis, MIT
BGS, UC, AI I2, I3, I4, I5
IceCube Benedikt Riedel, UW Madison BGS, UC, AI I3, I4
Astronomy (DKIST &
Sky Surveys)
Curt Dodds, U. Hawai’i BGS, AI I3, I5,
Campus Scale Instru-
ment Facilities
Mark Ellisman, NCMIR; Sa-
mara Reck-Peterson, Nicon
Imaging Center; Johannes
Schoeneberg, Adaptive Op-
tics Lightsheet Microscopy;
Kristen Jepsen, Institute for
Genomic Medicine; Tami
Brown-Brandl, Precision Ani-
mal Management
SD, UC, H I1, I2, I3, I4, I5
Molecular Dynamics
Rommie Amaro, UCSD; An-
dreas Goetz, SDSC; Jona-
than Allen, LLNL
MD, AI, H I1, I2, I3
Human microbiome Rob Knight, UCSD G, AI, H I1, I2, I3
Genomics & Bioinfor-
Alex Feltus, Clemson G, AI, H I3, I4, I5
Fluid Dynamics Rose Yu, UCSD AI I1, I2, I3
Experimental Particle
Physics, IAIFI Phil Harris, MIT AI, BGS, SD I1, I2
Computer Vision Nuno Vasconcelos, UCSD AI, CV I3
Computer Graphics Robert Twomey, UNL CV, AI I3
Programmable Storage Carlos Malzahn, UCSC SD I1, I2, I5
AI systems software
stack for FPGAs Hadi Esmaeilzadeh , UCSD SD I1, I2
WildFire Analysis &
Ilkay Altintas, UCSD UC, AI, CV I3, I4
Key: The NRP innovations column lists those innovations among I1 through I5 listed in Section 2.1 that a
given science driver most benefits from.
Incl. 4 campus scale
instrument facilities
Incl. a very diverse
set of sciences and
Lot’s of AI …
but so much more …
FKW’s Wishlist for the Future
• Growth of NRP infrastructure
- 1,000++ GPUs end of 2022
- 50 PB storage end of 2024
- Growth in diversity of community
§ # and types of campuses and their researchers
• Introduce new capabilities to NRP
- Machine learning at 100TB scale
- Support Domain Specific Architecture R&D on NRP
- Expand NRP into Wireless, Edge, IoT
- Towards “FAIR” on OSDF
• New Directions initiated by the Community
“Domain Specific Architectures”
• I1: Innovative network fabric allowing “composable hardware”.
• I2: Innovative application libraries allowing “domain optimized
architectures” on FPGAs
“end of Moore’s law” motivates new architectures
Mark Papermaster, CTO of AMD
PRISM, a Jump 2.0 project
funded by SRC
is early user of FPGAs@PNRP
John Shalf (2020)
PI, Tajana Rosing
New Data Origins
• The NSF CC* 2022 program awarded 9
campuses with $500k storage awards each.
- We guess this pays for 5PB of storage each.
• Some of these campuses may decide to
integrate their CC* storage into the OSDF.
• Some of these campuses have storage from
other projects that they may integrate with the
OSDF in addition.
• NSF 23-523 includes $500k storage
solicitation again, Spring & Fall 2023.
Summary & Conclusions
• PRP ended, and was replaced by NRP
- Significant new capabilities via Cat-II system “PNRP”
§ PNRP provides ops effort for Nautilus for the future
- # of GPUs available double in 2022.
§ new GPUs (A10, 3080, 3090, A100) much more powerful than older GPUs
- # of FPGAs increase from a few to a few dozen in 2022.
- # of caches grow by 50% in 22/23
=> more consistent coverage across USA
- Data volume served expected to grow substantially in 23/24/25.
§ How much? As yet too hard to predict.
• Hoping to recruit new partners to build FAIR capabilities on
top of OSDF within the next 5 years.
• Hoping to expand NRP into sensor networks using 5G &
6G in the next 10 years.
• This work was partially supported by the NSF
grants OAC-1541349, OAC-1826967, OAC-
2030508, OAC-1841530, OAC-2005369,
OAC-21121167, CISE-1713149, CISE-
2100237, CISE-2120019, OAC-2112167

More Related Content

Similar to Frank Würthwein - NRP and the Path forward

vishal choudhary
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
David Wallom
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Azure Brain: 4th paradigm, scientific discovery & (really) big data
Azure Brain: 4th paradigm, scientific discovery & (really) big dataAzure Brain: 4th paradigm, scientific discovery & (really) big data
Azure Brain: 4th paradigm, scientific discovery & (really) big data
Microsoft Technet France
The Pacific Research Platform- a High-Bandwidth Distributed Supercomputer
The Pacific Research Platform-a High-Bandwidth Distributed SupercomputerThe Pacific Research Platform-a High-Bandwidth Distributed Supercomputer
The Pacific Research Platform- a High-Bandwidth Distributed Supercomputer
Larry Smarr
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNA
Daniel S. Katz
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Alan Sill
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
Larry Smarr
The Department of Energy's Integrated Research Infrastructure (IRI)
The Department of Energy's Integrated Research Infrastructure (IRI)The Department of Energy's Integrated Research Infrastructure (IRI)
The Department of Energy's Integrated Research Infrastructure (IRI)
Network research
Network researchNetwork research
Network research
IRJET-Open Curltm Cloud Computing Test Structure:Confederate Data Centers for...
IRJET-Open Curltm Cloud Computing Test Structure:Confederate Data Centers for...IRJET-Open Curltm Cloud Computing Test Structure:Confederate Data Centers for...
IRJET-Open Curltm Cloud Computing Test Structure:Confederate Data Centers for...
IRJET Journal
ApacheCon NA 2013
ApacheCon NA 2013ApacheCon NA 2013
ApacheCon NA 2013
SKA NZ R&D BeSTGRID Infrastructure
SKA NZ R&D BeSTGRID InfrastructureSKA NZ R&D BeSTGRID Infrastructure
SKA NZ R&D BeSTGRID Infrastructure
Nick Jones
Software, Licences etc
Software, Licences etcSoftware, Licences etc
Software, Licences etc
DART Project
The Pacific Research Platform: a Science-Driven Big-Data Freeway System
The Pacific Research Platform: a Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: a Science-Driven Big-Data Freeway System
The Pacific Research Platform: a Science-Driven Big-Data Freeway System
Larry Smarr
Larry Smarr - NRP Application Drivers
Larry Smarr - NRP Application DriversLarry Smarr - NRP Application Drivers
Larry Smarr - NRP Application Drivers
Larry Smarr
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
David Wallom
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark Summit
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017
Stacy Véronneau
Grid computing
Grid computingGrid computing
Grid computing
Keshab Nath

Similar to Frank Würthwein - NRP and the Path forward (20)

Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Azure Brain: 4th paradigm, scientific discovery & (really) big data
Azure Brain: 4th paradigm, scientific discovery & (really) big dataAzure Brain: 4th paradigm, scientific discovery & (really) big data
Azure Brain: 4th paradigm, scientific discovery & (really) big data
The Pacific Research Platform- a High-Bandwidth Distributed Supercomputer
The Pacific Research Platform-a High-Bandwidth Distributed SupercomputerThe Pacific Research Platform-a High-Bandwidth Distributed Supercomputer
The Pacific Research Platform- a High-Bandwidth Distributed Supercomputer
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNA
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
The Department of Energy's Integrated Research Infrastructure (IRI)
The Department of Energy's Integrated Research Infrastructure (IRI)The Department of Energy's Integrated Research Infrastructure (IRI)
The Department of Energy's Integrated Research Infrastructure (IRI)
Network research
Network researchNetwork research
Network research
IRJET-Open Curltm Cloud Computing Test Structure:Confederate Data Centers for...
IRJET-Open Curltm Cloud Computing Test Structure:Confederate Data Centers for...IRJET-Open Curltm Cloud Computing Test Structure:Confederate Data Centers for...
IRJET-Open Curltm Cloud Computing Test Structure:Confederate Data Centers for...
ApacheCon NA 2013
ApacheCon NA 2013ApacheCon NA 2013
ApacheCon NA 2013
SKA NZ R&D BeSTGRID Infrastructure
SKA NZ R&D BeSTGRID InfrastructureSKA NZ R&D BeSTGRID Infrastructure
SKA NZ R&D BeSTGRID Infrastructure
Software, Licences etc
Software, Licences etcSoftware, Licences etc
Software, Licences etc
The Pacific Research Platform: a Science-Driven Big-Data Freeway System
The Pacific Research Platform: a Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: a Science-Driven Big-Data Freeway System
The Pacific Research Platform: a Science-Driven Big-Data Freeway System
Larry Smarr - NRP Application Drivers
Larry Smarr - NRP Application DriversLarry Smarr - NRP Application Drivers
Larry Smarr - NRP Application Drivers
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017
Grid computing
Grid computingGrid computing
Grid computing

More from Larry Smarr

Digital Twins of Physical Reality - Future in Review
Digital Twins of Physical Reality - Future in ReviewDigital Twins of Physical Reality - Future in Review
Digital Twins of Physical Reality - Future in Review
Larry Smarr
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy
Larry Smarr’s Prostate Cancer Early Detection and Focal TherapyLarry Smarr’s Prostate Cancer Early Detection and Focal Therapy
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy
Larry Smarr
The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...
Larry Smarr
The PRP and Its Applications - Nautilus and the National Research Platform
The PRP and Its Applications - Nautilus and the National Research PlatformThe PRP and Its Applications - Nautilus and the National Research Platform
The PRP and Its Applications - Nautilus and the National Research Platform
Larry Smarr
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
The Pacific Research Platform Connects to CSU San Bernardino
The Pacific Research Platform Connects to CSU San BernardinoThe Pacific Research Platform Connects to CSU San Bernardino
The Pacific Research Platform Connects to CSU San Bernardino
Larry Smarr
The CENIC-Connected Cyberinfrastructure Commons: Enabling AI for Research and...
The CENIC-Connected Cyberinfrastructure Commons:Enabling AI for Research and...The CENIC-Connected Cyberinfrastructure Commons:Enabling AI for Research and...
The CENIC-Connected Cyberinfrastructure Commons: Enabling AI for Research and...
Larry Smarr
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
Supercomputing from the Desktop Workstation
Supercomputingfrom the Desktop WorkstationSupercomputingfrom the Desktop Workstation
Supercomputing from the Desktop Workstation
Larry Smarr
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy- Focus on Pos...
Larry Smarr’s Prostate CancerEarly Detection and Focal Therapy-Focus on Pos...Larry Smarr’s Prostate CancerEarly Detection and Focal Therapy-Focus on Pos...
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy- Focus on Pos...
Larry Smarr
Getting Started Using the National Research Platform
Getting Started Using the National Research PlatformGetting Started Using the National Research Platform
Getting Started Using the National Research Platform
Larry Smarr
Discovering Human Gut Microbiome Dynamics
Discovering Human Gut Microbiome DynamicsDiscovering Human Gut Microbiome Dynamics
Discovering Human Gut Microbiome Dynamics
Larry Smarr
From NCSA to the National Research Platform
From NCSA to the National Research PlatformFrom NCSA to the National Research Platform
From NCSA to the National Research Platform
Larry Smarr
My Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 YearsMy Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 Years
Larry Smarr
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Larry Smarr
Panel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving InstitutionsPanel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving Institutions
Larry Smarr
Global Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsGlobal Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated Systems
Larry Smarr
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
 Wireless FasterData and Distributed Open Compute Opportunities and (some) Us... Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
Larry Smarr
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Larry Smarr
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon MoonThe Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
Larry Smarr

More from Larry Smarr (20)

Digital Twins of Physical Reality - Future in Review
Digital Twins of Physical Reality - Future in ReviewDigital Twins of Physical Reality - Future in Review
Digital Twins of Physical Reality - Future in Review
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy
Larry Smarr’s Prostate Cancer Early Detection and Focal TherapyLarry Smarr’s Prostate Cancer Early Detection and Focal Therapy
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy
The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...
The PRP and Its Applications - Nautilus and the National Research Platform
The PRP and Its Applications - Nautilus and the National Research PlatformThe PRP and Its Applications - Nautilus and the National Research Platform
The PRP and Its Applications - Nautilus and the National Research Platform
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
The Pacific Research Platform Connects to CSU San Bernardino
The Pacific Research Platform Connects to CSU San BernardinoThe Pacific Research Platform Connects to CSU San Bernardino
The Pacific Research Platform Connects to CSU San Bernardino
The CENIC-Connected Cyberinfrastructure Commons: Enabling AI for Research and...
The CENIC-Connected Cyberinfrastructure Commons:Enabling AI for Research and...The CENIC-Connected Cyberinfrastructure Commons:Enabling AI for Research and...
The CENIC-Connected Cyberinfrastructure Commons: Enabling AI for Research and...
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Supercomputing from the Desktop Workstation
Supercomputingfrom the Desktop WorkstationSupercomputingfrom the Desktop Workstation
Supercomputing from the Desktop Workstation
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy- Focus on Pos...
Larry Smarr’s Prostate CancerEarly Detection and Focal Therapy-Focus on Pos...Larry Smarr’s Prostate CancerEarly Detection and Focal Therapy-Focus on Pos...
Larry Smarr’s Prostate Cancer Early Detection and Focal Therapy- Focus on Pos...
Getting Started Using the National Research Platform
Getting Started Using the National Research PlatformGetting Started Using the National Research Platform
Getting Started Using the National Research Platform
Discovering Human Gut Microbiome Dynamics
Discovering Human Gut Microbiome DynamicsDiscovering Human Gut Microbiome Dynamics
Discovering Human Gut Microbiome Dynamics
From NCSA to the National Research Platform
From NCSA to the National Research PlatformFrom NCSA to the National Research Platform
From NCSA to the National Research Platform
My Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 YearsMy Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 Years
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Panel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving InstitutionsPanel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving Institutions
Global Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsGlobal Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated Systems
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
 Wireless FasterData and Distributed Open Compute Opportunities and (some) Us... Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon MoonThe Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon

Recently uploaded

Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Matthew Sinclair
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
Awais Yaseen
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Bert Blevins
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev

Recently uploaded (20)

Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition

Frank Würthwein - NRP and the Path forward

  • 1. NRP & the Path forward Frank Würthwein Director, San Diego Supercomputer Center February 9th 2023
  • 4. NATIONAL RESEARCH NATIONAL RESEARCH Long Term Vision • Create an Open National Cyberinfrastructure that allows the federation of CI at all ~4,000 accredited, degree granting higher education institutions, non- profit research institutions, and national laboratories. § Open Science § Open Data § Open Source § Open Infrastructure 4 Openness for an Open Society Open Compute Open Storage & CDN Open devices/instruments/IoT, …?
  • 5. NATIONAL RESEARCH NATIONAL RESEARCH Community vs Funded Projects 5 Community with Shared Vision Lot’s of funded projects that contribute to this shared vision in different ways. We want you to … … grow NRP. … build on NRP. NRP is “owned” and “built” by the community for the community
  • 6. How is NRP different from OSG? We’ve been doing federated cyberinfrastructure since 2005. Why is NRP even needed given that OSG exists?
  • 7. NATIONAL RESEARCH HTCondor/OSG Cyberinfrastructure Stack 7 Hardware IPMI, Firmware, BIOS Kubernetes Admiralty SLURM NRP operates at all layers of the stack, from IPMI up • IPMI reduces TCO and lower threshold to entry • Kubernetes allows service deployments • Also the natural layer for application container deployment • Admiralty allows K8S federation with folks who want control • Including cloud integration to access TPUs & other cloud only architectures • HTCondor allows NRP to show up as a “site” in OSG The layer you integrate at depends on • Control you want • Effort you can afford
  • 8. Complementarity in Implementation of “Bring Your Own Resource” model OSG/PATh focused on campus cluster integration. NRP focused on individual node integration instead of clusters.
  • 9. NATIONAL RESEARCH HTCondor/OSG Cyberinfrastructure Stack 9 Hardware IPMI, Firmware, BIOS Kubernetes Admiralty SLURM NRP operates at all layers of the stack, from IPMI up • IPMI reduces TCO and lowers threshold to entry • Kubernetes allows service deployments • Also the natural layer for application container deployment • Admiralty allows K8S federation with folks who want control • Including cloud integration to access TPUs & other cloud only architectures • HTCondor allows NRP to show up as site in OSG • Under-resourced institutions • Network providers and their POPs • CS & ECE faculty specialized on: • AI/ML => gaming GPUs • systems R&D All of these find it difficult to justify staff to support all layers
  • 10. NATIONAL RESEARCH Hardware on NRP is Global 10 NRP integrates hardware in USA, EU, and Asia
  • 11. NATIONAL RESEARCH Cyberinfrastructure Stack 11 Hardware IPMI, Firmware, BIOS Kubernetes NRP operates at all layers of the stack, from IPMI up • IPMI reduces TCO and lowers threshold to entry • Kubernetes allows service deployments • Also the natural layer for application container deployment • Admiralty allows K8S federation with folks who want control • Including cloud integration to access TPUs & other cloud only architectures • HTCondor allows NRP to show up as site in OSG • Open Science Data Federation • Origins & Caches in US, EU, Asia • Protein Data Bank • (Future) Replicas in EU & Asia NRP is unique in its support of global service deployments OSDF ops by PATh PDB ops by PDB ???
  • 12. Supporting Nautilus for the next decade Nautilus = K8S infrastructure of PRP for the last 5+ years Nautilus = K8S of NRP for the next 10 years
  • 13. NATIONAL RESEARCH NATIONAL RESEARCH The NSF Cat-II Program • NSF supports via the Cat-II program novel systems ideas. - 3 year “testbed” phase § The PI owns the resource, and has (some) freedom regarding who uses it. Ÿ No requirements for making it available via any specific allocation mechanism. § It is expected that not all features work on day 1. Ÿ 3 years of experimentation & development of features - 2 year “allocation” phase § The resource is made available via an NSF supported allocation mechanism. - The solicitation mentions the possibility of an additional 5 year renewal without re-competition if system is successful. • We decided that this is an ideal program to try and secure NRP core operations funding for the next 10 years - And thus provide the stability necessary for growth of NRP. 13
  • 14. NATIONAL RESEARCH NATIONAL RESEARCH Cat-II: Prototype National Research Platform (PNRP) 14 Funded as NSF 2112167 80GB A100 A10 64 288 5 year project with $5M hardware & $6.45M people Supports Nautilus, and thus the core NRP infrastructure Promises to build on “PRP” functionality, and go beyond NSF Acceptance Review of System scheduled for March 8 & 9th 2023 PI = Wuerthwein; Co-PIs: DeFanti, Rosing, Tatineni, Weitzel
  • 15. NATIONAL RESEARCH NATIONAL RESEARCH Innovations • I1: Innovative network fabric that allows “rack” of hardware to behave like a single “node” connected via PCIe. • I2: Innovative application libraries to expose FPGAs hardware to science apps at language constructs scientists understand (C, C++ rather than firmware) • I3: A “Bring Your Own Resource” model that allows campuses nationwide to join their resources to the system. • I4: Innovative scheduling to support urgent computing, including interactive via Jupyter. • I5: Innovative Data Infrastructure, including national scale Content Delivery System like YouTube for science. 15 I3 & I4 & I5 turn ”PRP” into “NRP” and sustains it into the future. I1 & I2 are totally new.
  • 16. NATIONAL RESEARCH NATIONAL RESEARCH Data Infrastructure Model of NRP • Support regional Ceph storage systems across the USA. - Campuses can join individual storage hosts to the Ceph system in their region. - All regional storage systems are Origins in OSG Data Federation (OSDF) - Deploy replication system such that researchers can decide what part of their namespace should be in which regional storage. • Deploy caches in Internet2 backbone such that no campus nationwide is more than 500 miles from a cache. 16 NRP data infrastructure model combines best of PRP & OSG From PRP we take the regional Ceph storage concept From OSG/PATh we take the data origin & caching concepts And then we add as a totally new feature: User controlled replication of partial namespaces across regions. (We will develop this during 3 year “testbed” phase) Want Others to build higher level data services on top
  • 17. NATIONAL RESEARCH NATIONAL RESEARCH Matrix of Science x Innovations 17 NRP. The networks of researchers we create will use the NRP systems innovations to achieve broader im- pact, with the help from our collaborators, users, staff, Co-PIs, and PI. Table 3.1 Representative Science and Engineering Use Cases Application domain Lead researcher & Institution Science Driver Themes NRP Innovations LIGO Peter Couvares, LIGO Lab; Erik Katsavounidis, MIT BGS, UC, AI I2, I3, I4, I5 IceCube Benedikt Riedel, UW Madison BGS, UC, AI I3, I4 Astronomy (DKIST & Sky Surveys) Curt Dodds, U. Hawai’i BGS, AI I3, I5, Campus Scale Instru- ment Facilities Mark Ellisman, NCMIR; Sa- mara Reck-Peterson, Nicon Imaging Center; Johannes Schoeneberg, Adaptive Op- tics Lightsheet Microscopy; Kristen Jepsen, Institute for Genomic Medicine; Tami Brown-Brandl, Precision Ani- mal Management SD, UC, H I1, I2, I3, I4, I5 Molecular Dynamics Rommie Amaro, UCSD; An- dreas Goetz, SDSC; Jona- than Allen, LLNL MD, AI, H I1, I2, I3 Human microbiome Rob Knight, UCSD G, AI, H I1, I2, I3 Genomics & Bioinfor- matics Alex Feltus, Clemson G, AI, H I3, I4, I5 Fluid Dynamics Rose Yu, UCSD AI I1, I2, I3 Experimental Particle Physics, IAIFI Phil Harris, MIT AI, BGS, SD I1, I2 Computer Vision Nuno Vasconcelos, UCSD AI, CV I3 Computer Graphics Robert Twomey, UNL CV, AI I3 Programmable Storage Carlos Malzahn, UCSC SD I1, I2, I5 AI systems software stack for FPGAs Hadi Esmaeilzadeh , UCSD SD I1, I2 WildFire Analysis & Prediction Ilkay Altintas, UCSD UC, AI, CV I3, I4 Key: The NRP innovations column lists those innovations among I1 through I5 listed in Section 2.1 that a given science driver most benefits from. NSF MREFCs Incl. 4 campus scale instrument facilities Incl. a very diverse set of sciences and engineering Lot’s of AI … but so much more …
  • 18. NATIONAL RESEARCH NATIONAL RESEARCH FKW’s Wishlist for the Future • Growth of NRP infrastructure - 1,000++ GPUs end of 2022 - 50 PB storage end of 2024 - Growth in diversity of community § # and types of campuses and their researchers • Introduce new capabilities to NRP - Machine learning at 100TB scale - Support Domain Specific Architecture R&D on NRP - Expand NRP into Wireless, Edge, IoT - Towards “FAIR” on OSDF • New Directions initiated by the Community 18
  • 19. NATIONAL RESEARCH NATIONAL RESEARCH “Domain Specific Architectures” • I1: Innovative network fabric allowing “composable hardware”. • I2: Innovative application libraries allowing “domain optimized architectures” on FPGAs 19 “end of Moore’s law” motivates new architectures Mark Papermaster, CTO of AMD PRISM, a Jump 2.0 project funded by SRC is early user of FPGAs@PNRP John Shalf (2020) PI, Tajana Rosing
  • 20. NATIONAL RESEARCH NATIONAL RESEARCH New Data Origins • The NSF CC* 2022 program awarded 9 campuses with $500k storage awards each. - We guess this pays for 5PB of storage each. • Some of these campuses may decide to integrate their CC* storage into the OSDF. • Some of these campuses have storage from other projects that they may integrate with the OSDF in addition. • NSF 23-523 includes $500k storage solicitation again, Spring & Fall 2023. 20
  • 21. NATIONAL RESEARCH NATIONAL RESEARCH Summary & Conclusions • PRP ended, and was replaced by NRP - Significant new capabilities via Cat-II system “PNRP” § PNRP provides ops effort for Nautilus for the future - # of GPUs available double in 2022. § new GPUs (A10, 3080, 3090, A100) much more powerful than older GPUs - # of FPGAs increase from a few to a few dozen in 2022. - # of caches grow by 50% in 22/23 => more consistent coverage across USA - Data volume served expected to grow substantially in 23/24/25. § How much? As yet too hard to predict. • Hoping to recruit new partners to build FAIR capabilities on top of OSDF within the next 5 years. • Hoping to expand NRP into sensor networks using 5G & 6G in the next 10 years. 21
  • 22. NATIONAL RESEARCH NATIONAL RESEARCH Acknowledgements • This work was partially supported by the NSF grants OAC-1541349, OAC-1826967, OAC- 2030508, OAC-1841530, OAC-2005369, OAC-21121167, CISE-1713149, CISE- 2100237, CISE-2120019, OAC-2112167 22