SlideShare a Scribd company logo
Running a GPU burst for Multi-
Messenger Astrophysics with
IceCube across all available GPUs in
the Cloud
Frank Würthwein
OSG Executive Director
UCSD/SDSC
Jensen Huang keynote @ SC19
2
The Largest Cloud Simulation in History
50k NVIDIA GPUs in the Cloud
350 Petaflops for 2 hours
Distributed across US, Europe & Asia
Saturday morning before SC19 we bought all GPU capacity that was for sale in
Amazon Web Services, Microsoft Azure, and Google Cloud Platform worldwide
How did we get here?
Annual IceCube GPU use via OSG
4
Peaked at ~3000 GPUs for a day.
Last 12 months
OSG supports global operations of IceCube.
IceCube made long term investment into
dHTC as their computing paradigm.
We produced ~3% of the annual photon
propagation simulations in a ~2h cloud burst.
Longterm Partnership between
IceCube, OSG, HTCondor, … lead to this cloud burst.

Recommended for you

GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech

A talk by Rob Emanuele given at FedGeoDay 2016 about using GeoMesa, GeoWave, and GeoTrellis to work with geospatial data on Apache Spark and Accumulo.

geotrellisgoespatialgeography
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...

This document summarizes an international collaboration between the National Computational Infrastructure (NCI) in Australia and A*Star in Singapore to accelerate DNA analysis. The collaboration utilizes trans-Pacific extended InfiniBand networks and supercomputers to: 1) Transfer large genetic sequence datasets from NCI in Canberra to A*Star in Singapore for analysis on the A*Star Aurora system and return results. 2) Utilize NCI's InfiniCloud HPC system for visualization of genetic data results produced by Aurora. 3) Demonstrate long distance high-speed data transfers between Australia and Singapore leveraging extended InfiniBand networks.

Federated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation TherapyFederated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation Therapy

ISC Cloud‘13, Heidelberg (Germany) Sep. 23-24th, 2013 A. Gómez, L.M. Carril, R. Valin, J.C. Mouriño, C. Cotelo

cloudhpcradiation therapy
The Science Case
IceCube
6
A cubic kilometer of ice at the
south pole is instrumented
with 5160 optical sensors.
Astrophysics:
• Discovery of astrophysical neutrinos
• First evidence of neutrino point source (TXS)
• Cosmic rays with surface detector
Particle Physics:
• Atmospheric neutrino oscillation
• Neutrino cross sections at TeV scale
• New physics searches at highest energies
Earth Science:
• Glaciology
• Earth tomography
A facility with very
diverse science goals
Restrict this talk to
high energy Astrophysics
High Energy Astrophysics Science
case for IceCube
7
Universe is opaque to light
at highest energies and
distances.
Only gravitational waves
and neutrinos can pinpoint
most violent events in
universe.
Fortunately, highest energy
neutrinos are of cosmic origin.
Effectively “background free” as long
as energy is measured correctly.
High energy neutrinos from
outside the solar system
8
First 28 very high energy neutrinos from outside the solar system
Red curve is the photon flux
spectrum measured with the
Fermi satellite.
Black points show the
corresponding high energy
neutrino flux spectrum
measured by IceCube.
This demonstrates both the opaqueness of the universe to high energy
photons, and the ability of IceCube to detect neutrinos above the maximum
energy we can see light due to this opaqueness.
Science 342 (2013). DOI:
10.1126/science.1242856

Recommended for you

The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack Nordic

The document discusses the CERN OpenStack cloud, which provides compute resources for the Large Hadron Collider experiment at CERN. It details the scale of the cloud, including over 6,700 hypervisors, 190,000 cores, and 20,000 VMs. It also describes the various use cases served, wide range of hardware, and operations of the cloud, including a retirement campaign and network migration to Neutron.

cernopenstack
OpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspectiveOpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspective

This document summarizes Tim Bell's presentation on OpenStack at CERN. It discusses how CERN adopted OpenStack in 2011 to manage its growing computing infrastructure needs for processing massive data sets from the Large Hadron Collider. OpenStack has since been scaled up to manage over 300,000 CPU cores and 500,000 physics jobs per day across CERN's private cloud. The document also briefly outlines CERN's use of other open source technologies like Ceph and Kubernetes.

openstackbudapestcern
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell

The document discusses OpenStack at CERN. It provides details on: - OpenStack has been in production at CERN for 3 years, managing over 190,000 cores and 7,000 hypervisors. - Major cultural and technology changes were required and have been successfully addressed to transition to OpenStack. - Contributing back to the upstream OpenStack community has led to sustainable tools and effective technology transfer.

meetupcerncloud computing
Understanding the Origin
9
We now know high energy events happen in the universe. What are they?
p + g D + p + p 0 p + gg
p + g D + n + p + n + m + n
Co
Aya Ishihara
The hypothesis:
The same cosmic events produce
neutrinos and photons
We detect the electrons or muons from neutrino that interact in the ice.
Neutrino interact very weakly => need a very large array of ice instrumented
to maximize chances that a cosmic neutrino interacts inside the detector.
Need pointing accuracy to point back to origin of neutrino.
Telescopes the world over then try to identify the source in the direction
IceCube is pointing to for the neutrino. Multi-messenger Astrophysics
The ν detection challenge
10
Optical Pro
Aya Ishiha
• Combining all the possible info
• These features are included in
• We’re always be developing th
Nature never tell us a perfec
satisfactory agreem
Ice properties change with
depth and wavelength
Observed pointing resolution at high
energies is systematics limited.
Central value moves
for different ice models
Improved e and τ reconstruction
 increased neutrino flux
detection
 more observations
Photon propagation through
ice runs efficiently on single
precision GPU.
Detailed simulation campaigns
to improve pointing resolution
by improving ice model.
Improvement in reconstruction with
better ice model near the detectors
First evidence of an origin
11
First location of a source of very high energy neutrinos.
Neutrino produced high energy muon
near IceCube. Muon produced light as it
traverses IceCube volume. Light is
detected by array of phototubes of
IceCube.
IceCube alerted the astronomy community of the
observation of a single high energy neutrino on
September 22 2017.
A blazar designated by astronomers as TXS
0506+056 was subsequently identified as most likely
source in the direction IceCube was pointing. Multiple
telescopes saw light from TXS at the same time
IceCube saw the neutrino.
Science 361, 147-151
(2018). DOI:10.1126/science.aat2890
IceCube’s Future Plans
12
| IceCube Upgrade and Gen2 | Summer Blot | TeVPA 2018
The IceCube-Gen2 Facility
Preliminary timeline
MeV- to EeV-scale physics
Surface array
High Energy
Array
Radio array
PINGU
IC86
2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 … 2032
Today
Surface air shower
ConstructionR&D Design & Approval
IceCube Upgrade
IceCube Upgrade
Deployment
Near term:
add more phototubes to deep core to increase granularity of measurements.
Longer term:
• Extend instrumented
volume at smaller
granularity.
• Extend even smaller
granularity deep core
volume.
• Add surface array.
Improve detector for low & high energy neutrinos

Recommended for you

20170926 cern cloud v4
20170926 cern cloud v420170926 cern cloud v4
20170926 cern cloud v4

Tim Bell from CERN gave a presentation on "Understanding the Universe through Clouds" at OpenStack UK Days on September 26th, 2017. Some key points: - CERN operates one of the world's largest private OpenStack clouds to support the Large Hadron Collider, with over 8000 hypervisors and 33,000 VMs. - The Worldwide LHC Computing Grid distributes and analyzes LHC data across 600 PB of storage and 750k CPU cores at 170 sites in 42 countries. - CERN has been an early adopter of OpenStack technologies like Nova, Glance, Horizon, and Neutron since 2011 and contributes code back to the community. - New services like Mag

openstack cern
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...

This document describes XeMPUPiL, a performance-aware power capping orchestrator for the Xen hypervisor. It aims to maximize performance under a power cap using a hybrid approach. The key challenges addressed are instrumentation-free workload monitoring and balancing hardware and software power management techniques. Experimental results show XeMPUPiL outperforms a pure hardware approach for I/O, memory, and mixed workloads by better balancing efficiency and timeliness. Future work includes integrating the orchestrator logic into the scheduler and exploring new resource assignment policies.

#necstlab#polimi#ngc17
Cycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC RunCycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC Run

In this slidecast, Jason Stowe from Cycle Computing describes the company's recent record-breaking Petascale CycleCloud HPC production run. "For this big workload, a 156,314-core CycleCloud behemoth spanning 8 AWS regions, totaling 1.21 petaFLOPS (RPeak, not RMax) of aggregate compute power, to simulate 205,000 materials, crunched 264 compute years in only 18 hours. Thanks to Cycle's software and Amazon's Spot Instances, a supercomputing environment worth $68M if you had bought it, ran 2.3 Million hours of material science, approximately 264 compute-years, of simulation in only 18 hours, cost only $33,000, or $0.16 per molecule." Learn more: http://blog.cyclecomputing.com/2013/11/back-to-the-future-121-petaflopsrpeak-156000-core-cyclecloud-hpc-runs-264-years-of-materials-science.html Watch the video presentation: http://wp.me/p3RLHQ-aO9

cycle computinghpcsupercomputing
Details on the Cloud Burst
The Idea
• Integrate all GPUs available for sale
worldwide into a single HTCondor pool.
 use 28 regions across AWS, Azure, and Google
Cloud for a burst of a couple hours, or so.
• IceCube submits their photon propagation
workflow to this HTCondor pool.
 we handle the input, the jobs on the GPUs, and
the output as a single globally distributed system.
14
Run a GPU burst relevant in scale
for future Exascale HPC systems.
A global HTCondor pool
• IceCube, like all OSG user communities, relies on
HTCondor for resource orchestration
 This demo used the standard tools
• Dedicated HW setup
 Avoid disruption of OSG production system
 Optimize HTCondor setup for the spiky nature of the demo
 multiple schedds for IceCube to submit to
 collecting resources in each cloud region, then collecting from all
regions into global pool
15
HTCondor Distributed CI
16
Collector
Collector Collector
Collector
Collector
Negotiator
Scheduler SchedulerScheduler
IceCube
VM
VM
VM
10 schedd’s
One global resource pool

Recommended for you

20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1

This document discusses OpenStack cloud computing at CERN. It notes that CERN has 4 OpenStack clouds with over 120,000 cores total, and is migrating to the Kilo release of OpenStack. It then describes OpenStack components like Keystone for authentication, Glance for images, Nova for compute, and Cinder for block storage. The document outlines how OpenStack supports federated identity through options like Active Directory, OpenID Connect, and SAML. It provides examples of how federation could allow access to external clouds and shares experiences in deploying federated OpenStack.

openstack saml kerberos
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3

CERN operates the largest particle physics laboratory in the world. It manages over 8,000 servers to support its research. In 2012, CERN recognized limits with its existing infrastructure management tools and formed a team to define a new "Agile Infrastructure Project." The project goals were to improve resource provisioning time, enable cloud interfaces, improve monitoring and accounting, and boost efficiency. The team adopted open source tools like OpenStack, Puppet, and Ceph to create a new cloud service spanning two data centers. This allowed on-demand provisioning in minutes versus months and helped CERN better support its expanding computing needs for research.

openstackcern
Differential data processing for energy efficiency of wireless sensor networks
Differential data processing for energy efficiency of wireless sensor networksDifferential data processing for energy efficiency of wireless sensor networks
Differential data processing for energy efficiency of wireless sensor networks

Wireless sensor networks use many types of wireless sensors to configure network. However batteries in wireless sensor nodes are energy limited and consume considerable energy for data transmission. Therefore, data merging is used as a means to increase energy efficiency in data transmission. In this paper, we propose Differential Data Processing(DDP), which reduces the size of data transmitted from the sensor node to increase the energy efficiency of the wireless sensor network. Experimental results show that processing the differential temperature data reduces the average data size of the sensor node by 30%.

wsnenergy efficiencywireless sensor networks
Using native Cloud storage
• Input data pre-staged into native Cloud storage
 Each file in one-to-few Cloud regions
 some replication to deal with limited predictability of resources per region
 Local to Compute for large regions for maximum throughput
 Reading from “close” region for smaller ones to minimize ops
• Output staged back to region-local Cloud storage
• Deployed simple wrappers around Cloud native file
transfer tools
 IceCube jobs do not need to customize for different Clouds
 They just need to know where input data is available
(pretty standard OSG operation mode)
17
The Testing Ahead of Time
18
~250,000 single threaded jobs
run across 28 cloud regions
during 80 minutes.
Peak at 90,000
jobs running.
up to 60k jobs started in ~10min.
Regions across US, EU, and
Asia were used in this test.
Demonstrated burst capability
of our infrastructure on CPUs.
Want scale of GPU burst to be limited
only by # of GPUs available for sale.
Science with 51,000 GPUs
achieved as peak performance
19
Time in Minutes
Each color is a different
cloud region in US, EU, or Asia.
Total of 28 Regions in use.
Peaked at 51,500 GPUs
~380 Petaflops of fp32
8 generations of NVIDIA GPUs used.
Summary of stats at peak
A Heterogenous Resource Pool
20
28 cloud Regions across 4 world regions
providing us with 8 GPU generations.
No one region or GPU type dominates!

Recommended for you

Toward a National Research Platform
Toward a National Research PlatformToward a National Research Platform
Toward a National Research Platform

The document summarizes Dr. Larry Smarr's presentation on the Pacific Research Platform (PRP) and its role in working toward a national research platform. It describes how PRP has connected research teams and devices across multiple UC campuses for over 15 years. It also details PRP's innovations like Flash I/O Network Appliances (FIONAs) and use of Kubernetes to manage distributed resources. Finally, it outlines opportunities to further integrate PRP with the Open Science Grid and expand the platform internationally through partnerships.

Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013

MACPAC is a federal legislative branch agency tasked with reviewing state and federal Medicaid and Children's Health Insurance Program (CHIP) access and payment policies and making recommendations to Congress. By March 15 and again by June 15 each year, the agency produces a comprehensive report for Congress that compiles results from Medicaid and CHIP data sources for the 50 states and territories. The CIO of MACPAC wanted a secure, cost-effective, high performance platform that met their needs to crunch this large amount of health data. In this session, learn how MACPAC and 8KMiles helped set up the agency’s Big Data/HPC analytics platform on AWS using SAS analytics software.

aws cloudsas analyticschip
Towards Exascale Simulations of Stellar Explosions with FLASH
Towards Exascale  Simulations of Stellar  Explosions with FLASHTowards Exascale  Simulations of Stellar  Explosions with FLASH
Towards Exascale Simulations of Stellar Explosions with FLASH

- ORNL is managed by UT-Battelle for the US Department of Energy and conducts research including simulations of stellar explosions using the FLASH code. - The research aims to prepare FLASH to run on the upcoming Summit supercomputer by accelerating components like the nuclear kinetics module using GPUs. - Preliminary results show significant speedups from using GPUs for large nuclear reaction networks that were previously too computationally expensive.

Science Produced
21
Distributed High-Throughput
Computing (dHTC) paradigm
implemented via HTCondor provides
global resource aggregation.
Largest cloud region provided 10.8% of the total
dHTC paradigm can aggregate
on-prem anywhere
HPC at any scale
and multiple clouds
Performance and Cost Analysis
Performance vs GPU type
23
42% of the science was done on V100 in 19% of the wall time.
IceCube Performance/$$$
24
GPU V100 P100 T4 M60 RTX 2080 Ti GTX 1080 Ti
Relative
TFLOP32
100% 67% 57% 34%* 82% 62%
Relative
IceCube
Performance
100% 56% 48% 30%* 70% 48%
Relative
Science/$**
1.1-1.4 1.1-1.3 1.7-2.1 1.0-1.3 N/A N/A
*per CUDA device **spot market prices, range indicates cost differences between cloud vendors
IceCube performance scales better than TFLOP32 for high end GPUs
Science/$$$ x1.5 better for T4 than V100
Price differential between vendors ~10-30%
Price differential on-demand vs spot ~x3
Aside: Science/$$$ for on-prem of 2080Ti ~x3 better than V100

Recommended for you

How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science

In this deck from DataTech19, Debbie Bard from NERSC presents: Supercomputing and the scientist: How HPC and large-scale data analytics are transforming experimental science. "Debbie Bard leads the Data Science Engagement Group NERSC. NERSC is the mission supercomputing center for the USA Department of Energy, and supports over 7000 scientists and 700 projects with supercomputing needs. A native of the UK, her career spans research in particle physics, cosmology and computing on both sides of the Atlantic. She obtained her PhD at Edinburgh University, and has worked at Imperial College London as well as the Stanford Linear Accelerator Center (SLAC) in the USA, before joining the Data Department at NERSC, where she focuses on data-intensive computing and research, including supercomputing for experimental science and machine learning at scale." Watch the video: https://wp.me/p3RLHQ-kLV Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter

hpcsupercomputingnersc
20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona

CERN operates the largest machine on Earth, the Large Hadron Collider (LHC), which produces over 1 billion collisions per second and records over 0.5 petabytes of data per day. CERN relies heavily on OpenStack, with over 190,000 CPU cores and 5,000 VMs under OpenStack management, accounting for over 90% of CERN's computing resources. CERN plans to add over 100,000 more CPU cores in the next 6 months and explores using public clouds and containers to help process the massive amount of data generated by the LHC.

cernopenstackopenlab
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...

11.12.12 Seminar Presentation Princeton Institute for Computational Science and Engineering (PICSciE) Princeton University Title: A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Intensive Research Princeton, NJ

picsciesmarrcyberinfrastructure
IceCube and dHTC
dHTC = distributed High Throughput Computing
IceCube Input Segmentable
26
IceCube prepared two types of input files that differed
in x10 in the number of input events per file.
Small files processed by K80 and K520, large files by all other GPU types.
seconds seconds
A total of 10.2 Billion events were processed across ~175,000 GPU jobs.
Each job fetched a file from cloud storage to local storage, processed that file, and wrote
the output to cloud storage. For ¼ of the regions cloud storage was not local to the
region. => we could have probably avoided data replication across regions given the
excellent networking between regions for each provider.
Applicability beyond IceCube
• All the large instruments we know off
 LHC, LIGO, DUNE, LSST, …
• Any midscale instrument we can think off
 XENON, GlueX, Clas12, Nova, DES, Cryo-EM, …
• A large fraction of Deep Learning
 But not all of it …
• Basically, anything that has bundles of
independently schedulable jobs that can be
partitioned to adjust workloads to have 0.5 to
few hour runtimes on modern GPUs.
27
Cost to support cloud as a “24x7”
capability
• Today, roughly $15k per 300 PFLOP32 hour
• This burst was executed by 2 people
 Igor Sfiligoi (SDSC) to support the infrastructure.
 David Schultz (UW Madison) to create and submit the
IceCube workflows.
 “David” type person is needed also for on-prem science workflows.
• To make this a routine operations capability for any
open science that is dHTC capable would require
another 50% FTE “Cloud Budget Manager”.
 There is substantial effort involved in just dealing with cost &
budgets for a large community of scientists.
28

Recommended for you

Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era

10.02.22 Invited talk Symposium #1610, How Computational Science Is Tackling the Grand Challenges Facing Science and Society Title: Science and Cyberinfrastructure in the Data-Dominated Era San Diego, CA

aaassmarrcomputational science
Toward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing CyberinfrastructureToward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing Cyberinfrastructure

The document discusses the need for a new generation of cyberinfrastructure to support interactive global earth observation. It outlines several prototyping projects that are building examples of systems enabling real-time control of remote instruments, remote data access and analysis. These projects are driving the development of an emerging cyber-architecture using web and grid services to link distributed data repositories and simulations.

smarrcalit2meteorology
The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...

Remote Presentation Emerging Centers Track Campus Research Computing Consortium (CaRCC) June 21, 2023

distributed systemssupercomputer applications
IceCube is ready for Exascale
• Humanity has built extraordinary instruments by pooling
human and financial resources globally.
• The computing for these large collaborations fits perfectly to
the cloud or scheduling holes in Exascale HPC systems due
to its “ingeniously parallel” nature. => dHTC
• The dHTC computing paradigm applies to a wide range of
problems across all of open science.
 We are happy to repeat this with anybody willing to spend $50k in the
clouds.
29
Contact us at: help@opensciencegrid.org
Or me personally at: fkw@ucsd.edu
Demonstrated elastic burst at 51,500 GPUs
IceCube is ready for Exascale
Acknowledgements
• This work was partially supported by the
NSF grants OAC-1941481, MPS-1148698,
OAC-1841530, and OAC-1826967
30

More Related Content

What's hot

Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific Output
Igor Sfiligoi
 
Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobs
Igor Sfiligoi
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
Igor Sfiligoi
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
Rob Emanuele
 
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
Andrew Howard
 
Federated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation TherapyFederated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation Therapy
CESGA Centro de Supercomputación de Galicia
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack Nordic
Tim Bell
 
OpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspectiveOpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspective
Tim Bell
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
Amrita Prasad
 
20170926 cern cloud v4
20170926 cern cloud v420170926 cern cloud v4
20170926 cern cloud v4
Tim Bell
 
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
NECST Lab @ Politecnico di Milano
 
Cycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC RunCycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC Run
inside-BigData.com
 
20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1
Tim Bell
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3
Tim Bell
 
Differential data processing for energy efficiency of wireless sensor networks
Differential data processing for energy efficiency of wireless sensor networksDifferential data processing for energy efficiency of wireless sensor networks
Differential data processing for energy efficiency of wireless sensor networks
Daniel Lim
 
Toward a National Research Platform
Toward a National Research PlatformToward a National Research Platform
Toward a National Research Platform
Larry Smarr
 
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Amazon Web Services
 
Towards Exascale Simulations of Stellar Explosions with FLASH
Towards Exascale  Simulations of Stellar  Explosions with FLASHTowards Exascale  Simulations of Stellar  Explosions with FLASH
Towards Exascale Simulations of Stellar Explosions with FLASH
Ganesan Narayanasamy
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
inside-BigData.com
 
20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona
Tim Bell
 

What's hot (20)

Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific Output
 
Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobs
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
 
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
 
Federated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation TherapyFederated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation Therapy
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack Nordic
 
OpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspectiveOpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspective
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
 
20170926 cern cloud v4
20170926 cern cloud v420170926 cern cloud v4
20170926 cern cloud v4
 
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
 
Cycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC RunCycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC Run
 
20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3
 
Differential data processing for energy efficiency of wireless sensor networks
Differential data processing for energy efficiency of wireless sensor networksDifferential data processing for energy efficiency of wireless sensor networks
Differential data processing for energy efficiency of wireless sensor networks
 
Toward a National Research Platform
Toward a National Research PlatformToward a National Research Platform
Toward a National Research Platform
 
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
 
Towards Exascale Simulations of Stellar Explosions with FLASH
Towards Exascale  Simulations of Stellar  Explosions with FLASHTowards Exascale  Simulations of Stellar  Explosions with FLASH
Towards Exascale Simulations of Stellar Explosions with FLASH
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona
 

Similar to "Building and running the cloud GPU vacuum cleaner"

A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
Larry Smarr
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
Larry Smarr
 
Toward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing CyberinfrastructureToward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing Cyberinfrastructure
Larry Smarr
 
The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...
Larry Smarr
 
Global Research Platforms: Past, Present, Future
Global Research Platforms: Past, Present, FutureGlobal Research Platforms: Past, Present, Future
Global Research Platforms: Past, Present, Future
Larry Smarr
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
Rob Emanuele
 
The OptIPuter as a Prototype for CalREN-XD
The OptIPuter as a Prototype for CalREN-XDThe OptIPuter as a Prototype for CalREN-XD
The OptIPuter as a Prototype for CalREN-XD
Larry Smarr
 
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
Larry Smarr
 
Thoughts on Cybersecurity
Thoughts on CybersecurityThoughts on Cybersecurity
Thoughts on Cybersecurity
Frank Wuerthwein
 
How to Terminate the GLIF by Building a Campus Big Data Freeway System
How to Terminate the GLIF by Building a Campus Big Data Freeway SystemHow to Terminate the GLIF by Building a Campus Big Data Freeway System
How to Terminate the GLIF by Building a Campus Big Data Freeway System
Larry Smarr
 
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
Larry Smarr
 
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Larry Smarr
 
Applying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application ChallengeApplying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application Challenge
Larry Smarr
 
OptIPuter Overview
OptIPuter OverviewOptIPuter Overview
OptIPuter Overview
Larry Smarr
 
PRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path ForwardPRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path Forward
Larry Smarr
 
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
Larry Smarr
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
Larry Smarr
 
Berkeley cloud computing meetup may 2020
Berkeley cloud computing meetup may 2020Berkeley cloud computing meetup may 2020
Berkeley cloud computing meetup may 2020
Larry Smarr
 
The Academic and R&D Sectors' Current and Future Broadband and Fiber Access N...
The Academic and R&D Sectors' Current and Future Broadband and Fiber Access N...The Academic and R&D Sectors' Current and Future Broadband and Fiber Access N...
The Academic and R&D Sectors' Current and Future Broadband and Fiber Access N...
Larry Smarr
 
The Pacific Research Platform Connects to CSU San Bernardino
The Pacific Research Platform Connects to CSU San BernardinoThe Pacific Research Platform Connects to CSU San Bernardino
The Pacific Research Platform Connects to CSU San Bernardino
Larry Smarr
 

Similar to "Building and running the cloud GPU vacuum cleaner" (20)

A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
 
Toward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing CyberinfrastructureToward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing Cyberinfrastructure
 
The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...The National Research Platform Enables a Growing Diversity of Users and Appl...
The National Research Platform Enables a Growing Diversity of Users and Appl...
 
Global Research Platforms: Past, Present, Future
Global Research Platforms: Past, Present, FutureGlobal Research Platforms: Past, Present, Future
Global Research Platforms: Past, Present, Future
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
 
The OptIPuter as a Prototype for CalREN-XD
The OptIPuter as a Prototype for CalREN-XDThe OptIPuter as a Prototype for CalREN-XD
The OptIPuter as a Prototype for CalREN-XD
 
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
 
Thoughts on Cybersecurity
Thoughts on CybersecurityThoughts on Cybersecurity
Thoughts on Cybersecurity
 
How to Terminate the GLIF by Building a Campus Big Data Freeway System
How to Terminate the GLIF by Building a Campus Big Data Freeway SystemHow to Terminate the GLIF by Building a Campus Big Data Freeway System
How to Terminate the GLIF by Building a Campus Big Data Freeway System
 
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
 
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
 
Applying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application ChallengeApplying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application Challenge
 
OptIPuter Overview
OptIPuter OverviewOptIPuter Overview
OptIPuter Overview
 
PRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path ForwardPRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path Forward
 
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
 
Berkeley cloud computing meetup may 2020
Berkeley cloud computing meetup may 2020Berkeley cloud computing meetup may 2020
Berkeley cloud computing meetup may 2020
 
The Academic and R&D Sectors' Current and Future Broadband and Fiber Access N...
The Academic and R&D Sectors' Current and Future Broadband and Fiber Access N...The Academic and R&D Sectors' Current and Future Broadband and Fiber Access N...
The Academic and R&D Sectors' Current and Future Broadband and Fiber Access N...
 
The Pacific Research Platform Connects to CSU San Bernardino
The Pacific Research Platform Connects to CSU San BernardinoThe Pacific Research Platform Connects to CSU San Bernardino
The Pacific Research Platform Connects to CSU San Bernardino
 

Recently uploaded

Science-Technology Quiz (School Quiz 2024)
Science-Technology Quiz (School Quiz 2024)Science-Technology Quiz (School Quiz 2024)
Science-Technology Quiz (School Quiz 2024)
Kashyap J
 
TOPIC: INTRODUCTION TO FORENSIC SCIENCE.pptx
TOPIC: INTRODUCTION TO FORENSIC SCIENCE.pptxTOPIC: INTRODUCTION TO FORENSIC SCIENCE.pptx
TOPIC: INTRODUCTION TO FORENSIC SCIENCE.pptx
imansiipandeyy
 
lipids_233455668899076544553879848657.pptx
lipids_233455668899076544553879848657.pptxlipids_233455668899076544553879848657.pptx
lipids_233455668899076544553879848657.pptx
muralinath2
 
Probing the northern Kaapvaal craton root with mantle-derived xenocrysts from...
Probing the northern Kaapvaal craton root with mantle-derived xenocrysts from...Probing the northern Kaapvaal craton root with mantle-derived xenocrysts from...
Probing the northern Kaapvaal craton root with mantle-derived xenocrysts from...
James AH Campbell
 
Computer aided biopharmaceutical characterization
Computer aided biopharmaceutical characterizationComputer aided biopharmaceutical characterization
Computer aided biopharmaceutical characterization
souravpaul769171
 
gastrointestinal hormonese I 45678633134668097636903278.pptx
gastrointestinal hormonese I 45678633134668097636903278.pptxgastrointestinal hormonese I 45678633134668097636903278.pptx
gastrointestinal hormonese I 45678633134668097636903278.pptx
muralinath2
 
A slightly oblate dark matter halo revealed by a retrograde precessing Galact...
A slightly oblate dark matter halo revealed by a retrograde precessing Galact...A slightly oblate dark matter halo revealed by a retrograde precessing Galact...
A slightly oblate dark matter halo revealed by a retrograde precessing Galact...
Sérgio Sacani
 
ScieNCE grade 08 Lesson 1 and 2 NLC.pptx
ScieNCE grade 08 Lesson 1 and 2 NLC.pptxScieNCE grade 08 Lesson 1 and 2 NLC.pptx
ScieNCE grade 08 Lesson 1 and 2 NLC.pptx
JoanaBanasen1
 
MACRAMÉ-ChiPs: Patchwork Project Family & Sibling Projects (24th Meeting of t...
MACRAMÉ-ChiPs: Patchwork Project Family & Sibling Projects (24th Meeting of t...MACRAMÉ-ChiPs: Patchwork Project Family & Sibling Projects (24th Meeting of t...
MACRAMÉ-ChiPs: Patchwork Project Family & Sibling Projects (24th Meeting of t...
Steffi Friedrichs
 
Electrostatic force class 8 physics .pdf
Electrostatic force class 8 physics .pdfElectrostatic force class 8 physics .pdf
Electrostatic force class 8 physics .pdf
yokeswarikannan123
 
Summer program introduction in Yunnan university
Summer program introduction in Yunnan universitySummer program introduction in Yunnan university
Summer program introduction in Yunnan university
Hayato Shimabukuro
 
Transmission Spectroscopy of the Habitable Zone Exoplanet LHS 1140 b with JWS...
Transmission Spectroscopy of the Habitable Zone Exoplanet LHS 1140 b with JWS...Transmission Spectroscopy of the Habitable Zone Exoplanet LHS 1140 b with JWS...
Transmission Spectroscopy of the Habitable Zone Exoplanet LHS 1140 b with JWS...
Sérgio Sacani
 
ANTIGENS_.pptx ( Ranjitha SL) PRESENTATION SLIDE
ANTIGENS_.pptx ( Ranjitha SL) PRESENTATION SLIDEANTIGENS_.pptx ( Ranjitha SL) PRESENTATION SLIDE
ANTIGENS_.pptx ( Ranjitha SL) PRESENTATION SLIDE
RanjithaSL
 
GIT hormones- II_12345677809876543235780963.pptx
GIT hormones- II_12345677809876543235780963.pptxGIT hormones- II_12345677809876543235780963.pptx
GIT hormones- II_12345677809876543235780963.pptx
muralinath2
 
SCIENCEgfvhvhvkjkbbjjbbjvhvhvhvjkvjvjvjj.pptx
SCIENCEgfvhvhvkjkbbjjbbjvhvhvhvjkvjvjvjj.pptxSCIENCEgfvhvhvkjkbbjjbbjvhvhvhvjkvjvjvjj.pptx
SCIENCEgfvhvhvkjkbbjjbbjvhvhvhvjkvjvjvjj.pptx
WALTONMARBRUCAL
 
SlideEgg-703870-Bird Migration PPT Presentation.pptx
SlideEgg-703870-Bird Migration PPT Presentation.pptxSlideEgg-703870-Bird Migration PPT Presentation.pptx
SlideEgg-703870-Bird Migration PPT Presentation.pptx
randaalmabrouk
 
Liver & Gall Bladder 23098463278654387654328765439875.pptx
Liver & Gall Bladder 23098463278654387654328765439875.pptxLiver & Gall Bladder 23098463278654387654328765439875.pptx
Liver & Gall Bladder 23098463278654387654328765439875.pptx
muralinath2
 
Science grade 09 Lesson1-2 NLC-pptx.pptx
Science grade 09 Lesson1-2 NLC-pptx.pptxScience grade 09 Lesson1-2 NLC-pptx.pptx
Science grade 09 Lesson1-2 NLC-pptx.pptx
JoanaBanasen1
 
Search for Dark Matter Ionization on the Night Side of Jupiter with Cassini
Search for Dark Matter Ionization on the Night Side of Jupiter with CassiniSearch for Dark Matter Ionization on the Night Side of Jupiter with Cassini
Search for Dark Matter Ionization on the Night Side of Jupiter with Cassini
Sérgio Sacani
 
Anatomy, and reproduction of Gnetum.pptx
Anatomy, and reproduction of Gnetum.pptxAnatomy, and reproduction of Gnetum.pptx
Anatomy, and reproduction of Gnetum.pptx
karthiksaran8
 

Recently uploaded (20)

Science-Technology Quiz (School Quiz 2024)
Science-Technology Quiz (School Quiz 2024)Science-Technology Quiz (School Quiz 2024)
Science-Technology Quiz (School Quiz 2024)
 
TOPIC: INTRODUCTION TO FORENSIC SCIENCE.pptx
TOPIC: INTRODUCTION TO FORENSIC SCIENCE.pptxTOPIC: INTRODUCTION TO FORENSIC SCIENCE.pptx
TOPIC: INTRODUCTION TO FORENSIC SCIENCE.pptx
 
lipids_233455668899076544553879848657.pptx
lipids_233455668899076544553879848657.pptxlipids_233455668899076544553879848657.pptx
lipids_233455668899076544553879848657.pptx
 
Probing the northern Kaapvaal craton root with mantle-derived xenocrysts from...
Probing the northern Kaapvaal craton root with mantle-derived xenocrysts from...Probing the northern Kaapvaal craton root with mantle-derived xenocrysts from...
Probing the northern Kaapvaal craton root with mantle-derived xenocrysts from...
 
Computer aided biopharmaceutical characterization
Computer aided biopharmaceutical characterizationComputer aided biopharmaceutical characterization
Computer aided biopharmaceutical characterization
 
gastrointestinal hormonese I 45678633134668097636903278.pptx
gastrointestinal hormonese I 45678633134668097636903278.pptxgastrointestinal hormonese I 45678633134668097636903278.pptx
gastrointestinal hormonese I 45678633134668097636903278.pptx
 
A slightly oblate dark matter halo revealed by a retrograde precessing Galact...
A slightly oblate dark matter halo revealed by a retrograde precessing Galact...A slightly oblate dark matter halo revealed by a retrograde precessing Galact...
A slightly oblate dark matter halo revealed by a retrograde precessing Galact...
 
ScieNCE grade 08 Lesson 1 and 2 NLC.pptx
ScieNCE grade 08 Lesson 1 and 2 NLC.pptxScieNCE grade 08 Lesson 1 and 2 NLC.pptx
ScieNCE grade 08 Lesson 1 and 2 NLC.pptx
 
MACRAMÉ-ChiPs: Patchwork Project Family & Sibling Projects (24th Meeting of t...
MACRAMÉ-ChiPs: Patchwork Project Family & Sibling Projects (24th Meeting of t...MACRAMÉ-ChiPs: Patchwork Project Family & Sibling Projects (24th Meeting of t...
MACRAMÉ-ChiPs: Patchwork Project Family & Sibling Projects (24th Meeting of t...
 
Electrostatic force class 8 physics .pdf
Electrostatic force class 8 physics .pdfElectrostatic force class 8 physics .pdf
Electrostatic force class 8 physics .pdf
 
Summer program introduction in Yunnan university
Summer program introduction in Yunnan universitySummer program introduction in Yunnan university
Summer program introduction in Yunnan university
 
Transmission Spectroscopy of the Habitable Zone Exoplanet LHS 1140 b with JWS...
Transmission Spectroscopy of the Habitable Zone Exoplanet LHS 1140 b with JWS...Transmission Spectroscopy of the Habitable Zone Exoplanet LHS 1140 b with JWS...
Transmission Spectroscopy of the Habitable Zone Exoplanet LHS 1140 b with JWS...
 
ANTIGENS_.pptx ( Ranjitha SL) PRESENTATION SLIDE
ANTIGENS_.pptx ( Ranjitha SL) PRESENTATION SLIDEANTIGENS_.pptx ( Ranjitha SL) PRESENTATION SLIDE
ANTIGENS_.pptx ( Ranjitha SL) PRESENTATION SLIDE
 
GIT hormones- II_12345677809876543235780963.pptx
GIT hormones- II_12345677809876543235780963.pptxGIT hormones- II_12345677809876543235780963.pptx
GIT hormones- II_12345677809876543235780963.pptx
 
SCIENCEgfvhvhvkjkbbjjbbjvhvhvhvjkvjvjvjj.pptx
SCIENCEgfvhvhvkjkbbjjbbjvhvhvhvjkvjvjvjj.pptxSCIENCEgfvhvhvkjkbbjjbbjvhvhvhvjkvjvjvjj.pptx
SCIENCEgfvhvhvkjkbbjjbbjvhvhvhvjkvjvjvjj.pptx
 
SlideEgg-703870-Bird Migration PPT Presentation.pptx
SlideEgg-703870-Bird Migration PPT Presentation.pptxSlideEgg-703870-Bird Migration PPT Presentation.pptx
SlideEgg-703870-Bird Migration PPT Presentation.pptx
 
Liver & Gall Bladder 23098463278654387654328765439875.pptx
Liver & Gall Bladder 23098463278654387654328765439875.pptxLiver & Gall Bladder 23098463278654387654328765439875.pptx
Liver & Gall Bladder 23098463278654387654328765439875.pptx
 
Science grade 09 Lesson1-2 NLC-pptx.pptx
Science grade 09 Lesson1-2 NLC-pptx.pptxScience grade 09 Lesson1-2 NLC-pptx.pptx
Science grade 09 Lesson1-2 NLC-pptx.pptx
 
Search for Dark Matter Ionization on the Night Side of Jupiter with Cassini
Search for Dark Matter Ionization on the Night Side of Jupiter with CassiniSearch for Dark Matter Ionization on the Night Side of Jupiter with Cassini
Search for Dark Matter Ionization on the Night Side of Jupiter with Cassini
 
Anatomy, and reproduction of Gnetum.pptx
Anatomy, and reproduction of Gnetum.pptxAnatomy, and reproduction of Gnetum.pptx
Anatomy, and reproduction of Gnetum.pptx
 

"Building and running the cloud GPU vacuum cleaner"

  • 1. Running a GPU burst for Multi- Messenger Astrophysics with IceCube across all available GPUs in the Cloud Frank Würthwein OSG Executive Director UCSD/SDSC
  • 2. Jensen Huang keynote @ SC19 2 The Largest Cloud Simulation in History 50k NVIDIA GPUs in the Cloud 350 Petaflops for 2 hours Distributed across US, Europe & Asia Saturday morning before SC19 we bought all GPU capacity that was for sale in Amazon Web Services, Microsoft Azure, and Google Cloud Platform worldwide
  • 3. How did we get here?
  • 4. Annual IceCube GPU use via OSG 4 Peaked at ~3000 GPUs for a day. Last 12 months OSG supports global operations of IceCube. IceCube made long term investment into dHTC as their computing paradigm. We produced ~3% of the annual photon propagation simulations in a ~2h cloud burst. Longterm Partnership between IceCube, OSG, HTCondor, … lead to this cloud burst.
  • 6. IceCube 6 A cubic kilometer of ice at the south pole is instrumented with 5160 optical sensors. Astrophysics: • Discovery of astrophysical neutrinos • First evidence of neutrino point source (TXS) • Cosmic rays with surface detector Particle Physics: • Atmospheric neutrino oscillation • Neutrino cross sections at TeV scale • New physics searches at highest energies Earth Science: • Glaciology • Earth tomography A facility with very diverse science goals Restrict this talk to high energy Astrophysics
  • 7. High Energy Astrophysics Science case for IceCube 7 Universe is opaque to light at highest energies and distances. Only gravitational waves and neutrinos can pinpoint most violent events in universe. Fortunately, highest energy neutrinos are of cosmic origin. Effectively “background free” as long as energy is measured correctly.
  • 8. High energy neutrinos from outside the solar system 8 First 28 very high energy neutrinos from outside the solar system Red curve is the photon flux spectrum measured with the Fermi satellite. Black points show the corresponding high energy neutrino flux spectrum measured by IceCube. This demonstrates both the opaqueness of the universe to high energy photons, and the ability of IceCube to detect neutrinos above the maximum energy we can see light due to this opaqueness. Science 342 (2013). DOI: 10.1126/science.1242856
  • 9. Understanding the Origin 9 We now know high energy events happen in the universe. What are they? p + g D + p + p 0 p + gg p + g D + n + p + n + m + n Co Aya Ishihara The hypothesis: The same cosmic events produce neutrinos and photons We detect the electrons or muons from neutrino that interact in the ice. Neutrino interact very weakly => need a very large array of ice instrumented to maximize chances that a cosmic neutrino interacts inside the detector. Need pointing accuracy to point back to origin of neutrino. Telescopes the world over then try to identify the source in the direction IceCube is pointing to for the neutrino. Multi-messenger Astrophysics
  • 10. The ν detection challenge 10 Optical Pro Aya Ishiha • Combining all the possible info • These features are included in • We’re always be developing th Nature never tell us a perfec satisfactory agreem Ice properties change with depth and wavelength Observed pointing resolution at high energies is systematics limited. Central value moves for different ice models Improved e and τ reconstruction  increased neutrino flux detection  more observations Photon propagation through ice runs efficiently on single precision GPU. Detailed simulation campaigns to improve pointing resolution by improving ice model. Improvement in reconstruction with better ice model near the detectors
  • 11. First evidence of an origin 11 First location of a source of very high energy neutrinos. Neutrino produced high energy muon near IceCube. Muon produced light as it traverses IceCube volume. Light is detected by array of phototubes of IceCube. IceCube alerted the astronomy community of the observation of a single high energy neutrino on September 22 2017. A blazar designated by astronomers as TXS 0506+056 was subsequently identified as most likely source in the direction IceCube was pointing. Multiple telescopes saw light from TXS at the same time IceCube saw the neutrino. Science 361, 147-151 (2018). DOI:10.1126/science.aat2890
  • 12. IceCube’s Future Plans 12 | IceCube Upgrade and Gen2 | Summer Blot | TeVPA 2018 The IceCube-Gen2 Facility Preliminary timeline MeV- to EeV-scale physics Surface array High Energy Array Radio array PINGU IC86 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 … 2032 Today Surface air shower ConstructionR&D Design & Approval IceCube Upgrade IceCube Upgrade Deployment Near term: add more phototubes to deep core to increase granularity of measurements. Longer term: • Extend instrumented volume at smaller granularity. • Extend even smaller granularity deep core volume. • Add surface array. Improve detector for low & high energy neutrinos
  • 13. Details on the Cloud Burst
  • 14. The Idea • Integrate all GPUs available for sale worldwide into a single HTCondor pool.  use 28 regions across AWS, Azure, and Google Cloud for a burst of a couple hours, or so. • IceCube submits their photon propagation workflow to this HTCondor pool.  we handle the input, the jobs on the GPUs, and the output as a single globally distributed system. 14 Run a GPU burst relevant in scale for future Exascale HPC systems.
  • 15. A global HTCondor pool • IceCube, like all OSG user communities, relies on HTCondor for resource orchestration  This demo used the standard tools • Dedicated HW setup  Avoid disruption of OSG production system  Optimize HTCondor setup for the spiky nature of the demo  multiple schedds for IceCube to submit to  collecting resources in each cloud region, then collecting from all regions into global pool 15
  • 16. HTCondor Distributed CI 16 Collector Collector Collector Collector Collector Negotiator Scheduler SchedulerScheduler IceCube VM VM VM 10 schedd’s One global resource pool
  • 17. Using native Cloud storage • Input data pre-staged into native Cloud storage  Each file in one-to-few Cloud regions  some replication to deal with limited predictability of resources per region  Local to Compute for large regions for maximum throughput  Reading from “close” region for smaller ones to minimize ops • Output staged back to region-local Cloud storage • Deployed simple wrappers around Cloud native file transfer tools  IceCube jobs do not need to customize for different Clouds  They just need to know where input data is available (pretty standard OSG operation mode) 17
  • 18. The Testing Ahead of Time 18 ~250,000 single threaded jobs run across 28 cloud regions during 80 minutes. Peak at 90,000 jobs running. up to 60k jobs started in ~10min. Regions across US, EU, and Asia were used in this test. Demonstrated burst capability of our infrastructure on CPUs. Want scale of GPU burst to be limited only by # of GPUs available for sale.
  • 19. Science with 51,000 GPUs achieved as peak performance 19 Time in Minutes Each color is a different cloud region in US, EU, or Asia. Total of 28 Regions in use. Peaked at 51,500 GPUs ~380 Petaflops of fp32 8 generations of NVIDIA GPUs used. Summary of stats at peak
  • 20. A Heterogenous Resource Pool 20 28 cloud Regions across 4 world regions providing us with 8 GPU generations. No one region or GPU type dominates!
  • 21. Science Produced 21 Distributed High-Throughput Computing (dHTC) paradigm implemented via HTCondor provides global resource aggregation. Largest cloud region provided 10.8% of the total dHTC paradigm can aggregate on-prem anywhere HPC at any scale and multiple clouds
  • 23. Performance vs GPU type 23 42% of the science was done on V100 in 19% of the wall time.
  • 24. IceCube Performance/$$$ 24 GPU V100 P100 T4 M60 RTX 2080 Ti GTX 1080 Ti Relative TFLOP32 100% 67% 57% 34%* 82% 62% Relative IceCube Performance 100% 56% 48% 30%* 70% 48% Relative Science/$** 1.1-1.4 1.1-1.3 1.7-2.1 1.0-1.3 N/A N/A *per CUDA device **spot market prices, range indicates cost differences between cloud vendors IceCube performance scales better than TFLOP32 for high end GPUs Science/$$$ x1.5 better for T4 than V100 Price differential between vendors ~10-30% Price differential on-demand vs spot ~x3 Aside: Science/$$$ for on-prem of 2080Ti ~x3 better than V100
  • 25. IceCube and dHTC dHTC = distributed High Throughput Computing
  • 26. IceCube Input Segmentable 26 IceCube prepared two types of input files that differed in x10 in the number of input events per file. Small files processed by K80 and K520, large files by all other GPU types. seconds seconds A total of 10.2 Billion events were processed across ~175,000 GPU jobs. Each job fetched a file from cloud storage to local storage, processed that file, and wrote the output to cloud storage. For ¼ of the regions cloud storage was not local to the region. => we could have probably avoided data replication across regions given the excellent networking between regions for each provider.
  • 27. Applicability beyond IceCube • All the large instruments we know off  LHC, LIGO, DUNE, LSST, … • Any midscale instrument we can think off  XENON, GlueX, Clas12, Nova, DES, Cryo-EM, … • A large fraction of Deep Learning  But not all of it … • Basically, anything that has bundles of independently schedulable jobs that can be partitioned to adjust workloads to have 0.5 to few hour runtimes on modern GPUs. 27
  • 28. Cost to support cloud as a “24x7” capability • Today, roughly $15k per 300 PFLOP32 hour • This burst was executed by 2 people  Igor Sfiligoi (SDSC) to support the infrastructure.  David Schultz (UW Madison) to create and submit the IceCube workflows.  “David” type person is needed also for on-prem science workflows. • To make this a routine operations capability for any open science that is dHTC capable would require another 50% FTE “Cloud Budget Manager”.  There is substantial effort involved in just dealing with cost & budgets for a large community of scientists. 28
  • 29. IceCube is ready for Exascale • Humanity has built extraordinary instruments by pooling human and financial resources globally. • The computing for these large collaborations fits perfectly to the cloud or scheduling holes in Exascale HPC systems due to its “ingeniously parallel” nature. => dHTC • The dHTC computing paradigm applies to a wide range of problems across all of open science.  We are happy to repeat this with anybody willing to spend $50k in the clouds. 29 Contact us at: help@opensciencegrid.org Or me personally at: fkw@ucsd.edu Demonstrated elastic burst at 51,500 GPUs IceCube is ready for Exascale
  • 30. Acknowledgements • This work was partially supported by the NSF grants OAC-1941481, MPS-1148698, OAC-1841530, and OAC-1826967 30