The Pacific Research Platform (PRP) is a multi-institutional partnership that establishes a high-capacity "big data freeway system" spanning the University of California campuses and other research universities in California to facilitate rapid data access and sharing between researchers and institutions. Fifteen multi-campus application teams in fields like particle physics, astronomy, earth sciences, biomedicine, and visualization drive the technical design of the PRP over five years. The goal of the PRP is to extend campus "Science DMZ" networks to allow high-speed data movement between research labs, supercomputer centers, and data repositories across campus, regional
This document discusses several projects related to connecting research institutions through high-speed networks:
1) The Pacific Research Platform connects campuses in California through a "big data superhighway" funded by NSF from 2015-2020.
2) CHASE-CI adds machine learning capabilities for researchers across 10 campuses in California using NSF-funded GPU resources.
3) A pilot project is using CENIC and Internet2 to connect regional research networks on a national scale, funded by NSF from 2018-2019.
Peering The Pacific Research Platform With The Great Plains NetworkLarry Smarr
The Pacific Research Platform (PRP) connects research institutions across the western United States with high-speed networks to enable data-intensive science collaborations. Key points:
- The PRP connects 15 campuses across California and links to the Great Plains Network, allowing researchers to access remote supercomputers, share large datasets, and collaborate on projects like analyzing data from the Large Hadron Collider.
- The PRP utilizes Science DMZ architectures with dedicated data transfer nodes called FIONAs to achieve high-speed transfer of large files. Kubernetes is used to manage distributed storage and computing resources.
- Early applications include distributed climate modeling, wildfire science, plankton imaging, and cancer genomics. The PR
The Pacific Research Platform Enables Distributed Big-Data Machine-LearningLarry Smarr
The Pacific Research Platform enables distributed big data machine learning by connecting scientific instruments, sensors, and supercomputers across California and the United States with high-speed optical networks. Key components include FIONA data transfer nodes that allow fast disk-to-disk transfers near the theoretical maximum, Kubernetes to orchestrate distributed computing resources, and the Nautilus hypercluster which aggregates thousands of CPU cores and GPUs into a unified platform. This infrastructure has accelerated many scientific workflows and supported cutting-edge research in fields such as astronomy, oceanography, climate science, and particle physics.
National Federated Compute Platforms: The Pacific Research PlatformLarry Smarr
The Pacific Research Platform (PRP) is a multi-institution hypercluster that connects science DMZs across 25 partner campuses using FIONA data transfer nodes and 10-100Gbps networks. PRP adopted Kubernetes and Rook to orchestrate petabytes of distributed storage and GPUs for data science applications. A CHASE-CI grant added machine learning capabilities. PRP is working to federate with the Open Science Grid and become a prototype for a future National Research Platform connecting regional networks.
Internet & Climate Change: Cyberinfrastructure for a Carbon-Constrained WorldLarry Smarr
- Internet and information technologies (ICT) can play a key role in addressing climate change by enabling efficiency gains across multiple sectors that could reduce greenhouse gas emissions up to 5 times more than ICT's own carbon footprint.
- University campuses can serve as living laboratories for testing green ICT solutions and infrastructure to reduce emissions from buildings, transportation, electricity generation and distribution.
- Advances in machine learning and brain-inspired computing will be necessary to develop low-power exascale supercomputers needed to fully model and simulate climate systems.
The Pacific Research Platform: Building a Distributed Big Data Machine Learni...Larry Smarr
This document summarizes Dr. Larry Smarr's invited talk about the Pacific Research Platform (PRP) given at the San Diego Supercomputer Center in April 2019. The PRP is building a distributed big data machine learning supercomputer by connecting high-performance computing and data resources across multiple universities in California and beyond using high-speed networks. It provides researchers with petascale computing power, distributed storage, and tools like Kubernetes to enable collaborative data-intensive science across institutions.
An Integrated Science Cyberinfrastructure for Data-Intensive ResearchLarry Smarr
This document summarizes Dr. Larry Smarr's vision for an integrated science cyberinfrastructure to support data-intensive research. It discusses the exponential growth of digital data and need for dedicated high-bandwidth networks and data repositories. Specific examples are provided of initiatives at UCSD, regional optical networks connecting research institutions, and national projects like the Open Science Grid and Cancer Genomics Hub that are creating cyberinfrastructure to enable data-intensive scientific discovery.
The Pacific Research Platform:a Science-Driven Big-Data Freeway SystemLarry Smarr
The Pacific Research Platform will create a regional "Big Data Freeway System" along the West Coast to support science. It will connect major research institutions with high-speed optical networks, allowing them to share vast amounts of data and computational resources. This will enable new forms of collaborative, data-intensive research for fields like particle physics, astronomy, biomedicine, and earth sciences. The first phase aims to establish a basic networked infrastructure, with later phases advancing capabilities to 100Gbps and beyond with security and distributed technologies.
The Pacific Research Platform (PRP) is a multi-institutional cyberinfrastructure project that connects researchers across California and beyond to share large datasets. It spans the 10 University of California campuses, major private research universities, supercomputer centers, and some out-of-state universities. Fifteen multi-campus research teams in fields like physics, astronomy, earth sciences, biomedicine, and multimedia will drive the technical needs of the PRP over five years. The goal is to create a "big data freeway" to allow high-speed sharing of data between research labs, supercomputers, and repositories across multiple networks without performance loss over long distances.
Toward a Global Interactive Earth Observing CyberinfrastructureLarry Smarr
The document discusses the need for a new generation of cyberinfrastructure to support interactive global earth observation. It outlines several prototyping projects that are building examples of systems enabling real-time control of remote instruments, remote data access and analysis. These projects are driving the development of an emerging cyber-architecture using web and grid services to link distributed data repositories and simulations.
Opening Keynote Lecture
15th Annual ON*VECTOR International Photonics Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
February 29, 2016
Calit2 as a Model for Collaborative InnovationLarry Smarr
- Calit2 was established in 2000 as a collaborative research institute between UCSD and UCI to bring together faculty from different disciplines to work on emerging technologies through multidisciplinary teams.
- It has over 1000 researchers working across both campuses in fields like nanotechnology, biomedicine, digital arts and more.
- Calit2 has established numerous partnerships internationally and in industry, and has facilities like clean rooms, virtual reality labs and more that enable cutting edge research.
- One example is how Calit2 worked with NASA to reduce the time to receive satellite images during wildfires, and has since used VR to help plan fire response.
The document summarizes the creation and evolution of Calit2, the California Institute for Telecommunications and Information Technology, a partnership between UC San Diego and UC Irvine. It describes how Calit2 was established in 2001 with a mission to explore how emerging technologies could transform applications through interdisciplinary research. With support from the state and industry partners, Calit2 has grown facilities and research projects in areas like networking, virtual reality, biomedicine, and more recently brain-inspired computing and machine learning.
CENIC: Pacific Wave and PRP Update Big News for Big DataLarry Smarr
The document discusses the Pacific Wave exchange and Pacific Research Platform (PRP). It provides an overview of Pacific Wave, including its history and connectivity across the Pacific and western US. It then discusses how the PRP will build on infrastructure projects to create a high-speed "big data freeway" for science across California universities. This will allow researchers to more easily share and analyze large datasets for projects in areas like climate modeling, cancer genomics, astronomy and particle physics. Details are provided on specific science applications and datasets that will benefit from the enhanced connectivity of the PRP.
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific ApplicationsLarry Smarr
07.03.21
IEEE Computer Society Tsutomu Kanai Award Keynote
At the Joint Meeting of the: 8th International Symposium on Autonomous Decentralized Systems
2nd International Workshop on Ad Hoc, Sensor and P2P Networks
11th IEEE International Workshop on Future Trends of Distributed Computing Systems
Title: OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
Sedona, AZ
High Performance Cyberinfrastructure for Data-Intensive ResearchLarry Smarr
This document summarizes a lecture given by Dr. Larry Smarr on high performance cyberinfrastructure for data-intensive research. The summary discusses:
1) The need for dedicated high-bandwidth networks separate from the shared internet to enable big data research due to the increasing volume of digital scientific data.
2) Extensions being made to networks like CENIC in California to provide campus "Big Data Freeways" connecting instruments, computing resources, and remote facilities.
3) The use of networks like HPWREN to provide high-performance wireless access for data-intensive applications in rural areas like astronomy, wildfire detection, and more.
Positioning University of California Information Technology for the Future: S...Larry Smarr
05.02.15
Invited Talk
The Vice Chancellor of Research and Chief Information Officer Summit
“Information Technology Enabling Research at the University of California”
Title: Positioning University of California Information Technology for the Future: State, National, and International IT Infrastructure Trends and Directions
Oakland, CA
- The Pacific Research Platform (PRP) interconnects campus DMZs across multiple institutions to provide high-speed connectivity for data-intensive research.
- The PRP utilizes specialized data transfer nodes called FIONAs that provide disk-to-disk transfer speeds of 10-100Gbps.
- Early applications of the PRP include distributing telescope data between UC campuses, connecting particle physics experiments to computing resources, and enabling real-time wildfire sensor data analysis.
Coupling Australia’s Researchers to the Global Innovation EconomyLarry Smarr
08.10.10
Fifth Lecture in the
Australian American Leadership Dialogue Scholar Tour
University of Queensland
Title: Coupling Australia’s Researchers to the Global Innovation Economy
Brisbane, Australia
PROnet is an NSF-supported research project being conducted by researchers at the University of Texas at Dallas. PROnet is dedicated to enabling the design, development, demonstration and deployment of innovative ultrahigh-speed low-latency applications being created in and across North Texas and beyond.
Science and Cyberinfrastructure in the Data-Dominated EraLarry Smarr
10.02.22
Invited talk
Symposium #1610, How Computational Science Is Tackling the Grand Challenges Facing Science and Society
Title: Science and Cyberinfrastructure in the Data-Dominated Era
San Diego, CA
Coupling Australia’s Researchers to the Global Innovation EconomyLarry Smarr
The document summarizes Dr. Larry Smarr's lecture on connecting Australian researchers to the global innovation economy through high-performance networks. It discusses projects that established dedicated 1Gbps and 10Gbps connections between Australian universities and research centers and international partners. This infrastructure will allow Australian researchers to collaborate globally on issues like climate change, health care, and more. The goal is for Australia to have connectivity on par with the best in the world to attract top researchers and partners.
Coupling Australia’s Researchers to the Global Innovation EconomyLarry Smarr
08.10.15
Eighth Lecture in the
Australian American Leadership Dialogue Scholar Tour
Australian National University
Title: Coupling Australia’s Researchers to the Global Innovation Economy
Canberra, Australia
Calit2: a View Into the Future of the Wired and Unwired InternetLarry Smarr
06.01.23
Invited Talk to the National Research Council's Computer Science and Telecommunications Board
Title: Calit2: a View Into the Future of the Wired and Unwired Internet
La Jolla, CA
The document discusses Internet2, an advanced networking consortium that operates a 15,000 mile fiber optic network for research and education. It provides very high speed connectivity and collaboration technologies to facilitate large data sharing and frictionless research. Examples are given of life sciences projects utilizing Internet2's high-speed network for genomic research and agricultural applications involving terabytes of satellite and sensor data. The network is expanding to include cloud computing resources and supercomputing centers to enable global-scale distributed scientific computing and collaboration.
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
My Remembrances of Mike Norman Over The Last 45 YearsLarry Smarr
Mike Norman has been a leader in computational astrophysics for over 45 years. Some of his influential work includes:
- Cosmic jet simulations in the early 1980s which helped explain phenomena from galactic centers.
- Pioneering the use of adaptive mesh refinement in the 1990s to achieve dynamic load balancing on supercomputers.
- Massive cosmology simulations in the late 2000s with over 100 trillion particles using thousands of processors across multiple supercomputing sites, producing petabytes of data.
- Developing end-to-end workflows in the 2000s to couple supercomputers, high-speed networks, and large visualization systems to enable real-time analysis of extremely large astrophysics simulations.
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019Larry Smarr
Larry Smarr discusses quantifying his body and health over time through extensive self-tracking. He measures various biomarkers through regular blood tests and analyzes his gut microbiome by sequencing stool samples. This revealed issues like chronic inflammation and an unhealthy microbiome. Smarr then took steps like a restricted eating window and increasing plant diversity in his diet, which reversed metabolic syndrome issues and correlated with shifts in his microbiome ecology. His goal is to continue precisely measuring factors like toxins, hormones, gut permeability and food/supplement impacts to further optimize his health.
Panel: Reaching More Minority Serving InstitutionsLarry Smarr
This document discusses engaging more minority serving institutions (MSIs) in cyberinfrastructure development through regional networks. It provides data showing the importance of MSIs like historically black colleges and universities (HBCUs) in educating underrepresented minority students in STEM fields. Regional networks can help equalize opportunities by assisting MSIs in overcoming barriers to resources through training, networking infrastructure support, and helping institutions obtain necessary staffing and funding. Strategies mentioned include collaborating with MSIs on grants and addressing issues identified in surveys like lack of vision for data use beyond compliance. The goal is to broaden participation in STEAM fields by leveraging the success MSIs have shown in supporting underrepresented students.
Global Network Advancement Group - Next Generation Network-Integrated SystemsLarry Smarr
This document summarizes a presentation on global petascale to exascale workflows for data intensive sciences. It discusses a partnership convened by the GNA-G Data Intensive Sciences Working Group with the mission of meeting challenges faced by data-intensive science programs. Cornerstone concepts that will be demonstrated include integrated network and site resource management, model-driven frameworks for resource orchestration, end-to-end monitoring with machine learning-optimized data transfers, and integrating Qualcomm's GradientGraph with network services to optimize applications and science workflows.
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...Larry Smarr
This document discusses opportunities for ESnet to support wireless edge computing through developing a strategy around self-guided field laboratories (SGFL). It outlines several potential science use cases that could benefit from wireless and distributed computing capabilities, both in the short term through technologies like 5G, LoRa and Starlink, and longer term through the vision of automated SGFL. The document proposes some initial ideas for deploying and testing wireless edge computing technologies through existing projects to help enable the SGFL vision and further scientific opportunities. It emphasizes that exploring these emerging areas could help drive new science possibilities if done at a reasonable scale.
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsMydbops
This presentation, delivered at the Postgres Bangalore (PGBLR) Meetup-2 on June 29th, 2024, dives deep into connection pooling for PostgreSQL databases. Aakash M, a PostgreSQL Tech Lead at Mydbops, explores the challenges of managing numerous connections and explains how connection pooling optimizes performance and resource utilization.
Key Takeaways:
* Understand why connection pooling is essential for high-traffic applications
* Explore various connection poolers available for PostgreSQL, including pgbouncer
* Learn the configuration options and functionalities of pgbouncer
* Discover best practices for monitoring and troubleshooting connection pooling setups
* Gain insights into real-world use cases and considerations for production environments
This presentation is ideal for:
* Database administrators (DBAs)
* Developers working with PostgreSQL
* DevOps engineers
* Anyone interested in optimizing PostgreSQL performance
Contact info@mydbops.com for PostgreSQL Managed, Consulting and Remote DBA Services
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Bert Blevins
Today’s digitally connected world presents a wide range of security challenges for enterprises. Insider security threats are particularly noteworthy because they have the potential to cause significant harm. Unlike external threats, insider risks originate from within the company, making them more subtle and challenging to identify. This blog aims to provide a comprehensive understanding of insider security threats, including their types, examples, effects, and mitigation techniques.
Kief Morris rethinks the infrastructure code delivery lifecycle, advocating for a shift towards composable infrastructure systems. We should shift to designing around deployable components rather than code modules, use more useful levels of abstraction, and drive design and deployment from applications rather than bottom-up, monolithic architecture and delivery.
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Chris Swan
Have you noticed the OpenSSF Scorecard badges on the official Dart and Flutter repos? It's Google's way of showing that they care about security. Practices such as pinning dependencies, branch protection, required reviews, continuous integration tests etc. are measured to provide a score and accompanying badge.
You can do the same for your projects, and this presentation will show you how, with an emphasis on the unique challenges that come up when working with Dart and Flutter.
The session will provide a walkthrough of the steps involved in securing a first repository, and then what it takes to repeat that process across an organization with multiple repos. It will also look at the ongoing maintenance involved once scorecards have been implemented, and how aspects of that maintenance can be better automated to minimize toil.
The DealBook is our annual overview of the Ukrainian tech investment industry. This edition comprehensively covers the full year 2023 and the first deals of 2024.
Details of description part II: Describing images in practice - Tech Forum 2024BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Quantum Communications Q&A with Gemini LLM. These are based on Shannon's Noisy channel Theorem and offers how the classical theory applies to the quantum world.
Mitigating the Impact of State Management in Cloud Stream Processing SystemsScyllaDB
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can be challenging. Decoupling compute and storage architecture has emerged as an effective solution to these challenges, but it can introduce high latency issues, especially when dealing with complex continuous queries that necessitate managing extra-large internal states.
In this talk, we focus on addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture. We delve into the root causes of latency in this context and explore various techniques to minimize the impact of S3 latency on stream processing performance. Our proposed approach is to implement a tiered storage mechanism that leverages a blend of high-performance and low-cost storage tiers to reduce data movement between the compute and storage layers while maintaining efficient processing.
Throughout the talk, we will present experimental results that demonstrate the effectiveness of our approach in mitigating the impact of S3 latency on stream processing. By the end of the talk, attendees will have gained insights into how to optimize their stream processing systems for reduced latency and improved cost-efficiency.
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfNeo4j
Presented at Gartner Data & Analytics, London Maty 2024. BT Group has used the Neo4j Graph Database to enable impressive digital transformation programs over the last 6 years. By re-imagining their operational support systems to adopt self-serve and data lead principles they have substantially reduced the number of applications and complexity of their operations. The result has been a substantial reduction in risk and costs while improving time to value, innovation, and process automation. Join this session to hear their story, the lessons they learned along the way and how their future innovation plans include the exploration of uses of EKG + Generative AI.
Best Programming Language for Civil EngineersAwais Yaseen
The integration of programming into civil engineering is transforming the industry. We can design complex infrastructure projects and analyse large datasets. Imagine revolutionizing the way we build our cities and infrastructure, all by the power of coding. Programming skills are no longer just a bonus—they’re a game changer in this era.
Technology is revolutionizing civil engineering by integrating advanced tools and techniques. Programming allows for the automation of repetitive tasks, enhancing the accuracy of designs, simulations, and analyses. With the advent of artificial intelligence and machine learning, engineers can now predict structural behaviors under various conditions, optimize material usage, and improve project planning.
Coordinate Systems in FME 101 - Webinar SlidesSafe Software
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights.
During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to:
- Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value
- Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems
- Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors
- Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported
- Look Ahead: Gain insights into where FME is headed with coordinate systems in the future
Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
Blockchain technology is transforming industries and reshaping the way we conduct business, manage data, and secure transactions. Whether you're new to blockchain or looking to deepen your knowledge, our guidebook, "Blockchain for Dummies", is your ultimate resource.
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
How RPA Help in the Transportation and Logistics Industry.pptxSynapseIndia
Revolutionize your transportation processes with our cutting-edge RPA software. Automate repetitive tasks, reduce costs, and enhance efficiency in the logistics sector with our advanced solutions.
Sustainability requires ingenuity and stewardship. Did you know Pigging Solutions pigging systems help you achieve your sustainable manufacturing goals AND provide rapid return on investment.
How? Our systems recover over 99% of product in transfer piping. Recovering trapped product from transfer lines that would otherwise become flush-waste, means you can increase batch yields and eliminate flush waste. From raw materials to finished product, if you can pump it, we can pig it.
7 Most Powerful Solar Storms in the History of Earth.pdfEnterprise Wired
Solar Storms (Geo Magnetic Storms) are the motion of accelerated charged particles in the solar environment with high velocities due to the coronal mass ejection (CME).
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfjackson110191
These fighter aircraft have uses outside of traditional combat situations. They are essential in defending India's territorial integrity, averting dangers, and delivering aid to those in need during natural calamities. Additionally, the IAF improves its interoperability and fortifies international military alliances by working together and conducting joint exercises with other air forces.
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc
Six months into 2024, and it is clear the privacy ecosystem takes no days off!! Regulators continue to implement and enforce new regulations, businesses strive to meet requirements, and technology advances like AI have privacy professionals scratching their heads about managing risk.
What can we learn about the first six months of data privacy trends and events in 2024? How should this inform your privacy program management for the rest of the year?
Join TrustArc, Goodwin, and Snyk privacy experts as they discuss the changes we’ve seen in the first half of 2024 and gain insight into the concrete, actionable steps you can take to up-level your privacy program in the second half of the year.
This webinar will review:
- Key changes to privacy regulations in 2024
- Key themes in privacy and data governance in 2024
- How to maximize your privacy program in the second half of 2024
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
The Pacific Research Platform: a Science-Driven Big-Data Freeway System
1. “The Pacific Research Platform:
a Science-Driven Big-Data Freeway System.”
NCSA Colloquium
National Center for Supercomputing Applications
University of Illinois, Urbana-Champaign
September 18, 2015
Dr. Larry Smarr
Director, California Institute for Telecommunications and Information Technology
Harry E. Gruber Professor,
Dept. of Computer Science and Engineering
Jacobs School of Engineering, UCSD
http://lsmarr.calit2.net
1
2. Abstract
Research in data-intensive fields is increasingly multi-investigator and multi-institutional, depending on ever more rapid
access to ultra-large heterogeneous and widely distributed datasets. The Pacific Research Platform (PRP) is a multi-
institutional extensible deployment that establishes a science-driven high-capacity data-centric “freeway system.” The
PRP spans all 10 campuses of the University of California, as well as the major California private research universities,
four supercomputer centers, and several universities outside California. Fifteen multi-campus data-intensive application
teams act as drivers of the PRP, providing feedback over the five years to the technical design staff. These application
areas include particle physics, astronomy/astrophysics, earth sciences, biomedicine, and scalable multimedia, providing
models for many other applications. The PRP partnership extends the NSF-funded campus Science DMZs to a regional
model that allows high-speed data-intensive networking, facilitating researchers moving data between their labs and
their collaborators’ sites, supercomputer centers or data repositories, and enabling that data to traverse multiple
heterogeneous networks without performance degradation over campus, regional, national, and international distances
3. Vision: Creating a West Coast “Big Data Freeway”
Connected by CENIC/Pacific Wave to Internet2 & GLIF
Use Lightpaths to Connect
All Data Generators and Consumers,
Creating a “Big Data” Plane
Integrated With High Performance Global Networks
“The Bisection Bandwidth of a Cluster Interconnect,
but Deployed on a 10-Campus Scale.”
This Vision Has Been Building for Over Two Decades
4. NCSA Telnet--“Hide the Cray”
Paradigm That We Still Use Today
• NCSA Telnet -- Interactive Access
– From Macintosh or PC Computer
– To Telnet Hosts on TCP/IP Networks
• Allows for Simultaneous
Connections
– To Numerous Computers on The Net
– Standard File Transfer Server (FTP)
– Lets You Transfer Files to and from
Remote Machines and Other Users
John Kogut Simulating
Quantum Chromodynamics
He Uses a Mac—The Mac Uses the Cray
Source: Larry Smarr 1985
Data
Generator
Data
Portal
Data
Transmission
5. Interactive Supercomputing Collaboratory Prototype:
Using Analog Communications to Prototype the Fiber Optic Future
“We’re using satellite technology…
to demo what It might be like to have
high-speed fiber-optic links between
advanced computers
in two different geographic locations.”
― Al Gore, Senator
Chair, US Senate Subcommittee on Science, Technology and Space
Illinois
Boston
SIGGRAPH 1989
“What we really have to do is eliminate distance between
individuals who want to interact with other people and
with other computers.”
― Larry Smarr, Director, NCSA
6. I-WAY: Information Wide Area Year
Supercomputing ‘95
• The First National 155 Mbps Research Network
– 65 Science Projects
– Into the San Diego Convention Center
• I-Way Featured:
– Networked Visualization Application Demonstrations
– Large-Scale Immersive Displays
– I-Soft Programming Environment
UIC
http://archive.ncsa.uiuc.edu/General/Training/SC95/GII.HPCC.html
CitySpace
Cellular Semiotics
7. PACI is Prototyping America’s 21st Century
Information Infrastructure
The PACI Grid Testbed
National Computational Science
1997
8. Chesapeake Bay Simulation Collaboratory:
vBNS Linked CAVE, ImmersaDesk, Power Wall, and Workstation
Alliance Project: Collaborative Video Production
via Tele-Immersion and Virtual Director
UIC
Donna Cox, Robert Patterson, Stuart Levy, NCSA Virtual Director Team
Glenn Wheless, Old Dominion Univ.
Alliance Application Technologies
Environmental Hydrology Team
4 MPixel PowerWall
Alliance 1997
9. UIC
ANL
NCSA/UIUC
UC
NU
MREN
IIT
True Grid Project
Started March 1999
State Commits
$7.5M over 4 years
Illinois is Positioned to Seize National Optical Networking Leadership
with I-WIRE Infrastructure Investment
• State-Funded Infrastructure
–Application Driven
–High Definition Streaming Media
–Telepresence and Media
–Computational Grids
–Cloud Computing
–Data Grids
–Search & Information Analysis
–EmergingTech Proving Ground
–Optical Switching
–Dense Wave Division Multiplexing
–Advanced Middleware Infrastructure
–Wireless Extensions
Source: Charlie Catlett, ANL1999 LS Slide
10. Two New Calit2 Buildings Provide
New Laboratories for “Living in the Future”
• “Convergence” Laboratory Facilities
– Nanotech, BioMEMS, Chips, Radio, Photonics
– Virtual Reality, Digital Cinema, HDTV, Gaming
• Over 1000 Researchers in Two Buildings
– Linked via Dedicated Optical Networks
UC Irvine
www.calit2.net
Preparing for a World in Which
Distance is Eliminated…
11. Linking the Calit2 Auditoriums at UCSD and UCI
With HD Streams
September 8, 2009
Photo by Erik Jepsen, UC San Diego
Sept. 8, 2009
12. NSF’s OptIPuter Project: Using Supernetworks
to Meet the Needs of Data-Intensive Researchers
OptIPortal–
Termination
Device
for the
OptIPuter
Global
Backplane
Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PI
Univ. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST
Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent
2003-2009
$13,500,000
In August 2003,
Jason Leigh and his
students used
RBUDP to blast data
from NCSA to SDSC
over the
TeraGrid DTFnet,
achieving18Gbps file
transfer out of the
available 20Gbps
13. High Resolution Uncompressed HD Streams
Require Multi-Gigabit/s Lambdas
U. Washington
JGN II Workshop
Osaka, Japan
Jan 2005
Prof.Osaka Prof. Aoyama
Prof. Smarr
Source: U Washington Research Channel
Telepresence Using Uncompressed 1.5 Gbps
HDTV Streaming Over IP on Fiber Optics--
75x Home Cable “HDTV” Bandwidth!
“I can see every hair on your head!”—Prof. Aoyama
14. First Trans-Pacific Super High Definition Telepresence Meeting
Using Digital Cinema 4k Streams
Keio University
President Anzai
UCSD
Chancellor Fox
Lays
Technical
Basis for
Global
Digital
Cinema
Sony
NTT
SGI
Streaming 4k
with JPEG 2000
Compression
½ gigabit/sec
100 Times
the Resolution
of YouTube!
4k = 4000x2000 Pixels = 4xHD
21 Countries Driving 50 Demonstrations
1 or 10Gbps to Calit2@UCSD Building Sept 2005September 26-30, 2005
15. Globally 10Gbp Optically Connected
Digital Cinema Collaboratory
Streaming 4K Live From NCSA Servers to Calit2@UCSD Auditorium
Content: Donna Cox, Robert Patterson, NCSA
16. Project StarGate Goals:
Combining Supercomputers and Supernetworks
• Create an “End-to-End”
10Gbps Workflow
• Explore Use of OptIPortals as
Petascale Supercomputer
“Scalable Workstations”
• Exploit Dynamic 10Gbps
Circuits on ESnet
• Connect Hardware Resources
at ORNL, ANL, SDSC
• Show that Data Need Not be
Trapped by the Network
“Event Horizon”
OptIPortal@SDSC
Rick Wagner Mike Norman
• ANL * Calit2 * LBNL * NICS * ORNL * SDSC
Source: Michael Norman, SDSC, UCSD
17. NICS
ORNL
NSF TeraGrid Kraken
Cray XT5
8,256 Compute Nodes
99,072 Compute Cores
129 TB RAM
simulation
Argonne NL
DOE Eureka
100 Dual Quad Core Xeon Servers
200 NVIDIA Quadro FX GPUs in 50
Quadro Plex S4 1U enclosures
3.2 TB RAM rendering
SDSC
Calit2/SDSC OptIPortal1
20 30” (2560 x 1600 pixel) LCD panels
10 NVIDIA Quadro FX 4600 graphics
cards > 80 megapixels
10 Gb/s network throughout
visualization
ESnet
10 Gb/s fiber optic network
*ANL * Calit2 * LBNL * NICS * ORNL * SDSC
www.calit2.net/newsroom/release.php?id=1624
Using Supernetworks to Couple End User
to Remote Supercomputers and Visualization Servers
Source: Mike Norman,
Rick Wagner, SDSC
Real-Time Interactive
Volume Rendering Streamed
from ANL to SDSC
Demoed
SC09
18. Integrated “OptIPlatform” Cyberinfrastructure System:
A 10Gbps Lightpath Cloud
National LambdaRail
Campus
Optical
Switch
Data Repositories & Clusters
HPC
HD/4k Video Images
HD/4k Video Cams
End User
OptIPortal
10G
Lightpath
HD/4k Telepresence
Instruments
LS 2009
Slide
19. So Why Don’t We Have a National
Big Data Cyberinfrastructure?
20. How Do You Get From Your Lab
to the Regional Optical Networks?
www.ctwatch.org
“Research is being stalled by ‘information overload,’ Mr. Bement said, because
data from digital instruments are piling up far faster than researchers can study.
In particular, he said, campus networks need to be improved. High-speed data
lines crossing the nation are the equivalent of six-lane superhighways, he said.
But networks at colleges and universities are not so capable. “Those massive
conduits are reduced to two-lane roads at most college and university
campuses,” he said. Improving cyberinfrastructure, he said, “will transform the
capabilities of campus-based scientists.”
-- Arden Bement, the director of the National Science Foundation
May 2005
21. DOE Esnet’s Science DMZ: A Scalable Network
Design Model for Optimizing Science Data Transfers
• A Science DMZ integrates 4 key concepts into a unified whole:
– A network architecture designed for high-performance applications,
with the science network distinct from the general-purpose network
– The use of dedicated systems for data transfer
– Performance measurement and network testing systems that are
regularly used to characterize and troubleshoot the network
– Security policies and enforcement mechanisms that are tailored for
high performance science environments
http://fasterdata.es.net/science-dmz/
Science DMZ
Coined 2010
22. The National Science Foundation
Has Funded Over 100 Campuses to Build Local Data Freeways
134 awards,
128 projects
- All but 4 states
- 120+ institutions
23. Creating a “Big Data” Plane on Campus:
NSF CC-NIE Funded Prism@UCSD and CHeruB
Prism@UCSD, Phil Papadopoulos, SDSC, Calit2, PI
CHERuB, Mike Norman, SDSC PI
CHERuB
24. Science DMZ Data Transfer Nodes - Optical Network Termination Devices:
Inexpensive PCs Optimized for Big Data
• FIONA – Flash I/O Network Appliance
– Combination of Desktop and Server Building Blocks
– US$5K - US$7K
– Desktop Flash up to 16TB
– RAID Drives up to 48TB
– 10GbE/40GbE Adapter
– Tested speed 40Gbs
– Developed Under
UCSD CC-NIE Prism Award
by UCSD’s
– Phil Papadopoulos
– Tom DeFanti
– Joe Keefe
FIONA
Data Appliance
9 X 256GB
510MB/sec
8 X 3TB
125MB/sec
2 x 40GbE
2 TB Cache
24TB Disk
25. Integrated Digital Cyberinfrastructure
Supporting Knight Lab
FIONA
12 Cores/GPU
128 GB RAM
3.5 TB SSD
48TB Disk
10Gbps NIC
Knight Lab
10Gbps
Gordon
Prism@UCSD
Data Oasis
7.5PB,
100GB/s
Knight 1024 Cluster
In SDSC Co-Lo
CHERuB
100Gbps
Emperor & Other Vis Tools
64Mpixel Data Analysis Wall
120Gbps
40Gbps
27. Why Now?
Federating the Six UC CC-NIE Grants
• 2011 ACCI Strategic Recommendation to the NSF #3:
– NSF should create a new program funding high-speed (currently
10 Gbps) connections from campuses to the nearest landing point
for a national network backbone. The design of these connections
must include support for dynamic network provisioning services
and must be engineered to support rapid movement of large
scientific data sets."
– - pg. 6, NSF Advisory Committee for Cyberinfrastructure Task
Force on Campus Bridging, Final Report, March 2011
– www.nsf.gov/od/oci/taskforces/TaskForceReport_CampusBridging.pdf
– Led to Office of Cyberinfrastructure RFP March 1, 2012
• NSF’s Campus Cyberinfrastructure –
Network Infrastructure & Engineering (CC-NIE) Program
– 85 Grants Awarded So Far (NSF Summit Last Week)
– 6 Are in UC
UC Must Move Rapidly or Lose a Ten-Year Advantage!
UC IT Leadership Council
Oakland, CA
May 19, 2014
28. CENIC is Rapidly Moving to Connect
at 100 Gbps Across the State and Nation
DOE
Internet2
29. The Pacific Wave Platform
Creates an End-to-End Regional Science Big Data Freeway
Source:
John Hess, CENIC
30. Ten Week Sprint to Demonstrate the West Coast
Big Data Freeway System: PRPv0
Presented at CENIC 2015
March 9, 2015
FIONA DTNs Now Deployed to All UC Campuses
And Most PRP Sites
31. Pacific Research Platform
Multi-Campus Science Driver Teams
• Particle Physics
• Astronomy and Astrophysics
– Telescope Surveys
– Galaxy Evolution
– Gravitational Wave Astronomy
• Biomedical
– Cancer Genomics Hub/Browser
– Microbiome and Integrative ‘Omics
– Integrative Structural Biology
• Earth Sciences
– Data Analysis and Simulation for Earthquakes and Natural Disasters
– Climate Modeling: NCAR/UCAR
– California/Nevada Regional Climate Data Analysis
– CO2 Subsurface Modeling
• Scalable Visualization, Virtual Reality, and Ultra-Resolution Video
31
32. Particle Physics: Creating a 10-100 Gbps LambdaGrid
to Support LHC Researchers
ATLASCMS
U.S. Institutions
Participating in LHC
LHC Data
Generated by
CMS & ATLAS
Detectors
Analyzed
on OSG
Maps from www.uslhc.us
33. Two Automated Telescope Surveys
Creating Huge Datasets Will Drive PRP
300 images per night.
100MB per raw image
30GB per night
120GB per night
250 images per night.
530MB per raw image
150 GB per night
800GB per night
When processed
at NERSC
Increased by 4x
Source: Peter Nugent, Division Deputy for Scientific Engagement, LBL
Professor of Astronomy, UC Berkeley
Precursors to
LSST
And
NCSA
34. Cancer Genomics Hub (UCSC) is Housed in SDSC CoLo:
Large Data Flows to End Users
1G
8G
15G
Cumulative TBs of CGH
Files Downloaded
Data Source: David Haussler,
Brad Smith, UCSC
30 PB
35. To Map Out the Dynamics of Autoimmune Microbiome Ecology
Couples Next Generation Genome Sequencers to Big Data Supercomputers
Source: Weizhong Li, UCSD
Our Team Used 25 CPU-years
to Compute
Comparative Gut Microbiomes
Starting From
2.7 Trillion DNA Bases
of My Samples
and Healthy and IBD Controls
SDSC Gordon Data Supercomputer
36. Computing on Data: Complex Software Pipelines -
From Sequence to Taxonomy and Function
PI: (Weizhong Li, CRBS, UCSD):
NIH R01HG005978 (2010-2013, $1.1M)
37. Dan Cayan
USGS Water Resources Discipline
Scripps Institution of Oceanography, UC San Diego
much support from Mary Tyree, Mike Dettinger, Guido Franco and
other colleagues
Sponsors:
California Energy Commission
NOAA RISA program
California DWR, DOE, NSF
Planning for climate change in California
substantial shifts on top of already high climate variability
SIO Campus Climate Researchers Need to Download
Results from Remote Supercomputer Simulations
to Make Regional Climate Change Forecasts
38. Earth Sciences: Pacific Earthquake
Engineering Research Center
Enabling
Real-Time Coupling
Between
Shake Tables
and
Supercomputer
Simulations
39. Collaboration Between EVL’s CAVE2
and Calit2’s VROOM Over 10Gb Wavelength
EVL
Calit2
Source: NTT Sponsored ON*VECTOR Workshop at Calit2 March 6, 2013
41. Next Step: Use AARnet/PRP to Set Up
Planetary-Scale Shared Virtual Worlds
Digital Arena, UTS Sydney
CAVE2, Monash U, Melbourne
CAVE2, EVL, Chicago
42. PRP Timeline
• PRPv1
– A Layer 2 and Layer 3 System
– Completed In 2 Years
– Tested, Measured, Optimized, With Multi-domain Science Data
– Bring Many Of Our Science Teams Up
– Each Community Thus Will Have Its Own Certificate-Based Access
To its Specific Federated Data Infrastructure.
• PRPv2
– Advanced Ipv6-Only Version with Robust Security Features
– E.G., Trusted Platform Module Hardware and SDN/SDX Software
– Support Rates up to 100Gb/s in Bursts And Streams
– Develop Means to Operate a Shared Federation of Caches
43. The Pacific Wave Platform
Creates an End-to-End Regional Science Big Data Freeway
Source:
John Hess, CENIC
Opportunity:
Connect NCSA
to End Users
on PRP Campuses
@10Gbps