10.02.22
Invited talk
Symposium #1610, How Computational Science Is Tackling the Grand Challenges Facing Science and Society
Title: Science and Cyberinfrastructure in the Data-Dominated Era
San Diego, CA
Set My Data Free: High-Performance CI for Data-Intensive Research
10.11.03
Keynote Speaker
Cyberinfrastructure Days
University of Michigan
Title: Set My Data Free: High-Performance CI for Data-Intensive Research
Ann Arbor, MI
This document provides an overview of the Pacific Research Platform (PRP) after two years of operation. It describes several science drivers that are using the PRP, including biomedical research on cancer genomics and microbiomes, earth sciences like earthquake modeling, and astronomy. It highlights how the PRP is connecting sites like UC San Diego, UC Santa Cruz, UC Berkeley to share and analyze large datasets using high-speed networks. The PRP is expanding to support new areas like deep learning, cultural heritage projects, and connecting additional UC campuses through network upgrades.
Cyberinfrastructure to Support Ocean Observatories
05.03.18
Invited Talk to the Ocean Studies Board
National Research Council
Title: Cyberinfrastructure to Support Ocean Observatories
University of California San Diego
Genomics at the Speed of Light: Understanding the Living Ocean
06.07.17-19
Invited Talk
The Gordon and Betty Moore Foundation 2nd Annual Marine Microbiology Investigator Symposium The Golden Gate Club, The Presidio of San Francisco
Title: Genomics at the Speed of Light: Understanding the Living Ocean
San Francisco, CA
Calit2-a Persistent UCSD/UCI Framework for Collaboration
05.02.16
Invited Talk
Sun Microsystems Global Education and Research
Conference 2005
Title: Calit2-a Persistent UCSD/UCI Framework for Collaboration
San Francisco, CA
The document describes the history and development of remote telepresence and virtual reality technologies over several decades. It outlines key projects and innovations including the NSFnet which connected supercomputers in the 1980s, the development of the CAVE virtual reality system in the early 1990s, and more advanced optical network projects like OptIPuter in the 2000s which enabled high-resolution telepresence and collaboration across global research centers.
06.07.26
Invited Talk
Cyberinfrastructure for Humanities, Arts, and Social Sciences, A Summer Institute, SDSC
Title: The OptIPuter and Its Applications
La Jolla, CA
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...Larry Smarr
The document discusses two projects, OptIPuter and LOOKING, that aim to analyze large earth data sets using optical networking and grid technologies. OptIPuter extends grid middleware to dedicated optical circuits for earth and medical sciences. LOOKING builds on OptIPuter to provide real-time control of ocean observatories through web and grid services integrated over optical networks. Both projects represent efforts to develop cyberinfrastructure for interactive analysis of remote earth science data and instruments.
Set My Data Free: High-Performance CI for Data-Intensive ResearchLarry Smarr
10.11.03
Keynote Speaker
Cyberinfrastructure Days
University of Michigan
Title: Set My Data Free: High-Performance CI for Data-Intensive Research
Ann Arbor, MI
The Pacific Research Platform Two Years InLarry Smarr
This document provides an overview of the Pacific Research Platform (PRP) after two years of operation. It describes several science drivers that are using the PRP, including biomedical research on cancer genomics and microbiomes, earth sciences like earthquake modeling, and astronomy. It highlights how the PRP is connecting sites like UC San Diego, UC Santa Cruz, UC Berkeley to share and analyze large datasets using high-speed networks. The PRP is expanding to support new areas like deep learning, cultural heritage projects, and connecting additional UC campuses through network upgrades.
Cyberinfrastructure to Support Ocean ObservatoriesLarry Smarr
05.03.18
Invited Talk to the Ocean Studies Board
National Research Council
Title: Cyberinfrastructure to Support Ocean Observatories
University of California San Diego
Genomics at the Speed of Light: Understanding the Living OceanLarry Smarr
06.07.17-19
Invited Talk
The Gordon and Betty Moore Foundation 2nd Annual Marine Microbiology Investigator Symposium The Golden Gate Club, The Presidio of San Francisco
Title: Genomics at the Speed of Light: Understanding the Living Ocean
San Francisco, CA
Calit2-a Persistent UCSD/UCI Framework for CollaborationLarry Smarr
05.02.16
Invited Talk
Sun Microsystems Global Education and Research
Conference 2005
Title: Calit2-a Persistent UCSD/UCI Framework for Collaboration
San Francisco, CA
Remote Telepresence for Exploring Virtual WorldsLarry Smarr
The document describes the history and development of remote telepresence and virtual reality technologies over several decades. It outlines key projects and innovations including the NSFnet which connected supercomputers in the 1980s, the development of the CAVE virtual reality system in the early 1990s, and more advanced optical network projects like OptIPuter in the 2000s which enabled high-resolution telepresence and collaboration across global research centers.
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
I presented to the Environmental Data Science group at UChicago, with the goal of getting them excited about the opportunities inherent in big data, big computing, and AI--and to think about how to collaborate with Argonne in those areas. We had a great and long conversation about Takuya Kurihana's work on unsupervised learning for cloud classification. I also mentioned our work making NASA and CMIP data accessible on AI supercomputers.
Deep Learning for Hidden Signals - Enabling Real-time Multimessenger Astrophy...Daniel George
Presented at the GPU Technology Conference (GTC17) in San Jose, California on May 10, 2017
-------------------------
We introduce Deep Filtering, a new method for end-to-end time-series signal processing, which combines two deep convolutional neural networks for classification and regression to detect and characterize signals much weaker than the background noise. We applied this method for gravitational wave analysis specifically for mergers of black holes and demonstrated that it significantly outperforms conventional machine learning techniques, is far more efficient than matched-filtering allowing real-time processing of raw big data with minimal resources, and extends the range of gravitational waves that can be detected by advanced LIGO. This initiates a new paradigm for scientific research which uses massively-parallel numerical simulations to train artificial intelligence algorithms that exploit emerging hardware architectures. Our approach offers a unique framework to enable coincident detection campaigns of gravitational wave sources and their multimessenger counterparts.
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...Larry Smarr
05.06.14
Keynote to the 15th Federation of Earth Science Information Partners Assembly Meeting: Linking Data and Information to Decision Makers
Title: The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way to the International LambdaGrid
San Diego, CA
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific ApplicationsLarry Smarr
07.03.21
IEEE Computer Society Tsutomu Kanai Award Keynote
At the Joint Meeting of the: 8th International Symposium on Autonomous Decentralized Systems
2nd International Workshop on Ad Hoc, Sensor and P2P Networks
11th IEEE International Workshop on Future Trends of Distributed Computing Systems
Title: OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
Sedona, AZ
Toward a Global Interactive Earth Observing CyberinfrastructureLarry Smarr
The document discusses the need for a new generation of cyberinfrastructure to support interactive global earth observation. It outlines several prototyping projects that are building examples of systems enabling real-time control of remote instruments, remote data access and analysis. These projects are driving the development of an emerging cyber-architecture using web and grid services to link distributed data repositories and simulations.
Coupling Australia’s Researchers to the Global Innovation EconomyLarry Smarr
08.10.15
Eighth Lecture in the
Australian American Leadership Dialogue Scholar Tour
Australian National University
Title: Coupling Australia’s Researchers to the Global Innovation Economy
Canberra, Australia
The Pacific Research Platform: a Science-Driven Big-Data Freeway SystemLarry Smarr
The Pacific Research Platform (PRP) is a multi-institutional partnership that establishes a high-capacity "big data freeway system" spanning the University of California campuses and other research universities in California to facilitate rapid data access and sharing between researchers and institutions. Fifteen multi-campus application teams in fields like particle physics, astronomy, earth sciences, biomedicine, and visualization drive the technical design of the PRP over five years. The goal of the PRP is to extend campus "Science DMZ" networks to allow high-speed data movement between research labs, supercomputer centers, and data repositories across campus, regional
Metacomputer Architecture of the Global LambdaGridLarry Smarr
06.01.13
Invited Talk
Department of Computer Science
Donald Bren School of Information and Computer Sciences
Title: Metacomputer Architecture of the Global LambdaGrid
Irvine, CA
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...Larry Smarr
06.07.31
Invited Talk
CONNECT Investment Community Meeting
Calit2@UCSD
Title: Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA)
La Jolla, CA
Why Researchers are Using Advanced NetworksLarry Smarr
07.07.03
Remote Talk from Calit2 to:
Building KAREN Communities for Collaboration Forum
KIWI Advanced Research and Education Network
University of Auckland, Auckland City, New Zealand
Title: Why Researchers are Using Advanced Networks
La Jolla, CA
Calit2: a View Into the Future of the Wired and Unwired InternetLarry Smarr
06.01.23
Invited Talk to the National Research Council's Computer Science and Telecommunications Board
Title: Calit2: a View Into the Future of the Wired and Unwired Internet
La Jolla, CA
The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...Larry Smarr
05.03.09
Invited Talk
Optical Fiber Communication Conference (OFC2005)
Title: The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Testbed for Optical Technologies Enabling LambdaGrid Computing
Anaheim, CA
Building a Global Collaboration System for Data-Intensive DiscoveryLarry Smarr
11.01.06
Distinguished Lecture
Hawaii International Conference on System Sciences (HICSS-44)
Title: Building a Global Collaboration System for Data-Intensive Discovery
Kauai, HI
How Global-Scale Personal Lightwaves are Transforming Scientific ResearchLarry Smarr
07.03.22
Distinguished Lecturer
Technology for a Changing World Series
Baskin School of Engineering, UCSC
Title: How Global-Scale Personal Lighwaves are Transforming Scientific Research
Santa Cruz, CA
Riding the Light: How Dedicated Optical Circuits are Enabling New ScienceLarry Smarr
The document discusses how dedicated optical circuits are enabling new science through high-bandwidth networks. It provides examples of several projects using dedicated optical networks, such as the OptIPuter project, to enable interactive analysis of large datasets through terabit network connections between supercomputing centers. The document concludes by discussing future ocean observatory networks that will use undersea fiber optics to enable remote interactive imaging and sensing.
Similar to Science and Cyberinfrastructure in the Data-Dominated Era (20)
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
My Remembrances of Mike Norman Over The Last 45 YearsLarry Smarr
Mike Norman has been a leader in computational astrophysics for over 45 years. Some of his influential work includes:
- Cosmic jet simulations in the early 1980s which helped explain phenomena from galactic centers.
- Pioneering the use of adaptive mesh refinement in the 1990s to achieve dynamic load balancing on supercomputers.
- Massive cosmology simulations in the late 2000s with over 100 trillion particles using thousands of processors across multiple supercomputing sites, producing petabytes of data.
- Developing end-to-end workflows in the 2000s to couple supercomputers, high-speed networks, and large visualization systems to enable real-time analysis of extremely large astrophysics simulations.
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019Larry Smarr
Larry Smarr discusses quantifying his body and health over time through extensive self-tracking. He measures various biomarkers through regular blood tests and analyzes his gut microbiome by sequencing stool samples. This revealed issues like chronic inflammation and an unhealthy microbiome. Smarr then took steps like a restricted eating window and increasing plant diversity in his diet, which reversed metabolic syndrome issues and correlated with shifts in his microbiome ecology. His goal is to continue precisely measuring factors like toxins, hormones, gut permeability and food/supplement impacts to further optimize his health.
Panel: Reaching More Minority Serving InstitutionsLarry Smarr
This document discusses engaging more minority serving institutions (MSIs) in cyberinfrastructure development through regional networks. It provides data showing the importance of MSIs like historically black colleges and universities (HBCUs) in educating underrepresented minority students in STEM fields. Regional networks can help equalize opportunities by assisting MSIs in overcoming barriers to resources through training, networking infrastructure support, and helping institutions obtain necessary staffing and funding. Strategies mentioned include collaborating with MSIs on grants and addressing issues identified in surveys like lack of vision for data use beyond compliance. The goal is to broaden participation in STEAM fields by leveraging the success MSIs have shown in supporting underrepresented students.
Global Network Advancement Group - Next Generation Network-Integrated SystemsLarry Smarr
This document summarizes a presentation on global petascale to exascale workflows for data intensive sciences. It discusses a partnership convened by the GNA-G Data Intensive Sciences Working Group with the mission of meeting challenges faced by data-intensive science programs. Cornerstone concepts that will be demonstrated include integrated network and site resource management, model-driven frameworks for resource orchestration, end-to-end monitoring with machine learning-optimized data transfers, and integrating Qualcomm's GradientGraph with network services to optimize applications and science workflows.
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...Larry Smarr
This document discusses opportunities for ESnet to support wireless edge computing through developing a strategy around self-guided field laboratories (SGFL). It outlines several potential science use cases that could benefit from wireless and distributed computing capabilities, both in the short term through technologies like 5G, LoRa and Starlink, and longer term through the vision of automated SGFL. The document proposes some initial ideas for deploying and testing wireless edge computing technologies through existing projects to help enable the SGFL vision and further scientific opportunities. It emphasizes that exploring these emerging areas could help drive new science possibilities if done at a reasonable scale.
Measuring the Impact of Network Latency at TwitterScyllaDB
Widya Salim and Victor Ma will outline the causal impact analysis, framework, and key learnings used to quantify the impact of reducing Twitter's network latency.
Best Practices for Effectively Running dbt in Airflow.pdfTatiana Al-Chueyr
As a popular open-source library for analytics engineering, dbt is often used in combination with Airflow. Orchestrating and executing dbt models as DAGs ensures an additional layer of control over tasks, observability, and provides a reliable, scalable environment to run dbt models.
This webinar will cover a step-by-step guide to Cosmos, an open source package from Astronomer that helps you easily run your dbt Core projects as Airflow DAGs and Task Groups, all with just a few lines of code. We’ll walk through:
- Standard ways of running dbt (and when to utilize other methods)
- How Cosmos can be used to run and visualize your dbt projects in Airflow
- Common challenges and how to address them, including performance, dependency conflicts, and more
- How running dbt projects in Airflow helps with cost optimization
Webinar given on 9 July 2024
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Erasmo Purificato
Slide of the tutorial entitled "Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Emerging Trends" held at UMAP'24: 32nd ACM Conference on User Modeling, Adaptation and Personalization (July 1, 2024 | Cagliari, Italy)
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc
Six months into 2024, and it is clear the privacy ecosystem takes no days off!! Regulators continue to implement and enforce new regulations, businesses strive to meet requirements, and technology advances like AI have privacy professionals scratching their heads about managing risk.
What can we learn about the first six months of data privacy trends and events in 2024? How should this inform your privacy program management for the rest of the year?
Join TrustArc, Goodwin, and Snyk privacy experts as they discuss the changes we’ve seen in the first half of 2024 and gain insight into the concrete, actionable steps you can take to up-level your privacy program in the second half of the year.
This webinar will review:
- Key changes to privacy regulations in 2024
- Key themes in privacy and data governance in 2024
- How to maximize your privacy program in the second half of 2024
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Chris Swan
Have you noticed the OpenSSF Scorecard badges on the official Dart and Flutter repos? It's Google's way of showing that they care about security. Practices such as pinning dependencies, branch protection, required reviews, continuous integration tests etc. are measured to provide a score and accompanying badge.
You can do the same for your projects, and this presentation will show you how, with an emphasis on the unique challenges that come up when working with Dart and Flutter.
The session will provide a walkthrough of the steps involved in securing a first repository, and then what it takes to repeat that process across an organization with multiple repos. It will also look at the ongoing maintenance involved once scorecards have been implemented, and how aspects of that maintenance can be better automated to minimize toil.
Blockchain technology is transforming industries and reshaping the way we conduct business, manage data, and secure transactions. Whether you're new to blockchain or looking to deepen your knowledge, our guidebook, "Blockchain for Dummies", is your ultimate resource.
7 Most Powerful Solar Storms in the History of Earth.pdfEnterprise Wired
Solar Storms (Geo Magnetic Storms) are the motion of accelerated charged particles in the solar environment with high velocities due to the coronal mass ejection (CME).
Quality Patents: Patents That Stand the Test of TimeAurora Consulting
Is your patent a vanity piece of paper for your office wall? Or is it a reliable, defendable, assertable, property right? The difference is often quality.
Is your patent simply a transactional cost and a large pile of legal bills for your startup? Or is it a leverageable asset worthy of attracting precious investment dollars, worth its cost in multiples of valuation? The difference is often quality.
Is your patent application only good enough to get through the examination process? Or has it been crafted to stand the tests of time and varied audiences if you later need to assert that document against an infringer, find yourself litigating with it in an Article 3 Court at the hands of a judge and jury, God forbid, end up having to defend its validity at the PTAB, or even needing to use it to block pirated imports at the International Trade Commission? The difference is often quality.
Quality will be our focus for a good chunk of the remainder of this season. What goes into a quality patent, and where possible, how do you get it without breaking the bank?
** Episode Overview **
In this first episode of our quality series, Kristen Hansen and the panel discuss:
⦿ What do we mean when we say patent quality?
⦿ Why is patent quality important?
⦿ How to balance quality and budget
⦿ The importance of searching, continuations, and draftsperson domain expertise
⦿ Very practical tips, tricks, examples, and Kristen’s Musts for drafting quality applications
https://www.aurorapatents.com/patently-strategic-podcast.html
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfNeo4j
Presented at Gartner Data & Analytics, London Maty 2024. BT Group has used the Neo4j Graph Database to enable impressive digital transformation programs over the last 6 years. By re-imagining their operational support systems to adopt self-serve and data lead principles they have substantially reduced the number of applications and complexity of their operations. The result has been a substantial reduction in risk and costs while improving time to value, innovation, and process automation. Join this session to hear their story, the lessons they learned along the way and how their future innovation plans include the exploration of uses of EKG + Generative AI.
Sustainability requires ingenuity and stewardship. Did you know Pigging Solutions pigging systems help you achieve your sustainable manufacturing goals AND provide rapid return on investment.
How? Our systems recover over 99% of product in transfer piping. Recovering trapped product from transfer lines that would otherwise become flush-waste, means you can increase batch yields and eliminate flush waste. From raw materials to finished product, if you can pump it, we can pig it.
How RPA Help in the Transportation and Logistics Industry.pptxSynapseIndia
Revolutionize your transportation processes with our cutting-edge RPA software. Automate repetitive tasks, reduce costs, and enhance efficiency in the logistics sector with our advanced solutions.
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
YOUR RELIABLE WEB DESIGN & DEVELOPMENT TEAM — FOR LASTING SUCCESS
WPRiders is a web development company specialized in WordPress and WooCommerce websites and plugins for customers around the world. The company is headquartered in Bucharest, Romania, but our team members are located all over the world. Our customers are primarily from the US and Western Europe, but we have clients from Australia, Canada and other areas as well.
Some facts about WPRiders and why we are one of the best firms around:
More than 700 five-star reviews! You can check them here.
1500 WordPress projects delivered.
We respond 80% faster than other firms! Data provided by Freshdesk.
We’ve been in business since 2015.
We are located in 7 countries and have 22 team members.
With so many projects delivered, our team knows what works and what doesn’t when it comes to WordPress and WooCommerce.
Our team members are:
- highly experienced developers (employees & contractors with 5 -10+ years of experience),
- great designers with an eye for UX/UI with 10+ years of experience
- project managers with development background who speak both tech and non-tech
- QA specialists
- Conversion Rate Optimisation - CRO experts
They are all working together to provide you with the best possible service. We are passionate about WordPress, and we love creating custom solutions that help our clients achieve their goals.
At WPRiders, we are committed to building long-term relationships with our clients. We believe in accountability, in doing the right thing, as well as in transparency and open communication. You can read more about WPRiders on the About us page.
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
Implementations of Fused Deposition Modeling in real worldEmerging Tech
The presentation showcases the diverse real-world applications of Fused Deposition Modeling (FDM) across multiple industries:
1. **Manufacturing**: FDM is utilized in manufacturing for rapid prototyping, creating custom tools and fixtures, and producing functional end-use parts. Companies leverage its cost-effectiveness and flexibility to streamline production processes.
2. **Medical**: In the medical field, FDM is used to create patient-specific anatomical models, surgical guides, and prosthetics. Its ability to produce precise and biocompatible parts supports advancements in personalized healthcare solutions.
3. **Education**: FDM plays a crucial role in education by enabling students to learn about design and engineering through hands-on 3D printing projects. It promotes innovation and practical skill development in STEM disciplines.
4. **Science**: Researchers use FDM to prototype equipment for scientific experiments, build custom laboratory tools, and create models for visualization and testing purposes. It facilitates rapid iteration and customization in scientific endeavors.
5. **Automotive**: Automotive manufacturers employ FDM for prototyping vehicle components, tooling for assembly lines, and customized parts. It speeds up the design validation process and enhances efficiency in automotive engineering.
6. **Consumer Electronics**: FDM is utilized in consumer electronics for designing and prototyping product enclosures, casings, and internal components. It enables rapid iteration and customization to meet evolving consumer demands.
7. **Robotics**: Robotics engineers leverage FDM to prototype robot parts, create lightweight and durable components, and customize robot designs for specific applications. It supports innovation and optimization in robotic systems.
8. **Aerospace**: In aerospace, FDM is used to manufacture lightweight parts, complex geometries, and prototypes of aircraft components. It contributes to cost reduction, faster production cycles, and weight savings in aerospace engineering.
9. **Architecture**: Architects utilize FDM for creating detailed architectural models, prototypes of building components, and intricate designs. It aids in visualizing concepts, testing structural integrity, and communicating design ideas effectively.
Each industry example demonstrates how FDM enhances innovation, accelerates product development, and addresses specific challenges through advanced manufacturing capabilities.
The DealBook is our annual overview of the Ukrainian tech investment industry. This edition comprehensively covers the full year 2023 and the first deals of 2024.
Science and Cyberinfrastructure in the Data-Dominated Era
1. Science and Cyberinfrastructure in the Data-Dominated Era Symposium #1610, How Computational Science Is Tackling the Grand Challenges Facing Science and Society San Diego, CA February 22, 2010 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD
2. Abstract The NSF Supercomputer Centers program not only directly stimulated a hundred-fold increase in the number of U.S. university computational scientists and engineers, but it also facilitated the emergence of the Internet, Web, scientific visualization, and synchronous collaboration. I will show how two NSF-funded grand challenges, one in basic scientific research (cosmological evolution) and one in computer science (super high bandwidth optical networks) are interweaving to enable new modes of discovery. Today we are living in a data-dominated world where supercomputers and increasingly distributed scientific instruments generate terabytes to petabytes of data. It was in response to this challenge that the NSF funded the OptIPuter project to research how user-controlled 10Gbps dedicated lightpaths (or “lambdas”) could provide direct access to global data repositories, scientific instruments, and computational resources from “OptIPortals,” PC clusters which provide scalable visualization, computing, and storage in the user's campus laboratory. The use of dedicated lightpaths over fiber optic cables enables individual researchers to experience “clear channel” 10,000 megabits/sec, 100-1000 times faster than over today’s shared Internet—a critical capability for data-intensive science. The seven-year OptIPuter computer science research project is now over, but it stimulated a national and global build-out of dedicated fiber optic networks. U.S. universities now have access to high bandwidth lambdas through the National LambdaRail, Internet2's Dynamic Circuit Services, and the Global Lambda Integrated Facility. A few pioneering campuses are now building on-campus lightpaths to connect the data-intensive researchers, data generators, and vast storage systems to each other on campus, as well as to the national network campus gateways. I will show how this next generation cyberinfrastructure is being used to support cosmological simulations containing 64 billion zones on remote NSF-funded TeraGrid facilities coupled to the end-users laboratory by national fiber networks. I will review how increasingly powerful NSF supercomputers have allowed for more and more realistic cosmological models over the last two decades. The 25 years of innovation in information infrastructure and scientific simulation that NSF has funded has steadily pushed out the frontier of knowledge while transforming our society and economy.
3. NCSA Telnet--“Hide the Cray” Paradigm That We Still Use Today NCSA Telnet -- Interactive Access From Macintosh or PC Computer To Telnet Hosts on TCP/IP Networks Allows for Simultaneous Connections To Numerous Computers on The Net Standard File Transfer Server (FTP) Lets You Transfer Files to and from Remote Machines and Other Users John Kogut Simulating Quantum Chromodynamics He Uses a Mac—The Mac Uses the Cray Source: Larry Smarr 1985 Data Generator Data Portal Data Transmission
4. Launching the Nation’s Information Infrastructure: NSFnet Supernetwork and the Six NSF Supercomputers NCSA NSFNET 56 Kb/s Backbone (1986-8) PSC NCAR CTC JVNC SDSC Supernetwork Backbone: 56kbps is 50 Times Faster than 1200 bps PC Modem!
5. Why Teraflop Supercomputers Matter For Accurate Science & Engineering Simulations FLOating Point OperationS per Spatial Point Ten Variables Hundred Operations Per Updated Variable One Thousand FLOPS per Updated Spatial Point One Dimensional Dynamics For 1000 Spatial Points Need MEGAFLOP Two Dimensions For 1000x1000 Spatial Points Need GIGAFLOP Three Dimensions For 1000x1000x1000 Spatial Points Need TERAFLOP Three Dimensions + Adaptive Mesh Refinement Need PETAFLOP
6. Today Dedicated 10,000Mbps Supernetworks Tie Together State and Regional Fiber Infrastructure NLR 40 x 10Gb Wavelengths Expanding with Darkstrand to 80 Interconnects Two Dozen State and Regional Optical Networks Internet2 Dynamic Circuit Network Is Now Available
7. NSF’s OptIPuter Project: Using Supernetworks to Meet the Needs of Data-Intensive Researchers OptIPortal– Termination Device for the OptIPuter Global Backplane Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PI Univ. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent
8. Short History of Cosmological Supercomputing: Early Days -1993 Convex C3880 (8-way SMP) GigaFLOPs Simulation of X-ray clusters in a 3D cube 85 Mpc/h on a side and Cartesian grid of size 270 3 Bryan, Cen, Norman, Ostriker, Stone (1994), ApJ Source: Michael Norman, SDSC, UCSD
9. Great Leap Forward-1994 Thinking Machines CM5 (512-cpu MPP) Simulation of X-ray clusters in a 3D cube 170 Mpc/h on a side and Cartesian grid of size 512 3 Bryan & Norman (1998), ApJ Source: Michael Norman, SDSC, UCSD
10. The Power of Adaptive Mesh Refinement-2006 IBM Power4 cluster (64 node, 8-way SMP) Simulation of X-ray clusters in a 3D cube 512 Mpc/h on a side with 7-level AMR for an effective resolution of 65,562 3 Norman et al. (2007) Source: Michael Norman, SDSC, UCSD
11. Adaptive Grids Resolve Individual Galaxy Collisions as Clusters Form in 15 Million Light Year Volume Source: Simulation: Mike Norman and Brian O’Shea; Animation: Donna Cox, Robert Patterson, Matthew Hall, Stuart Levy, Jeff Carpenter, Lorne Leonard-NCSA SGI Altix DSM cluster (512 cpu)
13. Enormous Detail in Simulation: Full Simulation with Blowup of a 1/512 Subcube
14. Project StarGate Goals: Combining Supercomputers and Supernetworks Create an “End-to-End” 10Gbps Workflow Explore Use of OptIPortals as Petascale Supercomputer “Scalable Workstations” Exploit Dynamic 10Gbps Circuits on ESnet Connect Hardware Resources at ORNL, ANL, SDSC Show that Data Need Not be Trapped by the Network “Event Horizon” [email_address] Rick Wagner Mike Norman ANL * Calit2 * LBNL * NICS * ORNL * SDSC Source: Michael Norman, SDSC, UCSD
15. Using Supernetworks to Couple End User’s OptIPortal to Remote Supercomputers and Visualization Servers *ANL * Calit2 * LBNL * NICS * ORNL * SDSC Source: Mike Norman, SDSC From 1985 to Project StarGate NICS ORNL NSF TeraGrid Kraken Cray XT5 8,256 Compute Nodes 99,072 Compute Cores 129 TB RAM simulation Argonne NL DOE Eureka 100 Dual Quad Core Xeon Servers 200 NVIDIA Quadro FX GPUs in 50 Quadro Plex S4 1U enclosures 3.2 TB RAM rendering SDSC Calit2/SDSC OptIPortal1 20 30” (2560 x 1600 pixel) LCD panels 10 NVIDIA Quadro FX 4600 graphics cards > 80 megapixels 10 Gb/s network throughout visualization ESnet 10 Gb/s fiber optic network
16. Project StarGate Credits Lawrence Berkeley National Laboratory (ESnet) Eli Dart San Diego Supercomputer Center Science application Michael Norman Rick Wagner (coordinator) Network Tom Hutton Oak Ridge National Laboratory Susan Hicks National Institute for Computational Sciences Nathaniel Mendoza Argonne National Laboratory Network/Systems Linda Winkler Loren Jan Wilson Visualization Joseph Insley Eric Olsen Mark Hereld Michael Papka [email_address] Larry Smarr (Overall Concept) Brian Dunne (Networking) Joe Keefe (OptIPortal) Kai Doerr, Falko Kuester (CGLX) ANL * Calit2 * LBNL * NICS * ORNL * SDSC
17. Blue Waters is a Sustained PetaFLOPs Supercomputer One Million Times the Convex 3880 of 1993! Planned for 2011-2012 Science Self-consistent simulation of the formation of the first galaxies and cosmic ionization Scale of Simulations AMR: 1536 3 base grid, 10 levels of refinement Cartesian: 6400 3 with radiation transport Source: Michael Norman, SDSC, UCSD
18. Academic Research “OptIPlatform” Cyberinfrastructure: A 10Gbps “End-to-End” Lightpath Cloud National LambdaRail Campus Optical Switch Data Repositories & Clusters HPC HD/4k Video Images HD/4k Video Cams End User OptIPortal 10G Lightpath HD/4k Telepresence Instruments
19. High Definition Video Connected OptIPortals: Virtual Working Spaces for Data Intensive Research Source: Falko Kuester, Kai Doerr Calit2; Michael Sims, NASA NASA Ames Lunar Science Institute Mountain View, CA NASA Interest in Supporting Virtual Institutes LifeSize HD