SlideShare a Scribd company logo
IDEAS for thought
SHPC lunch and learn
JULY 25, 2013
John D. Almon
• Full stack software engineer
• Implemented RTM on GPU using MPI
• Implemented Cloud basedWEM using SOA
• Terabyte scale database design and data warehousing
• Architected hybrid web interpretation and processing system
• C++, Java, MPI, C, Oracle PL/SQL, HTML,Web Based Systems, XML
• Managed software team
• Currently serves as CEO ofAdvanced SeismicTechnologies
Hardware
Small HPC setup - Guess what company
• Fiber optic to every desktop using HPC grid
• 400Terabytes of Storage
• 300 x 10 GbE ports
• 1500 x 1 GbE ports
• Desktop workstations automatically added to HPC grid after hours
• 5,000 AMD processors + 3,000 desktop processors at night

Recommended for you

Hybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and KubernetesHybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and Kubernetes

Lyft is on the mission to improve people’s lives with the world’s best transportation. Starting 2019, Lyft has been running both Batch ETL and ML spark workloads primarily on Kubernetes with the Apache Spark on k8s operator. However, with the increasing scale of workloads in frequency and resource requirements, we started hitting numerous reliability issues related to IP allocation, container images, IAM role assignment, and Kubernetes Control Plane. To continue supporting growing Spark usage with Lyft, the team came up with a hybrid architecture optimized for containerized and non-containerized workload based on Kubernetes and YARN. In this talk, we will also cover a dynamic runtime controller that helps with per environment config overrides and easy switchover between resource managers.

Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop

This document discusses Cloudera Search, which integrates Apache Solr with Cloudera's distribution of Apache Hadoop (CDH) to provide interactive search capabilities. It describes the architecture of Cloudera Search, including components like Solr, SolrCloud, and Morphlines for extraction and transformation. Methods for indexing data in real-time using Flume or batch using MapReduce are presented. The document also covers querying, security features like Kerberos authentication and collection-level authorization using Sentry, and concludes by describing how to obtain Cloudera Search.

A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology

Presented by Michael Noll, Product Manager, Confluent. Why are there so many stream processing frameworks that each define their own terminology? Are the components of each comparable? Why do you need to know about spouts or DStreams just to process a simple sequence of records? Depending on your application’s requirements, you may not need a full framework at all. Processing and understanding your data to create business value is the ultimate goal of a stream data platform. In this talk we will survey the stream processing landscape, the dimensions along which to evaluate stream processing technologies, and how they integrate with Apache Kafka. Particularly, we will learn how Kafka Streams, the built-in stream processing engine of Apache Kafka, compares to other stream processing systems that require a separate processing infrastructure.

stream processingapache kafka
Hpc lunch and learn
Hpc lunch and learn
Monsters University
• 100 Million CPU hours
• 5.5 million individual hairs
• 127 simulated garments
• Global illumination ray tracing
Key point #1
Perhaps we can learn new techniques from
other industries that operate at scale

Recommended for you

Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkReactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark

This document discusses a presentation titled "Reactive Fast Data & the Data Lake with Akka, Kafka, Spark" given by Todd Fritz at DevNexus in February 2017. The presentation agenda covers reactive systems and patterns, fast data, data lakes, the intersection of these topics, and architecture considerations for building systems that can scale to millions of users and billions of messages. Key technologies discussed include Akka, Kafka, and Spark.

akkasparkdata lake
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networks

In this presentation I describe the architecture of two of our Flink projects. Both developed for our customers from telco industry.

flinkstream processingbig data
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha

This document discusses Apache Tez, a framework for accelerating Hadoop query processing. Tez is designed to express query computations as dataflow graphs and execute them efficiently on YARN. It addresses limitations of MapReduce by allowing for custom dataflows and optimizations. Tez provides APIs for defining DAGs of tasks and customizing inputs/outputs/processors. This allows applications to focus on business logic while Tez handles distributed execution, fault tolerance, and resource management for Hadoop clusters.

hortonworksbig datahadoop
Software
Bi Modal Distribution of Developers
This shapes Architecture and Design Innovation
Loosely coupled code
Fast hardware
Open source
Closely coupled code
Slow hardware
More optimization
Geoscience Gap
Massive hardware changes
Better compilers and cheaper hardware has
changed everything about software development
• No more fortran ( sort of )
• Object oriented approach
• Teenage internet billionaires
Software access patterns affect memory
speed ( affected by data and users )
Word Size Affects
Memory Bandwidth
Temporal Locality &
Spatial Locality
Can affect bandwidth

Recommended for you

Enabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopEnabling real interactive BI on Hadoop
Enabling real interactive BI on Hadoop

Providing true interactive and scalable BI on Hadoop is proven to be one of the biggest challenges that is preventing completion of legacy EDW OLAP system transit to Hadoop. While we have all seen many benchmarks running consecutive queries claiming success, having thousands of concurrent business users sending complicated generated queries from their dashboards over billions of records while delivering interactive speed is yet to be seen. In this session we will discuss how an architecture that replaces full-scan brute-force approach with adaptive indexing and auto-generated cubes can dramatically reduce the resources and effort per query, resulting in interactive performance for high concurrency workloads and explain how this is achieved with minimum data engineering efforts. We will also discuss how this architecture can be seamlessly integrated with Hive to provide a complete OLAP-on-Hadoop solution. Session will include live demo of complex business dashboards connected to Hive and accessing billions of rows at interactive speed. Speaker Boaz Raufman, CTO and Co-Founder, JethroData

data processing and warehousingbusiness intelligencedata engineering
Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...

This presentation gives an overview of the steps in the workshop labs for Oracle Management Cloud APM and Log Analytics. The labs themselves and all sources are found at GitHub: https://github.com/lucasjellema/APM-Demo-App-WorldView .

omcoracle management cloudlog analytics
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...

One key area of Oracle OpenWorld 2016 was data in various shapes. Big Data, streaming data and traditional transactional data. The power of SQL to access and unleash all data - even data in NoSQL databases. The advent of the citizen data scientist. Streaming data analysis in real time on vast and fast and vast data, data discovery. And the new Oracle Database 12cR2 release. Forms, APEX, SQL and PL/SQL.

streamingoraclemachine learning
Memory Mountain software code
/* Iterate over first "elems" elements of array "data" with stride of
* "stride". */
void test(int elems, int stride)
{
int i;
double result = 0.0;
volatile double sink;
for (i = 0; i < elems; i += stride)
result += data[i];
sink = result; /* So compiler doesn't optimize away the loop */
}
Everything is a cache ( memory heirachy )
• Register, ~2ns
• Primary cache, ~4-5ns
• Secondary cache, ~30ns
• Main memory, ~22ns
• Magnetic Disk, ~3ms
• SSD,~100µs
• File server on Gigabit ethernet
• Cloud
Bottleneck is the
memory bus
Bottleneck is the
network
New Paradigm for Optimization of Compute
at Cluster / Cloud level
• Pre sorting / caching of data for maximum
throughput
• Hueristic analysis at the application level
• Optimization of hardware resources determined by
the application
• Hardware switching based on access patterns of
application and user
All developers are:
(artists | engineers | brilliant | clueless )
• There is no one right way to build a piece of software
• Heterogeous development staff builds heterogeneous
solutions
• What about UI / UX ( User Interface / User Experience )
• Business workflows should drive UI / UX
• Steve jobs was tyrannical about every detail fitting into his one
overaching product vision

Recommended for you

Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry

This document discusses predictive maintenance of robots in the automotive industry using big data analytics. It describes Cisco's Zero Downtime solution which analyzes telemetry data from robots to detect potential failures, saving customers over $40 million by preventing unplanned downtimes. The presentation outlines Cisco's cloud platform and a case study of how robot and plant data is collected and analyzed using streaming and batch processing to predict failures and schedule maintenance. It proposes a next generation predictive platform using machine learning to more accurately detect issues before downtime occurs.

hadoop summit
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2

This document provides an introduction to Cloudant, which is a fully managed NoSQL database as a service (DBaaS) that provides a scalable and flexible data layer for web and mobile applications. The presentation discusses NoSQL databases and why they are useful, describes Cloudant's features such as document storage, querying, indexing and its global data presence. It also provides examples of how companies like FitnessKeeper and Fidelity Investments use Cloudant to solve data scaling and management challenges. The document concludes by outlining next steps for signing up and exploring Cloudant.

tealeaffiberlinkthe now factory
Choosing the right Cloud Database
Choosing the right Cloud DatabaseChoosing the right Cloud Database
Choosing the right Cloud Database

Database as a Service (DBaaS) is cloud database hosted and managed by the cloud service providers that can be accessed through public cloud or the hybrid cloud. The cloud provider takes care of provisioning, configuring, setup, maintenance, backups and patching the database. Customers are expected to export the database and start consuming the service through the pay-as-you-go model. In his session at 5th Big Data Expo, Janakiram MSV will analyze the current market landscape while exploring the available options, strengths and weaknesses of current DBaaS players. He will highlight the key factors that enterprises should consider before adopting a cloud database platform.

amazon rdsazure sqlgoogle cloud sql
Who are we ?
No sacred cows
• temp
Hpc lunch and learn
Key point #2
Software developers shape the choice of
architecture and available tools

Recommended for you

What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...

The promise of the cloud is substantial. Oracle's public cloud promise goes beyond the generic promise. This presentation describes the promise of the Oracle Public Cloud specifically for developers. It describes the current state of the PaaS Platform, the actual and coming services and what they could mean to a developer. From same platform, different location (DBaaS, JCS) to cloud native stack (ICS, MCS) and services for Citizen Developers, the presentation touches upon virtually all services relevant to developers. The presentation concludes with first the steps enterprises can start taking to move to the cloud and second the steps individual developers could and perhaps should take in order to conquer the clouds.

dbaasapplication container cloudics
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam

Cloudant is a fully-managed NoSQL distributed data layer service based on a JSON document store that provides high availability, scalability, simplicity and performance. It uses a flexible schema and scales massively while always being available. Cloudant is an operational data store and NoSQL document database with a simple HTTP API that is fully integrated with mobile devices, big data, cloud and delivery. It provides replication, sync, real-time analytics using MapReduce, full-text search and geospatial capabilities.

nosqlcouchdbcloudant
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions

Hear Ryan Millay, IBM Cloudant software development manager, discuss what you need to consider when moving from world of relational databases to a NoSQL document store. You'll learn about the key differences between relational databases and JSON document stores like Cloudant, as well as how to dodge the pitfalls of migrating from a relational database to NoSQL.

nosqlapache couchdbjson
2 Companies with really “Big Data”
Hpc lunch and learn
• $50 Billion in revenue
• 30,000 + employees
• Optimization throughout entire stack
• Google Filesystem, Operating System, CHROME
• 2,000,000 servers
• Free food to keep their developers working long
hours
Google
• Pluto switch

Recommended for you

Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...

The document discusses machine learning with SQL Server 2016 and R Services. It provides an overview of machine learning, R programming language, and the challenges of using R with SQL databases prior to SQL Server 2016. SQL Server 2016 introduces R Services, which allows running R code directly in the database for high performance, scalable machine learning. R Services integrates R with SQL Server through in-database deployment and parallel processing capabilities. This eliminates data movement and scaling issues while leveraging existing R and SQL skills.

machine learning r sql server
Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.

In any modern web platform you end up with a need to store different views of your data in many different datastores. I will cover how we have coped with doing this in a reliable way at State.com across a range of different languages, tools and datastores.

change data capturestream processingmognodb
Summer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointSummer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpoint

This powerpoint is the summary of my 2017 Summer Undergraduate Research experience with Dr. Jeff Prevost.

iotcloud computing
Google tools
• Google Hangout - collaboration
• Google Maps
• Google compute engine
• Google bigQuery
Hpc lunch and learn
• $1 Billion data center in Iowa
• 450,000 servers
• API first development strategy
• Supports multiple interface connectivity using
“restful” applications
• Compete with UI / UX
• Creates user lock in through iterative conditioning
Iterative conditioning
• Workflows are hard to learn
• You should need software training to learn how to use software
• Software fatigue
• Switching cost
• Adoption rates
• Advanced features
• Tracking all of this and dynamic menus and configuration

Recommended for you

HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey

Introducing an internal cloud brings new paradigms, tools and infrastructure management. When placed alongside traditional HPC the new opportunities are significant But getting to the new world with micro-services, autoscaling and autodialing is a journey that cannot be achieved in a single step.

openstack hpc adoption hybrid clusters
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices

By simply looking at structured and unstructured data, Data Lakes enable companies to understand correlations between existing and new external data - such as social media - in ways traditional Business Intelligence tools cannot. For this you need to find out the most efficient way to store and access structured or unstructured petabyte-sized data across your entire infrastructure. In this meetup we’ll give answers on the next questions: 1. Why would someone use a Data Lake? 2. Is it hard to build a Data Lake? 3. What are the main features that a Data Lake should bring in? 4. What’s the role of the microservices in the big data world?

dockermesoscontainers
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB

FoundationDB is a next-generation database that aims to provide high performance transactions at massive scale through a distributed design. It addresses limitations of NoSQL databases by providing a transactional, fault-tolerant foundation using tools like the Flow programming language. FoundationDB has demonstrated high performance that exceeds other NoSQL databases, and provides ease of scaling, building abstractions, and operation through its transactional design and automated partitioning. The goal is to solve challenges of state management so developers can focus on building applications.

nosqldatabasesfoundationdb
Facebook tools and contributions
• Apache Cassandra ( Big data database, linear
scalability )
• ApacheThrift ( cross language services )
Architecture choices provide insight … still have to
implement for specifics of Oil and Gas
Open Source Licensing
• MIT X11 License – ANY use permissible
• BSD – Identical to MIT X11
• GPL – no linking
• LPGL – linking allowed
• Appliances – ethical / versus legal
Must read the fine print before using, but can save very large amount
of time by using these frameworks and implementations where
possible
Key point #3
Internet companies have innovation at scale
Using REST architecture to go FAST

Recommended for you

Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows

The requirement of running HPC/Congnitive Workload flow with container and manged by container platform

hpccontainermesos
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion

This document provides an overview of software architecture fundamentals and patterns, with a focus on architectures for scalable systems. It discusses key quality attributes for architecture like performance, reliability, and scalability. Common patterns for scalable systems are described, including load balancing, map-reduce, and caching. The document also provides a detailed look at architectures used at Facebook, including the architectures for Facebook's website, chat service, and handling of big data. Key aspects of each system are summarized, including the technologies and design principles used.

architecture patterns scalable system design
Cloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming CurriculumCloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming Curriculum

This document discusses proposed changes to a Systems Programming course (CS252) to incorporate cloud computing concepts. The course currently focuses on C/C++, operating systems, and networking. The proposal is to have students write mobile and web applications using HTML5, JavaScript frameworks, and cloud services on Bluemix. Students would work in groups on semester-long projects developing games, social apps, or other programs that run in browsers and mobile devices while calling APIs hosted on Bluemix. This aims to teach new generation web development skills and how applications can leverage cloud computing technologies.

data scienceeducationsystems programming
Hpc lunch and learn
Representational State Transfer
• 6 constraints
• Client Server – clients are not concerned with data storage
• Stateless – server does not store client context
• Cacheable – client stores responses
• Layered system – client does not know if it is at end server or intermediary
• Optional code on demand – client downloads code and runs
• Uniform interface – decouples interface and allows each part to evolve
independently
Representational State Transfer
• 6 constraints
• Client Server – clients are not concerned with data storage
• Stateless – server does not store client context
• Cacheable – client stores responses
• Layered system – client does not know if it is at end server or intermediary
• Optional code on demand – client downloads code and runs
• Uniform interface – decouples interface and allows each part to evolve
independently
Simplified REST
Web Browser Web Server
Database
File Servers
Presentation Layer
can’t handle
Geoscience or
local compute
Web server has the
majority of control
Compute Engine
REST API

Recommended for you

Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

While cloud computing offers virtually unlimited capacity, harnessing that capacity in an efficient, cost effective fashion can be cumbersome and difficult at the workload level. At the organizational level, it can quickly become chaos. You must make choices around cloud deployment, and these choices could have a long-lasting impact on your organization. It is important to understand your options and avoid incomplete, complicated, locked-in scenarios. Data management and placement challenges make having the ability to automate workflows and processes across multiple clouds a requirement. In this webinar, you will: • Learn how to leverage cloud services as part of an overall computation approach • Understand data management in a cloud-based world • Hear what options you have to orchestrate HPC in the cloud • Learn how cloud orchestration works to automate and align computing with specific goals and objectives • See an example of an orchestrated HPC workload using on-premises data From computational research to financial back testing, and research simulations to IoT processing frameworks, decisions made now will not only impact future manageability, but also your sanity.

high performance computingcloud hpchpc
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...

Live Integrated Visualization Environment: An Experiment in Generalized Structured Frameworks for Visualization and Analysis

sievaslivecave
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform

This document provides an overview of Google Cloud Platform (GCP) services. It begins by explaining why GCP is underpinned by Google's infrastructure and innovation. It then outlines GCP's compute, networking, storage, big data, and machine learning services. These include Compute Engine, Container Engine, App Engine, load balancing, Cloud DNS, Cloud Storage, Cloud Datastore, Cloud Bigtable, Cloud SQL, BigQuery, Dataflow, Pub/Sub, Dataproc, and Cloud Datalab. Machine learning services such as Translate API, Prediction API, Cloud Vision API, and Cloud Speech API are also introduced.

google cloudcloud
Hpc lunch and learn
REST with Mashup
Web Browser Web Server 1
Database
File Servers
Presentation Layer
can mashup data
from 2 separate
sources
Compute Engine
Web Server 2
REST API
REST with new application layer
Form window Application
Database
File Servers
Compute Engine
Web Server 2
REST API
OpenGLWindow
Web Browser
Internet architecture / legacy style code
• REST Architecture for NON – INTERNET
applications
• Can keep inside corporate networks
• Distributed systems architecture
• Predominant webAPI design model
• Allows for distributed development team
• Separate data model from view model
• But allows for computation on either side

Recommended for you

8. Software Development Security
8. Software Development Security8. Software Development Security
8. Software Development Security

This chapter discusses software development security. It covers topics like programming concepts, compilers and interpreters, procedural vs object-oriented languages, application development methods like waterfall vs agile models, databases, object-oriented design, assessing software vulnerabilities, and artificial intelligence techniques. The key aspects are securing the entire software development lifecycle from initial planning through operation and disposal, using secure coding practices, testing for vulnerabilities, and continually improving processes.

Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser

The document summarizes lessons learned from building a real-time network traffic analyzer in C/C++. Key points include: - Libpcap was used for traffic capturing as it is cross-platform, supports PF_RING, and has a relatively easy API. - SQLite was used for data storage due to its small footprint, fast performance, embeddability, SQL support, and B-tree indexing. - A producer-consumer model with a blocking queue was implemented to handle packet processing in multiple threads. - Memory pooling helped address performance issues caused by excessive malloc calls during packet aggregation. - Custom spin locks based on atomic operations improved performance over mutexes on FreeBSD/

c++libpcaplinux
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?

This document discusses how organizations will need to adapt their data infrastructure and software models as Moore's Law ends and data volumes continue growing exponentially. It outlines how traditional clustering, databases, and application servers will no longer scale to meet these new demands. New distributed, dynamically adaptive approaches like NoSQL data stores, functional programming, and eventual consistency models are needed. Hardware is also evolving to support exabyte storage, tens of thousands of CPU cores, and networked memory, requiring new software architectures.

 
by CQD
nosql scala cap x86
Software Demo
Client Server
• FINALLY !! Interactive HPC apps made easy
• Our tabs are the clients connection to application
layer via a “REST” style API
• Application layer provides caching and file system
access
• Application layer provides access to heterogeneous
compute
Stateless
• Each tab does not know about other tabs
• This creates the ability to very quickly have
developer from different teams and disciplines work
independently
• Application layer provides synchronization states
• Application layer provides for off-workstation
transferability ( work from iPad on the Beach )
Cacheable
• Heuristic data sorting and precaching based on user /
algorithm needs
• Allows for compute distribution without presentation layer
needing to know
• Allows for disparate file systems
• Abstracts data location from user
• Communicate with HPC grid in more advanced manner

Recommended for you

Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar

This webinar discusses tools for making big data easy to work with. It covers MetaScale Expertise, which provides Hadoop expertise and case studies. Kognitio Analytics is discussed as a way to accelerate Hadoop for organizations. The webinar agenda includes an introduction, presentations on MetaScale and Kognitio, and a question and answer session. Rethinking data strategies with Hadoop and using in-memory analytics are presented as ways to gain insights from large, diverse datasets.

hadoop
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...

IBM Spectrum Conductor can manage H2O Driverless AI instances at scale across multiple nodes in an enterprise data center. Key benefits include the ability to run multiple Driverless AI instances on the same host using GPUs, failover capabilities if an instance fails, and role-based access control for users. The integration improves productivity by providing a shared file system, workload management, and allowing easy start/stop of Driverless AI instances.

Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith

The document discusses strategies for transitioning from monolithic architectures to microservice architectures. It outlines some of the challenges with maintaining large monolithic applications and reasons for modernizing, such as handling more data and needing faster changes. It then covers microservice design principles and best practices, including service decomposition, distributed systems strategies, and reactive design. Finally it introduces Lagom as a framework for building reactive microservices on the JVM and outlines its key components and development environment.

lagommicroservicesdecomposition
Layered System
• Allows for use of 3rd party plugins
• Allows EVERY application connect to HPC grid
• Graphics as plugins
• Workflows as plugins - dynamic workflow
• No menu on Amazon
• Optimize each layer independently
Code on demand
• Safer since security is controlled by application layer
• Sandbox each user and only give access with additional security
credentials
• Can download and run legacy code through Pinvoke
• DLL injection
Uniform Interface
• HTML for cross platform consistency
• User adoption and ease of use
• Internet style decoupling of functionality from
graphics creates a better user experience and more
intuitive style workflow
• Most graphic designers do NOT know C++
• Geoscientists won’t always agree on color scheme,
styles, icons
Most important benefits
• More flexibility means rapid application development and easier
maintenance
• Presentation layer needs change as business requirements needs
change over time
• Hooking into outside tools that have REST API’s
• Data
• Social
• Compute engines
• Mash ups

Recommended for you

Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices

This is a small introduction to microservices. you can find the differences between microservices and monolithic applications. You will find the pros and cons of microservices. you will also find the challenges (Business/ technical) that you may face while implementing microservices.

softwareagilemicroservices
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...

IBM Connect 2017 Session on RESTful architectures and their uses in IBM Domino environments (Notes and XPages applications). February 22, 2017.

javaapache winkibm notes
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix

This document discusses Indix's evolution from its initial Data Platform 1.0 to a new Data Platform 2.0 based on the Lambda Architecture. The Lambda Architecture uses three layers - batch, serving, and speed layers - to process streaming and batch data. This provides robustness, fault tolerance, and the ability to query both real-time and batch processed views. The new system uses technologies like Spark, HBase, and Solr to implement the Lambda Architecture principles.

big datascaldinglambda architecture
Key point #4
A REST architecture enables scalability,
extensible development, and mashup of
tools and ideas created for the Internet
InterestingTechnologies for Big Data
Hpc lunch and learn
Google BigQuery
• Underlying technology is called DREMEL
• Uses google file system as abstraction for database
• Dremel can even execute a complex regular expression text matching on a huge
logging table that consists of about 35 billion rows and 20TB, in merely tens of
seconds

Recommended for you

SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing

Cloud computing is no longer a fad that is going around. It is for real and is perhaps the most talked about subject. Various players in the cloud eco-system have provided a definition that is closely aligned to their sweet spot –let it be infrastructure, platforms or applications. This presentation will provide an exposure of a variety of cloud computing techniques, architecture, technology options to the participants and in general will familiarize cloud fundamentals in a holistic manner spanning all dimensions such as cost, operations, technology etc

cloud computingcloud trainingcloud
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024

Everything that I found interesting about machines behaving intelligently during June 2024

quantumfaxmachine
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx

Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation

rpa in healthcarerpa in healthcare usarpa in healthcare industry
Cassandra
• Cassandra provides a structured key-value store with tunable
consistency.
• Keys map to multiple values, which are grouped into column families.
The column families are fixed when a Cassandra database is created,
but columns can be added to a family at any time.
• Furthermore, columns are added only to specified keys, so different
keys can have different numbers of columns in any given family.
• The values from a column family for each key are stored together.
Palantir
• Does work for government agencies
• High security layer that sits on top of disparate data sources
• The Palantir Stack Layer
• Brings together structured and unstructured data
• Serves as foundation for applications using the dataAPI
• Search and discovery layer
• Granular multi layered security model
• Revisioning database and original source tracking
• Collaboration and data editing
Ayasdi
• Topological data analysis using machine learning
• Can cross analyze multiple data
sources
• Query free approach
Zoom Data
• Automated connectivity to third party sources
• Visualization studio
• Interactive visualizations

Recommended for you

Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation

Java Servlet programs

INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf

These fighter aircraft have uses outside of traditional combat situations. They are essential in defending India's territorial integrity, averting dangers, and delivering aid to those in need during natural calamities. Additionally, the IAF improves its interoperability and fortifies international military alliances by working together and conducting joint exercises with other air forces.

air force fighter planebiggest submarinezambia port
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...

Slide of the tutorial entitled "Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Emerging Trends" held at UMAP'24: 32nd ACM Conference on User Modeling, Adaptation and Personalization (July 1, 2024 | Cagliari, Italy)

user modelinguser profilinguser model
WebGL ( Open GL in web browser )
• Could be used for presentation layer in mobile device
http://demos.vicomtech.org/x3dom/test/functional/volrenShaderBoun
daryEnh.xhtml
http://ourbricks.com/viewer/178d62ac29aa44459a6d57ce474fa6b6
Key point #5
Connect to these and other tools using REST
Questions ?
john@advancedseismic.com
832.544.7305

More Related Content

What's hot

Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Continuent
 
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
Lucas Jellema
 
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
HostedbyConfluent
 
Hybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and KubernetesHybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and Kubernetes
Databricks
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
gregchanan
 
A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology
confluent
 
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkReactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Todd Fritz
 
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networks
pbelko82
 
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
Data Con LA
 
Enabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopEnabling real interactive BI on Hadoop
Enabling real interactive BI on Hadoop
DataWorks Summit
 
Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...
Lucas Jellema
 
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Lucas Jellema
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
DataWorks Summit/Hadoop Summit
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
Raul Chong
 
Choosing the right Cloud Database
Choosing the right Cloud DatabaseChoosing the right Cloud Database
Choosing the right Cloud Database
Janakiram MSV
 
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
Lucas Jellema
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
Romeo Kienzler
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
Mike Broberg
 
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Rui Quintino
 
Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.
Dan Harvey
 

What's hot (20)

Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
 
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
 
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
 
Hybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and KubernetesHybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and Kubernetes
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
 
A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology
 
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkReactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
 
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networks
 
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
 
Enabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopEnabling real interactive BI on Hadoop
Enabling real interactive BI on Hadoop
 
Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...
 
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Choosing the right Cloud Database
Choosing the right Cloud DatabaseChoosing the right Cloud Database
Choosing the right Cloud Database
 
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
 
Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.
 

Similar to Hpc lunch and learn

Summer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointSummer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpoint
Christopher Dubois
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
Peter Clapham
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
Bigstep
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
FoundationDB
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Yong Feng
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
Nguyen Tung
 
Cloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming CurriculumCloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming Curriculum
Steven Miller
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Avere Systems
 
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
moneyjh
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
Sujai Prakasam
 
8. Software Development Security
8. Software Development Security8. Software Development Security
8. Software Development Security
Sam Bowne
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
Alex Moskvin
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
CQD
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Michael Hiskey
 
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Sri Ambati
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
Markus Eisele
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
MahmoudZidan41
 
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
Serdar Basegmez
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
Yu Ishikawa
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
SpringPeople
 

Similar to Hpc lunch and learn (20)

Summer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointSummer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpoint
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
 
Cloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming CurriculumCloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming Curriculum
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
 
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
8. Software Development Security
8. Software Development Security8. Software Development Security
8. Software Development Security
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 

Recently uploaded

20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
shanthidl1
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
Liveplex
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Bert Blevins
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
Sally Laouacheria
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Bert Blevins
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
welrejdoall
 

Recently uploaded (20)

20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
 

Hpc lunch and learn

  • 1. IDEAS for thought SHPC lunch and learn JULY 25, 2013
  • 2. John D. Almon • Full stack software engineer • Implemented RTM on GPU using MPI • Implemented Cloud basedWEM using SOA • Terabyte scale database design and data warehousing • Architected hybrid web interpretation and processing system • C++, Java, MPI, C, Oracle PL/SQL, HTML,Web Based Systems, XML • Managed software team • Currently serves as CEO ofAdvanced SeismicTechnologies
  • 4. Small HPC setup - Guess what company • Fiber optic to every desktop using HPC grid • 400Terabytes of Storage • 300 x 10 GbE ports • 1500 x 1 GbE ports • Desktop workstations automatically added to HPC grid after hours • 5,000 AMD processors + 3,000 desktop processors at night
  • 7. Monsters University • 100 Million CPU hours • 5.5 million individual hairs • 127 simulated garments • Global illumination ray tracing
  • 8. Key point #1 Perhaps we can learn new techniques from other industries that operate at scale
  • 10. Bi Modal Distribution of Developers This shapes Architecture and Design Innovation Loosely coupled code Fast hardware Open source Closely coupled code Slow hardware More optimization Geoscience Gap Massive hardware changes
  • 11. Better compilers and cheaper hardware has changed everything about software development • No more fortran ( sort of ) • Object oriented approach • Teenage internet billionaires
  • 12. Software access patterns affect memory speed ( affected by data and users ) Word Size Affects Memory Bandwidth Temporal Locality & Spatial Locality Can affect bandwidth
  • 13. Memory Mountain software code /* Iterate over first "elems" elements of array "data" with stride of * "stride". */ void test(int elems, int stride) { int i; double result = 0.0; volatile double sink; for (i = 0; i < elems; i += stride) result += data[i]; sink = result; /* So compiler doesn't optimize away the loop */ }
  • 14. Everything is a cache ( memory heirachy ) • Register, ~2ns • Primary cache, ~4-5ns • Secondary cache, ~30ns • Main memory, ~22ns • Magnetic Disk, ~3ms • SSD,~100µs • File server on Gigabit ethernet • Cloud Bottleneck is the memory bus Bottleneck is the network
  • 15. New Paradigm for Optimization of Compute at Cluster / Cloud level • Pre sorting / caching of data for maximum throughput • Hueristic analysis at the application level • Optimization of hardware resources determined by the application • Hardware switching based on access patterns of application and user
  • 16. All developers are: (artists | engineers | brilliant | clueless ) • There is no one right way to build a piece of software • Heterogeous development staff builds heterogeneous solutions • What about UI / UX ( User Interface / User Experience ) • Business workflows should drive UI / UX • Steve jobs was tyrannical about every detail fitting into his one overaching product vision
  • 20. Key point #2 Software developers shape the choice of architecture and available tools
  • 21. 2 Companies with really “Big Data”
  • 23. • $50 Billion in revenue • 30,000 + employees • Optimization throughout entire stack • Google Filesystem, Operating System, CHROME • 2,000,000 servers • Free food to keep their developers working long hours
  • 25. Google tools • Google Hangout - collaboration • Google Maps • Google compute engine • Google bigQuery
  • 27. • $1 Billion data center in Iowa • 450,000 servers • API first development strategy • Supports multiple interface connectivity using “restful” applications • Compete with UI / UX • Creates user lock in through iterative conditioning
  • 28. Iterative conditioning • Workflows are hard to learn • You should need software training to learn how to use software • Software fatigue • Switching cost • Adoption rates • Advanced features • Tracking all of this and dynamic menus and configuration
  • 29. Facebook tools and contributions • Apache Cassandra ( Big data database, linear scalability ) • ApacheThrift ( cross language services ) Architecture choices provide insight … still have to implement for specifics of Oil and Gas
  • 30. Open Source Licensing • MIT X11 License – ANY use permissible • BSD – Identical to MIT X11 • GPL – no linking • LPGL – linking allowed • Appliances – ethical / versus legal Must read the fine print before using, but can save very large amount of time by using these frameworks and implementations where possible
  • 31. Key point #3 Internet companies have innovation at scale
  • 34. Representational State Transfer • 6 constraints • Client Server – clients are not concerned with data storage • Stateless – server does not store client context • Cacheable – client stores responses • Layered system – client does not know if it is at end server or intermediary • Optional code on demand – client downloads code and runs • Uniform interface – decouples interface and allows each part to evolve independently
  • 35. Representational State Transfer • 6 constraints • Client Server – clients are not concerned with data storage • Stateless – server does not store client context • Cacheable – client stores responses • Layered system – client does not know if it is at end server or intermediary • Optional code on demand – client downloads code and runs • Uniform interface – decouples interface and allows each part to evolve independently
  • 36. Simplified REST Web Browser Web Server Database File Servers Presentation Layer can’t handle Geoscience or local compute Web server has the majority of control Compute Engine REST API
  • 38. REST with Mashup Web Browser Web Server 1 Database File Servers Presentation Layer can mashup data from 2 separate sources Compute Engine Web Server 2 REST API
  • 39. REST with new application layer Form window Application Database File Servers Compute Engine Web Server 2 REST API OpenGLWindow Web Browser
  • 40. Internet architecture / legacy style code • REST Architecture for NON – INTERNET applications • Can keep inside corporate networks • Distributed systems architecture • Predominant webAPI design model • Allows for distributed development team • Separate data model from view model • But allows for computation on either side
  • 42. Client Server • FINALLY !! Interactive HPC apps made easy • Our tabs are the clients connection to application layer via a “REST” style API • Application layer provides caching and file system access • Application layer provides access to heterogeneous compute
  • 43. Stateless • Each tab does not know about other tabs • This creates the ability to very quickly have developer from different teams and disciplines work independently • Application layer provides synchronization states • Application layer provides for off-workstation transferability ( work from iPad on the Beach )
  • 44. Cacheable • Heuristic data sorting and precaching based on user / algorithm needs • Allows for compute distribution without presentation layer needing to know • Allows for disparate file systems • Abstracts data location from user • Communicate with HPC grid in more advanced manner
  • 45. Layered System • Allows for use of 3rd party plugins • Allows EVERY application connect to HPC grid • Graphics as plugins • Workflows as plugins - dynamic workflow • No menu on Amazon • Optimize each layer independently
  • 46. Code on demand • Safer since security is controlled by application layer • Sandbox each user and only give access with additional security credentials • Can download and run legacy code through Pinvoke • DLL injection
  • 47. Uniform Interface • HTML for cross platform consistency • User adoption and ease of use • Internet style decoupling of functionality from graphics creates a better user experience and more intuitive style workflow • Most graphic designers do NOT know C++ • Geoscientists won’t always agree on color scheme, styles, icons
  • 48. Most important benefits • More flexibility means rapid application development and easier maintenance • Presentation layer needs change as business requirements needs change over time • Hooking into outside tools that have REST API’s • Data • Social • Compute engines • Mash ups
  • 49. Key point #4 A REST architecture enables scalability, extensible development, and mashup of tools and ideas created for the Internet
  • 52. Google BigQuery • Underlying technology is called DREMEL • Uses google file system as abstraction for database • Dremel can even execute a complex regular expression text matching on a huge logging table that consists of about 35 billion rows and 20TB, in merely tens of seconds
  • 53. Cassandra • Cassandra provides a structured key-value store with tunable consistency. • Keys map to multiple values, which are grouped into column families. The column families are fixed when a Cassandra database is created, but columns can be added to a family at any time. • Furthermore, columns are added only to specified keys, so different keys can have different numbers of columns in any given family. • The values from a column family for each key are stored together.
  • 54. Palantir • Does work for government agencies • High security layer that sits on top of disparate data sources • The Palantir Stack Layer • Brings together structured and unstructured data • Serves as foundation for applications using the dataAPI • Search and discovery layer • Granular multi layered security model • Revisioning database and original source tracking • Collaboration and data editing
  • 55. Ayasdi • Topological data analysis using machine learning • Can cross analyze multiple data sources • Query free approach
  • 56. Zoom Data • Automated connectivity to third party sources • Visualization studio • Interactive visualizations
  • 57. WebGL ( Open GL in web browser ) • Could be used for presentation layer in mobile device http://demos.vicomtech.org/x3dom/test/functional/volrenShaderBoun daryEnh.xhtml http://ourbricks.com/viewer/178d62ac29aa44459a6d57ce474fa6b6
  • 58. Key point #5 Connect to these and other tools using REST