SlideShare a Scribd company logo
Discovering Emerging
Technology Through
Graph Analysis
GraphConnect | Chicago
June 2013
About Me
henry74@gmail.com || henry.hwangbo@us.pwc.com
@henry74
henry74
Founder / Director of PwC's Emerging Tech Lab
What is the Emerging Tech Lab?
We build stuff to help people get smart about applying technology to
solve problems
● Founded 3 years ago to identify and experiment with new
technologies relevant to but not widely adopted by the Enterprise
● Focuses on rapid prototyping & MVP build-outs for both
tactical internal projects and more creative, exploratory ideas
● Permanent core team, but operates a rotational program for
staff to provide them an opportunity for hands-on technical
experience, learning agile & lean principles, and exposure to a
startup-like environment
The Challenge

Recommended for you

A view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academiaA view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academia

Academics in an ivory tower conjures images of people toiling away nicely insulated from many of the concerns of reality. While this has it's advantages, anyone who's tried to use a project written for a research paper under a deadline can attest that it doesn't always result in useful code. While completing my PhD, I found an Apache project that fit well with the work I was doing s I rolled up my sleeves to write some code to make it more useful for solving my own problems. I've since had the opportunity to join the project's PMC and now as a faculty member, I continue to find value in encouraging my own students to contribute to Apache projects. I'll discuss how academics and Apache projects can find mutual benefit in close collaboration.

apachecalcitespark
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j

Originally presented at DataDay Texas in Austin, this presentation shows how a graph database such as Neo4j can be used for common natural language processing tasks, such as building a word adjacency graph, mining word associations, summarization and keyword extraction and content recommendation.

databasebig datamachine learning
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...

It is widely known that the discovery, development, and commercialization of new classes of drugs can take 10-15 years and greater than $5 billion in R&D investment only to see less than 5% of the drugs make it to market. AstraZeneca is a global, innovation-driven biopharmaceutical business that focuses on the discovery, development, and commercialization of prescription medicines for some of the world’s most serious diseases. Our scientists have been able to improve our success rate over the past 5 years by moving to a data-driven approach (the “5R”) to help develop better drugs faster, choose the right treatment for a patient and run safer clinical trials. However, our scientists are still unable to make these decisions with all of the available scientific information at their fingertips. Data is sparse across our company as well as external public databases, every new technology requires a different data processing pipeline and new data comes at an increasing pace. It is often repeated that a new scientific paper appears every 30 seconds, which makes it impossible for any individual expert to keep up-to-date with the pace of scientific discovery. To help our scientists integrate all of this information and make targeted decisions, we have used Spark on Azure Databricks to build a knowledge graph of biological insights and facts. The graph powers a recommendation system which enables any AZ scientist to generate novel target hypotheses, for any disease, leveraging all of our data. In this talk, I will describe the applications of our knowledge graph and focus on the Spark pipelines we built to quickly assemble and create projections of the graph from 100s of sources. I will also describe the NLP pipelines we have built – leveraging spacy, bioBERT or snorkel – to reliably extract meaningful relations between entities and add them to our knowledge graph.

* 
apache spark

 *big data

 *ai

 *
It usually starts with an idea…
“Build a platform to help discover emerging technologies.”
…followed by some pretty mock-ups…
…to raise expectations.
Envisioning success
● What are some emerging
technologies?
● How are they being used to solve
real problems?
● Who is talking about them?
● Who are the players?
● Are there related technologies?
● Get up to speed quickly
● Discover related topics
● Understand what is trending
● Find interesting applications
● See what's possible
What makes technology “emerging”?
● Cannot already be mainstream technology
● Needs to be more than a single event to be an emerging trend
● Must be growing in popularity, but not yet popular
● "Technology" could be a thing (e.g. nanotubes), but also an
aggregation or application of technologies (e.g. cloud
computing, quantified self)

Recommended for you

Sprint_1_Python_vs_R
Sprint_1_Python_vs_RSprint_1_Python_vs_R
Sprint_1_Python_vs_R

This document outlines a project exploring the use of Python and R for business applications. It provides brief descriptions of Python and R, noting their uses in scientific computing, big data, automation, web scraping, visualization, and more. Potential business applications are mentioned but not described. The document discusses success factors such as continuing to learn the syntax of Python and R, defining requirements, and investigating applications. It proposes taking what was learned about the languages and identifying a realistic business problem and solution to develop using Python, R, or both. Next steps include meeting with a professor, exploring solutions, and coordinating with teammates to develop materials showcasing Python and R.

South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel

Slides from Trey's opening presentation for the South Big Data Hub's Text Data Analysis Panel on December 8th, 2016. Trey provided a quick introduction to Apache Solr, described how companies are using Solr to power relevant search in industry, and provided a glimpse on where the industry is heading with regard to implementing more intelligent and relevant semantic search.

big datasemantic searchtext data analysis
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk

This talk describes how to implement conceptual search (semantic search) within a modern search engine using the word2vec algorithm to learn concepts. We also cover how to auto-tune the search engine parameters using black box optimization techniques, and the problems of feedback loops encountered when building machine learning systems that modify the user behavior used to train the system.

solr lucene "search relevancy" "conceptual search"
The Journey
Initial design
Data Feeds
(RSS)
Pull &
Store Raw
Data
MongoDB
Analyze VisualizeSource
?
Postgres
Breaking ground
● Natural Language Processing
● Named Entity Recognition
● ???
● ???
● ???
● ???
● ???
Extract Text
Understand
Text
Discover
Insights
A bit more clarity
Data Feeds
(RSS)
Pull &
Store Raw
Data
MongoDB
Analyze VisualizeSource
?
3rd Party
APIs
Tag &
Update
Postgres

Recommended for you

R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics

This presentation if for beginners in R and is geared toward use in psychometrics (academic, credentialing, and psychological exam development).

psychometricsrr programming
Interleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904LabsInterleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904Labs

Presented at Open Source Connections Haystack Relevance Conference on 904Labs' "Interleaving: from Evaluation to Self-Learning". 904Labs is the first to commercialize "Online Learning to Rank" as a state-of-art for technical Self-learning Search Ranking that automatically takes into account your customers human behaviors for personalized search results.

User behaviour modeling for data prefetching in web applications
User behaviour modeling for data prefetching in web applicationsUser behaviour modeling for data prefetching in web applications
User behaviour modeling for data prefetching in web applications

This document proposes extensions to user behavior modeling for web application prefetching. It discusses using n-gram and n-gram+ techniques to predict the next actions users will take based on sequential patterns in their historical requests and responses. Relations between actions are defined to identify dependencies between tokens in requests. An algorithm is proposed to assign actions to endpoints, tokenize requests/responses, identify action relations through n-gram statistics, and predict/prefetch future actions by filling token values. This predictive modeling could help prefetch dependent resources to reduce latency.

n-gramaimachine learning
Digging a little deeper
● Natural Language Processing
● Named Entity Recognition
● Collocation?
● K-means clustering?
● Information Ontology?
● ???
● ???
Extract Text
Understand
Text
Discover
Insights
The Eureka moment...
…took a bit longer than it should have
Graphs are everywhere
Final design
Data Feeds
(RSS)
Pull &
Store Raw
Data
MongoDB
Analyze VisualizeSource
3rd Party
API
Tag &
Update
Neo4j Postgres
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConnect Chicago 2013

Recommended for you

SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018

This presentation was given in one of the DSATL Mettups in March 2018 in partnership with Southern Data Science Conference 2018 (www.southerndatascience.com)

data sciencebig datarecommender system
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning

This document discusses principles for applying continuous delivery practices to machine learning models. It begins with background on the speaker and their company Indix, which builds location and product-aware software using machine learning. The document then outlines four principles for continuous delivery of machine learning: 1) Automating training, evaluation, and prediction pipelines using tools like Go-CD; 2) Using source code and artifact repositories to improve reproducibility; 3) Deploying models as containers for microservices; and 4) Performing A/B testing using request shadowing rather than multi-armed bandits. Examples and diagrams are provided for each principle.

machine learningproductioncontinuous delivery
Data science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridgedData science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridged

This document provides a summary of Rangarajan Chari's background and experience. It includes 3 sentences of experience as a data scientist and machine learning specialist with skills in neural networks, natural language processing, and big data technologies. Chari has worked on projects involving text classification, face recognition, and troubleshooting techniques for vehicles. The summary also lists education including a PhD program in artificial intelligence and masters degrees in computer science, math, and physics.

nlpword embeddingsresume
Lesson #1 - Graph data modeling is iterative
What should be a node, relationship, or a property? Depends on:
● What will you search on?
● How do you start your searches?
● How much data do you expect to have? What data?
Expect to change your graph based on:
● Experimentation
● Query syntax available to extract and aggregate graph data
● Query performance
TIP: Plan to reload your graph many times - save the raw data, start small,
use batch loading until you get it right
…but more flexible than traditional data modeling
Modeling the data
DO
C
P
P
C
K
K
C
T
C
DOC
P
P
C
K
K
O
T
Document are described by its
entities, concepts, and keywords
through relationships
This means:
● Document are related to other
documents through shared
entities, concepts, and keywords
● Concepts and entities are related
to each other through shared
documents
● Incoming relationships measures
# of referring documents
Simple, yet powerful
TAGGED_AS
RELATES_TO
REFERS_TO
CONTAINS
REFERS_TO
Lesson #2 - Connections are important
Highly connected data creates richer
graphs and increases potential for
discovering greater insights
BUT unnecessary connections can
create noise & extra work
Don't create artificial connections, but clean up data before importing when it
makes sense (e.g. networking, networks, network)
Prevent duplication which can impact your insights based on aggregation (e.g.
# of relationships) or certain patterns
Keeping it clean
Techniques Graph Benefits
Text extraction with
readability scoring
● Better named entity extraction
● Improve neighbor relevance
● Minimize invalid nodes & relationships
Similarity Hashing
● Improve validity of relationships
● Increase graph connectedness
Porter Stemming ● Improve graph connectedness

Recommended for you

The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph

Presentation of the Semantic Knowledge Graph research paper at the 2016 IEEE 3rd International Conference on Data Science and Advanced Analytics (Montreal, Canada - October 18th, 2016) Abstract—This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between each pair of nodes and their corresponding edge, enabling edges to materialize dynamically from underlying corpus statistics. As a result, any combination of nodes can have edges to any other nodes materialize and be scored to reveal latent relationships between the nodes. This provides numerous benefits: the knowledge graph can be built automatically from a real-world corpus of data, new nodes - along with their combined edges - can be instantly materialized from any arbitrary combination of preexisting nodes (using set operations), and a full model of the semantic relationships between all entities within a domain can be represented and dynamically traversed using a highly compact representation of the graph. Such a system has widespread applications in areas as diverse as knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems. The main contribution of this paper is the introduction of a novel system - the Semantic Knowledge Graph - which is able to dynamically discover and score interesting relationships between any arbitrary combination of entities (words, phrases, or extracted concepts) through dynamically materializing nodes and edges from a compact graphical representation built automatically from a corpus of data representative of a knowledge domain.

semantic knowledge graphgraph traversalsemantic search
ExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
ExperTwin: An Alter Ego in Cyberspace for Knowledge WorkersExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
ExperTwin: An Alter Ego in Cyberspace for Knowledge Workers

ExperTwin is a Knowledge Advantage Machine (KAM) that is able to collect data from your areas of interest and present it in-time, in-context and in place to the worker workspace. This research paper describes how workers can be benefited from having a personal net of crawlers (as Google does) collecting and organizing updated data relevant to their areas of interest and delivering these to their workspace.

crawlersbotsvirtual agent
L15.pptx
L15.pptxL15.pptx
L15.pptx

The document provides a general introduction to artificial intelligence (AI), machine learning (ML), deep learning (DL), and data science (DS). It defines each term and describes their relationships. Key points include: - AI is the ability of computers to mimic human cognition and intelligence. - ML is an approach to achieve AI by having computers learn from data without being explicitly programmed. - DL uses neural networks for ML, especially with unstructured data like images and text. - DS involves extracting insights from data through scientific methods. It is a multidisciplinary field that uses techniques from ML, DL, and statistics.

Lesson #3 - Understand Cypher
● Cypher experimentation opens up the possible
● SQL users will be at home - tabular results, similar
syntax
● Start without parameters, check with Neo4j shell,
move to parameterized queries for security &
performance (caching)
● Don't forget Lucene syntax
● Continues to evolve for the better - check new release
changes (http://docs.neo4j.org/refcard/1.9/)
● Let Cypher do the work
Useful Cypher Syntax
START with an index
MATCH defines your universe
WHERE filters it down
WITH combines multiple statements
HAS checks if a property exists
AS lets you name your return values
IN checks against an array
COLLECT aggregates into an array
ORDER just like SQL
LIMIT for performance
Prototype highlights
● 4 people & 4 months (first version)
● Data Stores - Neo4J, MongoDB, Redis, Postgres
● Visuals - D3.js, Vivagraph.js, Twitter Bootstrap
● Key Languages/Libraries - Ruby, Rails, Cypher,
Knockout.js, Amplify.js, HTML5, CSS3, jQuery,
Neography gem, Resque gem
● 3rd Party - Alchemy, OpenCalais, RSS feeds,
Wikipedia
● Concepts - natural language processing, named
entity extraction, text cleansing & de-duplication
(map/reduce), similarity hashing, large-scale
information retrieval
● 1M+ nodes, 3M+ relationships, 6M+ properties after
6 months
Emerging Tech Radar Demo

Recommended for you

Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4j

The presentation gives a brief information about Graph Databases and its usage in today's scenario. Moving on the presentation talks about the popular Graph DB Neo4j and its Cypher Query Language i.e., used to query the graph.

neo4jgraphdbnosql
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph

What if instead of a query returning documents, you could alternatively return other keywords most related to the query: i.e. given a search for "data science", return me back results like "machine learning", "predictive modeling", "artificial neural networks", etc.? Solr’s Semantic Knowledge Graph does just that. It leverages the inverted index to automatically model the significance of relationships between every term in the inverted index (even across multiple fields) allowing real-time traversal and ranking of any relationship within your documents. Use cases for the Semantic Knowledge Graph include disambiguation of multiple meanings of terms (does "driver" mean truck driver, printer driver, a type of golf club, etc.), searching on vectors of related keywords to form a conceptual search (versus just a text match), powering recommendation algorithms, ranking lists of keywords based upon conceptual cohesion to reduce noise, summarizing documents by extracting their most significant terms, and numerous other applications involving anomaly detection, significance/relationship discovery, and semantic search. In this talk, we'll do a deep dive into the internals of how the Semantic Knowledge Graph works and will walk you through how to get up and running with an example dataset to explore the meaningful relationships hidden within your data.

semantic knowledge graphknowledge graphssemantic search
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS

The slides are from my talk on General Introduction to Artifical Intelligence, Machine Learning, Deep Learning & Data Science

artifical intelligencemachine learningdeep learning
Tag Cloud / Search
DOC C
K
K
C
DOC
C
K
K
DOC
DOC
DOC
DOC
● Index keywords and search across keywords (tip: use Lucene syntax)
● Identify documents with strong relationships to keywords
● Locate concepts with strongest relationships to relevant documents
● Popularity based on number of incoming relationships
Emerging Index / Popularity / Doc List
DO
C
CDOC
(E)
OC
DOC
(NE)
DOC
(E)
DOC
(E)
DOC
(NE)
DOC
(E)
DOC
(NE)
DOC
(E)
Cloud computing (Concept) and Google (Org)
● Strong relationships with documents shared between concepts to filter
and rank relevant content
● Ratio and strength of relationships to quantify emerging index
● Popularity based on number of incoming relationships of each type of
document (emerging versus non-emerging)
Node Graph
DO
C
CK DOC OC
DOC
DOC
DOC
DOC DOC
DOC
● Existing relationships with documents shared between concepts to
filter relevant neighbors
● Strength of relationships based on # and weight for ranking relevance
(color)
C
The Takeaway

Recommended for you

GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases

Applying graph analytics on data stored in relational databases can provide tremendous value in many application domains. We discuss the importance of leveraging these analyses, and the challenges in enabling them. We present a tool, called GraphGen, that allows users to visually explore, and rapidly analyze (using NetworkX) different graph structures present in their databases.

graph analytics relational databases sql graphgen
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases

This document discusses GraphGen, a tool for conducting graph analytics over relational databases. It begins by introducing graph analytics and its applications. It then discusses the current state of graph analytics, which is fragmented with no single solution. Most organizations store data relationally and have "hidden" graphs that can be extracted. GraphGen provides a declarative language to define nodes and edges to extract these graphs without ETL. It supports various interfaces like Java, Python, and a web application to enable graph analytics over relational data in an intuitive way.

A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...

This document discusses providing a modern interface for data science on Postgres and Greenplum databases. It introduces Ibis, a Python library that provides a DataFrame abstraction for SQL systems. Ibis allows defining complex data pipelines and transformations using deferred expressions, providing type checking before execution. The document argues that Ibis could be enhanced to support user-defined functions, saving results to tables, and data science modeling abstractions to provide a full-featured interface for data scientists on SQL databases.

greenplum summit 2019
Final Thoughts
● Graphs makes it simple to generate complex insights - you don't
need to be a data scientist
● Graphs are a natural fit for anything connected...which is most
things (e.g. social media, internet of things, sensor data)
● Experimentation is the best way to learn the power of graphs
● Make graph databases a first class citizen in your technology
toolkit - many things can be solved better with a graph
The best way to discover emerging technologies is to try
them out
Thanks for Listening - Q & A
Special thanks to Max De Marzi for his neography gem (https://github.
com/maxdemarzi/neography) and ongoing advice, suggestions,
troubleshooting

More Related Content

What's hot

Approaches to text analysis
Approaches to text analysisApproaches to text analysis
Approaches to text analysis
Sigmoid
 
Indexing, searching, and aggregation with redi search and .net
Indexing, searching, and aggregation with redi search and .netIndexing, searching, and aggregation with redi search and .net
Indexing, searching, and aggregation with redi search and .net
Stephen Lorello
 
Neo4jrb
Neo4jrbNeo4jrb
Neo4jrb
andreasronge
 
A view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academiaA view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academia
Michael Mior
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
William Lyon
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Databricks
 
Sprint_1_Python_vs_R
Sprint_1_Python_vs_RSprint_1_Python_vs_R
Sprint_1_Python_vs_R
BobSmith712
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel
Trey Grainger
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Simon Hughes
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics
Diane Talley
 
Interleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904LabsInterleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904Labs
John T. Kane
 
User behaviour modeling for data prefetching in web applications
User behaviour modeling for data prefetching in web applicationsUser behaviour modeling for data prefetching in web applications
User behaviour modeling for data prefetching in web applications
Kacper Łukawski
 

What's hot (12)

Approaches to text analysis
Approaches to text analysisApproaches to text analysis
Approaches to text analysis
 
Indexing, searching, and aggregation with redi search and .net
Indexing, searching, and aggregation with redi search and .netIndexing, searching, and aggregation with redi search and .net
Indexing, searching, and aggregation with redi search and .net
 
Neo4jrb
Neo4jrbNeo4jrb
Neo4jrb
 
A view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academiaA view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academia
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
Sprint_1_Python_vs_R
Sprint_1_Python_vs_RSprint_1_Python_vs_R
Sprint_1_Python_vs_R
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics
 
Interleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904LabsInterleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904Labs
 
User behaviour modeling for data prefetching in web applications
User behaviour modeling for data prefetching in web applicationsUser behaviour modeling for data prefetching in web applications
User behaviour modeling for data prefetching in web applications
 

Similar to Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConnect Chicago 2013

SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
CareerBuilder.com
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
Rajesh Muppalla
 
Data science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridgedData science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridged
Rangarajan Chari
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
Trey Grainger
 
ExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
ExperTwin: An Alter Ego in Cyberspace for Knowledge WorkersExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
ExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
Carlos Toxtli
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
ImonBennett
 
Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4j
Suroor Wijdan
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
Trey Grainger
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
Roopesh Kohad
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
Konstantinos Xirogiannopoulos
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
PyData
 
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
VMware Tanzu
 
Elasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalElasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ Signal
Joachim Draeger
 
SEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentationSEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentation
SemLib Project
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
Trey Grainger
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph Datasources
Stratio
 
RamaRaju_Profile
RamaRaju_ProfileRamaRaju_Profile
RamaRaju_Profile
Ramaraju Dantuluri
 
The Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphThe Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge Graph
Trey Grainger
 
Which Questions We Should Have
Which Questions We Should HaveWhich Questions We Should Have
Which Questions We Should Have
Oracle Korea
 
Building data "Py-pelines"
Building data "Py-pelines"Building data "Py-pelines"
Building data "Py-pelines"
Rob Winters
 

Similar to Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConnect Chicago 2013 (20)

SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Data science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridgedData science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridged
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
ExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
ExperTwin: An Alter Ego in Cyberspace for Knowledge WorkersExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
ExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
 
Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4j
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
 
Elasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalElasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ Signal
 
SEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentationSEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentation
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph Datasources
 
RamaRaju_Profile
RamaRaju_ProfileRamaRaju_Profile
RamaRaju_Profile
 
The Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphThe Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge Graph
 
Which Questions We Should Have
Which Questions We Should HaveWhich Questions We Should Have
Which Questions We Should Have
 
Building data "Py-pelines"
Building data "Py-pelines"Building data "Py-pelines"
Building data "Py-pelines"
 

More from Neo4j

BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Atelier - Architecture d’applications de Graphes - GraphSummit Paris
Atelier - Architecture d’applications de Graphes - GraphSummit ParisAtelier - Architecture d’applications de Graphes - GraphSummit Paris
Atelier - Architecture d’applications de Graphes - GraphSummit Paris
Neo4j
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
FLOA - Détection de Fraude - GraphSummit Paris
FLOA -  Détection de Fraude - GraphSummit ParisFLOA -  Détection de Fraude - GraphSummit Paris
FLOA - Détection de Fraude - GraphSummit Paris
Neo4j
 
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
Neo4j
 
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
ADEO -  Knowledge Graph pour le e-commerce, entre challenges et opportunités ...ADEO -  Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
Neo4j
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
Neo4j
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
Neo4j
 

More from Neo4j (20)

BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Atelier - Architecture d’applications de Graphes - GraphSummit Paris
Atelier - Architecture d’applications de Graphes - GraphSummit ParisAtelier - Architecture d’applications de Graphes - GraphSummit Paris
Atelier - Architecture d’applications de Graphes - GraphSummit Paris
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
FLOA - Détection de Fraude - GraphSummit Paris
FLOA -  Détection de Fraude - GraphSummit ParisFLOA -  Détection de Fraude - GraphSummit Paris
FLOA - Détection de Fraude - GraphSummit Paris
 
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
SOPRA STERIA - GraphRAG : repousser les limitations du RAG via l’utilisation ...
 
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
ADEO -  Knowledge Graph pour le e-commerce, entre challenges et opportunités ...ADEO -  Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
ADEO - Knowledge Graph pour le e-commerce, entre challenges et opportunités ...
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 

Recently uploaded

Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Bert Blevins
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
rajancomputerfbd
 
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
Toru Tamaki
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
UiPathCommunity
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Andrey Yasko
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
Sally Laouacheria
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 

Recently uploaded (20)

Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
 
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 

Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConnect Chicago 2013

  • 1. Discovering Emerging Technology Through Graph Analysis GraphConnect | Chicago June 2013
  • 2. About Me henry74@gmail.com || henry.hwangbo@us.pwc.com @henry74 henry74 Founder / Director of PwC's Emerging Tech Lab
  • 3. What is the Emerging Tech Lab? We build stuff to help people get smart about applying technology to solve problems ● Founded 3 years ago to identify and experiment with new technologies relevant to but not widely adopted by the Enterprise ● Focuses on rapid prototyping & MVP build-outs for both tactical internal projects and more creative, exploratory ideas ● Permanent core team, but operates a rotational program for staff to provide them an opportunity for hands-on technical experience, learning agile & lean principles, and exposure to a startup-like environment
  • 5. It usually starts with an idea… “Build a platform to help discover emerging technologies.”
  • 6. …followed by some pretty mock-ups… …to raise expectations.
  • 7. Envisioning success ● What are some emerging technologies? ● How are they being used to solve real problems? ● Who is talking about them? ● Who are the players? ● Are there related technologies? ● Get up to speed quickly ● Discover related topics ● Understand what is trending ● Find interesting applications ● See what's possible
  • 8. What makes technology “emerging”? ● Cannot already be mainstream technology ● Needs to be more than a single event to be an emerging trend ● Must be growing in popularity, but not yet popular ● "Technology" could be a thing (e.g. nanotubes), but also an aggregation or application of technologies (e.g. cloud computing, quantified self)
  • 10. Initial design Data Feeds (RSS) Pull & Store Raw Data MongoDB Analyze VisualizeSource ? Postgres
  • 11. Breaking ground ● Natural Language Processing ● Named Entity Recognition ● ??? ● ??? ● ??? ● ??? ● ??? Extract Text Understand Text Discover Insights
  • 12. A bit more clarity Data Feeds (RSS) Pull & Store Raw Data MongoDB Analyze VisualizeSource ? 3rd Party APIs Tag & Update Postgres
  • 13. Digging a little deeper ● Natural Language Processing ● Named Entity Recognition ● Collocation? ● K-means clustering? ● Information Ontology? ● ??? ● ??? Extract Text Understand Text Discover Insights
  • 14. The Eureka moment... …took a bit longer than it should have Graphs are everywhere
  • 15. Final design Data Feeds (RSS) Pull & Store Raw Data MongoDB Analyze VisualizeSource 3rd Party API Tag & Update Neo4j Postgres
  • 17. Lesson #1 - Graph data modeling is iterative What should be a node, relationship, or a property? Depends on: ● What will you search on? ● How do you start your searches? ● How much data do you expect to have? What data? Expect to change your graph based on: ● Experimentation ● Query syntax available to extract and aggregate graph data ● Query performance TIP: Plan to reload your graph many times - save the raw data, start small, use batch loading until you get it right …but more flexible than traditional data modeling
  • 18. Modeling the data DO C P P C K K C T C DOC P P C K K O T Document are described by its entities, concepts, and keywords through relationships This means: ● Document are related to other documents through shared entities, concepts, and keywords ● Concepts and entities are related to each other through shared documents ● Incoming relationships measures # of referring documents Simple, yet powerful TAGGED_AS RELATES_TO REFERS_TO CONTAINS REFERS_TO
  • 19. Lesson #2 - Connections are important Highly connected data creates richer graphs and increases potential for discovering greater insights BUT unnecessary connections can create noise & extra work Don't create artificial connections, but clean up data before importing when it makes sense (e.g. networking, networks, network) Prevent duplication which can impact your insights based on aggregation (e.g. # of relationships) or certain patterns
  • 20. Keeping it clean Techniques Graph Benefits Text extraction with readability scoring ● Better named entity extraction ● Improve neighbor relevance ● Minimize invalid nodes & relationships Similarity Hashing ● Improve validity of relationships ● Increase graph connectedness Porter Stemming ● Improve graph connectedness
  • 21. Lesson #3 - Understand Cypher ● Cypher experimentation opens up the possible ● SQL users will be at home - tabular results, similar syntax ● Start without parameters, check with Neo4j shell, move to parameterized queries for security & performance (caching) ● Don't forget Lucene syntax ● Continues to evolve for the better - check new release changes (http://docs.neo4j.org/refcard/1.9/) ● Let Cypher do the work
  • 22. Useful Cypher Syntax START with an index MATCH defines your universe WHERE filters it down WITH combines multiple statements HAS checks if a property exists AS lets you name your return values IN checks against an array COLLECT aggregates into an array ORDER just like SQL LIMIT for performance
  • 23. Prototype highlights ● 4 people & 4 months (first version) ● Data Stores - Neo4J, MongoDB, Redis, Postgres ● Visuals - D3.js, Vivagraph.js, Twitter Bootstrap ● Key Languages/Libraries - Ruby, Rails, Cypher, Knockout.js, Amplify.js, HTML5, CSS3, jQuery, Neography gem, Resque gem ● 3rd Party - Alchemy, OpenCalais, RSS feeds, Wikipedia ● Concepts - natural language processing, named entity extraction, text cleansing & de-duplication (map/reduce), similarity hashing, large-scale information retrieval ● 1M+ nodes, 3M+ relationships, 6M+ properties after 6 months
  • 25. Tag Cloud / Search DOC C K K C DOC C K K DOC DOC DOC DOC ● Index keywords and search across keywords (tip: use Lucene syntax) ● Identify documents with strong relationships to keywords ● Locate concepts with strongest relationships to relevant documents ● Popularity based on number of incoming relationships
  • 26. Emerging Index / Popularity / Doc List DO C CDOC (E) OC DOC (NE) DOC (E) DOC (E) DOC (NE) DOC (E) DOC (NE) DOC (E) Cloud computing (Concept) and Google (Org) ● Strong relationships with documents shared between concepts to filter and rank relevant content ● Ratio and strength of relationships to quantify emerging index ● Popularity based on number of incoming relationships of each type of document (emerging versus non-emerging)
  • 27. Node Graph DO C CK DOC OC DOC DOC DOC DOC DOC DOC ● Existing relationships with documents shared between concepts to filter relevant neighbors ● Strength of relationships based on # and weight for ranking relevance (color) C
  • 29. Final Thoughts ● Graphs makes it simple to generate complex insights - you don't need to be a data scientist ● Graphs are a natural fit for anything connected...which is most things (e.g. social media, internet of things, sensor data) ● Experimentation is the best way to learn the power of graphs ● Make graph databases a first class citizen in your technology toolkit - many things can be solved better with a graph The best way to discover emerging technologies is to try them out
  • 30. Thanks for Listening - Q & A Special thanks to Max De Marzi for his neography gem (https://github. com/maxdemarzi/neography) and ongoing advice, suggestions, troubleshooting