NEW LAUNCH! Natural Language Processing for Data Analytics - MCL343 - re:Invent 2017
- 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Comprehend
M C L 3 4 3
N o v e m b e r 3 0 , 2 0 1 7
N i n o B i c e , P r o d u c t M a n a g e r
D i m i t r i s S o u l i o s , E n g i n e e r M a n a g e r
- 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Let’s Take a Look Around Us
- 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Growth of Natural Language Experiences
• Public Content is Relevant
• Social Media
• News
• Natural Language Customer Engagement
• Reviews/Comment
• Support (call, email feedback)
- 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Text Analytics at Scale
• AWS Platform value
• Amazon S3 Data Lakes
• Scalable, pay for what you use, analytics
- 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training NLP is Hard and Expensive
NLP Model
Data Collection
and Prep
Training the
model
Data annotation
- 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Comprehend: Natural Language
Processing
😃
Sentiment Entities Languages Key phrases Topic modeling
POWERED BY DEEP
LEARNING
- 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A m a z o n . c o m , I n c . i s l o c a t e d i n
S e a t t l e , W A a n d w a s f o u n d e d J u l y
5 t h , 1 9 9 4 b y J e f f B e z o s . O u r
c u s t o m e r s l o v e b u y i n g e v e r y t h i n g
f r o m b o o k s t o b l e n d e r s a t g r e a t
p r i c e s
N a m e d E n t i t i e s
• A m a z o n . c o m : O r g a n i z a t i o n
• S e a t t l e , W A : L o c a t i o n
• J u l y 5 t h , 1 9 9 4 : D a t e
• J e f f B e z o s : P e r s o n
K e y p h r a s e s
• O u r c u s t o m e r s
• b o o k s
• b l e n d e r s
• g r e a t p r i c e s
S e n t i m e n t
• P o s i t i v e
L a n g u a g e
• E n g l i s h
Text Analysis
- 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Topic Modeling
Document Topic Proportion
Doc.txt 0 .89
Doc.txt 1 .67
Doc.txt 2 .91
Topic Term Weight
0 Washington .89
1 Silicon Valley .67
2 Roasting .91
Keywords Topic Groups Document Relationship to Topics
- 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Console Demo
- 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Voice of Customer Analytics
Semantic Search
Knowledge Management/Discovery
Analyzing what customer are saying about your brand, products and services
Making search smarter by searching on keyphrase, sentiment and topic
Organizing documents, categorizing by topic and personalizing experiences
Common Use Cases
- 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS ML Stack
Frameworks &
Infrastructure
AWS Deep Learning AMI
GPU
(P3 Instances)
Mobile
CPU
(C5 Instances)
IoT
(Greengrass)
Vision:
Rekognition Image
Rekognition Video
Speech:
Polly
Transcribe
Language:
Lex Translate
Comprehend
Apache
MXNet
PyTorch
Cognitive
Toolkit
Keras
Caffe2
& Caffe
TensorFlow Gluon
Application
Services
Platform
Services
Amazon Machine
Learning
Mechanical
Turk
Spark &
EMR
Amazon
SageMaker
AWS
DeepLens
- 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Developer Experience
- 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Synchronous
• DetectDominantLanguage and BatchDetectDominantLanguage – to detect the dominant language in a
document. We can detect up to 100 languages.
• DetectEntities and Batch DetectEntities – to detect the entities, such as persons or places, in the document.
• DetectKeyPhrases and Batch DetectKeyPhrases – to detect key noun phrases that are most indicative of the
content.
• DetectSentiment and Batch DetectSentiment – to detect the emotional sentiment, positive, negative, mixed, or
neutral, of a document.
Asynchronous
• StartTopicDetection – to start a topic modeling job
• ListTopicDetection – to list all your submitted jobs
• DescribeTopicDetection –to get progress status and information about each submitted job
API Summary
- 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
{
• "Languages": [
» {
• "LanguageCode":
"string",
• "Score": number
» }
• ]
}
Synchronous APIs
• {
• "Entities": [
• {
• "BeginOffset": number,
• "EndOffset": number,
• "Score": number,
• "Text": "string",
• "Type": "string"
• }
• ]
• }
- 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
import boto3
import json
comprehend = boto3.client(service_name='comprehend', region_name='region')
text = "It is raining today in Seattle"
print('Calling DetectEntities')
print(
• json.dumps(
• comprehend.detect_entities(Text=text, LanguageCode='en'), sort_keys=True, indent=4)
• )
print('End of DetectEntitiesn')
Synchronous APIs
- 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
{
• "ErrorList": [
• {
– "ErrorCode": "string",
– "ErrorMessage": "string",
– "Index": number
• }
• ],
• "ResultList": [
• {
– "Entities": [
» {
• "BeginOffset": number,
• "EndOffset": number,
• "Score": number,
• "Text": "string",
• "Type": "string"
» }
– ],
– "Index": number
• }
• ]
}
Batch APIs
- 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AmazonComprehend client = AmazonComprehendClientBuilder.standard()
• .withCredentials(new AWSStaticCredentialsProvider(awsCreds))
• .withRegion("region")
• .build();
String[] textList = {"I love Seattle", "Today is Sunday", "Tomorrow is Monday”};
// Call detectEntities API
System.out.println("Calling BatchDetectEntities");
BatchDetectEntitiesRequest batchDetectEntitiesRequest = new BatchDetectEntitiesRequest()
• .withTextList(textList)
• .withLanguageCode("en");
BatchDetectEntitiesResult batchDetectEntitiesResult = client.batchDetectEntities(batchDetectEntitiesRequest);
for(BatchDetectEntitiesItemResult item : batchDetectEntitiesResult.getResultList())
{
• System.out.println(item);
}
Batch APIs
- 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
aws comprehend start-topics-detection-job
• --number-of-topics topics to return
• --job-name "job name"
• --region region
• --cli-input-json file:
• {
• "InputDataConfig": {
– "S3Uri": "s3://input bucket/input path",
– "InputFormat": "ONE_DOC_PER_FILE"
• },
• "OutputDataConfig": {
– "S3Uri": "s3://output bucket/output path"
• },
• "DataAccessRoleArn": "arn:aws:iam::account ID:role/data access role"
• }
aws comprehend describe-topics-detection-job --region region --job-id job ID
Topic Modeling APIs
- 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Comprehend Powered Solutions
- 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Comprehend + AWS = Scale Text Analytics
Amazon Kinesis
Amazon ES
Amazon Redshift
Amazon EMR
• Semantic
• Rich Filtering
• Grouping, Trends
• Joining, Correlating
• Clustering
• Graph, Search
• Near real-time
• Alerts
Amazon S3
Articles, Documents
Social Media, Support
Amazon Aurora
- 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Launch Customer: Elementum
- 24. News Feed Analysis - Workflow
Amazon
S3
News from
multiple
channels
Amazon
Comprehend
Amazon
S3
Amazon
Athena
Processing Natural
Language
Ad Hoc Queries
Amazon EMR
Amazon
DynamoDB
User Application
Build predictions and
alternative route
recommendations
Feedback
- 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Launch Customer: Infor
- 26. Infor is an enterprise
software provider and
strategic technology partner
for more than 90,000
organizations worldwide.
Our software is purpose-
built for specific industries,
providing complete suites
that are designed to support
progress – for individuals,
businesses, and the world.
Customersandinnovationatthecore
- 27. DATA MGMT
Data Lake
Graph
Data Catalogue
Data Services APIs
Data Pipelines
Archiving
UX
Portal
In-context
Homepages
Search
Chat
Documents (IDM)
Process integration
Activity Monitoring
Workflow
API gateway
Orchestration
Mapping
Mediation
iPAAS
Single Sign On
Users / Roles
Groups
Auditing / Monitoring
Risk & Compliancy
Insights
SECURITY
Single Sign On
Users / Roles
Groups
Auditing / Monitoring
Risk & Compliancy
Insights
SECURITY
PAAS
Dev Framework
Composite Apps
Soho UX Library
Reports
Extensibility
Single Sign On
Users / Roles
Groups
Auditing / Monitoring
Risk & Compliancy
Insights
SECURITY
Digital Assistant
Automated Skills
Contextual A.I.
Image recognition
A.I. PaaS
COLEMAN A.I. IOT
IoT Portal
Connectors
Embedded EAM
Analytics
Electronic Messages
Reports
Tax Engine
eAccounting
Financial Controller
Submission Portal
LOCALIZATIONS
I N F O R T E C H N O L O G Y S U I T E
- 28. A M A Z O N C O M P R E H E N D + I N F O R T E C H N O L O G Y
Unstructured
Documents
Search
Chat
Analysis
Ability to extract sentiment and entity relationships
in documents to automatically create actions and
provide refined search capabilities
Ability to accept search requests in natural language and create
relationships to entities in order to generate more accurate
search queries and to automatically link to Digital Assistant skills
Ability to transcribe and analyze text and voice conversations
to create contextual minutes and tasks, while capturing
unstructured knowledge
- 29. A M A Z O N C O M P R E H E N D + I N F O R T E C H N O L O G Y
Unstructured
Documents
Search
Sentiment
Chat
Analysis
Docs
(IDM)
S3 NLP S3
Search
Workflow
Infor
Chat
ASR NLP S3
APIs
Search
Workflow
APIs
Elasticsearch
NLP
ASR
S3
Graph
- 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Comprehend Demo: Social Analytics
- 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Twitter
Stream API
Amazon
Kinesis
Amazon
S3
Amazon
Athena
Analyze social media postings and comments to organize and classify customer
feedback and look for common patterns.
Visualize results in Amazon QuickSight
- 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Comprehend: Call to action
• Free to try!
• Pay for what you use
• AWS SDK, Code Samples
• Follow us!
- 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you
- 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use topic modeling to categorize documents, improving information management and data
discovery
S3 bucket full of blog
posts over the years
Com
preh
end
Comprehend Topic
Modeling Job
Blogs organized into topic
groups for easier discovery
Marketing
Engineering Tips
Community
Messaging
- 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
- 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.