NLP & Machine Learning - An Introductory Talk

NLP & Machine Learning
Vijay Ganti

About Me
• I am an amateur programmer and ML enthusiast
• I am developing NLP prototype systems for problems
that I ﬁnd interesting and have used models like Naive
Bayes, LDA for topic modeling of HTML data.
• I code in Python
• I have developed a deep love for solving stimulating
problems and since I also like writing I am intrigued by
the problem of “can good/great writing be detected or
one day created by ML/AI”
• An amateur is someone who does something for love

–Laozi (604 BC- 531BC) - A contemporary of
Confucius
“A journey of a thousand miles
begins with a single step”

Agenda
Why NLP & ML?
What is NLP?
Getting started with NLP & ML
Why Python?
Making it real with an NLP & ML coding demo
A program that predicts gender given name(s) as input
Some glimpses into some practical issues
Next Steps

NLP powered by ML is ripe for
changing the way business gets done !
• Conversational agents are becoming an important form
of human-computer communication (Customer
support interactions using chat-bots)
• Much of human-human communication is now
mediated by computers (Email, Social Media,
Messaging)
• An enormous amount of knowledge is now available in
machine readable form as natural language text (web,
proprietary enterprise content)

My meet up
calendar is
buzzing with
NLP & ML

So what is NLP ?
Get machines to understand human language
Segmentation (words, sentences, stemming)
Part of speech tagging
Named Entity Recognition
Disambiguation (Semantics and Context)
Document/Text Classiﬁcation like topic modeling……

Disambiguation in language is easy for us
but hard for machines
Sentence Relation
I ate spaghetti with meatballs ingredient
I ate spaghetti with salad side dish
I ate spaghetti with abandon feeling
I ate spaghetti with a fork instrument
I ate spaghetti with a friend company

A few years back we faced the disambiguation problem
with images. This was one time I wanted polarization
and but the machines couldn't tell the difference !

Old vs New NLP
Rule Based
Deterministic
Hard Boundaries
Fixed
Machine Learning Based
Probabilistic
Soft boundaries
Malleable

What do you need to become good at
NLP & ML based on experience & ?
Pick Machine Learning & Distributed Computing stuff, as
needed
ref: https://www.linkedin.com/pulse/20141114072915-11846569-what-it-takes-to-be-a-data-scientist-advice-from-a-
non-data-scientist?trk=mp-reader-card
• Coding

• Probability Theory & Statistical Inference Theory

• Algorithm theory for both tweaking models and build
scalable implementations

• Look for problems to solve end-to-end and soak in
large amounts of data (data are everywhere)

Why should I study probability .. we have
all tossed coins and played card games!
Outcomes are highly non-intuitive
Required to combat our primitive intuition &
build sophisticated “intuition”
EXAMPLES ?
Google “Birthday Problem” to see an
example

Why Python for NLP & ML
Easy to get productive quickly
Easy to access and “pre-process” text data
Interpreted so great for research productivity
Support for higher order abstractions and programming
paradigms (declarative/functional, object oriented)
Rich eco-system with tons of modules for data science
and NLP

NLP & Machine Learning - An Introductory Talk

Getting started with NLP & ML & some
foundational probability theory in Python
• Coursera course on Python Data Structures
• Some basic Python - Google Lectures on Python
(https://developers.google.com/edu/python/)
• NLTK - nltk.org
• Get other packages as needed like NumPy, Matplotlib,
Scikit-learn, PyBrain, pandas, IPython
• Natural Language Processing with Python (book)
• http://norvig.com/ngrams/ch14.pdf
• Azure Text Analytics API ( I haven't tried it but looks
promising)
• http://stats.stackexchange.com/
• https://www.quora.com

Coding time to demonstrate
the ML workflow
Simple gender prediction problem solved
interactively that uses Naive Bayes
Classifier to show the ML workflow &
importance of feature engineering

Supervised classiﬁcation workﬂow
Training Data
Feature
Extraction z ML Algo
Prediction
ML Algo
Prediction
Data
Feature
Extraction z

Practical issues seen our example - Curse of
Dimensionality (too many features isn’t good)
Overﬁtting (sparse data for some features)
Scaling
More data vs better algorithms

More data is better than better
algorithm
Source - Scaling to Very
Very Large Corpora for
Natural Language
Disambiguation
Michele Banko and Eric
Brill
Microsoft Research
1 Microsoft Way
Redmond, WA 98052
USA

Practical lessons learned so far
Data preparation is 70% of the work
Feature Engineering is 70% of the rest of the work
Domain expertise critical for feature engineering
Modeling is more about understanding the concepts so
that you use it correctly.
It’s hard to understand the theory so don’t try to do this
all at once. Instead pick them as needed and ask for
help.

Next Steps
Think of use cases that will add most value for a customer
Think about the domain deeply not models
Think about the data deeply (acquisition, format,
processing etc.)
Contact me for discussing problems worth solving - we
can hack together or ganti.vijay1@gmail.com
tweet to @vijayganti if you liked the talk and want more

“Ars longa, vita brevis”
which in English is
"Life is short, [the] craft long”
Hippocrates’ Parting Words of Caution

Naive Bayes Classiﬁer
P (A|B) = P(B|A) x P (A) / P(B)
P (Class| Feature) = P(Feature|Class) x P(Class)/ P(Feature)
Posterior
LikelihoodPriors Evidence

Naive Bayes Classiﬁer
What is independence?
In NLP let’s say you are using word frequency as a feature but
words like
United States
Damn good
Stainless steel
aren’t independent words. They often occur together. Hence you
can get better classiﬁcation accuracy if your initial processing uses
something called “collocation” to treat them as one unit.

NLP & Machine Learning - An Introductory Talk

More Related Content

NLP & Machine Learning - An Introductory Talk