SlideShare a Scribd company logo
PYTHON
FUN WITH
AGENDA
▸ Using Python to Access Web Data
▸ Using Databases with Python
▸ Processing and Visualizing Data with Python
USING PYTHON TO ACCESS WEB DATA
Access Web Data
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
▸ Web Parser
▸ Web Services
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Requests Library
import requests
requests.get(‘http://www.facebook.com’).text
pip install requests #install library
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Make a Request
#GET Request



import requests
r = requests.get(‘http://www.facebook.com’) 

if r.status_code == 200:

print(“Success”)
Success
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Make a Request
#POST Request



import requests
r = requests.post('http://httpbin.org/post', data = {'key':'value'})

if r.status_code == 200:

print(“Success”)
Success
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Make a Request
#Other Types of Request



import requests
r = requests.put('http://httpbin.org/put', data = {'key':'value'})

r = requests.delete('http://httpbin.org/delete')

r = requests.head('http://httpbin.org/get') 

r = requests.options('http://httpbin.org/get')
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Passing Parameters In URLs
#GET Request with parameter



import requests
r = requests.get(‘https://www.google.co.th/?hl=th’) 

if r.status_code == 200:

print(“Success”)
Success
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Passing Parameters In URLs
#GET Request with parameter
import requests
r = requests.get(‘https://www.google.co.th’,params={“hl”:”en”}) 

if r.status_code == 200:

print(“Success”)
Success
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Passing Parameters In URLs
#POST Request with parameter
import requests
r = requests.post("https://m.facebook.com",data={"key":"value"})

if r.status_code == 200:

print(“Success”)
Success
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Response Content
#Text Response
import requests



data = {“email” :“…..” , pass : “……”}

r = requests.post(“https://m.facebook.com”,data=data)

if r.status_code == 200:

print(r.text)
'<?xml version="1.0" encoding="utf-8"?>n<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML
Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd"><html xmlns="http://
www.w3.org/1999/xhtml"><head><title>Facebook</title><meta name="referrer"
content="default" id="meta_referrer" /><style type=“text/css”>/*<!………………..
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Response Content
#Response encoding
import requests
r = requests.get('https://www.google.co.th/logos/doodles/2016/king-
bhumibol-adulyadej-1927-2016-5148101410029568.2-hp.png') 

r.encoding = ’tis-620'

if r.status_code == 200:

print(r.text)
'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage"
lang="th"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta
content="/logos/doodles/2016/king-bhumibol-adulyadej-1927-2016-5148101410029568.2-
hp.png" itemprop="image"><meta content="ปวงข้าพระพุทธเจ้า ขอน้อมเกล้าน้อมกระหม่อมรำลึกใน...
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Response Content
#Binary Response




import requests
r = requests.get('https://www.google.co.th/logos/doodles/2016/king-
bhumibol-adulyadej-1927-2016-5148101410029568.2-hp.png') 

if r.status_code == 200:

open(“img.png”,”wb”).write(r.content)
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Response Status Codes
#200 Response (OK)



import requests
r = requests.get('https://api.github.com/events')

if r.status_code == requests.codes.ok:

print(data[0]['actor'])


{'url': 'https://api.github.com/users/ShaolinSarg', 'display_login': 'ShaolinSarg', 'avatar_url': 'https://
avatars.githubusercontent.com/u/6948796?', 'id': 6948796, 'login': 'ShaolinSarg', 'gravatar_id': ''}
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Response Status Codes
#200 Response (OK)



import requests
r = requests.get('https://api.github.com/events')

print(r.status_code)
200
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Response Status Codes
#404



import requests
r = requests.get('https://api.github.com/events/404')

print(r.status_code)

404
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Response Headers
#404



import requests
r = requests.get('http://www.sanook.com')

print(r.headers)

print(r.headers[‘Date’])

{'Content-Type': 'text/html; charset=UTF-8', 'Date': 'Tue, 08 Nov 2016 14:38:41 GMT', 'Cache-
Control': 'private, max-age=0', 'Age': '16', 'Content-Encoding': 'gzip', 'Content-Length': '38089',
'Connection': 'keep-alive', 'Vary': 'Accept-Encoding', 'Accept-Ranges': 'bytes'}



Tue, 08 Nov 2016 14:38:41 GMT
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Timeouts
#404



import requests
r = requests.get(‘http://www.sanook.com',timeout=0.001)

ReadTimeout: HTTPConnectionPool(host='github.com', port=80): Read timed out. (read
timeout=0.101)
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Authentication
#Basic Authentication



import requests
r = requests.get('https://api.github.com/user', auth=('user', 'pass'))

print(r.status_code)

200
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
read more : http://docs.python-requests.org/en/master/
USING PYTHON TO ACCESS WEB DATA
▸ Web Requests
Quiz#1 : Tag Monitoring
1. Get webpage : http://pantip.com/tags
2. Save to file every 5 minutes (time.sleep(300))
3. Use current date time as filename
(How to get current date time using Python?, find it on Google)
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
HTML Parser : beautifulsoup
from bs4 import BeautifulSoup



soup = BeautifulSoup(open(“file.html”),"html.parser") #parse from file

soup = BeautifulSoup(“<html>data</html>”,"html.parser") #parse from
text
pip install beautifulsoup4 #install library
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
from bs4 import BeautifulSoup
soup = BeautifulSoup(“<html>data</html>”,"html.parser")

print(soup)
<html>data</html>
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
#Navigating using tag names
from bs4 import BeautifulSoup



html_doc = """<html><head><title>The Dormouse's story</title></
head><body><p class="title"><b>The Dormouse's story</b></p></
body>”””
soup = BeautifulSoup(html_doc,"html.parser")

soup.head 

soup.title

soup.body.p
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
<head><title>The Dormouse's story</title></head>
<title>The Dormouse's story</title>
<p class="title"><b>The Dormouse's story</b></p>
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
#Access string
from bs4 import BeautifulSoup



html_doc = “""<h1>hello</h1>”””
soup = BeautifulSoup(html_doc,"html.parser")

print(soup.h1.string)
hello
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
#Access attribute
from bs4 import BeautifulSoup



html_doc = “<a href="http://example.com/elsie" >Elsie</a>”
soup = BeautifulSoup(html_doc,"html.parser")

print(soup.a[‘href’])
http://example.com/elsie
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
#Get all text in the page
from bs4 import BeautifulSoup



html_doc = """<html><head><title>The Dormouse's story</title></
head><body><p class="title"><b>The Dormouse's story</b></p></
body>”””
soup = BeautifulSoup(html_doc,"html.parser")

print(soup.get_text)
<bound method Tag.get_text of <html><head><title>The Dormouse's story</title></
head><body><p class="title"><b>The Dormouse's story</b></p></body></html>>
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
# find_all()
from bs4 import BeautifulSoup



html_doc = """<a href="http://example.com/elsie" class="sister"
id="link1">Elsie</a>,<a href="http://example.com/lacie" class="sister"
id="link2">Lacie</a> and <a href="http://example.com/tillie"
class="sister" id="link3">Tillie</a>;”””
soup = BeautifulSoup(html_doc,"html.parser")

for a in soup.find_all(‘a’):

print(a)
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
<a class="sister" href="http://example.com/elsie"
id="link1">Elsie</a>

<a class="sister" href="http://example.com/lacie"
id="link2">Lacie</a>

USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
#find_all()

soup.find_all(id='link2')



soup.find_all(href=re.compile("elsie"))



soup.find_all(id=True) 



data_soup.find_all(attrs={"data-foo": “value"})



soup.find_all("a", class_="sister")



soup.find_all("a", recursive=False)

soup.p.find_all(“a", recursive=False)
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
re.compile(…..)
<a href=“http://192.x.x.x” class=“c1”>hello</a>

<a href=“https://192.x.x.x” class=“c1”>hello</a>

<a href=“https://www.com” class=“c1”>hello</a>
find_all(href=re.compile(‘(https|http)://[0-9.]’))
https://docs.python.org/2/howto/regex.html
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Parse a document
read more : https://www.crummy.com/software/BeautifulSoup/
bs4/doc/
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Quiz#2 : Tag Extraction
1. Get webpage : http://pantip.com/tags
2. Extract tag name, tag link, number of topic in

first 10 pages
3. save to file as this format

tag name, tag link, number of topic, current datetime
4. Run every 5 minutes
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
JSON Parser : json
import json



json_doc = json.loads(“{key : value}“)
built-in function
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
JSON Parser : json
#JSON string



json_doc = “””{“employees":[

{"firstName":"John", "lastName":"Doe"},

{"firstName":"Anna", "lastName":"Smith"},

{"firstName":"Peter", "lastName":"Jones"}

]} “””
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
JSON Parser : json
#Parse string to object



import json
json_obj = json.loads(json_doc)

print(json_obj)
{'employees': [{'firstName': 'John', 'lastName': 'Doe'}, {'firstName': 'Anna', 'lastName': 'Smith'},
{'firstName': 'Peter', 'lastName': 'Jones'}]}
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
JSON Parser : json
#Access json object



import json
json_obj = json.loads(json_doc)

print(json_obj[‘employees’][0][‘firstName’])

print(json_obj[‘employees’][0][‘lastName’])
John

Doe
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
JSON Parser : json
#Create json doc



import json
json_obj = {“firstName” : “name”,”lastName” : “last”} #Dictionary

print(json.dumps(json_obj,indent=1))
{

"firstName": "name",

"lastName": “last"

}
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Quiz#3 : Post Monitoring
1. Register as Facebook Developer on
developers.facebook.com
2. Get information of last 10 hours post on the page

https://www.facebook.com/MorningNewsTV3

3. save to file as this format

post id, post datetime, #number like, current datetime
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Quiz#3 : Post Monitoring
URL





https://graph.facebook.com/v2.8/<PageID>?
fields=posts.limit(100)%7Blikes.limit(1).summary(true)
%2Ccreated_time%7D&access_token=
USING PYTHON TO ACCESS WEB DATA
▸ Web Service
USING PYTHON TO ACCESS WEB DATA
▸ Web Service
Web Service Type
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
SOAP Example
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
SOAP Request
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
REST
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
REST Request
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
JSON Web Service
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Application
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
JSON


{"employees":[
{"firstName":"John", "lastName":"Doe"},
{"firstName":"Anna", "lastName":"Smith"},
{"firstName":"Peter", "lastName":"Jones"}
]}
list

dict

key

value
read more : http://www.json.org/
USING PYTHON TO ACCESS WEB DATA
▸ Web Service
Create Simple Web Service
from flask.ext.api import FlaskAPI
app = FlaskAPI(__name__)



@app.route('/example/')

def example():

return {'hello': 'world'}



app.run(debug=False,port=5555)
pip install Flask-API
USING PYTHON TO ACCESS WEB DATA
▸ Web Service
Create Simple Web Service
#receive input



from flask.ext.api import FlaskAPI
app = FlaskAPI(__name__)



@app.route(‘/hello/<name>/<lastName>')

def example(name,lastName):

return {'hello':name}



app.run(debug=False,port=5555)
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Quiz#4 : Tag Service
1. Build get TopTagInfo function using web service.
2. Input : Number of top topic
3. Output: tag name and number of top the topic in json

format.
USING PYTHON TO ACCESS WEB DATA
▸ Web Parser
Quiz#4 : Top Tag Service
1. Build getTopTagInfo web service.
2. Input : Number of top topic
3. Output: tag name and number of top the topic in json

format.
USING DATABASES WITH PYTHON
Databases
USING DATABASES WITH PYTHON
……….
USING DATABASES WITH PYTHON
Zero configuration 

– SQLite does not need to be Installed as there is no setup procedure to use it.
Server less 

– SQLite is not implemented as a separate server process. With SQLite, the process that wants to access the
database reads and writes directly from the database files on disk as there is no intermediary server process.
Stable Cross-Platform Database File 

– The SQLite file format is cross-platform. A database file written on one machine can be copied to and used
on a different machine with a different architecture.
Single Database File 

– An SQLite database is a single ordinary disk file that can be located anywhere in the directory hierarchy.
Compact 

– When optimized for size, the whole SQLite library with everything enabled is less than 400KB in size
USING DATABASES WITH PYTHON
SQLite


import sqlite3
conn = sqlite3.connect('my.db')
built-in library : sqlite3
USING DATABASES WITH PYTHON
SQLite
1. Connect to db
2. Get cursor
3. Execute command
4. Commit (insert / update/delete) / Fetch result (select)
5. Close database
Workflow
USING DATABASES WITH PYTHON
SQLite
import sqlite3

conn = sqlite3.connect(‘example.db') # connect db

c = conn.cursor() # get cursor
# execute1

c.execute('''CREATE TABLE stocks

(date text, trans text, symbol text, qty real, price real)''')
# execute2

c.execute("INSERT INTO stocks VALUES ('2006-01-05','BUY','RHAT',100,35.14)")
conn.commit() # commit

conn.close() # close
Workflow Example
USING DATABASES WITH PYTHON
SQLite
Data Type
USING DATABASES WITH PYTHON
Database Storage
import sqlite3
conn = sqlite3.connect(‘example.db') #store in disk
conn = sqlite3.connect(‘:memory:’) #store in memory
USING DATABASES WITH PYTHON
Execute
#execute



import sqlite3
conn = sqlite3.connect(‘example.db') 

c = conn.cursor()

t = ('RHAT',)

c.execute('SELECT * FROM stocks WHERE symbol=?', t)
USING DATABASES WITH PYTHON
Execute
#executemany



import sqlite3
conn = sqlite3.connect(‘example.db') 

c = conn.cursor()

purchases = [('2006-03-28', 'BUY', 'IBM', 1000, 45.00),

('2006-04-05', 'BUY', 'MSFT', 1000, 72.00),

('2006-04-06', 'SELL', 'IBM', 500, 53.00),]
c.executemany('INSERT INTO stocks VALUES (?,?,?,?,?)', purchases)
USING DATABASES WITH PYTHON
fetch
#fetchaone



import sqlite3
conn = sqlite3.connect(‘example.db') 

c = conn.cursor()

c.execute('SELECT * FROM stocks')
c.fetchone()
('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14)
USING DATABASES WITH PYTHON
fetch
#fetchall



import sqlite3
conn = sqlite3.connect(‘example.db') 

c = conn.cursor()

c.execute('SELECT * FROM stocks')
for d in c.fetchall():

print(d)
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),

('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),

('2006-04-05', 'BUY', 'MSFT', 1000.0, 72.0),
USING DATABASES WITH PYTHON
Context manager
import sqlite3
con = sqlite3.connect(":memory:")
con.execute("create table person (id integer primary key, firstname
varchar unique)")
#con.commit() is called automatically afterwards

with con:

con.execute("insert into person(firstname) values (?)", ("Joe"))
USING DATABASES WITH PYTHON
Read more : 

https://docs.python.org/2/library/sqlite3.html

https://www.tutorialspoint.com/python/python_database_access.htm
USING DATABASES WITH PYTHON
Quiz#5 : Post DB
1. Register as Facebook Developer on
developers.facebook.com
2. Get information of last 10 hours post on the page

https://www.facebook.com/MorningNewsTV3

(post id, post datetime, #number like, current datetime)
3. design and create table to store posts

PROCESSING AND VISUALIZING DATA WITH PYTHON
Processing and Visualizing
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Processing : pandas
pip install pandas
high-performance, easy-to-use data structures and
data analysis tools
USING DATABASES WITH PYTHON
Pandas : Series
#create series with Array-like



import pandas as pd

from numpy.random import rand
s = pd.Series(rand(5), index=['a', 'b', 'c', 'd', 'e'])
print(s)
a 0.690232

b 0.738294

c 0.153817

d 0.619822

e 0.4347
USING DATABASES WITH PYTHON
Pandas : Series
#create series with dictionary



import pandas as pd

from numpy.random import rand



d = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(d) #with dictionary
print(s)
a 0

b 1

c 2
dtype: float64
USING DATABASES WITH PYTHON
Pandas : Series
#create series with Scalar



import pandas as pd

from numpy.random import rand
s = pd.Series(5., index=['a', 'b', 'a', 'd', ‘a']) #index can duplicate
print(s[‘a’])
a 5

a 5

a 5
dtype: float64
USING DATABASES WITH PYTHON
Pandas : Series
#access series data



import pandas as pd

from numpy.random import rand
s = pd.Series(5., index=['a', 'b', 'a', 'd', ‘a']) #index can duplicate
print(s[0])

print(s[:3])
5.0

a 5

b 5

a 5
dtype: float64
USING DATABASES WITH PYTHON
Pandas : Series
#series operations



import pandas as pd

from numpy.random import rand

import numpy as np
s = pd.Series(rand(10)) #index can duplicate
s = s + 2

s = s * s

s = np.exp(s)

print(s)

0 187.735606

1 691.660752

2 60.129741

3 595.438606

4 769.479456

5 397.052123

6 4691.926483

7 1427.593520

8 180.001824

9 410.994395
dtype: float64
USING DATABASES WITH PYTHON
Pandas : Series
#series filtering



import pandas as pd

from numpy.random import rand

import numpy as np
s = pd.Series(rand(10)) #index can duplicate
s = s[s > 0.1]

print(s)

1 0.708700

2 0.910090

3 0.380613

6 0.692324

7 0.508440

8 0.763977

9 0.470675
dtype: float64
USING DATABASES WITH PYTHON
Pandas : Series
#series incomplete data



import pandas as pd

from numpy.random import rand

import numpy as np
s1 = pd.Series(rand(10))

s2 = pd.Series(rand(8))
s = s1 + s2

print(s)

0 0.813747

1 1.373839

2 1.569716

3 1.624887

4 1.515665

5 0.526779

6 1.544327

7 0.740962

8 NaN

9 NaN
dtype: float64
USING DATABASES WITH PYTHON
Pandas : Series
#create series with Array-like



import pandas as pd

from numpy.random import rand
s = pd.Series(rand(5), index=['a', 'b', 'c', 'd', 'e'])
print(s)
a 0.690232

b 0.738294

c 0.153817

d 0.619822

e 0.4347
USING DATABASES WITH PYTHON
Pandas : DataFrame
2-dimensional labeled data 

structure with columns 

of potentially different types
USING DATABASES WITH PYTHON
Pandas : DataFrame
#create dataframe with dict



d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),

'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}



df = pd.DataFrame(d)

print(df)
one two

a 1 1

b 2 2

c 3 3

d NaN 4
USING DATABASES WITH PYTHON
Pandas : DataFrame
#create dataframe with dict list



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}



df = pd.DataFrame(d)

print(df)
one two

0 1 4

1 2 3

2 3 2

3 4 1
USING DATABASES WITH PYTHON
Pandas : DataFrame
#access dataframe column



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}



df = pd.DataFrame(d)

print(df[‘one’])
0 1

1 2

2 3

3 4
Name: one, dtype: float64
USING DATABASES WITH PYTHON
Pandas : DataFrame
#access dataframe row



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}



df = pd.DataFrame(d)

print(df.iloc[:3])
one two

0 1 4

1 2 3

2 3 2
USING DATABASES WITH PYTHON
Pandas : DataFrame
#add new column



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}

df = pd.DataFrame(d)

df['three'] = [1,2,3,2]

print(df)
one two three

0 1 4 1

1 2 3 2

2 3 2 3

3 4 1 2
USING DATABASES WITH PYTHON
Pandas : DataFrame
#show data : head() and tail()



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}

df = pd.DataFrame(d)

df['three'] = [1,2,3,2]

print(df.head())

print(df.tail())
one two three

0 1 4 1

1 2 3 2

2 3 2 3

3 4 1 2
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe summary



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}

df = pd.DataFrame(d)

print(df.describe())
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe function



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}

df = pd.DataFrame(d)���
print(df.mean())
one 2.5

two 2.5

dtype: float64
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe function



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}

df = pd.DataFrame(d)

print(df.corr()) #calculate correlation
one two

one 1 -1

two -1 1
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe filtering



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}

df = pd.DataFrame(d)

print(df[(df[‘one’] > 1) & (df[‘one’] < 3)] )
one two
1 2 3
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe filtering with isin



d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}

df = pd.DataFrame(d)

print(df[df[‘one’].isin([2,4])] )
one two
1 2 3

3 4 1
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe with row data



d = [ [1., 2., 3., 4.], [4., 3., 2., 1.]]

df = pd.DataFrame(d)

df.columns = ["one","two","three","four"]

print(df)
one two three four

0 1 2 3 4

1 4 3 2 1
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe sort values



d = [ [2., 1., 3., 4.], [1., 3., 2., 4.]]

df = pd.DataFrame(d)

df.columns = ["one","two","three","four"]

df = df.sort_values([“one”,”two”],ascending=[1,0]) 

print(df)
one two three four

0 2 1 3 4

1 1 3 2 4
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe from csv file



df = pd.read_csv(‘file.csv’)

print(df)
one two three
0 1 2 3

1 1 2 3

2 1 2 3
file.csv



one,two,three

1,2,3

1,2,3

1,2,3
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe from csv file, without header.



df = pd.read_csv(‘file.csv’,header=-1)

print(df)
0 1 2
0 1 2 3

1 1 2 3

2 1 2 3
file.csv



1,2,3

1,2,3

1,2,3
USING DATABASES WITH PYTHON
Pandas : DataFrame
USING DATABASES WITH PYTHON
Pandas : DataFrame
#dataframe from html, need to install lxml first (pip install lxml)



df = pd.read_html(‘https://simple.wikipedia.org/wiki/
List_of_U.S._states’)



print(df[0])
Abbreviation State Name Capital Became a State
1 AL Alabama Montgomery December 14, 1819
2 AK Alaska Juneau January 3, 1959
3 AZ Arizona Phoenix February 14, 1912
USING DATABASES WITH PYTHON
Quiz#6 : Data Exploration
1. Goto https://archive.ics.uci.edu/ml/datasets/Adult

to read data description
2. Parse data into pandas using read_csv() and set
columns name
3. Explore data to answer following questions,

- find number of person in each education level.

- find correlation and covariance between continue
fields 

- Avg age of United-States population where income
>50K.
USING DATABASES WITH PYTHON
Quiz#6 : Data Exploration
df[3].value_counts()
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
pip install seaborn
visualization library based on matplotlib
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : set inline plot for jupyter
%matplotlib inline

import numpy as np

import seaborn as sns



# Generate some sequential data
x = np.array(list("ABCDEFGHI"))
y1 = np.arange(1, 10)
sns.barplot(x, y1)
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : plot result
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : set layout
%matplotlib inline

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt
f,ax = plt.subplots(1,1,figsize=(10, 10))

sns.barplot(x=[1,2,3,4,5],y=[3,2,3,4,2])
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : set layout
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : set layout
%matplotlib inline

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt
f,ax = plt.subplots(2,2,figsize=(10, 10))

sns.barplot(x=[1,2,3,4,5],y=[3,2,3,4,2],ax=ax[0,0])

sns.distplot([3,2,3,4,2],ax=ax[0,1])
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : set layout
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : axis setting
%matplotlib inline

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt
f,ax = plt.subplots(figsize=(10, 5))

sns.barplot(x=[1,2,3,4,5],y=[3,2,3,4,2])
ax.set_xlabel("number")

ax.set_ylabel("value")
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : axis setting
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : with pandas dataframe
%matplotlib inline

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt



d = {'x' : [1., 2., 3., 4.], 'y' : [4., 3., 2., 1.]}

df = pd.DataFrame(d)
f,ax = plt.subplots(figsize=(10, 5))

sns.barplot(x=‘x’,y=‘y’,data=df)
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : with pandas dataframe
PROCESSING AND VISUALIZING DATA WITH PYTHON
▸ Visualizing : seaborn
seaborn : plot types
http://seaborn.pydata.org/examples/index.html
USING DATABASES WITH PYTHON
Quiz#7 : Adult Plot
1. Goto https://archive.ics.uci.edu/ml/datasets/Adult

to read data description
2. Parse data into pandas using read_csv() and set
columns name
3. Plot five charts.



More Related Content

Fun with Python

  • 2. AGENDA ▸ Using Python to Access Web Data ▸ Using Databases with Python ▸ Processing and Visualizing Data with Python
  • 3. USING PYTHON TO ACCESS WEB DATA Access Web Data
  • 4. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests ▸ Web Parser ▸ Web Services
  • 5. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Requests Library import requests requests.get(‘http://www.facebook.com’).text pip install requests #install library
  • 6. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Make a Request #GET Request
 
 import requests r = requests.get(‘http://www.facebook.com’) 
 if r.status_code == 200:
 print(“Success”) Success
  • 7. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Make a Request #POST Request
 
 import requests r = requests.post('http://httpbin.org/post', data = {'key':'value'})
 if r.status_code == 200:
 print(“Success”) Success
  • 8. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Make a Request #Other Types of Request
 
 import requests r = requests.put('http://httpbin.org/put', data = {'key':'value'})
 r = requests.delete('http://httpbin.org/delete')
 r = requests.head('http://httpbin.org/get') 
 r = requests.options('http://httpbin.org/get')
  • 9. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Passing Parameters In URLs #GET Request with parameter
 
 import requests r = requests.get(‘https://www.google.co.th/?hl=th’) 
 if r.status_code == 200:
 print(“Success”) Success
  • 10. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Passing Parameters In URLs #GET Request with parameter import requests r = requests.get(‘https://www.google.co.th’,params={“hl”:”en”}) 
 if r.status_code == 200:
 print(“Success”) Success
  • 11. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Passing Parameters In URLs #POST Request with parameter import requests r = requests.post("https://m.facebook.com",data={"key":"value"})
 if r.status_code == 200:
 print(“Success”) Success
  • 12. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Response Content #Text Response import requests
 
 data = {“email” :“…..” , pass : “……”}
 r = requests.post(“https://m.facebook.com”,data=data)
 if r.status_code == 200:
 print(r.text) '<?xml version="1.0" encoding="utf-8"?>n<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd"><html xmlns="http:// www.w3.org/1999/xhtml"><head><title>Facebook</title><meta name="referrer" content="default" id="meta_referrer" /><style type=“text/css”>/*<!………………..
  • 13. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Response Content #Response encoding import requests r = requests.get('https://www.google.co.th/logos/doodles/2016/king- bhumibol-adulyadej-1927-2016-5148101410029568.2-hp.png') 
 r.encoding = ’tis-620'
 if r.status_code == 200:
 print(r.text) '<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="th"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/logos/doodles/2016/king-bhumibol-adulyadej-1927-2016-5148101410029568.2- hp.png" itemprop="image"><meta content="ปวงข้าพระพุทธเจ้า ขอน้อมเกล้าน้อมกระหม่อมรำลึกใน...
  • 14. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Response Content #Binary Response 
 
 import requests r = requests.get('https://www.google.co.th/logos/doodles/2016/king- bhumibol-adulyadej-1927-2016-5148101410029568.2-hp.png') 
 if r.status_code == 200:
 open(“img.png”,”wb”).write(r.content)
  • 15. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Response Status Codes #200 Response (OK)
 
 import requests r = requests.get('https://api.github.com/events')
 if r.status_code == requests.codes.ok:
 print(data[0]['actor']) 
 {'url': 'https://api.github.com/users/ShaolinSarg', 'display_login': 'ShaolinSarg', 'avatar_url': 'https:// avatars.githubusercontent.com/u/6948796?', 'id': 6948796, 'login': 'ShaolinSarg', 'gravatar_id': ''}
  • 16. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Response Status Codes #200 Response (OK)
 
 import requests r = requests.get('https://api.github.com/events')
 print(r.status_code) 200
  • 17. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Response Status Codes #404
 
 import requests r = requests.get('https://api.github.com/events/404')
 print(r.status_code)
 404
  • 18. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Response Headers #404
 
 import requests r = requests.get('http://www.sanook.com')
 print(r.headers)
 print(r.headers[‘Date’])
 {'Content-Type': 'text/html; charset=UTF-8', 'Date': 'Tue, 08 Nov 2016 14:38:41 GMT', 'Cache- Control': 'private, max-age=0', 'Age': '16', 'Content-Encoding': 'gzip', 'Content-Length': '38089', 'Connection': 'keep-alive', 'Vary': 'Accept-Encoding', 'Accept-Ranges': 'bytes'}
 
 Tue, 08 Nov 2016 14:38:41 GMT
  • 19. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Timeouts #404
 
 import requests r = requests.get(‘http://www.sanook.com',timeout=0.001)
 ReadTimeout: HTTPConnectionPool(host='github.com', port=80): Read timed out. (read timeout=0.101)
  • 20. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Authentication #Basic Authentication
 
 import requests r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
 print(r.status_code)
 200
  • 21. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests read more : http://docs.python-requests.org/en/master/
  • 22. USING PYTHON TO ACCESS WEB DATA ▸ Web Requests Quiz#1 : Tag Monitoring 1. Get webpage : http://pantip.com/tags 2. Save to file every 5 minutes (time.sleep(300)) 3. Use current date time as filename (How to get current date time using Python?, find it on Google)
  • 23. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser HTML Parser : beautifulsoup from bs4 import BeautifulSoup
 
 soup = BeautifulSoup(open(“file.html”),"html.parser") #parse from file
 soup = BeautifulSoup(“<html>data</html>”,"html.parser") #parse from text pip install beautifulsoup4 #install library
  • 24. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document from bs4 import BeautifulSoup soup = BeautifulSoup(“<html>data</html>”,"html.parser")
 print(soup) <html>data</html>
  • 25. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document #Navigating using tag names from bs4 import BeautifulSoup
 
 html_doc = """<html><head><title>The Dormouse's story</title></ head><body><p class="title"><b>The Dormouse's story</b></p></ body>””” soup = BeautifulSoup(html_doc,"html.parser")
 soup.head 
 soup.title
 soup.body.p
  • 26. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser <head><title>The Dormouse's story</title></head> <title>The Dormouse's story</title> <p class="title"><b>The Dormouse's story</b></p>
  • 27. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document #Access string from bs4 import BeautifulSoup
 
 html_doc = “""<h1>hello</h1>””” soup = BeautifulSoup(html_doc,"html.parser")
 print(soup.h1.string) hello
  • 28. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document #Access attribute from bs4 import BeautifulSoup
 
 html_doc = “<a href="http://example.com/elsie" >Elsie</a>” soup = BeautifulSoup(html_doc,"html.parser")
 print(soup.a[‘href’]) http://example.com/elsie
  • 29. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document #Get all text in the page from bs4 import BeautifulSoup
 
 html_doc = """<html><head><title>The Dormouse's story</title></ head><body><p class="title"><b>The Dormouse's story</b></p></ body>””” soup = BeautifulSoup(html_doc,"html.parser")
 print(soup.get_text) <bound method Tag.get_text of <html><head><title>The Dormouse's story</title></ head><body><p class="title"><b>The Dormouse's story</b></p></body></html>>
  • 30. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document # find_all() from bs4 import BeautifulSoup
 
 html_doc = """<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and <a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;””” soup = BeautifulSoup(html_doc,"html.parser")
 for a in soup.find_all(‘a’):
 print(a)
  • 31. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document <a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>
 <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>

  • 32. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document #find_all()
 soup.find_all(id='link2')
 
 soup.find_all(href=re.compile("elsie"))
 
 soup.find_all(id=True) 
 
 data_soup.find_all(attrs={"data-foo": “value"})
 
 soup.find_all("a", class_="sister")
 
 soup.find_all("a", recursive=False)
 soup.p.find_all(“a", recursive=False)
  • 33. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document re.compile(…..) <a href=“http://192.x.x.x” class=“c1”>hello</a>
 <a href=“https://192.x.x.x” class=“c1”>hello</a>
 <a href=“https://www.com” class=“c1”>hello</a> find_all(href=re.compile(‘(https|http)://[0-9.]’)) https://docs.python.org/2/howto/regex.html
  • 34. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Parse a document read more : https://www.crummy.com/software/BeautifulSoup/ bs4/doc/
  • 35. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Quiz#2 : Tag Extraction 1. Get webpage : http://pantip.com/tags 2. Extract tag name, tag link, number of topic in
 first 10 pages 3. save to file as this format
 tag name, tag link, number of topic, current datetime 4. Run every 5 minutes
  • 36. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser JSON Parser : json import json
 
 json_doc = json.loads(“{key : value}“) built-in function
  • 37. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser JSON Parser : json #JSON string
 
 json_doc = “””{“employees":[
 {"firstName":"John", "lastName":"Doe"},
 {"firstName":"Anna", "lastName":"Smith"},
 {"firstName":"Peter", "lastName":"Jones"}
 ]} “””
  • 38. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser JSON Parser : json #Parse string to object
 
 import json json_obj = json.loads(json_doc)
 print(json_obj) {'employees': [{'firstName': 'John', 'lastName': 'Doe'}, {'firstName': 'Anna', 'lastName': 'Smith'}, {'firstName': 'Peter', 'lastName': 'Jones'}]}
  • 39. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser JSON Parser : json #Access json object
 
 import json json_obj = json.loads(json_doc)
 print(json_obj[‘employees’][0][‘firstName’])
 print(json_obj[‘employees’][0][‘lastName’]) John
 Doe
  • 40. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser JSON Parser : json #Create json doc
 
 import json json_obj = {“firstName” : “name”,”lastName” : “last”} #Dictionary
 print(json.dumps(json_obj,indent=1)) {
 "firstName": "name",
 "lastName": “last"
 }
  • 41. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Quiz#3 : Post Monitoring 1. Register as Facebook Developer on developers.facebook.com 2. Get information of last 10 hours post on the page
 https://www.facebook.com/MorningNewsTV3
 3. save to file as this format
 post id, post datetime, #number like, current datetime
  • 42. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Quiz#3 : Post Monitoring URL
 
 
 https://graph.facebook.com/v2.8/<PageID>? fields=posts.limit(100)%7Blikes.limit(1).summary(true) %2Ccreated_time%7D&access_token=
  • 43. USING PYTHON TO ACCESS WEB DATA ▸ Web Service
  • 44. USING PYTHON TO ACCESS WEB DATA ▸ Web Service Web Service Type
  • 45. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser SOAP Example
  • 46. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser SOAP Request
  • 47. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser REST
  • 48. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser REST Request
  • 49. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser JSON Web Service
  • 50. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Application
  • 51. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser JSON 
 {"employees":[ {"firstName":"John", "lastName":"Doe"}, {"firstName":"Anna", "lastName":"Smith"}, {"firstName":"Peter", "lastName":"Jones"} ]} list
 dict
 key
 value read more : http://www.json.org/
  • 52. USING PYTHON TO ACCESS WEB DATA ▸ Web Service Create Simple Web Service from flask.ext.api import FlaskAPI app = FlaskAPI(__name__)
 
 @app.route('/example/')
 def example():
 return {'hello': 'world'}
 
 app.run(debug=False,port=5555) pip install Flask-API
  • 53. USING PYTHON TO ACCESS WEB DATA ▸ Web Service Create Simple Web Service #receive input
 
 from flask.ext.api import FlaskAPI app = FlaskAPI(__name__)
 
 @app.route(‘/hello/<name>/<lastName>')
 def example(name,lastName):
 return {'hello':name}
 
 app.run(debug=False,port=5555)
  • 54. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Quiz#4 : Tag Service 1. Build get TopTagInfo function using web service. 2. Input : Number of top topic 3. Output: tag name and number of top the topic in json
 format.
  • 55. USING PYTHON TO ACCESS WEB DATA ▸ Web Parser Quiz#4 : Top Tag Service 1. Build getTopTagInfo web service. 2. Input : Number of top topic 3. Output: tag name and number of top the topic in json
 format.
  • 56. USING DATABASES WITH PYTHON Databases
  • 57. USING DATABASES WITH PYTHON ……….
  • 58. USING DATABASES WITH PYTHON Zero configuration 
 – SQLite does not need to be Installed as there is no setup procedure to use it. Server less 
 – SQLite is not implemented as a separate server process. With SQLite, the process that wants to access the database reads and writes directly from the database files on disk as there is no intermediary server process. Stable Cross-Platform Database File 
 – The SQLite file format is cross-platform. A database file written on one machine can be copied to and used on a different machine with a different architecture. Single Database File 
 – An SQLite database is a single ordinary disk file that can be located anywhere in the directory hierarchy. Compact 
 – When optimized for size, the whole SQLite library with everything enabled is less than 400KB in size
  • 59. USING DATABASES WITH PYTHON SQLite 
 import sqlite3 conn = sqlite3.connect('my.db') built-in library : sqlite3
  • 60. USING DATABASES WITH PYTHON SQLite 1. Connect to db 2. Get cursor 3. Execute command 4. Commit (insert / update/delete) / Fetch result (select) 5. Close database Workflow
  • 61. USING DATABASES WITH PYTHON SQLite import sqlite3
 conn = sqlite3.connect(‘example.db') # connect db
 c = conn.cursor() # get cursor # execute1
 c.execute('''CREATE TABLE stocks
 (date text, trans text, symbol text, qty real, price real)''') # execute2
 c.execute("INSERT INTO stocks VALUES ('2006-01-05','BUY','RHAT',100,35.14)") conn.commit() # commit
 conn.close() # close Workflow Example
  • 62. USING DATABASES WITH PYTHON SQLite Data Type
  • 63. USING DATABASES WITH PYTHON Database Storage import sqlite3 conn = sqlite3.connect(‘example.db') #store in disk conn = sqlite3.connect(‘:memory:’) #store in memory
  • 64. USING DATABASES WITH PYTHON Execute #execute
 
 import sqlite3 conn = sqlite3.connect(‘example.db') 
 c = conn.cursor()
 t = ('RHAT',)
 c.execute('SELECT * FROM stocks WHERE symbol=?', t)
  • 65. USING DATABASES WITH PYTHON Execute #executemany
 
 import sqlite3 conn = sqlite3.connect(‘example.db') 
 c = conn.cursor()
 purchases = [('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
 ('2006-04-05', 'BUY', 'MSFT', 1000, 72.00),
 ('2006-04-06', 'SELL', 'IBM', 500, 53.00),] c.executemany('INSERT INTO stocks VALUES (?,?,?,?,?)', purchases)
  • 66. USING DATABASES WITH PYTHON fetch #fetchaone
 
 import sqlite3 conn = sqlite3.connect(‘example.db') 
 c = conn.cursor()
 c.execute('SELECT * FROM stocks') c.fetchone() ('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14)
  • 67. USING DATABASES WITH PYTHON fetch #fetchall
 
 import sqlite3 conn = sqlite3.connect(‘example.db') 
 c = conn.cursor()
 c.execute('SELECT * FROM stocks') for d in c.fetchall():
 print(d) [('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
 ('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
 ('2006-04-05', 'BUY', 'MSFT', 1000.0, 72.0),
  • 68. USING DATABASES WITH PYTHON Context manager import sqlite3 con = sqlite3.connect(":memory:") con.execute("create table person (id integer primary key, firstname varchar unique)") #con.commit() is called automatically afterwards
 with con:
 con.execute("insert into person(firstname) values (?)", ("Joe"))
  • 69. USING DATABASES WITH PYTHON Read more : 
 https://docs.python.org/2/library/sqlite3.html
 https://www.tutorialspoint.com/python/python_database_access.htm
  • 70. USING DATABASES WITH PYTHON Quiz#5 : Post DB 1. Register as Facebook Developer on developers.facebook.com 2. Get information of last 10 hours post on the page
 https://www.facebook.com/MorningNewsTV3
 (post id, post datetime, #number like, current datetime) 3. design and create table to store posts

  • 71. PROCESSING AND VISUALIZING DATA WITH PYTHON Processing and Visualizing
  • 72. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Processing : pandas pip install pandas high-performance, easy-to-use data structures and data analysis tools
  • 73. USING DATABASES WITH PYTHON Pandas : Series #create series with Array-like
 
 import pandas as pd
 from numpy.random import rand s = pd.Series(rand(5), index=['a', 'b', 'c', 'd', 'e']) print(s) a 0.690232
 b 0.738294
 c 0.153817
 d 0.619822
 e 0.4347
  • 74. USING DATABASES WITH PYTHON Pandas : Series #create series with dictionary
 
 import pandas as pd
 from numpy.random import rand
 
 d = {'a' : 0., 'b' : 1., 'c' : 2.} s = pd.Series(d) #with dictionary print(s) a 0
 b 1
 c 2 dtype: float64
  • 75. USING DATABASES WITH PYTHON Pandas : Series #create series with Scalar
 
 import pandas as pd
 from numpy.random import rand s = pd.Series(5., index=['a', 'b', 'a', 'd', ‘a']) #index can duplicate print(s[‘a’]) a 5
 a 5
 a 5 dtype: float64
  • 76. USING DATABASES WITH PYTHON Pandas : Series #access series data
 
 import pandas as pd
 from numpy.random import rand s = pd.Series(5., index=['a', 'b', 'a', 'd', ‘a']) #index can duplicate print(s[0])
 print(s[:3]) 5.0
 a 5
 b 5
 a 5 dtype: float64
  • 77. USING DATABASES WITH PYTHON Pandas : Series #series operations
 
 import pandas as pd
 from numpy.random import rand
 import numpy as np s = pd.Series(rand(10)) #index can duplicate s = s + 2
 s = s * s
 s = np.exp(s)
 print(s)
 0 187.735606
 1 691.660752
 2 60.129741
 3 595.438606
 4 769.479456
 5 397.052123
 6 4691.926483
 7 1427.593520
 8 180.001824
 9 410.994395 dtype: float64
  • 78. USING DATABASES WITH PYTHON Pandas : Series #series filtering
 
 import pandas as pd
 from numpy.random import rand
 import numpy as np s = pd.Series(rand(10)) #index can duplicate s = s[s > 0.1]
 print(s)
 1 0.708700
 2 0.910090
 3 0.380613
 6 0.692324
 7 0.508440
 8 0.763977
 9 0.470675 dtype: float64
  • 79. USING DATABASES WITH PYTHON Pandas : Series #series incomplete data
 
 import pandas as pd
 from numpy.random import rand
 import numpy as np s1 = pd.Series(rand(10))
 s2 = pd.Series(rand(8)) s = s1 + s2
 print(s)
 0 0.813747
 1 1.373839
 2 1.569716
 3 1.624887
 4 1.515665
 5 0.526779
 6 1.544327
 7 0.740962
 8 NaN
 9 NaN dtype: float64
  • 80. USING DATABASES WITH PYTHON Pandas : Series #create series with Array-like
 
 import pandas as pd
 from numpy.random import rand s = pd.Series(rand(5), index=['a', 'b', 'c', 'd', 'e']) print(s) a 0.690232
 b 0.738294
 c 0.153817
 d 0.619822
 e 0.4347
  • 81. USING DATABASES WITH PYTHON Pandas : DataFrame 2-dimensional labeled data 
 structure with columns 
 of potentially different types
  • 82. USING DATABASES WITH PYTHON Pandas : DataFrame #create dataframe with dict
 
 d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
 'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
 
 df = pd.DataFrame(d)
 print(df) one two
 a 1 1
 b 2 2
 c 3 3
 d NaN 4
  • 83. USING DATABASES WITH PYTHON Pandas : DataFrame #create dataframe with dict list
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 
 df = pd.DataFrame(d)
 print(df) one two
 0 1 4
 1 2 3
 2 3 2
 3 4 1
  • 84. USING DATABASES WITH PYTHON Pandas : DataFrame #access dataframe column
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 
 df = pd.DataFrame(d)
 print(df[‘one’]) 0 1
 1 2
 2 3
 3 4 Name: one, dtype: float64
  • 85. USING DATABASES WITH PYTHON Pandas : DataFrame #access dataframe row
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 
 df = pd.DataFrame(d)
 print(df.iloc[:3]) one two
 0 1 4
 1 2 3
 2 3 2
  • 86. USING DATABASES WITH PYTHON Pandas : DataFrame #add new column
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 df = pd.DataFrame(d)
 df['three'] = [1,2,3,2]
 print(df) one two three
 0 1 4 1
 1 2 3 2
 2 3 2 3
 3 4 1 2
  • 87. USING DATABASES WITH PYTHON Pandas : DataFrame #show data : head() and tail()
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 df = pd.DataFrame(d)
 df['three'] = [1,2,3,2]
 print(df.head())
 print(df.tail()) one two three
 0 1 4 1
 1 2 3 2
 2 3 2 3
 3 4 1 2
  • 88. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe summary
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 df = pd.DataFrame(d)
 print(df.describe())
  • 89. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe function
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 df = pd.DataFrame(d)
 print(df.mean()) one 2.5
 two 2.5
 dtype: float64
  • 90. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe function
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 df = pd.DataFrame(d)
 print(df.corr()) #calculate correlation one two
 one 1 -1
 two -1 1
  • 91. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe filtering
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 df = pd.DataFrame(d)
 print(df[(df[‘one’] > 1) & (df[‘one’] < 3)] ) one two 1 2 3
  • 92. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe filtering with isin
 
 d = {'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]}
 df = pd.DataFrame(d)
 print(df[df[‘one’].isin([2,4])] ) one two 1 2 3
 3 4 1
  • 93. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe with row data
 
 d = [ [1., 2., 3., 4.], [4., 3., 2., 1.]]
 df = pd.DataFrame(d)
 df.columns = ["one","two","three","four"]
 print(df) one two three four
 0 1 2 3 4
 1 4 3 2 1
  • 94. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe sort values
 
 d = [ [2., 1., 3., 4.], [1., 3., 2., 4.]]
 df = pd.DataFrame(d)
 df.columns = ["one","two","three","four"]
 df = df.sort_values([“one”,”two”],ascending=[1,0]) 
 print(df) one two three four
 0 2 1 3 4
 1 1 3 2 4
  • 95. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe from csv file
 
 df = pd.read_csv(‘file.csv’)
 print(df) one two three 0 1 2 3
 1 1 2 3
 2 1 2 3 file.csv
 
 one,two,three
 1,2,3
 1,2,3
 1,2,3
  • 96. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe from csv file, without header.
 
 df = pd.read_csv(‘file.csv’,header=-1)
 print(df) 0 1 2 0 1 2 3
 1 1 2 3
 2 1 2 3 file.csv
 
 1,2,3
 1,2,3
 1,2,3
  • 97. USING DATABASES WITH PYTHON Pandas : DataFrame
  • 98. USING DATABASES WITH PYTHON Pandas : DataFrame #dataframe from html, need to install lxml first (pip install lxml)
 
 df = pd.read_html(‘https://simple.wikipedia.org/wiki/ List_of_U.S._states’)
 
 print(df[0]) Abbreviation State Name Capital Became a State 1 AL Alabama Montgomery December 14, 1819 2 AK Alaska Juneau January 3, 1959 3 AZ Arizona Phoenix February 14, 1912
  • 99. USING DATABASES WITH PYTHON Quiz#6 : Data Exploration 1. Goto https://archive.ics.uci.edu/ml/datasets/Adult
 to read data description 2. Parse data into pandas using read_csv() and set columns name 3. Explore data to answer following questions,
 - find number of person in each education level.
 - find correlation and covariance between continue fields 
 - Avg age of United-States population where income >50K.
  • 100. USING DATABASES WITH PYTHON Quiz#6 : Data Exploration df[3].value_counts()
  • 101. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn pip install seaborn visualization library based on matplotlib
  • 102. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : set inline plot for jupyter %matplotlib inline
 import numpy as np
 import seaborn as sns
 
 # Generate some sequential data x = np.array(list("ABCDEFGHI")) y1 = np.arange(1, 10) sns.barplot(x, y1)
  • 103. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : plot result
  • 104. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : set layout %matplotlib inline
 import numpy as np
 import seaborn as sns
 import matplotlib.pyplot as plt f,ax = plt.subplots(1,1,figsize=(10, 10))
 sns.barplot(x=[1,2,3,4,5],y=[3,2,3,4,2])
  • 105. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : set layout
  • 106. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : set layout %matplotlib inline
 import numpy as np
 import seaborn as sns
 import matplotlib.pyplot as plt f,ax = plt.subplots(2,2,figsize=(10, 10))
 sns.barplot(x=[1,2,3,4,5],y=[3,2,3,4,2],ax=ax[0,0])
 sns.distplot([3,2,3,4,2],ax=ax[0,1])
  • 107. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : set layout
  • 108. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : axis setting %matplotlib inline
 import numpy as np
 import seaborn as sns
 import matplotlib.pyplot as plt f,ax = plt.subplots(figsize=(10, 5))
 sns.barplot(x=[1,2,3,4,5],y=[3,2,3,4,2]) ax.set_xlabel("number")
 ax.set_ylabel("value")
  • 109. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : axis setting
  • 110. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : with pandas dataframe %matplotlib inline
 import numpy as np
 import seaborn as sns
 import matplotlib.pyplot as plt
 
 d = {'x' : [1., 2., 3., 4.], 'y' : [4., 3., 2., 1.]}
 df = pd.DataFrame(d) f,ax = plt.subplots(figsize=(10, 5))
 sns.barplot(x=‘x’,y=‘y’,data=df)
  • 111. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : with pandas dataframe
  • 112. PROCESSING AND VISUALIZING DATA WITH PYTHON ▸ Visualizing : seaborn seaborn : plot types http://seaborn.pydata.org/examples/index.html
  • 113. USING DATABASES WITH PYTHON Quiz#7 : Adult Plot 1. Goto https://archive.ics.uci.edu/ml/datasets/Adult
 to read data description 2. Parse data into pandas using read_csv() and set columns name 3. Plot five charts.