A Tour of Google Cloud Platform
- 1. Google Cloud Platform
Getting Started with Google's Infrastructure and Platform
!
!
+ColinSu
Developer Expert, Google Cloud Platform
Software Architect, Tagtoo
A Tour of Google Cloud Platform
- 3. Google Cloud Platform
> Overview of Google Cloud Platform
> Google App Engine - Platform-as-a-Service in Google Cloud
> App Services in Google Cloud Platform
> Google Compute Engine - Infrastructure-as-a-Service in Google Cloud
> BigData Lab - What we did in GCDC 2013?
Outlines
- 6. Google Cloud Platform
> Why
+ too much data that your PC/servers couldn't store
+ too much computation your PC/servers couldn't deal with
+ your PC/servers are hard to scale
> Why not
+ need a website
+ it sounds cool
- 8. Google Cloud Platform
> Access Control
public, private networks
block storage
> Encryption
all block storages will be encrypted on the
cloud, no worries about leaks
> Encapsulation
all instances, virtual machines, networks or
any resources will be encapsulated to
prevent any other ones take over your
precious stuffs
Highly Secured Cloud
by Google
- 9. Google Cloud Platform
> You will be using Google's Infrastructure
Virtual Machines
Networks
Storage
> And be placed in a safe place
Google's Data Center
> And Google will do those for you
Scaling
Migrating
Maintenance
Take over anything you don't wanna do
Powered by Google
- 10. Google Cloud Platform
> The best way how Google share their
+ Cloud Infrastructure
+ Cloud Knowledge
+ Cloud Engineers
> Your own data center, with minimum cost
as possible as it could be
So What is Google Cloud Platform
- 11. Google Cloud Platform
Google Cloud Family
Computing
Compute Engine
App Engine
Storage
Cloud SQL
Datastore
Cloud Storage
App Services
Cloud Endpoints
Big Query
- 12. Google Cloud Platform
> Manage all API services on Google
Cloud
(e.g. Translation API, Prediction API, Maps API...)
> Compose equivalent commands for:
Command-line tools (Google Cloud SDK)
RESTful API
> Dashboard for monitoring all
resources on Google Cloud Platform
Cloud Developer Console
- 13. Google Cloud Platform
> Install/uninstall/upgrade all
command-line tools related to Google
Cloud Platform
> Notification for new release of any
Cloud SDK component
> Automatization
Google Cloud SDK
- 15. Google Cloud Platform
> It's MySQL, but managed by Google
> Relational Data Storage on Google Cloud
> Use Cases
+ LAMP Applications
+ Google App Engine
Cloud SQL
- 16. Google Cloud Platform
> Non-relational database (NoSQL)
> Schema-less data
> Use Cases
+ Highly scalable application
Cloud Datastore
- 17. Google Cloud Platform
> Protected
Your data is protected at multiple physical locations
> Strong, configurable security
OAuth or simple access control on your data
> Multiple usages
+ Serve static objects directly
+ Use with other Google Cloud products (Bridge)
Cloud Storage
- 19. Google Cloud Platform
> Data Analysis Tool
+ BigQuery
+ Google Prediction API
> Cloud Endpoints
> Google Cloud DNS
App Services in Google Cloud Platform
- 20. Google Cloud Platform
> Analyze terabytes of data, just a click of a button
> Super-fast, SQL-like queries
> Convenient import/export mechanism
Big Query
RUN QUERY
- 21. Google Cloud Platform
> Previewing of data
> Statistics of tables
> History/Cached Result
> Save query result as another
BigQuery table
BigQuery Browser Tool
- 22. Google Cloud Platform
Popular Languages on Github
!
SELECT repository.language, COUNT(repository.language) as num
FROM [publicdata:samples.github_nested]
GROUP BY repository.language
ORDER BY num DESC
LIMIT 10
BIGQUERY
1.6s elapsed, 12.8 MB processed
- 23. Google Cloud Platform
> Command-line Tool
a full-featured command-line tool is included in Google Cloud SDK, called bq
> RESTful API
a set of APIs is ready for helping you to control all components and data on your BigQuery
> BigQuery Connector for Excel
Microsoft Excel? No problem, we have an extension for your excel to query over your excel data
> Third-party Tools
> Make your own
More Ways To Use BigQuery
- 24. Google Cloud Platform
> generate APIs and client libraries from an App
Engine application
> make it easier to share web backend for mobile
applications
Cloud Endpoints
- 25. Google Cloud Platform
Cloud Endpoints Architecture
iOS
Objective-C Client Library
Android
Java Client Library
Web Browser
JavaScript Client Library
Google Cloud Endpoints
Google App Engine
API Backend Instances
- 26. Google Cloud Platform
> Website
solid integration, define APIs in Google App Engine
application and generate JavaScript client library with
Endpoints, no more handmade AJAX
> API Server
define APIs with Endpoints API, then it will be a RESTful API
service immediately
> Mobile Applications
Backend-as-a-Service
define reusable APIs with GAE various services, then
generate client libraries for iOS, Android and web browser to
share the resources you have
Use Cases
- 27. Google Cloud Platform
> Machine Learning
+ Categorical
+ Regression
> Pattern-matching
> Simple API Interface
Prediction API
- 28. Google Cloud Platform
> Recommendation System
Predict what will be liked by your users
> Filtering spams
Categorizes messages as spam or non-spam
> Semantic Analysis
Know how your users feel, given your comments
What can you do with Prediction API
- 32. Google Cloud Platform
How did you build a full-functional web service?
a scalable, high-performance, fault-tolerant service
- 33. Google Cloud Platform
> LAMP
+ Linux
+ Apache
+ MySQL
+ Programming Languages
> Failed: not scalable
Traditional Way: LAMP
Apache
MySQL
- 34. Google Cloud Platform
> Power up Apache2 army!
> Failed: database is alone, too busy
You Need Load Balancing
Apache
MySQL
Apache Apache
- 35. Google Cloud Platform
> Replication (Master/Slave)
> Failed: Master may die
Now Scale Database
Apache
MySQL
Apache Apache
MySQL
Master
Read/Write
Slave
Read
Replication
- 37. Google Cloud Platform
> You should care
+ Application code
+ Automated Scaling
> You should not care
+ Server management (networking, cores, memory, disks...)
+ Bootstrapping
+ Deployment
Platform-as-a-Service
- 38. Google Cloud Platform
> Application code gets executed (Runtime)
> Static Content gets served (CDN)
> Data gets stored (Database)
> Server gets secured (Sandbox)
> Service gets scaled (and automatically)
How PaaS Work
- 39. Google Cloud Platform
Application
Code
Front End
Load Balancer
Routing
Security
DNS Setting
Storage
SQL/NoSQL
Memcache
Static Content
Block Storage
Service Support
Mail
Authentication
Socket
Auto Scaling
Cron Job
Queue
Your Responsibility
Google's Responsibility
- 40. Google Cloud Platform
> Sandboxed containers with various runtimes
> Easy to build
All you need to do is preparing your application code
> Easy to run
Deploy with a single command, and it works
> Easy to scale
scale on GAE is automated and easy to configure
Google App Engine
- 41. Google Cloud Platform
> Java
Java Servlets interface
Support for standard interfaces to App Engine scalable services such as JDO, JPA, JavaMail and JCache
> Python
Python 2.7 and full support for any pure Python libraries, tools and frameworks
Built-in Compiled C-extension libraries are good to go
> PHP
Currently in "Preview" stage
But enough for your "Wordpress"
> Go
Currently in "Experimental" stage
automated build service included, no need to rebuild when code changed
and interface similar to the standard Go http package
Languages and Runtimes
- 43. Google Cloud Platform
> Datastore
schema-less, scalable object data storage
rich data modeling API
SQL-like query language, GQL (Google Query Language)
> Cloud Storage
strong, flexible, distributed storage service for serving or storing static files
> Search
Google-like search on structured data, such as full text, numbers, dates and geographic locations
> Memcache
a distributed, in-memory data cache to greatly improve your applications
> Logs
programmatic access to logging system
a full functional control panel in Cloud Console, better than a gzip file
> Migration/Backup Tools
Data on Google App Engine
- 44. Google Cloud Platform
> Channel
Create a persistent connection between your application and Google servers
Send messages to JavaScript clients in real-time
> Mail
send email messages on behalf of admin or Google account users
receive mails at various custom email addresses
> URL Fetch
Efficiently issue HTTP or HTTPS requests on your web application
> Outbound Socket
Socket support without requiring any special App Engine libs or any special App Engine ( import socket in Python)
> XMPP
Enable you application to send and receive chat messages to/from any XMPP-compatible messaging service (e.g. Facebook
Chat, previous Google talks...)
Communication
- 45. Google Cloud Platform
> Task Queue
allow your application to asynchronize user requests, and organize them to be executed later
> Scheduled Task (Cron Job)
configure regular tasks at scheduled times or regular intervals
Process Management
- 46. Google Cloud Platform
> Modules
Create instances for exempting from request deadlines and request more memory and CPI resources for computing
> MapReduce
optimized adaptation of the MapReduce computing model for efficient distributing computation on large data sets
> Images API
Manipulate, combine and enhance images
Convert images between formats
Query metadata of images (height/width, colors)
Computation
- 48. Google Cloud Platform
> Connects together complex, time-consuming workflows
> Asynchronize tasks
> Built-in pipelines or implement your own pipelines
Pipeline
- 49. Google Cloud Platform
Create a Pipeline
from pipeline import common
!
class CountReport(pipeline.Pipeline):
!
def run(self, email_address, entity_kind, property_name, *value_list):
split_counts = yield SplitCount(entity_kind, property_name, *value_list) # pipeline to gain
count result
yield common.Log.info('SplitCount result = %s', split_counts)
!
with pipeline.After(split_counts):
with pipeline.InOrder():
yield common.Delay(seconds=1)
yield common.Log.info('Done waiting')
yield EmailCountReport(email_address, split_counts) # another sending mail pipeline
PYTHON
- 51. Google Cloud Platform
> Programming model for processing large data sets in a parallel and
distributed algorithms on a cluster
> differ from map/reduce, one of functional programming conception, but has
the same idea, "divide and conquer"
> Proposed by Google
> Hadoop-free
MapReduce Library
https://developers.google.com/appengine/docs/python/dataprocessing/mapreduce_library
- 52. Google Cloud Platform
> map()/reduce() in Python
> map(func(elem), list) -> list
> reduce(func(elem1, elem2), list) -> elem
MapReduce in Functional Programming
>>> map(lambda x: x*2, [1,2,3,4])
[2,4,6,8]
!
>>> reduce(lambda x,y: x+y, [1,2,3,4])
10
!
PYTHON
- 54. Google Cloud Platform
Configure a MapReduce Pipeline
class WordCountPipeline(base_handler.PipelineBase):
!
def run(self, filekey, blobkey):
output = yield mapreduce_pipeline.MapreducePipeline(
"word_count", # name of mapreduce job
"main.word_count_map", # mapper function
"main.word_count_reduce", # reducer function
"mapreduce.input_readers.BlobstoreZipInputReader", # input reader
"mapreduce.output_writers.FileOutputWriter", # output writer
mapper_params={ # parameters to supply to the input reader
"input_reader": {
"blob_key": blobkey,
},
},
reducer_params={ # parameters to supply to the output writer
"output_writer": {
"mime_type": "text/plain",
"output_sharding": "input",
"filesystem": "blobstore",
},
},
shards=16) # number of shards
yield StoreOutput("WordCount", filekey, output)
PYTHON
- 57. Google Cloud Platform
> Google has resources
+ CPU Cores
+ Memory
+ Networking
+ Persistency (Disks, Snapshot, Cloud Storage...)
+ Well-trained engineering monkeys
> You have business and be busy
Infrastucture-as-a-Service
- 58. Google Cloud Platform
> High-performance virtual machines
from micro-VM to large instance
> Powered by Google's global network
you could build a large cluster with strong and consistent bandwidth, provided by
Google
> Load Balancing
spread incoming traffic across instances
> Fast Bullet Reloading
quick deployment of large VMs
command-line interface
web-based console
> Highly secured
All data written to disk in Compute Engine will be encrypted by high-class
encryption algorithm
Google Compute Engine
- 59. Google Cloud Platform
> KVM-based Virtual machines
> Fast booting time
routinely takes less than 30 secs
> Various OS support
> Various machine types
Instances on GCE
http://gce-demos.appspot.com
- 62. Google Cloud Platform
> a unit of CPU capacity used to describe the compute power of instance
types
> 2.75 GCEUs = 1 minimum power of 1 logical core on Sandy Bridge platform
Google Compute Engine Units (GCEUs)
=
- 64. Google Cloud Platform
Machine Types (Standard)
n1-standard-n
Starts from 1 Core
Start from 3.75 GB Memory
n Virtual CPUs Memory GCEUs
1 1 3.75 GB 2.75
2 2 7.50 GB 5.50
4 4 15 GB 11
8 8 30 GB 30
16 16 60 GB 60
- 65. Google Cloud Platform
Machine Types (High Memory)
n1-highmem-n
Starts from 2 Core
Start from 13 GB Memory
n Virtual CPUs Memory GCEUs
2 2 13 GB 5.50
4 4 26 GB 11
8 8 52 GB 22
16 16 104 GB 44
- 66. Google Cloud Platform
Machine Types (High CPU)
n1-highcpu-n
Starts from 2 Core
Start from 1.8 GB Memory
n Virtual CPUs Memory GCEUs
2 2 1.8 GB 5.50
4 4 3.6 GB 11
8 8 7.2 GB 22
16 16 14.4 GB 44
- 68. Google Cloud Platform
> Transparent Maintenance
> Auto restart instances shutdown by system events
> During transparent maintenance, you could set GCE to handle your
instance in two ways:
+ Live migrate
affect performance in some degree
but remain your instances online (no downtime)
+ Terminate and reboot
Live Migration
- 69. Google Cloud Platform
> Virtual SCSI device
> Block Storage
> Persistent until deleted
> Hot-plug to GCE instances (attach/deattach)
Persistent Disk
- 70. Google Cloud Platform
> Primary Disk: OS boot volume
Persistent Disk Mode
GCE Virtual Machine
Root
Stateful Root Volume
- 71. Google Cloud Platform
> Additional Disk: Read/Write Mode for user managed data volume
Persistent Disk Mode
GCE Virtual Machine
Root
Stateful Root Volume
RW Data
Stateful Data Volume
- 72. Google Cloud Platform
> Distribution Disk: Instant distribution of static content
Persistent Disk Mode
GCE Virtual Machine
RO Data
Read-Only Data Volume
GCE Virtual MachineGCE Virtual Machine
- 73. Google Cloud Platform
> Target Pools
> Health Checking
> Forwarding Rules
iptables for target pools
Load Balancing
- 74. Google Cloud Platform
> CGE reserved IP for instance, won't change with the reboot of VMs
> You can promote ephemeral IP to persistent IP
> no DNS changing anymore
Persistent IP Addresses
- 75. Google Cloud Platform
> Networking is first-class object on GCE
which means you could apply/unapply it anytime easily
> pre-defined networks before the first instance started
Integrated Networking
- 76. Google Cloud Platform
> Those resources are global resources
+ Images (OS Images)
+ Snapshots
+ Network
+ Firewalls
+ Routes
> And they're also first-class object in GCE
Multi-Region Resources
- 79. Google Cloud Platform
> http://www.google.com/events/gcdc2013/
> developers in 6 regions envolved
> goals
+ effective use of Google App Engine
+ originality of concept
+ integration and creative use of Google Products
Google Cloud Developer Challenge
- 80. Google Cloud Platform
> Provide a simple web interface to perform 4 big data
operations:
+ Storing (Data Source)
+ MapReduce
+ Prediction (Machine Learning)
+ Visualization
BigData Lab
- 81. Google Cloud Platform
> Google App Engine & Google Compute Engine
> Cloud Endpoints
> Google Cloud Storage
> MapReduce Module for Google App Engine
> Pipeline Module for Google App Engine
> Prediction API
> BigQuery
What Are We Using?