Building MuleSoft Applications with Google BigQuery Meetup 4
- 3. All contents © MuleSoft Inc.
Agenda
3
• Introductions/Community Updates
• Main Presentation
• Questions & Answers
• Announcements
• See you next time.
- 5. All contents © MuleSoft Inc.
Who Am I?
5
• Background in Optometry – nearly 13 years
ago.
• Currently a Certified MuleSoft Platform &
Integration Architect, working as an
Integration Application Developer @ Wawanesa
Insurance. Started writing mule applications in
2016.
• Part of the Co-organizers of the Winnipeg
Meetup.
- 6. All contents © MuleSoft Inc.
Our Keynote Speaker?
6
• Eswara Pendli
• Senior MuleSoft Consultant @ Apisero.
• With loads of experience across multiple
industries including(insurance, supply chain,
retail etc.).
• He is also the author of the MuleSoft +
BigQuery Series blog, hosted on Apisero.
- 7. All contents © MuleSoft Inc.
Share
7
• Share Meetup in your social network.
• Give some kudos to our speaker on
LinkedIn
• Use Hashtags
– #MuleSoftMeetup
– #MuleSoftMeetupWinnipeg
• Instagram: @mulesoftmeetupwpg
Thank you
- 9. All contents © MuleSoft Inc. 9
MuleSoft CONNECT:Now
MuleSoft CONNECT:Now is a virtual experience bringing
you a full program of technical sessions and content,
streamed online for free!
Register for free:
https://connect.mulesoft.com
- 10. All contents © MuleSoft Inc.
10
Developer Meetups at CONNECT:Now events
Meet the MuleSoft Community!
● Hear technical use cases from customer and partner
MuleSoft experts around the globe
● Live chat with MuleSoft Ambassadors!
JOIN ONLINE FOR FREE:
EMEA: October 8, 2020
AMER: October 13, 2020
APAC: October 20, 2020
Register: https://connect.mulesoft.com/
- 11. All contents © MuleSoft Inc.
Check out the technical presentations below:
Developer Meetup at CONNECT:Now EMEA
● Twitter
○ Felipe Ocadiz, MuleSoft Ambassador, IT Integration Engineer
○ How to become an Anypoint Studio ninja
● Saint-Gobain
○ Francis Edwards, MuleSoft Ambassador, Integration Analyst
○ Useful integration tools
JOIN FOR FREE: October 8, 2020 (10:30am-11:15am BST)
Register: https://connect.mulesoft.com/events/connect/emea
- 12. All contents © MuleSoft Inc.
Check out the technical presentations below:
Developer Meetup at CONNECT:Now Americas
● AT&T
○ Brad Ringer, Principal System Engineer
○ MuleSoft Runtime Fabric: The road to success
● MuleSoft Ambassadress
○ Alexandra Martinez, Sr. MuleSoft Developer, Bits in Glass
○ Reviewing a complex DataWeave transformation
JOIN FOR FREE: October 13, 2020 (10:30am-11:15am PDT)
Register: https://connect.mulesoft.com/events/connect/amer
- 13. All contents © MuleSoft Inc.
Check out the technical presentations below:
Developer Meetup at CONNECT:Now JAPAC
● Datacom
○ Mary Joy Sabal, Sr. Integration Developer
○ Using Maven Archetypes to create MuleSoft API Project Templates
● MuleSoft Ambassador
○ Sravan Lingam, Consultant, Virtusa
○ Create a virtual Tic-Tac-Toe game using Object Store v2
JOIN FOR FREE: October 20, 2020 (2:30pm-3:15pm AEST)
Register: https://connect.mulesoft.com/events/connect/japac
- 14. All contents © MuleSoft Inc.
Kahoot! Trivia.
14
• Interactive quizes/surveys
• From the web browser
• Visit the page
kahoot.it
and provide
• PIN number
• Your nick
- 15. All contents © MuleSoft Inc.
Time to Talk
15
• Presentation by Eswara
• Drop your questions in the chat, and they will be addressed at the
end of the presentation.
- 17. All contents © MuleSoft Inc.
About
17
• Passionate MuleSoft professional.
• Working as a Senior Mulesoft Consultant at Apisero.
• Worked for different domains (Insurance, Claims, Complaint Handling, Order Processing,
Supply Chain Management & Retail).
• Love to watch Animated Movies
• Email: eshwar11naidu9@gmail.com
• Linkedin: https://www.linkedin.com/in/eswara-pendli/
- 18. All contents © MuleSoft Inc.
Agenda
18
• Introduction
• Features & Quick Points
• Play with BigQuery in GCP
• BigQuery API
• Play with BigQuery in Anypoint Studio
• Summary
• Questions & Answers
• References and Documentation
- 19. All contents © MuleSoft Inc. 19
Prerequisite
A Google Cloud / sandbox account.
Terminology
• Google Cloud Platform
• BigQuery
• Dataset
• Table
Introduction
- 20. All contents © MuleSoft Inc. 20
• BigQuery:
– BigQuery is a fully-managed data warehouse on RESTful web service which allows to
run complex analytical SQL-based queries under large sets of data. That enables
scalable, cost-effective and fast analysis of big data working in conjunction with
Google Cloud Storage.
Dataset:
– Datasets are top-level containers that are used to organize and control access to our
tables and views.
– A table or view must belong to a dataset and it contained within a specific project, so
we need to create at least one dataset before loading data into BigQuery.
Table:
– Ideally table is where all the data in a database is stored.
- 21. All contents © MuleSoft Inc.
Why BigQuery
21
• BigQuery was first launched as a service in 2010 with general availability in
November 2011.
• BigQuery is built on top of Dremel technology.
– Google’s interactive ad-hoc query system for analysis of read-only
nested data
• BigQuery best for optimizing query performance and high cost effectiveness.
• BigQuery now integrates with a variety of Google Cloud Platform (GCP) services and
third-party tools which makes it more useful.
- 22. All contents © MuleSoft Inc. 22
• Main factors:
– Columnar storage:
• Data is stored by columns and this makes it possible to achieve very high
compression ratio and scan throughput.
• Only required column values on each query are scanned and transferred on query
execution
• It separates a record into column values and stores each value on different storage
volume, whereas traditional databases normally store the whole record on one
volume.
- 23. All contents © MuleSoft Inc. 23
– Tree architecture: Tree execution architecture is used to dispatch queries and
aggregate results across thousands of machines.
• Ex: Book: Book1
price:
discountCnt: 1
str: "AAA,firstTitle"
– Architecture forms a massively parallel distributed tree for pushing down a query to the
tree and then aggregating the results from the leaves at a blazingly fast speed.
– By leveraging this architecture,
• Google was able to implement the distributed design for Dremel and realize the vision of the
massively parallel columnar based database on the cloud platform.
- 24. All contents © MuleSoft Inc. 24
• Cost-effective cloud data warehouse designed to help us make informed
decisions quickly.
• So we can transform our business with ease.
– Accelerate time-to-value with a fully managed and serverless cloud data
warehouse.
– Which is easy to set up and manage.
– Doesn’t require a database administrator.
– Quickly analyze gigabytes to petabytes of data using ANSI SQL (American
National Standards Institute – SQL) at blazing-fast speeds, and with zero
operational overhead.
• Have peace of mind with BigQuery’s robust security, governance, and
reliability controls that offer high availability and a 99.9% uptime SLA.
• Data is encrypted by default and includes support for customer-managed
encryption keys.
Features & Quick Points
- 25. All contents © MuleSoft Inc. 25
• All Features :
– Serverless
– Real-Time Analytics
– High availability
– Standard SQL
– Storage and compute separation
– Automatic backup and easy restore
– Data transfer service
– Big data ecosystem integration
– Flexible pricing models
– Data governance and security
– Geo-expansion
– Foundation for AI
– Rich monitoring and logging with Stackdriver
– Public datasets
– Commercial datasets
- 26. All contents © MuleSoft Inc. 26
• TOP COMPETITORS OF GOOGLE BIGQUERY :
- 27. All contents © MuleSoft Inc. 27
• Setup Database at GCP (Google Cloud Platform):
– In the Cloud Console, on the project selector page, select or create a Cloud project.
(https://console.cloud.google.com/projectselector2/home/dashboard?_ga=2.120024
392.1366931291.1581598739-1744192159.1581411477)
– BigQuery provides a sandbox, if we do not want to provide a credit card or enable
billing for our project.
– The BigQuery web UI provides an interface to query tables, including public datasets
offered by BigQuery.
– Once Project setup is done. Go to
https://console.cloud.google.com/home/dashboard?_ga=2.120024392.136693129
1.1581598739-1744192159.1581411477&project=navigation-api-
demo&folder=&organizationId=
Or
– BigQuery Navigation:
https://console.cloud.google.com/bigquery?_ga=
2.120024392.1366931291.1581598739-1744192159.1581411477&project
=navigation-api-demo
Play with BigQuery in GCP
- 28. All contents © MuleSoft Inc. 28
• Choose the project name and click on create dataset:
• In this section, provide relevant dataset name and
choose other options as required:
• Next, create table under <test_dataset>:
- 29. All contents © MuleSoft Inc. 29
• Here, we can select multiple options for schema:
- 30. All contents © MuleSoft Inc. 30
• We will be creating a table
using ‘test_csv’ data with the
Upload option.
• Select the ‘Browse’ button
and upload relevant test file:
- 31. All contents © MuleSoft Inc. 31
• Now the table has been created with ‘testData_DOB.csv’ attached to it.
• Here, we can run multiple queries at editor section:
- 32. All contents © MuleSoft Inc. 32
• Like this, we can set up our database with high SLA @serverless.
• Now we will utilize this database for our integration purpose.
• We can use BigQuery REST API to connect it from external sources.
• We can do multiple operations using this REST API.
- 33. All contents © MuleSoft Inc. 33
• Here are the few operations
supported by BigQuery REST
API:
** Base URI of BigQuery API is
: https://www.googleapis.com:
443/
BigQuery API
- 35. All contents © MuleSoft Inc.
• Here we will see simple setup / configuration from Anypoint Studio:
- 36. All contents © MuleSoft Inc.
• Following is the BigQuery REST service configuration:
• Following is the URL’s need to configure:
– Callback URL: http://localhost:8081/callback
– Local Authorization URL: http://localhost:8081/web
– Authorization URL: https://accounts.google.com/o/oauth2/auth
– Scopes: https://www.googleapis.com/auth/{scopeName}
– Token URL: https://oauth2.googleapis.com/token
– Client_ID: {Your Client ID}
– Client_Secret: {Your Client Secret}
- 37. All contents © MuleSoft Inc.
• Like, we can connect to BigQuery API using OAuth 2.0 Configuration:
- 38. All contents © MuleSoft Inc.
• Once we are done, Now go to the following location by choosing relevant
project name and Click on “Credentials” and update Authorized redirect
URIs as specified
below: https://console.cloud.google.com/apis/credentials/consent?project=
navigation-api-demo
- 39. All contents © MuleSoft Inc.
• Below is the location, where we can get all credential details:
• Once the setup is done. Let’s deploy our application locally.
• After successful deployment in local, hit following URL on Google Chrome (as
Postman facing challenge to obtain OAuth 2.0 Token for Google) to get
Authorization code / Access Token and Choose relevant Gmail account:
- 41. All contents © MuleSoft Inc.
• Copy the code value and Use it for Oauth 2.0 Authorization.
• Trigger the request URL by make use of code value as Access Token and Click on Preview
Request as follows:
All contents © MuleSoft Inc. 11
- 42. All contents © MuleSoft Inc.
• Recently, MuleSoft released BigQuery Connector in Anypoint Exchange which is created
by Connectivity Partners. This connector supports Mule 4.X Runtime version.
• This connector provides organizations access to BigQuery through interfacing the
Google BigQuery API.
• The BigQuery Connector allows customers to create, manage, share and query data
• Here, We will discuss Operations supported by BigQuery Connector and Simple Demo
on ‘E2E’ using BigQuery Connector.
• Operations Supported By Connector:
• Create Job : to create and start an asynchronous job.
• Copy Job :
• Extract Job
• Load Job
• Query Job
• Get Job : information about a specific job based on
given BigQuery Job name.
• List Job : to list all jobs that we started in the
specified project.
• Cancel Job : to cancel the BigQuery Job based on 42
Play with BigQuery in Anypoint Studio
- 43. All contents © MuleSoft Inc. 43
• Create Dataset : to create a new empty dataset. can hold one or more tables in it.Dataset
names should be unique.
• Get Dataset
• List Dataset
• Update Dataset
• Delete Dataset
• Create Table : to create a new table in existing dataset. We have to provide a new table
name along with datasetId.
• Get Table
• List Table
• Update Table
• Delete Table
• List Table Data : to list the content of a table in rows based on given datasetId and
tableId.
• Query : to run the query associated with the request based on given JobId.
• Get Query Result : to return the results of a query job based on given JobId.
• Insert All : to stream data into BigQuery one record at a time without running the load job
- 44. All contents © MuleSoft Inc. 44
• Create Dataset, Table and Insert data to Table from Anypoint
Studio:
• Create a new project and Add “Google BigQuery” palette or dependency to project.
• Configure the BigQuery global configuration as shown below.
• Global configuration for this connector is simple.We need to update this configuration with our
service Account Key details:Where,
• Project ID: Project Name in which BigQuery is associated.
• Service Account Key:Download the JSON file from the Credentials section.
- 46. All contents © MuleSoft Inc. 46
• On successful table creation. We can insert the data using InsertAll operation. We need to
update the payload at “Row Data” section. There is a limitation on “Insert All”operation, We
can send the data in a bunch of 10,000 records/rows at a time.
•
- 47. All contents © MuleSoft Inc.
Limitations:
47
• BigQuery cannot be used to substitute a relational database / Not OLTP system.
• BigQuery is Cloud based solution.
• Limited number of updates in the table per day.
• Good for scenarios where data does not change often and you want to use cache,
as it has built-in cache.
• If you run the same query and the data in tables is not changed (updated),
BigQuery will just use cached results and will not try to execute the query again.
• Minimize the amount of scanned data.
• Consideration of usage, as it charges on following criteria:
– No of Users (per day)
– No of Queries (per User, per day)
– Average Data Usage (per Query)
* Pricing Calculator https://cloud.google.com/products/calculator/
- 48. All contents © MuleSoft Inc.
All contents © MuleSoft Inc.
Summary
12
• ### BigQuery is fully-managed, we don’t need to deploy any
resources, such as disks and virtual machines.
• Easy to use with high SLA###
!!!!! Happy Learning !!!!!
- 50. All contents © MuleSoft Inc.
All contents © MuleSoft Inc.
References and
Documentation
30
https://apisero.com/mulesoft-bigquery-series-1/
https://apisero.com/mulesoft-bigquery-series-2/
https://pixabay.com/photos/
https://www.mulesoft.com/press-center/runtime-fabric-google-cloud-gcp
https://cloud.google.com/bigquery/docs/quickstarts/quickstart-web-ui?hl=en_US
https://cloud.google.com/bigquery/docs/interacting-with-bigquery?hl=en_US
https://cloud.google.com/bigquery/docs/reference/rest/
https://cloud.google.com/files/BigQueryTechnicalWP.pdf
- 51. All contents © MuleSoft Inc.
All contents © MuleSoft Inc. 31
https://cloud.google.com/docs/authentication/getting-started
https://eu1.anypoint.mulesoft.com/exchange/com.mulesoft.connectors/mule-
bigquery-connector/
https://opendoc.gslab.com/bq_release_notes.html
https://opendoc.gslab.com/bq_user_guide.html
https://cloud.google.com/bigquery/
https://opendoc.gslab.com/bq_api_reference.html
https://www.mulesoft.com/legal/versioning-back-support-policy#anypoint-
connectors
- 53. See you next time
Please send topic suggestions to the organizer
Editor's Notes
- https://apisero.com/mulesoft-bigquery-series-1/
https://apisero.com/mulesoft-bigquery-series-2/
- Encourage audience to take a look at the blog. “Blog has important reference materials related to the presentation.”