Skip to content

Latest commit

 

History

History

AWS

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Data pipeline with Snowflake

made-with-python

Table of Contents

INTRODUCTION

  1. Upload and Store the required data in S3 bucket
  2. Create a pipeline using Glue
  3. Write queries in Athena and build visualizations in Amazon Quick sight

ARCHITECTURE

image

SETUP

  1. Downloading data from AWS s3 bucket requires:
  • pip install boto3
  • pip install s3fs
  1. To web scrape using Beautiful Soup:
  • pip install bs4

PROCESS

Data

  • Create Access Key in AWS
  • Create a Storage Bucket in S3 and upload scrapped Storm data and Sevir data in their respective S3 bucket

Create a pipeline using Glue

  • In the AWS, select Glue and schedule a glue job to create a combined dataset and push it into S3 bucket.
  • Make use of Glue crawler to fetch the data from S3.
  • Once the Crawler is created run it and it will create Glue Datalog and tables.

View Query using Athena and Visualize using Quick sight

  • Now, make use of Athena by connecting it with table created by crawler and hit queries.
  • Lastly, view the result using Amazon Quicksight

REFERENCE