Unleash the Power of Serverless: Building a File Storage Service with API Gateway, Lambda, and S3

Vishal Mishra
Towards AWS
Published in
15 min readDec 7, 2023

The need for data accessibility, scalability, and dependability in today’s digital environment drives the usage and development of cloud file storage services. These powerful solutions go beyond conventional constraints, enabling both individuals and enterprises to fully utilize the cloud’s limitless potential. Storing files in the cloud opens up amazing possibilities. It’s like having a superpower for your data with unlimited scalability!

Cloud file storage services offer a range of use cases due to their scalability, accessibility, and reliability. Some common use cases include:

  1. Data Backup and Recovery
  2. File Sharing and Collaboration
  3. Data Archiving
  4. Media Storage etc.

This simple, step-by-step guide will show you how to quickly and easily build a cloud storage service in a matter of minutes.

What we’ll build

In this hands-on guide, we are going to create a serverless file upload service where users can securely upload files to an S3 bucket through an API Gateway endpoint and process the uploaded files using Lambda functions. Please note there is a limitation of 5 MB in the API Gateway so using this service you will be able to upload any type of data up to 5MB in size.

What we’ll need

  1. A private S3 bucket for the user content, whatever the user is going to upload will be stored in this s3 bucket.
  2. A lambda function that will have access to the s3 bucket created previously and using boto3 library, this lambda function is going to put the object in the s3 bucket.
  3. API gateway endpoint that will call the lambda function.
  4. Another private S3 bucket for the web app hosting — This bucket will consist of the front-end files — index.html, style.css, and app.js
  5. CloudFront Distribution — For the content delivery. Users will be able to access the distribution across the globe. They will have no access to the S3 bucket directly.
  6. Terraform Configuration files that will consist of all of the above resources

Architecture

Cloud File Storage Service Architecture

Prerequisites

As you have followed my previous projects, We are not going to create anything via AWS Console instead, we’ll be writing Terraform configurations and will deploy via terraform apply .

So, before proceeding make sure you have AWS CLI and Terraform CLI installed and exported the credentials in your CLI.

export AWS_ACCESS_KEY_ID=<Get this from the credentials file downloaded from AWS Console>
export AWS_SECRET_ACCESS_KEY=<Copy this from the credentials file downloaded from AWS Console>

Steps

We’ll be writing Terraform Configurations files for all of the resources that we are going to create along the way. We are going to follow a proper production-like directory structure for the entire code i.e. different directories for the front-end, back-end, and infrastructure files. So before proceeding below is the directory structure that you can create first.

Directory Structure

Also, you can download the entire code here.

We are going to write all the main resource configurations in the main.tf file under tf-aws-infra. Let’s start writing and building the app.

1. Set Up S3 Bucket:

The first thing that we need to create is an S3 bucket where the uploaded files will be stored. The bucket will be kept private so that no one can access that publicly. Here is the terraform configuration for this —

#Creating S3 bucket for storing user content from Rest API Call

resource "aws_s3_bucket" "user_content_bucket" {
bucket = var.user_bucket
force_destroy = true
}

resource "aws_s3_bucket_ownership_controls" "user_content_bucket" {
bucket = aws_s3_bucket.user_content_bucket.id
rule {
object_ownership = "BucketOwnerPreferred"
}
}

resource "aws_s3_bucket_acl" "user_content_bucket" {
depends_on = [aws_s3_bucket_ownership_controls.user_content_bucket]

bucket = aws_s3_bucket.user_content_bucket.id
acl = "private"
}

2. Create AWS Lambda Function:

The next step is to create a Lambda function. Below is the functionality for this function.

  • Lambda Trigger: This Lambda function is triggered by a POST request from API Gateway.
  • Processing the Request: It extracts the file content from the request body and the filename from the query string parameters.
  • S3 Upload: Using the AWS SDK (Boto3), it uploads the file content to the specified S3 bucket and key (file path).
  • Response Handling: It returns a success message if the file upload is successful or an error message if the upload fails.
  • Grant Permissions: Configure IAM roles and permissions for the Lambda function to access S3.
#defining variables for Lambda funtcion
locals {
lambda_src_dir = "${path.module}/../back-end/"
lambda_function_zip_path = "${path.module}/lambda/lambda_function.zip"
}

# Creating an IAM role for Lambda
resource "aws_iam_role" "lambda_role" {
name = "LambdaRole"

assume_role_policy = jsonencode({
Version = "2012-10-17",
Statement = [{
Action = "sts:AssumeRole",
Effect = "Allow",
Principal = {
Service = "lambda.amazonaws.com"
}
}]
})
}

# Creating S3 policy for Lambda functiion role to get and put objects to S3 bucket
data "aws_iam_policy_document" "policy" {
statement {
effect = "Allow"
actions = ["s3:ListBucket", "s3:GetObject", "s3:PutObject", "s3:CopyObject", "s3:HeadObject",
"logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents"]
resources = ["*"]
}
}

resource "aws_iam_policy" "policy" {
name = "lambda-policy"
policy = data.aws_iam_policy_document.policy.json
}

# Attaching the policy created above to the IAM role
resource "aws_iam_role_policy_attachment" "lambda_role_policy" {
policy_arn = aws_iam_policy.policy.arn
role = aws_iam_role.lambda_role.name
}

# Creating the Lambda function using data resource
data "archive_file" "lambda" {
source_dir = local.lambda_src_dir
output_path = local.lambda_function_zip_path
type = "zip"
}

resource "aws_lambda_function" "file_uploader_lambda" {
filename = local.lambda_function_zip_path
function_name = var.lambda_function_name
role = aws_iam_role.lambda_role.arn
handler = "lambda_function.lambda_handler"
runtime = var.lambda_runtime
timeout = 20
memory_size = 128
source_code_hash = data.archive_file.lambda.output_base64sha256

environment {
variables = {
USER_BUCKET = var.user_bucket,
}
}

}

3. Create AWS API Gateway:

Moving further, let’s create Rest API using AWS API Gateway with the following properties -

  • Resource and Method: The first thing you need to define is the Resource and HTTP method. We are keeping resource /upload and Method as POST since we want to put the data using the endpoint.
  • Integration Type: For the POST method, we are keeping the AWS_PROXYintegration type which is used for the Lambda function.
  • Method Request and Integration Request: Method Request focuses on initial request validation and shaping, while Integration Request handles transformations necessary to match the backend’s requirements before the request is sent to the integration. Since we are not going to do any transformations here hence we are not using this in our configuration.
  • Method Response and Integration Response: Similarly, Integration Response transforms the backend’s actual response to match the expected structure or format defined in the Method Response before returning it to the client. In the case of Lambda ProxyIntegration which we are using here, we only have to define Method Response with status codes (e.g., 200, 400, 500), headers, and body models based on the responses from the Lambda function.
  • Deployment: In API Gateway, a deployment is a snapshot of an API and its configuration at a specific moment in time. It includes the API’s resources, methods, integrations, and other settings. Deployments help you manage different versions or configurations of your API.
  • Stage: Stages provide unique URLs for different environments or versions, allowing users to interact with the API according to the designated stage. For example — a dev stage for testing purposes, and a prod stage for the live production environment.
API Gateway — Properties

Below are the terraform configurations for the resources discussed above -

 #Creating API Gateway for the REST API

resource "aws_api_gateway_rest_api" "FileUploderService" {
name = "FileUploderService"
}

resource "aws_api_gateway_resource" "FileUploderService" {
parent_id = aws_api_gateway_rest_api.FileUploderService.root_resource_id
path_part = "upload"
rest_api_id = aws_api_gateway_rest_api.FileUploderService.id
}

resource "aws_api_gateway_method" "FileUploderService" {
authorization = "NONE"
http_method = "POST"
resource_id = aws_api_gateway_resource.FileUploderService.id
rest_api_id = aws_api_gateway_rest_api.FileUploderService.id
}

resource "aws_api_gateway_integration" "FileUploderService" {
http_method = aws_api_gateway_method.FileUploderService.http_method
resource_id = aws_api_gateway_resource.FileUploderService.id
rest_api_id = aws_api_gateway_rest_api.FileUploderService.id
integration_http_method = "POST"
type = "AWS_PROXY"
uri = aws_lambda_function.file_uploader_lambda.invoke_arn
}

# Method Response and Enabling CORS

resource "aws_api_gateway_method_response" "FileUploderService" {
rest_api_id = aws_api_gateway_rest_api.FileUploderService.id
resource_id = aws_api_gateway_resource.FileUploderService.id
http_method = aws_api_gateway_method.FileUploderService.http_method
status_code = "200"

response_models = {
"application/json" = "Empty"
}

response_parameters = {
"method.response.header.Access-Control-Allow-Origin" = true,
"method.response.header.Access-Control-Allow-Headers" = true,
"method.response.header.Access-Control-Allow-Methods" = true
}

}

resource "aws_api_gateway_deployment" "FileUploderService" {
rest_api_id = aws_api_gateway_rest_api.FileUploderService.id

triggers = {
redeployment = sha1(jsonencode([
aws_api_gateway_resource.FileUploderService.id,
aws_api_gateway_method.FileUploderService.id,
aws_api_gateway_integration.FileUploderService.id,
]))
}

lifecycle {
create_before_destroy = true
}
}

resource "aws_api_gateway_stage" "prod" {
deployment_id = aws_api_gateway_deployment.FileUploderService.id
rest_api_id = aws_api_gateway_rest_api.FileUploderService.id
stage_name = "prod"
}

# Permission for API Gateway to invoke lambda function
resource "aws_lambda_permission" "apigw_lambda" {
statement_id = "AllowExecutionFromAPIGateway"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.file_uploader_lambda.function_name
principal = "apigateway.amazonaws.com"
source_arn = "arn:aws:execute-api:${var.aws_region}:${var.aws_account_id}:${aws_api_gateway_rest_api.FileUploderService.id}/*/${aws_api_gateway_method.FileUploderService.http_method}${aws_api_gateway_resource.FileUploderService.path}"
}

4. Run Terraform Apply and Test endpoint Locally:

Now, we are set to run terraform applywhich will create the above resources. Before that let’s run terraform plan and see how many resources will be created. Please make sure you are in the directory — tf-aws-infra while running terraform plan or terraform apply . Also use the provide the tfvars file path for providing the values for the variables declared.

tf-aws-infra  $  terraform plan -var-file=variables/dev.tfvars
terraform plan
terraform plan

So, it's going to create around 15 resources. Let’s apply this.

tf-aws-infra  $  terraform apply -var-file=variables/dev.tfvars -auto-approve 
terraform apply

All the resources are created so now we can test the API using tools like Postman or cURL to ensure it accepts file uploads and stores them in the S3 bucket. Therefore, navigate to the API Gateway console and under stages, take the URL for the stage deployed and prepare the curl command.

API Gateway prod stage URL

Below is the CURLcommand for the same with the API Gateway URL. Provide the local directory path correctly for the files that you want to upload.

curl -X POST -H "Content-Type: application/octet-stream" -T ../test/file.txt 'https://onkiuircbk.execute-api.us-east-1.amazonaws.com/prod/upload?filename=file.txt' 

curl -X POST -H "Content-Type: application/octet-stream" -T ../test/architecture.jpg 'https://onkiuircbk.execute-api.us-east-1.amazonaws.com/prod/upload?filename=architecture.jpg'
Local Testing

And here you go. You are able to test the rest API successfully. It might take a minute or so for the images to get uploaded. However, for the text files, it is very quick. Also, make sure your file size should be less than 5 MB.

5. Create Front-End UI:

As we have tested this locally and the API Gateway endpoint is working fine, we can proceed with a simple UI for the file upload service because for the end user it would be much easier to use a UI for this service rather than POSTMAN or CURL command line tool. People who are experts in front-end technologies can also create UI using the latest frameworks like React or Angular . Since I know very little about front-end technologies, I am gonna just stick to the basic HTML, CSS, and JavaScript. We are not going to discuss the front-end code as I am pretty sure that you guys can create a better UI than mine. Here is the code — front-end. However, you can use the below directory structure. Also, update the API Gateway endpoint URL correctly in the app.js file.

Directory Structure for the UI
app.js API Gateway URL

After developing the front-end code, you can test using a live server in your IDE for example VSCode This is how it looks during testing and running it locally -

Live Server Local Testing

5. Create an S3 bucket for Web hosting and CloudFront Distribution:

We have come to the last step of this project. Since you want to share the Cloud File Storage Service URL with external users, it’s better to deploy this as a web app so that anyone can use this. However, you want to authenticate the users first before using your service. That we’ll cover in the second part of this project. For now, let’s focus on how we can create and host a simple web application. AWS provides many options for hosting web apps. The easiest of them would be to use an S3 bucket for the website hosting and deliver the content using Content Delivery Network — Cloudfront in this case.

So, let’s create an S3 bucket first for hosting the web app files i.e. index.html, style.css, and app.js. As always, this bucket will also be a private bucket with the bucket owner owning the objects inside the bucket. No one will have access to this bucket directly.

Here is the terraform configuration for the bucket and its properties.

# Creating S3 bucket for web hosting (front-end)

resource "aws_s3_bucket" "file_uploader_app_bucket" {
bucket = var.webapp_bucket
force_destroy = true

tags = {
Name = "File Uploader Service App Bucket"
}
}

resource "aws_s3_bucket_ownership_controls" "file_uploader_app_bucket_owner" {
bucket = aws_s3_bucket.file_uploader_app_bucket.id
rule {
object_ownership = "BucketOwnerPreferred"
}
}

resource "aws_s3_bucket_acl" "file_uploader_app_bucket_acl" {
depends_on = [aws_s3_bucket_ownership_controls.file_uploader_app_bucket_owner]
bucket = aws_s3_bucket.file_uploader_app_bucket.id
acl = "private"
}

To deliver the content globally and with the lowest latency, we can use CloudFront Distrbution with S3 Bucket as origin.

CloudFront delivers your content through a worldwide network of data centers called edge locations. When a user requests content that you’re serving with CloudFront, the request is routed to the edge location that provides the lowest latency (time delay), so that content is delivered with the best possible performance.

Hence, we are going to create a CloudFront distribution and will point that to the S3 bucket which hosts the web app files. There are many options/settings for creating a CloudFront distribution but we are just going to look into the important ones which are just enough for our web app.

Here are some important settings to consider -

1. Origin Settings -

- Origin Domain Name: S3 or ALB or any other custom origin.

- Protocol Policy: HTTP only or HTTPS only.

- Origin Path: Path to resources at the origin.

2. Cache Behavior Settings:

- Path Pattern: Define how CloudFront handles requests based on URL patterns.

- Cache and Origin Request Settings: Control caching behavior, TTLs, query strings, etc.

3. Distribution Settings:

- Price Class: Choose the AWS Edge Locations based on your user base that the distribution will use. There are some cheaper options available if you don't have a fully global user base.

- SSL/TLS Settings: Configure SSL protocols.

- Default Root Object: Specify the default object to serve when users access the root URL for example — index.html in our case.

4. Behavior and Performance:

- Viewer Protocol Policy: Define whether CloudFront serves content over HTTP or HTTPS. You can also choose redirect-to-https for a secure connection.

- CORS (Cross-Origin Resource Sharing): Control access to resources from different origins.

- HTTP/2: Enable for improved performance.

5. Security and Access Control:

- Access Control: Restrict access to content using signed URLs or signed cookies.

6. Geo-Restriction:

- Restrict access based on geographic locations. You can whitelist and blacklist countries or specify them as None if you have no such restriction.

Below are the terraform configurations using the above settings —

#Creating CloudFront distribution for the web app

locals {
s3_origin_id = "FileUploaderS3Origin"
}

resource "aws_cloudfront_origin_access_control" "oac" {
name = "fileuploader-oac"
description = "File Uploader Policy"
origin_access_control_origin_type = "s3"
signing_behavior = "always"
signing_protocol = "sigv4"
}
resource "aws_cloudfront_distribution" "s3_distribution" {
origin {
domain_name = aws_s3_bucket.file_uploader_app_bucket.bucket_regional_domain_name
origin_access_control_id = aws_cloudfront_origin_access_control.oac.id
origin_id = local.s3_origin_id
}

enabled = true
is_ipv6_enabled = true
comment = "Some comment"
default_root_object = "index.html"

default_cache_behavior {
allowed_methods = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
cached_methods = ["GET", "HEAD"]
target_origin_id = local.s3_origin_id

forwarded_values {
query_string = false

cookies {
forward = "none"
}
}

viewer_protocol_policy = "redirect-to-https"
min_ttl = 0
default_ttl = 3600
max_ttl = 86400
}

price_class = "PriceClass_200"

restrictions {
geo_restriction {
restriction_type = "none"
locations = []
}
}

tags = {
Environment = "production"
}

viewer_certificate {
cloudfront_default_certificate = true
}
}

Let’s apply all these configurations once again and create the entire infrastructure. It’s gonna take a few minutes as CloudFront distribution may take some time to be deployed.

tf-aws-infra  $  terraform apply -var-file=variables/dev.tfvars -auto-approve 
Plan: 21 to add, 0 to change, 0 to destroy.

Changes to Outputs:
+ File-Uploader-App-bucket = (known after apply)
+ Source-S3-bucket = (known after apply)
+ fileuploader-api-endpoint = (known after apply)
+ fileuploader-app-url = (known after apply)

Please note, that I destroyed the resources that I created in the previous apply and I am re-creating the entire infrastructure that’s why you seeing here — 21 to add.

...
aws_cloudfront_distribution.s3_distribution: Still creating... [3m30s elapsed]
aws_cloudfront_distribution.s3_distribution: Creation complete after 3m37s [id=E1X508LRLG558T]
data.aws_iam_policy_document.allow_access_from_cloudfront: Reading...
data.aws_iam_policy_document.allow_access_from_cloudfront: Read complete after 0s [id=1921119105]
aws_s3_bucket_policy.allow_access_from_cloudfront: Creating...
aws_s3_bucket_policy.allow_access_from_cloudfront: Creation complete after 2s [id=file-uploader-service-app-9002]

Apply complete! Resources: 21 added, 0 changed, 0 destroyed.

Outputs:

File-Uploader-App-bucket = "file-uploader-service-app-9002"
Source-S3-bucket = "user-content-bucket-9001"
fileuploader-api-endpoint = "crc47kna7h"
fileuploader-app-url = "d1d8eikgl0gsid.cloudfront.net"

Once the distribution and other resources are created, you can notice the distribution URL and the API endpoint ID. Before accessing the URL, there are a couple of more things you need to do —

  1. We need to copy the API endpoint ID and update this in the app.js.
const apiUrl = 'https://crc47kna7h.execute-api.us-east-1.amazonaws.com/prod/upload'

2. We need to copy the files — index.html, style.css, and app.js to the web hosting bucket i.e. file-uploader-service-app-9002 . We can directly do this via AWS CLI using the below commands. Please make sure you have already exported the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.


tf-aws-infra $ pwd 9:39AM
/Users/vishalmishra/Study/medium/AWSDevOpsProjects/Project-3/tf-aws-infra
tf-aws-infra $
9:39AM
tf-aws-infra $ ls -ltr ../front-end 9:39AM
total 24
-rw-r--r-- 1 vishalmishra staff 641 Dec 4 16:56 style.css
-rw-r--r-- 1 vishalmishra staff 614 Dec 4 16:56 index.html
-rw-r--r-- 1 vishalmishra staff 2237 Dec 7 09:34 app.js

tf-aws-infra $ 9:39AM
tf-aws-infra $ aws s3 cp ../front-end/index.html s3://file-uploader-service-app-9002/ 9:39AM
upload: ../front-end/index.html to s3://file-uploader-service-app-9002/index.html

tf-aws-infra $ aws s3 cp ../front-end/style.css s3://file-uploader-service-app-9002/ 9:40AM
upload: ../front-end/style.css to s3://file-uploader-service-app-9002/style.css

tf-aws-infra $ aws s3 cp ../front-end/app.js s3://file-uploader-service-app-9002/ 9:40AM
upload: ../front-end/app.js to s3://file-uploader-service-app-9002/app.js

tf-aws-infra $ aws s3 ls s3://file-uploader-service-app-9002/ 9:41AM
2023-12-07 09:39:26 2237 app.js
2023-12-07 09:40:07 614 index.html
2023-12-07 09:40:35 641 style.css

We have come a long way. It’s time to test the CloudFront URL. Just copy the URL from the terraform applyoutput and hit it in the browser, click on the choose file, and select the file which you want to upload, it will display the file details with size and finally click on submit. Within a few seconds, your file will be uploaded to S3.

File Storage Service served via CloudFront Distribution
Uploading image less than 5 MB

Finally, let’s verify the files in the S3 bucket.

S3 user content bucket

Errors Encountered

I didn’t complete this project in one go. There were multiple errors and challenges encountered and I took a lot of time to fix those. We learn things by trying multiple iterations and then only the final product will be better.

Here are some of the common errors -

  1. If you get the below error while uploading a file via UI, please enable CORS. You can do this either by navigating to the API Gateway console or directly by keeping the terraform configuration for this.
Access to fetch at 'https://xxxxxxx.execute-api.us-east-1.amazonaws.com/dev/upload?filename=architecture.jpg' from origin 'http://127.0.0.1:5500' 
has been blocked by CORS policy: No 'Access-Control-Allow-Origin'
header is present on the requested resource.
If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.
Enabling CORS
Terraform configuration for enabling CORS

There is one more thing since I was using Proxy integration, I will have to send the following in the response header — “Access-Control-Allow-Origin”. I modified the Lambda function with the response header and after that, this error was resolved. However, you don’t want to put * for the header value instead, you should use the actual domain name.

Lambda Function code snippet with Response Header

2. Getting Access Denied error while accessing Cloudfront distribution.

<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>GT6MJX8YVDJW7N5F</RequestId>
<HostId>QKDWhlO6IU0nnk32nOpaJCRVK3Q4Ehrs/5ap9yTnaa2vRy2AhD7GYpkHx3tocfaB+QtmSrYrlxQ=</HostId>
</Error>

Make sure the bucket policy is updated for CloudFront to access.

{
"Version": "2008-10-17",
"Id": "PolicyForCloudFrontPrivateContent",
"Statement": [
{
"Sid": "AllowCloudFrontServicePrincipal",
"Effect": "Allow",
"Principal": {
"Service": "cloudfront.amazonaws.com"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::file-uploader-service-app-9002/*",
"Condition": {
"StringEquals": {
"AWS:SourceArn": "arn:aws:cloudfront::<AWS Account Id>:distribution/E683SHD2Z443X"
}
}
}
]
}

Conclusion

As we wrap up this project — Cloud File Storage Service, we’ve successfully harnessed the power of AWS services — API Gateway, Lambda, and S3 — to create a seamless and efficient way of uploading files to the cloud. With every upload, we’ve witnessed the synergy of these technologies, enabling us to build a reliable and scalable solution.

But our exploration doesn’t end here! In the upcoming sequel to this blog, we’re diving into the realm of authentication and security using Amazon Cognito. We’ll elevate our service to the next level, ensuring that only authenticated users can access our API and utilize this file upload functionality.

Stay tuned for the next chapter!!! Together, let’s continue our quest to craft robust and user-friendly cloud-based solutions.

My Github Project repo for DevOps Projects — https://github.com/vishal2505/AWSDevOpsProjects/tree/main

Did you see what happens when you click and hold on the clap button?

--

--