31

So I have been trying to get my hands on Amazon's AWS since my company's whole infrastructure is based of it.

One component I have never been able to understand properly is the Queue Service, I have searched Google quite a bit but I haven't been able to get a satisfactory answer. I think a Cron job and Queue Service are quite similar somewhat, correct me if I am wrong.

So what exactly SQS does? As far as I understand, it stores simple messages to be used by other components in AWS to do tasks & you can send messages to do that.

In this question, Can someone explain to me what Amazon Web Services components are used in a normal web service?; the answer mentioned they used SQS to queue tasks they want performed asynchronously. Why not just give a message back to the user & do the processing later on? Why wait for SQS to do its stuff?

Also, let's just say I have a web app which allows user to schedule some daily tasks, how would SQS would fit in that?

3 Answers 3

98

No, cron and SQS are not similar. One (cron) schedules jobs while the other (SQS) stores messages. Queues are used to decouple message producers from message consumers. This is one way to architect for scale and reliability.

Let's say you've built a mobile voting app for a popular TV show and 5 to 25 million viewers are all voting at the same time (at the end of each performance). How are you going to handle that many votes in such a short space of time (say, 15 seconds)? You could build a significant web server tier and database back-end that could handle millions of messages per second but that would be expensive, you'd have to pre-provision for maximum expected workload, and it would not be resilient (for example to database failure or throttling). If few people voted then you're overpaying for infrastructure; if voting went crazy then votes could be lost.

A better solution would use some queuing mechanism that decoupled the voting apps from your service where the vote queue was highly scalable so it could happily absorb 10 messages/sec or 10 million messages/sec. Then you would have an application tier pulling messages from that queue as fast as possible to tally the votes.

4
  • 6
    Ah so use case is sorta like load/spike distribution. But does it matter too much if you have a serverless event-driven setup? Let's say the code that handles the votes is an event-driven lambda, provision and spike doesn't really matter in this case, the best SQS could do is to do batch processing, right?
    – Mojimi
    Commented Aug 12, 2019 at 18:14
  • This way you will have only one lambda being triggered on every vote. Also what if tomorrow there is a requirement to count those votes per customer? You could an SQS queue or FIFO SQS queue for some other requirement. How would you trigger your lambda? Is it from an api gateway? You can have an SQS queue in front of the api gateway and or vice verse Commented Jan 6, 2023 at 4:19
  • @AnkurKothari note that Lambda was less than one year old when this question and answer were written. Anyhow, typically a Lambda function would be invoked with a batch of messages e.g. 100 or more per invocation. There’s no API Gateway in that scenario. The AWS Lambda service polls the SQS queue and invokes the Lambda function as needed, and potentially invokes many instances of the Lambda function, each processing a batch. Counts by some grouping, if needed, can be done with DynamoDB.
    – jarmod
    Commented Jan 6, 2023 at 4:45
  • In my scenario, my lambda reads from SQS, reads a whole batch of meesages and creates a new AWS batch because of the time limit of the lambda function Commented Jan 6, 2023 at 8:33
11

One thing I would add to @jarmod's excellent and succinct answer is that the size of the messages does matter. For example in AWS, the maximum size is just 256 KB unless you use the Extended Client Library, which increases the max to 2 GB. But note that it uses S3 as a temporary storage.

In RabbitMQ the practical limit is around 100 KB. There is no hard-coded limit in RabbitMQ, but the system simply stalls more or less often. From personal experience, RabbitMQ can handle a steady stream of around 1 MB messages for about 1 - 2 hours non-stop, but then it will start to behave erratically, often becoming a zombie and you'll need to restart the process.

7

SQS is a great way to decouple services, especially when there is a lot of heavy-duty, batch-oriented processing required.

For example, let's say you have a service where people upload photos from their mobile devices. Once the photos are uploaded your service needs to do a bunch of processing of the photos, e.g. scaling them to different sizes, applying different filters, extracting metadata, etc.

One way to accomplish this would be to post a message to an SQS queue (or perhaps multiple messages to multiple queues, depending on how you architect it). The message(s) describe work that needs to be performed on the newly uploaded image file. Once the message has been written to SQS, your application can return a success to the user because you know that you have the image file and you have scheduled the processing.

In the background, you can have servers reading messages from SQS and performing the work specified in the messages. If one of those servers dies another one will pick up the message and perform the work. SQS guarantees that a message will be delivered eventually so you can be confident that the work will eventually get done.

Not the answer you're looking for? Browse other questions tagged or ask your own question.