1

For context, I am building an application with a server that is written in python, hosted on AWS, and uses DynamoDB for a database.

My question pertains to the following operation: When a user purchases an item, I want to update the count of soldItems in my database. To do this, I essentially do

updateItemsSold() {
  soldItems = db.get(soldItems) + 1
  db.update(soldItems)
}

My question concerns the situation where two different customers buy an item at the same time. Is it possible that both db.get calls happen at the same time, leading to an error where the soldItems total only increases by one? Since I'm not using threads in my server I assume that this shouldn't be a problem, as one call of updateItemsSold() will need to finish before the next operation starts. However, I also know that in production multiple servers may be spun up to manage load. In that case, it seems like this could actually be an issue. Am I correct in thinking that? If so, how should I mitigate this? Are there some kind of locks I need to use?

3
  • This sounds like a case for transactions: aws.amazon.com/blogs/aws/new-amazon-dynamodb-transactions Commented Jun 21, 2019 at 18:54
  • 2
    Keeping a counter in the database in inherently problematic. I suggest you keep a table instead, e.g named ItemsSold, which is just a list. When you need to "increment" the counter, insert a row into ItemsSold. Then when you need a count, just use SELECT COUNT(*) FROM ItemsSold. This design would be immune to the sort of concurrency problem you are coping with here.
    – John Wu
    Commented Jun 21, 2019 at 21:29
  • @BerinLoritsch like other databases, DynamoDB has specialized instructions for modifying a value. Pretty much like sql you can instruct it to SET field = field + 1.
    – marstato
    Commented Jun 21, 2019 at 21:50

3 Answers 3

6

DynamoDB supports transactions. However, transactions are limited to reads or writes, rather than arbitrary actions in each transaction. This isn't the limitation it might seem at first, as reads and writes support expressions in various arguments.

In this particular case, using an update expression (without a transaction) will allow for an atomic operation. DynamoDB Transactions are useful for batch reads & writes.

1
  • 1
    This is the only correct answer.
    – marstato
    Commented Jun 21, 2019 at 21:51
2

As with other answers: yes, you absolutely can have an issue. The term for this specific race condition is a dirty read (or dirty write).

You could try to use transactions to prevent this but I suggest you reconsider the design instead. Just insert a new record for each sale. You are probably doing this anyway. When you want to know how many times an item has been sold, retrieve the sales records for that item and sum the quantity.

It's possible to make this solution robust. There are many pitfalls and it's impossible without introducing contention. And with DynamoDB, you would be required to always use a consistent read for getting the current total which essentially means everything going to the same node for this item. In short, you are going to negate a lot of the value of using this kind of database. Don't prematurely optimize this, use the solution that's the easiest to ensure correctness and then see where your bottlenecks are.

0

Yes, that is your standard race condition. I am not terribly familiar with DynamoDB or python, but it is possible your db client will handle that for you via a transaction mechanism or some versioning exception. More than likely though you'll need to do that yourself.

Not the answer you're looking for? Browse other questions tagged or ask your own question.