-1

Say you have a REST API endpoint like POST /move-money which transfers money from your main account to a savings pot. There are three path parameters

  • accountId for the user's account
  • potId for the user's savings pot
  • transferId which is a GUID generated by the calling client

Assume there's also a body with additional details but those are irrelevant for the question.

The goal is to achieve idempotency on the endpoint so that if two concurrent requests arrive at the service, the amount will be transferred only once. So to be precise:

Is it enough to use the transferId (which is generated by the client) as an idempotency key?

Is it redundant or necessary to perform a lock on the accountId to ensure idempotency?

What would be the sequence of actions necessary to ensure idempotency, as in at what point do we store the transferId and when do we perform the check?

3
  • In general, there isn't such a thing as an "idempotent increment/decrement". What you want here is for the operation to skip if the transaction id was already used. That's different from idempotency.
    – T. Sar
    Commented Apr 25 at 13:40
  • In this case, one would want an exclusive lock so that only 1 transaction at a time can do any sort of modification. As part of the processing after acquiring the lock, one could check to see if the transfer Id has already been applied. If not, then move money, otherwise stop and tell the caller than transfer has already been applied. This also helps with two different transfer Ids occurring at the same time, so the beginning and ending balance is always correct.
    – Jon Raynor
    Commented Apr 25 at 15:32
  • Yes, after reading a bit more about this and the replies here, I think the original question conflates two different problems - concurrent modification of the account balance and idempotent execution of a request with a specific Id. The former is solved by a lock and the latter by storing and checking the Id of the request being executed
    – MZokov
    Commented Apr 25 at 18:19

2 Answers 2

5

The goal is to achieve idempotency on the endpoint so that if two concurrent requests arrive at the service, the amount will be transferred only once.

This may be a little bit tangled, as the process-at-most-once property that you want isn't quite the same thing as idempotency. So let's digress for a moment.

Here's a java example of idempotency:

HashMap<String, String> example = new HashMap<>();
example.put("A", "B");
example.put("A", "B");

assert example.size() == 1;

Notice that invoking HashMap::put twice (with the same arguments) produces the same effect as invoking the method once. That's the idempotent bit; processing the command a second time is redundant, and that's a natural consequence of the semantics of the method.

Assignment, and set operations like add/remove have idempotent semantics. Increment/decrement do not.

In the general case, things aren't idempotent.

A thing you can sometimes do is treat the collection of incoming messages themselves as a set, and you "upsert" each new message as it arrives, thus ensuring that there are zero or one copies of each message in the collection.

An alternative is to take a compare-and-swap approach, where you describe in the message some predicate that will be false if the message has already been processed. Including a sequence number/target version is one common approach - if the version doesn't match what you find at processing time, then you no-op. (If you are already familiar with conditional requests, than you will recognize this as essentially the same idea).

And this is all well and good in the imaginary world where you only have to worry about handling a single message at a time. But in the real world, we have to worry about concurrent requests, and the possibility that there are zombie processes still trying to do work, and goodness knows what else.

Therefore, if you are intending that your data model satisfies some constraint, like "at most one copy of each message in the list", then you are going to need some form of locking somewhere. Where that is, and what form the lock should take, tends to vary with context; I'll note in passing that the versions where the domain processing can happen asynchronously are much simpler than the synchronous/request-response cases.

If your processing includes side effects in addition to local bookkeeping, then you are going to want to be very cautious about trying to incorporate multiple concerns into the same handler.


Is it enough to use the transferId (which is generated by the client) as an idempotency key?

Maybe. Review de Graauw 2010. Part of the challenge is whether you need to worry about cases like the "same" logical messages being manifest as collections of data with different ids (example: I tried to send an HTTP request, the system seemed unresponsive, so I tried again from a different browser/machine, so there's a second copy of the message using a different transferId. How important to the business is it to get that right the first time we process the messages?)

Is it redundant or necessary to perform a lock on the accountId to ensure idempotency?

If you have two processes trying to concurrently write to a (logical) data structure, you are going to need some mechanism in place to ensure that the data doesn't get corrupted / that writes don't get lost / and so on.

That might mean acquiring a lock, or it might mean leveraging compare and swap commands. It absolutely requires recognizing the contention, and paying attention to the failure modes.

1
  • Thanks for the detailed response, I really appreciate you pointing out the nuances in this and the different types of semantics that need to be considered. I do agree that in an async model this would be simpler to process than the request-response version. The discussion which prompted this was strictly about latter unfortunately. Would you agree that if sameness is only based on the transactionId from the client, that would be enough to provide idempotence from the point of view of the client? (ie if the caller uses the same id) Store the id, then process and no-op if the same id is received
    – MZokov
    Commented Apr 25 at 15:43
3

You mentioned an account table and a pot table. Implicit in the question, there must also be a ledger table, which we can use to audit a sequence of transactions.

It only makes sense for the client to issue a transaction request if the account balance is at least as much as the amount being transferred. So the client should be creating such a request only after getting a good query result back from the server. As long as you are doing that, you should obtain the last ID from the ledger, or the most recently used GUID from the ledger. In this way, we can tie together the ACID guarantees of the backend RDBMS with the idempotent re-transmissions of the client.

So, for example, the client might request a transfer of $10 with ID = 123, and subsequently make a distinct request for a transfer of $10 with ID = 124. Each of these requests might be re-transmitted several times, with idempotent effect, transferring a total of $20.

If the starting account balance was $15, then the client would never have attempted to produce that second transaction request. If a pair of racing clients do attempt such a pair of transfers, the server will notify one of them that they lost the race, so that client will give up and not attempt any re-transmissions.

2
  • I think for this scenario, we can assume that the app doing the transfer does the ledger/balance check and returns an error if there's no balance. So we care about the event in which the same ID (ie 123) is received and there's enough balance but want to ensure only one request is processed. I feel like locking on the account would be redundant if you check the ID as the first step, store the Id as a second step. That way you let the DB handle the race condition as that would provide serial execution. Would you agree?
    – MZokov
    Commented Apr 25 at 15:51
  • I'm not sure what you mean by "locking on the account", perhaps acquiring some external mutex? I intended for update requests received by the server to enjoy ACID protection: BEGIN; do stuff; COMMIT; The stuff involves verifying funds are available & this is the first and only time we have encountered an ID=123 request, decrement, increment, append to audit ledger. For a well formed DB transaction like that, I am confident that a Postgres backend would prohibit double spending so money is neither created nor destroyed. I imagine default MVCC isolation level suffices, but do check the docs.
    – J_H
    Commented Apr 25 at 16:12

Not the answer you're looking for? Browse other questions tagged or ask your own question.