7

I'm a bit confused about how to identify the elements in my application.

I have a rest backend with an sql database. I'm currently using numeric (autoincrement) ids as primary key and use these for foreign keys. I also defined uniques based on semantics.

My client is an app where I, for the most part, am mirroring the remote database. The goal of this is both to have a cache (performance) and to make possible to use the app offline. The offline usage can be temporary or permanent (user can use the app without ever connecting to the server).

So far I have been implementing the client with online usage in mind. So when the user creates a new item I would send a POST to the server, the server creates the item and gives it an id, and sends me back the item with the id, and I store it in the local database. This way I will use this id as foreign key also in the client database.

But the offline usage doesn't work like this. When I add an item without server I'll have to give it an id in the client. So this is immediately available to be updated, used in a foreign key, etc. Furthermore when offline is just temporary, that is the user connects later to the server, I have to synchronize the items which were added offline. If I'm using client-side ids, I would have to reemplace these in the server with server-generated ids, otherwise the database would look inconsistent and it may (though highly improvable) happen a clash. Or just use always clientside ids?

So my question is, how is this handled normally? I can think of 4 approaches:

  1. Drop the ids entirely and use only uniques. I can get even updates working with this. For example a unique can be a name. But this can lead to issue as follows - if user was online and created an item. Goes offline and updates the name of the item, which is the unique, 2 times. I would have to store in the client the last value of the unique when a synchronisation happened in order to do the update in the server. And cascade update. Besides of this unique also occupies more space than numeric id in the database.

  2. Drop the ids only in the client. So in the client I use only uniques (name, etc) and I send the server old-new value for updates, which is the same as in 1. The server uses ids internally but never sends this to the client.

  3. Use the server ids and do a sync call as soon as the client goes online where the client-generated ids are replaced with server-generated.

  4. Generate the ids always in the client - similar to described here maybe. (to garantee uniqueness among users e.g. device UUID + timestamp, but this leads to relatively long ids, not sure about space/performance implications here). And use them in the server instead of autoincr id. Or maybe use both?

1 Answer 1

6

There is no way you can get around having more than one way of uniquely identifying your entities. The best you can do is grit your teeth and confine the problem to the smallest possible scope.

For instance:

  • assign the real IDs on the server only. The server should never even be aware that the client uses another ID scheme, only that it delivers new records that must receive a new real ID.
  • use a distinct ID scheme in the client that has no overlap with the real IDs (e.g. negative rather than positive values...)
  • change all places in the client that deal with entity finding so that they understand both schemes, and use the other if the one isn't assigned yet
  • change the one place in the app that posts new data to the server so that it updates the "real ID" after the save has been confirmed.

It doesn't really get simpler than that. (Of course, minimizing the places that have to be changed so that this solution becomes easier is a good idea anyway, per the principle of Don't Repeat Yourself.)

7
  • I see, thanks. What about generating the ids always in the client? This would avoid the extra complexity of having to deal with 2 schemes...
    – User
    Commented Jun 18, 2015 at 9:33
  • @Ixxzz It adds much more complexity, because you cannot guarantee that the unique IDs are actually unique. You'd have to add complicated, rarely-used, devastating-if-broken logic to reconcile and change duplicate IDs. Don't go there unless clashes are so rare and so unimportant that you can just ignore them. Commented Jun 18, 2015 at 9:43
  • 1
    The problem is that this rests on a lot of assumptions that you will have to ensure forever, or risk insidious errors. Can you guarantee that device times are never reset? Are you sure that there will never be transactions where multiple items are created simultaneously? How sure are you that your random strings never, ever clash? Alternatively, could you live with a slash once in a while or not? The price of trying this is that you have to keep thinking about all these questions. Do the right thing instead, and they simply disappear. Commented Jun 18, 2015 at 10:21
  • 1
    Ok, I did some research and for now I'm going with UUIDs. My app will have always relatively small amounts of data (by it's nature), and it's not critical data, so if that highly unlikely event happens that there's a clash I can afford to trigger an exception or some possible inconsistency scenarios. I think the cost of maintaining 2 schemes is too high for my use case. I'll keep this in mind for the future, or other apps though. Thanks.
    – User
    Commented Jun 18, 2015 at 14:37
  • 2
    Never store UUID/GUID as CHAR(32), it's not text, it's a 128-bit number in hexadecimal form. 128-bits is 16-bytes so just use a BINARY(16) field if UUID isn't available. Commented Mar 10, 2017 at 7:07

Not the answer you're looking for? Browse other questions tagged or ask your own question.