4

I'm working in the context of a (kind of) microservices architecture where services can have multiple instances that can create new documents in the same collection of a Mongo DB.

There is a functional requirement that each document gets a unique ID (like e.g. an employee ID or badge number). Preferably newer documents get a higher ID.

Since MongoDB doesn't have an auto-increment feature, how is this typically handled? I've seen suggestions to create a separate service that keeps track of the numbering, but I don't really like this because there can only be one instance of this service and it will need to process requests for a new number in a synchronized manner.

2
  • 2
    Generation of monotonically increasing IDs needs a central authority (native in DB or any other service) that keeps track of existing IDs and calculate new ones based on some algorithm (simplest adding 1). Why do you want increasing IDs? In worst case you could do it on your own and not relying on MongoDB.
    – Ewald B.
    Commented Nov 29, 2017 at 13:45
  • Mongo Docs say not to use incrementing ids, one can use the default ObjectId...docs.mongodb.com/v2.8/tutorial/…
    – Jon Raynor
    Commented Nov 29, 2017 at 15:57

3 Answers 3

6

Have the microservice that owns the document creation generate the IDs. Having multiple services all accessing the same data-store directly means that they’re not individually deployable and scalable. If you just have many instances of the document service and you need to have them collaborate, then you need something like snowflake.

1
  • although there are indeed problems with the individual deployability and scalability, in this case it's mostly about having multiple instances of the service (not sure it can be called a document service) potentially creating new documents concurrently.
    – herman
    Commented Nov 29, 2017 at 22:46
2

Preferably newer documents get a higher ID.

Is this just a personal preference, or is there a technical requirement for this?

In my experience UUIDs generally make better primary keys than sequential IDs because anybody, including clients, can create them. This allows your API's Create methods to be asynchronous since clients interested in the created document will already have its IDs (since they specified it).

Sequential IDs make humans feel warm and fuzzy but they don't usually solve technical problems any better than UUIDs.

If your users want to see a sequential ID then by all means give them one. Just don't make it your document's primary key. As for how to source your sequential IDs, I'd have to agree that a central service is the way to go.

4
  • As the IDs are user-facing, we cannot use UUIDs: there are constraints on the format, e.g. In some cases we can't have IDs with more than 5 or 6 figures.
    – herman
    Commented Nov 29, 2017 at 13:16
  • I'm not a big fan of user-facing IDs. Your document ID concerns probably don't actually overlap with your user's concerns. You can probably get away with having a private document UUID ID and having a second informational user-facing ID.
    – MetaFight
    Commented Nov 29, 2017 at 13:19
  • Maybe (although I'm not sure the concerns are different), but then this informational user-facing ID would need to be unique as well, right? Seems like the problem stays the same. Also, it's not necessarily end-user facing but may need to be send to a government organization (for declaration purposes) which only accept a particular format, ...
    – herman
    Commented Nov 29, 2017 at 13:31
  • It's not exactly the same problem. The distinction is that you can create a new document without immediately specifying its sequential ID. In other words, your documents can eventually get a user-facing ID. The additional complexity of sourcing one doesn't get in the way of actually creating your document. Admittedly, this is a small gain.
    – MetaFight
    Commented Nov 30, 2017 at 11:28
0

I always use UUIDs for this. I don't like auto-increment numbers because I've seen that go wrong in failover scenarios.

With UUIDs you don't need a separate service to track them.

Downside of UUIDs are that they are long strings, and storage is not as efficient. On the other hand, they can give you better random writes, which might help you avoid hot-spotting.

[Edit] Regarding your format constraints, you certainly don't have to use a canonical UUID. You can take a UUID and do stuff to it to shorten it to some reasonable value. You can also do stuff with Base64 strings, which I do alot. This gives you something ala YouTube IDs which I find quite reasonable. I've generated even shorter ones (5 characters) that didn't have any collisions (yet). Obviously the success of this depends on how many documents you have and what collision risks you specific method entails.

If you really can't use UUIDs, then you'll have to make a simple IdGenerator service that stores the latest ID in a RDBMS table. You might have to code against simultaneous updates and deadlocks. Many, many years ago I used this approach with a stored procedure that implemented quasi-row-level locking to ensure that multiple simultaneous calls would always return unique values.

2

Not the answer you're looking for? Browse other questions tagged or ask your own question.