23

I have been adapting domain-driven design for about 8 years now and even after all these years, there is still one thing, that has been bugging me. That is checking for a unique record in data storage against a domain object.

In September 2013 Martin Fowler mentioned the TellDon'tAsk principle, which, if possible, should be applied to all domain objects, which should then return a message, how the operation went (in object-oriented design this is mostly done through exceptions, when the operation was unsuccessful).

My projects are usually divided into many parts, where two of them are Domain (containing business rules and nothing else, the domain is completely persistence-ignorant) and Services. Services knowing about repository layer used to CRUD data.

Because uniqueness of an attribute belonging to an object is a domain/business rule, it should be long to domain module, so the rule is exactly where it is supposed to be.

In order to be able to check the uniqueness of a record, you need to query current dataset, usually a database, to find out, whether another record with a let's say Name already exists.

Considering domain layer is persistence ignorant and has no idea how to retrieve the data but only how to do operations on them, it cannot really touch the repositories itself.

The design I have been then adapting looks like this:

class ProductRepository
{
    // throws Repository.RecordNotFoundException
    public Product GetBySKU(string sku);
}

class ProductCrudService
{
    private ProductRepository pr;

    public ProductCrudService(ProductRepository repository)
    {
        pr = repository;
    }

    public void SaveProduct(Domain.Product product)
    {
        try {
            pr.GetBySKU(product.SKU);

            throw Service.ProductWithSKUAlreadyExistsException("msg");
        } catch (Repository.RecordNotFoundException e) {
            // suppress/log exception
        }

        pr.MarkFresh(product);
        pr.ProcessChanges();
    }
}

This leads to having services defining domain rules rather than the domain layer itself and you having the rules scattered across multiple sections of your code.

I mentioned the TellDon'tAsk principle, because as you can clearly see, the service offers an action (it either saves the Product or throws an exception), but inside the method you are operation on objects through using procedural approach.

The obvious solution is to create a Domain.ProductCollection class with an Add(Domain.Product) method throwing the ProductWithSKUAlreadyExistsException, but it's lacking in performance a lot, because you would need to obtain all the Products from data storage in order to find out in code, whether a Product already has the same SKU as the Product you are trying to add.

How do you guys solve this specific issue? This is not really a problem per se, I have had service layer represent certain domain rules for years. The service layer usually also serves more complex domain operations, I am simply wondering whether you have stumbled upon a better, more centralized, solution during your career.

4
  • Why pull an entire list of names when you can just ask for one and base the logic on whether or not is was found? You're telling the persistence layer what to do and not how to do it as long is there is an agreement with the interface.
    – JeffO
    Commented May 17, 2016 at 13:52
  • 2
    When using the term Service, we should strive toward more clarity, such as the difference between domain services in the domain layer and application services in the application layer. See gorodinski.com/blog/2012/04/14/…
    – Erik Eidt
    Commented May 17, 2016 at 15:17
  • Excelent article, @ErikEidt, thank you. I guess my design is not wrong then, if I can trust Mr. Gorodinski, he's pretty much saying the same: A better solution is to have an application service retrieve the information required by an entity, effectively setting up the execution environment, and provide it to the entity, and has the same objectsion against injecting repository into Domain model directly as me, mostly through breaking the SRP.
    – Andy
    Commented May 17, 2016 at 15:32
  • 1
    A quote from the "tell, don't ask" reference: "But personally, I don't use tell-dont-ask."
    – radarbob
    Commented May 17, 2016 at 16:39

2 Answers 2

13

Considering domain layer is persistence ignorant and has no idea how to retrieve the data but only how to do operations on them, it cannot really touch the repositories itself.

I would disagree with this part. Especially the last sentence.

While it is true that domain should be persistence ignorant, it does know that there is "Collection of domain entities". And that there are domain rules that concern this collection as a whole. Uniqueness being one of them. And because the implementation of the actual logic heavily depends on specific persistence mode, there must be some kind of abstraction in the domain that specifies need for this logic.

So it is as simple as creating an interface that can query if name already exists, which is then implemented in your data store and called by whoever needs to know if the name is unique.

And I would like to stress out that repositories are DOMAIN services. They are abstractions around persistence. It is the implementation of repository which should be separate from the domain. There is absolutely nothing wrong with domain entity calling a domain service. There is nothing wrong with one entity being able to use repository to retrieve another entity or retrieve some specific information, that cannot be readily kept in memory. This is a reason why Repository is key concept in Evans' book.

9
  • Thank you for the input. Could repository actually represent the Domain.ProductCollection I had in mind, considering they are responsible for retrieving objects from the Domain layer?
    – Andy
    Commented May 17, 2016 at 13:31
  • @DavidPacker I don't really understand what you mean. Because there should be no need to keep all items in memory. The implementation of the "DoesNameExist" method should be (most probably) SQL query on the data store side.
    – Euphoric
    Commented May 17, 2016 at 13:35
  • What is mean is, instead of storing the data in a collection in memory, when I want to know all the Product I don't need to get them from Domain.ProductCollection but ask the repository instead, same with asking, whether the Domain.ProductCollection contains a Product with the passed SKU, this time, again, asking the repository instead (this is actually the example), which instead of iterating over the pre-loaded products queries the underlying database. I don't mean to store all Product in memory unless I have to, doing so would be a complete nonsense.
    – Andy
    Commented May 17, 2016 at 13:42
  • Which leads to another question, should the repository then knew, whether a attribute should be unique? I have always implemented repositories as pretty dumb components, saving what you pass them to save and trying to retrieve data based on the passed conditions and put the decision into services.
    – Andy
    Commented May 17, 2016 at 13:44
  • 1
    If there's no aggregate involved then there's always the possibility of a race condition and a db unique constraint is almost the only solution in that case.
    – plalx
    Commented May 21, 2016 at 7:34
4

You need to read Greg Young on set validation.

Short answer: before you go too far down the rats nest, you need to make sure that you understand the value of the requirement from the business perspective. How expensive is it, really, to detect and mitigate the duplication, rather than preventing it?

The problem with “uniqueness” requirements is that, well, very often there’s a deeper underlying reason why people want them -- Yves Reynhout

Longer answer: I've seen a menu of possibilities, but they all have tradeoffs.

You can check for duplicates before sending the command to the domain. This can be done in the client, or in the service (your example shows the technique). If you aren't happy with the logic leaking out of the domain layer, you can achieve the same sort of result with a DomainService.

class Product {
    void register(SKU sku, DuplicationService skuLookup) {
        if (skuLookup.isKnownSku(sku) {
            throw ProductWithSKUAlreadyExistsException(...)
        }
        ...
    }
}

Of course, done this way the implementation of the DeduplicationService is going to need to know something about how to look up the existing skus. So while it pushes some of the work back into the domain, you are still faced with the same basic problems (needing an answer for the set validation, problems with race conditions).

You can do the validation in your persistence layer itself. Relational databases are really good at set validation. Put a uniqueness constraint on the sku column of your product, and you are good to go. The application just saves the product into the repository, and you get a constraint violation bubbling back up if there is a problem. So the application code looks good, and your race condition is eliminated, but you've got "domain" rules leaking out.

You can create a separate aggregate in your domain that represents the set of known skus. I can think of two variations here.

One is something like a ProductCatalog; products exist somewhere else, but the relationship between products and skus is maintained by a catalog that guarantees sku uniqueness. Not that this implies that products don't have skus; skus are assigned by a ProductCatalog (if you need skus to be unique, you achieve this by having only a single ProductCatalog aggregate). Review the ubiquitous language with your domain experts -- if such a thing exists, this could well be the right approach.

An alternative is something more like a sku reservation service. The basic mechanism is the same: an aggregate knows about all of the skus, so can prevent the introduction of duplicates. But the mechanism is slightly different: you acquire a lease on a sku before assigning it to a product; when creating the product, you pass it the lease to the sku. There's still a race condition in play (different aggregates, therefore distinct transactions), but it's got a different flavor to it. The real downside here is that you are projecting into the domain model a leasing service without really having a justification in the domain language.

You can pull all products entities into a single aggregate -- ie, the product catalog described above. You absolutely get uniqueness of the skus when you do this, but the cost is additional contention, modifying any product really means modifying the entire catalog.

I don't like the need to pull all product SKUs out of the database to do the operation in-memory.

Maybe you don't need to. If you test your sku with a Bloom filter, you can discover many unique skus without loading the set at all.

If your use case allows you to be arbitrary about which skus you reject, you could punt away all of the false positives (not a big deal if you allow the clients to test the skus they propose before submitting the command). That would allow you to avoid loading the set into memory.

(If you wanted to be more accepting, you could consider lazy loading the skus in the event of a match in the bloom filter; you still risk loading all the skus into memory sometimes, but it shouldn't be the common case if you allow the client code to check the command for errors before sending).

3
  • I don't like the need to pull all product SKUs out of the database to do the operation in-memory. It's the obvious solution I suggested in the initial question and questioned its performance. Also, IMO, relying on the constraint in your DB, to be responsible for the uniqueness, is bad. If you were to switch to a new database engine and somehow the unique constraint got lost during a transformation, you have broken code, because the database information that were there before are no longer there. Anyway thank you for the link, interesting read.
    – Andy
    Commented May 17, 2016 at 15:23
  • Perhaps a Bloom filter (see edit). Commented May 19, 2016 at 14:04
  • 1
    Loading all producs SKUs out of db is not even a solution. In a concurrent scenario, when two service are trying to create new product with the same SKU, both might read all product, validate uniqueness and insert duplicate SKUs.
    – tchelidze
    Commented Oct 16, 2020 at 17:18

Not the answer you're looking for? Browse other questions tagged or ask your own question.