18

To clarify the question, here is my context (or something very similar).

I have an interface, that I call IDataSource. The implementing classes contain information to retrieve data. So I have multiple classes implementing it, let's say FileSource, DatabaseSource and WebServiceSource. So far so good.

Now, depending on the version of the program, I will have some treatment to do to these sources' data, like decrypting, reading the first few bytes (signature), and anything you can imagine.

What I wanted to do was having a coordinating class, that will take IDataSource objects, check their actual type (FileSource and such) and perform the appropriate data treatment before doing stuff with the final data.
For example, I would have a FileSource that requires no treatment, and an EncryptedFileSource that requires decrypting, but the classes themselves are completely equivalent (exact same properties) in a vacuum.

So, the question becomes, is it OK to have EncryptedFileSource inherit FileSource, which implements IDataSource, and in my coordinating class, check if the IDataSource object's type is FileSource or EncryptedFileSource, and perform the decrypting in the second case?

Or is there a better way to do it?

8
  • 62
    Having the coordinating class that checks concrete types is really the problem here. Commented Sep 7, 2020 at 12:57
  • 22
    Yeah, this is a textbook case for the Replace Conditional With Polymorphism Refactoring, as in, if I were to write a textbook on Refactoring, I couldn't come up with a better example case myself if I wanted to, Commented Sep 7, 2020 at 13:28
  • 19
    "What I wanted was a coordinating class, that will take objects, check their actual type, and perform the appropriate data treatment" - that's precisely what the open-closed principle is trying to avoid!
    – Bergi
    Commented Sep 7, 2020 at 22:06
  • 2
    @Bergi Plugin system is the solution for OCP in this example. (It's mentioned inside the article you linked.) Tell, don't ask don't actually solve the OCP issue, as far as file handling is concerned.
    – rwong
    Commented Sep 8, 2020 at 0:07
  • 1
    @JörgWMittag I thought all polymorphism examples were mandated to be either animals or coffee pots!!
    – corsiKa
    Commented Sep 8, 2020 at 5:43

8 Answers 8

51

No. Emphatic no.

Unless I misunderstood you, the question is to subclass for a different behavior, but actually not have the behavior itself. Instead an outside actor checks the exact type and does the needed behavior. That is not ok.

If you do have an EncrypedFileSource you have to have the decryption/encryption in that class. If your design doesn't allow that, you'll have to change your design.

Also, inheritance is the wrong tool here even if the behavior was there.

6
  • 2
    Exactly, as a developer I just want to call e.g. read() and get readable data. If source is encrypted, then class should decrypt data before giving to me. If you want to use same code to decrypt everywhere, you'd have a separate decrypt class. Now if you want to first read the data and then decrypt it, you would save it into a different class. While what op describes is possible, doesn't look like an optimal design. But! In case you need to implement something with minimal changes which is often the case, then ... whatever ¯_(ツ)_/¯ Commented Sep 7, 2020 at 21:26
  • Decorator can be used for two purposes: representation and implementation. In both usage, the decorator should ensure that the responsibility (to decrypt, decompress, or decode) is fulfilled when used appropriately. However, that doesn't mean the code that implements the task (e.g. decryption algorithms) need to sit with the Decorator classes themselves. Most likely, those algorithms came from another library (possibly a system level library), and the Decorator merely delegates the mechanics to them.
    – rwong
    Commented Sep 8, 2020 at 0:11
  • 4
    @rwong I agree. Obviously you should not re-implement cryptography in the class, you can use libraries and collaborator objects. The point is conceptually it needs to be in there. You can not expect that the caller will take care of something that you promised. Commented Sep 8, 2020 at 11:03
  • 2
    I agree with your conclusion but your justification is extremely weak and nonsensical and makes this answer useless. You should explain WHY this is bad, not merely say that it's bad because you said so. Commented Sep 8, 2020 at 22:19
  • 1
    @RobertBräutigam why is inheritance the wrong tool? What would the right tool be? Commented Sep 9, 2020 at 23:24
13

Sure, this can be an appropriate way to model your classes. The EncryptedFileSource is not quite empty. It changes the type that the file source object is “tagged” with, and in cases like yours that can be relevant.

However, using the type system to add tags to an object is not always appropriate. Most importantly, the tag must be known at the time when that object is constructed. In most languages, you can't change the type later. Also, things get complicated when there are multiple tags that can be combined – you'd need a class for each combination of tags.

Instead of modelling such tags in the type system, it can be more appropriate to model them more explicitly. For example, you might separate tags from operations on the data source:

class Source {
  public IDataSource DataSource;
  public bool IsEncrypted;

  ... // and some helper methods that take the IsEncrypted status into account
}

Or perhaps these tags should be modelled explicitly as part of the IDataSource interface.

In some cases, a Decorator Pattern could be more appropriate when you just want to wrap methods in an interface with extra functionality, but not add an externally visible tag:

class EncryptedSource : IDataSource {
  private IDataSource source;
  ... // forward method calls to the inner `source`
}

The idea to have a coordinator object that checks the type of the objects it is passed sounds natural, but results in very fragile programs: this violates most interpretations of the open/closed principle because the coordinator has hard-coded support for specific types. It is usually more appropriate to move those operations into the IDataSource interface so that they can be overridden, or by extending IDataSource to implement the Visitor Pattern. For example:

interface IDataSource {
  T Accept<T>(IVisitor<T> v);
}

interface IVisitor<T> {
  T DoNormalThing(IDataSource s);
  T DoEncryptedThing(IDataSource s);
}

class FileSource : IDataSource {
  ...
  T Accept<T>(IVisitor<T> v) { return v.DoNormalThing(this); }
}

class EncryptedSource : IDataSource {
  T Accept<T>(IVisitor<T> v) { return v.DoEncryptedThing(this); }
}

Of course, if all available data source types are known up front, then the OCP doesn't really apply and hard-coding support for certain types can be an acceptable tradeoff. Especially if you're writing an application and not a library, strictly following the OCP can be overkill and it can be more sensible to optimize for refactorability: when you can change the design in a future release, a simple solution will do for now, no fancy design patterns required.

8

I'd like to approach this problem differently. Instead of starting with principles, let's start with architecture, because honestly the bigger picture gives you a better answer, and allows you to adhere to principles easier.


Implementing encryption at the data access layer implies that the same encryption could be applied to multiple kinds of data access. It could just as easily be used with the FileSource as the WebServiceSource. If you find that you cannot interchange the encryption with different data sources, then the encryption is likely specific to that data source.

Encryption usually happens just before the transport layer of an application (over a network or I/O BUS in the computer). This involves incompatible network protocols that may implement the same encryption algorithm, but initialize and configure it differently. Reusing this code often introduces an abstraction that requires more effort to understand than just copying and pasting little bits of code.

This makes encryption specific to the data source implemention. I'm not so sure you even need an "encrypted source" anything at this layer of the application. You are better off putting encryption specific code in each data source.

This solves your original problem with where to put EncryptedFileSource. You do not need it. It is an implementation detail of FileSource. If WebServiceSource needs encryption too, the code will likely be just different enough that a class specializing in data source encryption is not that useful.

4

Can’t say I like it. I’d create an EncryptedSource, which has an IDataSource as a member, which reads from its IDataSource and handles all the decryption. So you can easily decrypt data that is stored in your database or on the web in encrypted format.

Or you might want a data source that is an image in TIFF format. So you create a TiffSource, again with an IDataSource as a member, which uses the IDataSource to read the start of the data and checks the format. So you now have a data source that can handle TIFF images stored in files, database, internet, and everything encrypted if needed.

Very similar to having filters that you can combine.

4

On the practical level, I side with Greg Burghardt's answer, because it appears to me that network protocols are designed to take this decision out of the hand of application developers. Compression and encryption tend to be taken care of by the transport layer, hence names such as "Transport Layer Security (TLS)", for real reasons such as ensuring that neither compression nor information security would be compromised.


My original answer below focuses on the object-oriented design issue, and is intended to deal with a generic stream of (possibly encoded) byte data, e.g. data represented by a Stream, in the context of matching that data stream with one or more file stream transformers or decoders that will work together to decode the file.


Your question highlights a tension (struggle) between two approaches:

(1) the code behavior should be driven by wrapping the data class with a type such as EncryptedFileSource (as opposed to an ordinary IDataSource).

(2) the code behavior should be driven by observing the data that can be read off on the instance, for example, extracting the "magic bytes" from the beginning of the data stream.

As far as file handling is concerned (where the set of file formats is an open set, i.e. unconstrained), the second approach tend to produce the correct behavior needed by the application. Once the correct code behavior is picked using the second approach, it can be indicated using the type system via approach one.


To understand why this is the case, let's consider the situation where a data source is unencrypted, but the programmer somehow wraps that data source within an EncryptedFileSource. Should the program try to run the decryption code on it?


To prevent misuse of the first approach, it is important that the responsibility for analyzing the actual object content (approach 2) and creating the object wrappers (approach 1) be combined into one design pattern, and the implementation of that design pattern needs to be well-tested. The end result is a Decorator Factory, already mentioned in amon's answer.

The Decorator Factory would be given a data source, examine the first few bytes and extract the magic number, determine the type of Decorator needed to "unwrap" it, and instantiate that Decorator. The Decorator Factory may need to repeat the process until the end result is fully decoded. For example, if the data is compressed and then encrypted, it may need to instantiate a first Decorator to decrypt, and then instantiate a second Decorator to decompress.


It just happens that, when we solve the problem with the Decorator Factory, that we will not need to perform any downcasting (type-checking against concrete sub-types), which is widely considered as a code smell (a reason for concern, or an undesirable trait).

However, some of the solutions to the downcasting code-smell are inappropriate.

In particular, as far as file format handling and transformation is concerned, it would not be appropriate to add an IsEncrypted flag to the abstract interface IDataSource.

To explain why it is unnecessary and inappropriate to add flags to the abstract interface to solve a situation where Decorator is more appropriate, I will borrow an example from the Decorator chapter of the book, Head First Design Patterns.

Consider the abstract ICoffee interface. Demanding that an ICoffee interface be telling (as in "tell, don't ask") requires us to have all of these as properties on ICoffee: GramsOfCocoa, PacketsOfSplenda, MeasuresOfAgaveSyrup, MeasuresOfCaneSyrup, NumEspressoShots, WithIce, Blended, Microwaved, MeasuresOfMintSyrup, and so on. The potentially unbounded nature of add-on attributes would make it undesirable to tell such information on the parent abstract interface.


In practice, "Tell, don't ask" is usually preempted by yet another adage: "Show, don't tell".


In the case of handling compressed and encrypted files, to show is to perform the decompression and decryption on behalf of the consumer of the data, so that the consumer of the data need not be concerned about the work of decompressing or decrypting. This is what Decorator pattern is best for.

In contrast to the Decorator example from the Head First Design Patterns chapter, where the application hard-codes the construction of a "Russian-doll" of multiple Decorator instances around a base Coffee class, the Decorator Factory pattern is needed for the file handling scenario, because the construction of the Decorator instances is driven by the data, which can only be examined at runtime (while the application is running).


Even though it may be inappropriate to expose a descriptor for unbounded combinations on the abstract interface, it might still be useful sometimes. However, keep in mind that the descriptor might be incomplete, that there are certain combinations of data that cannot be fully described by such a descriptor. For example, a data stream that is compressed twice; or encrypted twice; or encrypted and then compressed (despite a programmer's common sense).

Rest assured that these quirky data sources can still be handled correctly by a Decorator Factory. The consumer will still get to the final data after the processing of multiple Decorators, even if the handling is too strange for the descriptor to represent.

In terms of file handling, there is one example where a descriptor will be appropriate even if it is theoretically incomplete: a MIME type. One can always concoct some combination of file format or data transformation that do not have a corresponding MIME type. MIME type resolves this incompleteness issue by disallowing such creation. Instead, an application is programmed to support a known, finite set of MIME types, and anything that isn't within that set don't need to be supported by the application.


Going back to the Open-Closed Principle. What problem was it intended to address?

  • Making sure that, when new situations arise, one can extend the behavior of the existing system by writing new classes and plugging them into the existing system, without having to modify the code of any objects (classes) already in the existing system.
  • Making sure that, when we extend the system, the outcome is not fragile overall, such as ensuring that every combination of situations that may arise, every code path that is actually possible, are handled correctly.

Downcasting is often said to be the ultimate code smell that predicts whether a system will become fragile after extension or modification. Moreover, the following code sample captures the problem succinctly:

switch (fruit)
{
    case Genus.Citrus citrus:
        return new Peelers.CitrusPeeler(citrus);
    case Genus.Vitis vitis:
        return new Peelers.VitisPeeler(vitis);
    default:
        // you may replace this line with return null,
        // but it doesn't resolve the problem.
        throw new NotSupportedException("Unknown fruit.");
}

The indictment of violating OCP in the code example is based on the failure to handle new additions to the types of objects handled without having to modify the code above. Passing new object types to the code above leads to the exception being thrown, or worse, silently ignored.

Techniques that remove the need for downcasting removes that smell, but do not always resolve an OCP issue.

At a library or framework level, for a task such as file handling, the OCP issue is not resolved unless the library or framework implements a Plugin system allowing its users to register new handlers for detecting and decoding new file types. It's the only solution compliant with OCP for this task.

The Plugin system consists of a Decorator factory that allows new Decorator classes to be registered, each with their unique file header (magic number) detection. This is how most image file format handlers work.

Removing the downcasting by blindly applying "Tell, don't Ask" simply shifts the OCP issue elsewhere.


There are a number of do's and dont's when applying the Decorator Factory pattern to file handling. It would be too long to cover these on this question. Hopefully this answer will give a head start.


Streams coming from a network connection, decryption/encryption algorithm, or decompression/compression algorithm are very likely to be non-rewindable. This means, for example, after reading the first few bytes for the file header, the stream cannot be reset in order to be used in the actual file processing. This means the framework must insert a buffered or spooled stream (as a Decorator) in between any two steps where a rewind operation is needed but not natively supported. All Stream objects must correctly indicate the rewind ability (via Stream.CanSeek). If a particular library implementation of Stream is lying, it is the framework's responsibility to immediately wrap such non-compliant, lie-telling stream with: either a lightweight Stream that doesn't add any behavior except returning Stream.CanSeek => false, or a properly buffered or spooled Stream around the original.

Note that buffering or spooling the stream data has data security implications. For example, if the stream is big (more than a few megabytes) and uses the disk temporarily, care needs to be taken if the data need to be secured from theft if there are untrusted applications running on the same computer that can steal part of the data from the disk. (This is an issue for servers only; not on end-user computers.)

1

I don't like very much the idea of naming a class with a functionality that it does not provide. But apart from that the most similar solution to what you describe is the marker interface. So for example your FileSource could implement the Encryptable interface. You could also add one or more Marker interfaces dynamically by instantiating anonymous inner classes, in this way you could keep a shallow inheritance tree.

Another quick solution would be adding some flags and the getters in the root class. E.G. isEncryptable(), the flags could be instantiated in the constructor.

0

Yes, it's appropriate to have some child classes just in order to differentiate some cases.
The choice between adding a child or adding property isEncrypted could be reasoned by question: Could this child get some additional methods/properties?

In your case, I would say definitely, yes. You would want to add methods to convert EncryptedFileSource to FileSource and FileSource to EncryptedFileSource. It's a good idea to place both of them to EncryptedFileSource so hey, it have differences from FileSource!

But the idea that some Coordinator make decisions based on object type is bad. It's better to use polymorphism, as amon illustrated in their answer.
For example, somewhere you could write val text=datasource.getReadableData and in EncryptedFileSource this method would be something like fun getReadableData() { return decryptData(getData())} while in FileSource it would simply fun getReadableData() { return getData()}
Here you split interface method getReadableData and let each concrete class decide what to do.

1
  • This is a terrible idea without some kind of guarantee that an EncryptedFileSource is actually encrypted. Commented Sep 8, 2020 at 22:24
0

Yes. Have you seen how many times Exception has been sub-classed just to add a new type of exception?

  • Open for extension (in this case, extension is adding a type)
  • Closed for modification (in this case, you didn't change Exceptions behavior.

I'd say you are well within the Open-Closed principle guidelines.

Now if you get too many bsae types, might want to consolidate them under "You Aren't Going to Need It" into fewer, more general, base types. But that's a completely different principle.

Not the answer you're looking for? Browse other questions tagged or ask your own question.