5

What are some commonly used strategies when it comes to divide software into modules, other than there should not be any cyclic dependency between any modules? Some ways I think of

  1. Group everything that relates to a set of related classes into one module. I could have one module for 1d data (curves), and one module for 2d data (images). It may happen that multiple types support a particular operation. In this case, both a curve and an image can be interpolated.

  2. Group everything that relates to a particular set of operations into one module. This way, the interpolation variants would live in one module, min/max search in another and so on.

In general, a module should group related functionality, but what exactly is related?

Option (1) has the benefit of not having to update any algorithm library when a new type is introduced. At the same time, some algorithm implementations are directly applicable on multiple types, which makes this approach less good.

Option (2) Is naturally the opposite of option (1).

3
  • What does "module" mean to you? There's many different ways to modularise code, and different programming languages or ecosystems use the term "module" for different things. It may refer to a file, a package, a library, an object…
    – Bergi
    Commented Jan 15 at 9:20
  • Related concepts: coupling and cohesion. Commented Jan 15 at 10:34
  • @Bergi: I think it is clear enough what the OP means to make this answerable,
    – Doc Brown
    Commented Jan 16 at 6:38

5 Answers 5

7

Both of your examples could be valid groupings to create modules.

Besides that modules group related functionality, there is another, possibly more important, property to them: Modules communicate a concept to others within your organization, a concept that has a higher abstraction level than a class or a function.

In that sense, it is equally valid to have a module for Nd data structures and their operations (including both 1d and 2d data structures).

The important part about modules is that you can talk with a fellow architect about the X module and they understand

  • what is in that module, even if they have never seen the code and don´t know the exact classes/functions that make up the module
  • how it makes your life and that of your team easier by grouping the functionality in that particular way.

You can look at creating modules as similar to deciding what concept to describe by a class, but then at a higher abstraction level.

6

Others have posted good reasons, such as communications, making code "easy to find", and mental models. There are also technical reasons for dividing code into modules. Though the Single Responsibility Principle is the current fave, and it isn't wrong if properly understood, it is a source of constant confusion and questions here on Stack Overflow.

IMO, David Parnas said it much clearer and best way back in his paper in 1972 (Emphasis added)

We propose instead that one begins with a list of difficult design decisions or design decisions which are likely to change. Each module is then designed to hide such a decision from the others.

So, if you do change your mind, that change is "hidden", at least as best as can be, from the rest of the code. And it will be possible to change without tearing everything apart.

4

The number one strategy for dividing software into modules is to make things easy to find.

You can actually use more than one organizing principle in the same code base. But you must be clear when you're sliding from one to the other. That change should be clearly signaled.

However, the groupings you're talking about are really paradigms. Multiple types that support the same operation? That's polymorphism. It's a big feature of Object Oriented programming. And focusing on the operations is a more Functional programming approach.

But, like I said, you can use more than one organizing principle in the same code base. Just make it easy to find stuff. If you've done it right it should be obvious where new stuff should go and when it's in the wrong place.

4

How source code is best organised is related to the mental conceptualisation of the developers who produce that code.

The basics of software are data structures and algorithms - at least as the late Niklaus Wirth had it.

A module will correspond to whatever set of structures and algorithms are designed as, and conveniently thought of as, a single unit.

A small, simple application doesn't have to have any modules, but in larger applications there invariably have to be internal divisions which are not necessarily apparent to users of the application through its shell. There may in fact be more than one level of modularisation.

A basic principle of where to draw the boundary lines of modules, is the ability to explain the design and workings of the module almost purely (that is, to a very great extent) in its own terms. To explain one module, the design artefacts it contains, and why the design takes a particular form and not another, you shouldn't already have to explain the fine details of everything else in the application.

Another principle is that the explanation of the internal workings of a module should be much more complicated than explaining its exterior interface.

By having modules with complicated interiors but simplified exteriors, you can then reason about how two or more such exteriors interface, whilst hopefully not having to think as much about the internal details at the same time.

Even though exteriors of modules themselves may seem fairly simple when considered individually, reasoning about the behaviour of assemblages of them is a source of significant additional complexity and mental workload.

The complexity of the whole when modules are composed often seems to be a multiple or exponentiation of all their individual complexities, so that keeping the exteriors of modules as simple as possible whilst containing as much complexity as possible within them, so as to moderate as much as possible the exponential complexity that has to be analysed when they are composed, is how an application of considerable total complexity can be built and still remain manageable by developers with finite intellect and mental capacities.

Another aspect of modules may be to aid the developer in navigating around the source code and understanding how data flows around the application, or how different parts of an application come into use at different stages of a business process (and that business process itself may be staged and conceptually modularised in order to assist in its comprehension by the staff who manage and operate it, so then the software which helps mechanise the process takes on the same modularisation).

What is nevertheless very difficult to do, is to explain how an application should be modularised, without being familiar with exactly what it is supposed to do, and having a subjective sense of how much complexity it involves.

When I design any kind of software, there are often multiple iterations of design, in which I discover that complexity in a particular place is running out of my control or that it is becoming subjectively awkward to understand, and I then devise targeted ways to reduce that complexity or find a way to re-conceptualise in a way that I find imposes fewer mental demands on me all at once.

1

Some excellent answers here, but there is one thing I would like to add here which I think the other answers did not mention. This is the aspect of which functionality should not become part of the same module, because the grouping could otherwise cause undesired issues.

When you group functionality into modules, a program P referencing a module M, even if P requires only one functionality of M, becomes dependent on M "as a whole". That gives you the following challenges:

  • whenever M is changed (even if there is only a change to some component which P does not use), one has to check if P might be affected, and if P should or should not upgrade to the newest version of M. Depending on your case, this may happen automatically, or it may cause a lot of manual work. All in all, this aspect is called configuration management.

  • M may have requirements for deployment which P before did not have, also caused by components which are not used by P. For example, M may require additional 3rd party dependencies by itself. That can make the deployment of P harder.

To avoid these issues, it is probably best to keep these potential problems in mind when grouping components into modules. One solution is to avoid large modules or deliberately split a larger module into smaller ones, where the smaller sub-modules might be reused each one of its own, without unneeded dependencies. There are a also lot technical approaches to deal with the issues I mentioned, but I won't mention them here, because I think that would lead to far and drive us away from your main question.

So all in all, this means one has to find a balance between

  • what could logically belong into a module

  • which kind of grouping may cause technical issues in terms of configuration management and deployment.

That's not always a simple decision, of course.

Not the answer you're looking for? Browse other questions tagged or ask your own question.