4

I'm currently experimenting with various use cases for C++20 modules, taking advantage of the (somewhat) stable support for C++20 modules in CMake and major compilers.

My goal is to utilize a public member defined in a base class with module linkage while having a derived class that is exported. During my exploration, I've tried several approaches and observed that some of them work as expected, while others do not.

I've attempted to find information on this specific case in the C++20 standard document C++ Standard Draft, but I still have some limitations in my understanding of this issue.

To provide some context, I've organized my code into three files:

libr5.ixx (a primary interface unit)
libr5_internal.ixx (a partition of libr5)
main.cpp (the main file, which uses the exported entity from libr5)

main.cpp is unchanged for all three examples

// main.cpp
import libr5;
#include <iostream>

int main(int argc, char *argv[]) {
  ClassInLibr5 libr5_class_instance {};
  libr5_class_instance.int_value = 50;

  std::cout << libr5_class_instance.int_value << std::endl;
  std::cout << libr5_class_instance.double_value << std::endl;

  return 0;
}

compile commands are unchanged for all three examples:

clang++ -std=c++20 -fprebuilt-module-path=. --precompile libr5_internal.cppm
clang++ -std=c++20 -fprebuilt-module-path=. --precompile -fmodule-file="libr5:internal=libr5_internal.pcm" libr5.cppm
clang++ -std=c++20 -fprebuilt-module-path=. libr5.pcm libr5_internal.pcm -c
clang++ -std=c++20 -fprebuilt-module-path=. main.cpp -c -o main.o
clang++ main.o libr5.o libr5_internal.o

Compiler information (Manual compiled version):
clang version 18.0.0git
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin

First attempt (compiled):

// libr5.cppm
export module libr5;

export import :internal;

export {
struct ClassInLibr5: Internal {
  ClassInLibr5(): Internal() {}
};
}
// libr5_internal.cppm
export module libr5:internal;

struct Internal {
  Internal(): int_value(0), double_value(0) {}
  int int_value;
  double double_value;
};

Second attempt (not compiled)

// libr5.cppm
export module libr5;

import :internal;

export {
struct ClassInLibr5: Internal {
  ClassInLibr5(): Internal() {}
};
}
// libr5_internal.cppm
module libr5:internal;

struct Internal {
  Internal(): int_value(0), double_value(0) {}
  int int_value;
  double double_value;
};

The compilation error I have in the second attempt is something like:

main.cpp:6:24: error: declaration of 'int_value' must be imported from module 'libr5' before it is required
    6 |   libr5_class_instance.int_value = 50;
      |                        ^
libr5_internal.cppm:5:7: note: declaration here is not visible
    5 |   int int_value;
...

Third attempt (compiled)

// libr5.cppm
export module libr5;

import :internal;

export {
struct ClassInLibr5: Internal {
  ClassInLibr5(): Internal() {}
  using Internal::int_value;
  using Internal::double_value;
};
}
// libr5_internal.cppm
module libr5:internal;

struct Internal {
  Internal(): int_value(0), double_value(0) {}
  int int_value;
  double double_value;
};

The difference between first and second attempts is: in First attempt, class Internal is defined in an interface partition unit, whereas the Second attempt defining such class in implementation partition unit. Both of them should(?) have module linkage only.

The difference between second and third attempts is: Third attempt explicitly 'using' those member in derived class where as the second does not.

So here comes to the question:
[1] Is the difference in First Attempt and Second Attempt a 'well-defined' behavior in c++20 standard? (aka, no undefined behavior)
[2] Is the Third Attempt a 'well-defined' approach to make the Second Attempt work?

If anyone could provide insights or direct me to relevant sections of the standard and provide some human understandable interpretation, I would greatly appreciate it.

5
  • FYI: "My goal is to utilize a public member defined in a base class with module linkage while having a derived class that is exported" I don't think this is a good idea. Declarations with module linkage ought to be implementation details of the interface. Ideally, a user ought to be able to use a module by just looking at the exported declarations. If a member is public and the user is expected to talk to it, then that class ought to be exported to. It functions as part of the class's interface and therefore should not be hidden. Commented Dec 24, 2023 at 18:13
  • @NicolBolas: I think there are fine reasons to have such “partial hiding”: any place something needs to be (in) a base class that the client isn’t supposed to use separately benefits from not being nameable outside the module even if it is used therefrom. Obviously documentation has to take this into account, but in general module linkage is an improvement over the namespace detail idiom for such things. Commented Dec 24, 2023 at 21:51
  • @DavisHerring: I wouldn't say that it should be in a detail namespace either. It's not an implementation detail; it's an actual part of the interface. Commented Dec 25, 2023 at 0:59
  • Yeah I absolute agree with the statement of 'exporting interface' if possible. However my use case is similar as private header (which is not a thing pre-20). And also I think this is the only way I can EXPOSE my base class structure while forbidding classes in other module from inheriting such class, maybe I'm wrong tho. (In older approach I need to make my structure unnecessarily complex to achieve this, by using friend + private etc) Commented Dec 25, 2023 at 1:42
  • 1
    @Meng-ZeChen: You can’t necessarily “forbid” such inheritance: decltype and template argument deduction are powerful tools for obtaining access to entities you can’t name. That doesn’t stop it from being useful to indicate your intent, though. Commented Dec 25, 2023 at 14:28

2 Answers 2

3

All of these should have equivalent behavior; you’re just seeing compiler bugs (still). To be fair, there is a core issue for the corresponding case of lookup in an enumeration, but the intent is clear.

1

Answering my own confusing. After diving into lots of different posts and term I believe Reachability is the more accurate term I should have search, and I'm much incline to believe this is not a bug in clang.

In short, my understanding is that, First Attempt is a well-defined behavior, whereas the Second Attempt and the Third Attempt are the cases explicitly advised not to use in standard draft.

The reason being: Only the First Attempt is an Interface Unit, and only if it is an Interface Unit, a rule called necessarily reachable will apply on it.

Only when an entity is necessarily reachable that we can reliably access components in a class.

In standard draft https://eel.is/c++draft/module.reach#2

All translation units that are necessarily reachable are reachable. Additional translation units on which the point within the program has an interface dependency may be considered reachable, but it is unspecified which are and under what circumstances.
[Note 2: It is advisable to avoid depending on the reachability of any additional translation units in programs intending to be portable. — end note]

(highlight by myself, not standard)

Other very useful post I found is linked here as well: https://vector-of-bool.github.io/2019/03/31/modules-2.html

Not the answer you're looking for? Browse other questions tagged or ask your own question.