1

I have an Abstract Base Class AbstractModel

class AbstractModel {
 public:
  struct predictionStructure{};
  virtual predictionStructure predict(CompanyLib::Matrix<double> data) = 0;
  std::string modelType;
 public:
  //constants
  const uint32_t IMPORTER_VERSION = 1;
 protected:
  Preprocessor dataProcessor;//object calls func's to transform data. Children need capability, made into obj for SRP?
};

and child classes ABCModel

class ABCModel: AbstractModel {
 public:
  struct ABCPredictionStruct : AbstractModel::predictionStructure {
    CompanyLib::Matrix<double> scores;
    std::vector<double> q;
    std::vector<double> tsqs;
  };
  ABCModel(std::ifstream & modelFile);
  virtual predictionStructure predict(CompanyLib::Matrix<double> data) override;
 private:
  uint32_t varCount;
  std::vector<double> otherData;
};

and XYZModel

class XYZModel: AbstractModel {
 public:
  struct XYZPredictionStruct : AbstractModel::predictionStructure {
    CompanyLib::Matrix<double> y;
  };
  XYZModel(std::ifstream & modelFile);
  virtual predictionStructure predict(CompanyLib::Matrix<double> data) override;
 private:
  CompanyLib::Matrix<double> means;
};

The problem is, I need to read the first ~64 bytes of an input file to decide which type of child class to construct, as it is a flag in the input file. While I could do this with a static function in AbstractModel, I am trying to minimize the amount of concrete data/methods in the base class (Sr. Eng said to make it Pure Virtual Class but I need to require children to construct/use Preprocessor).

What is the best way to go about this? static protected function in AbstractModel? Make a ModelCreator class that handles file read & creation, and make AbstractModel more interface-like (called factory design pattern I think?).

I struggle a lot with designing a project structure, and this feels messy considering the Sr. Engineer said I should make AbstractModel a Pure Virtual Class/interface, and I have non-virtual things in it. If anyone can point out other mistakes I'm making in terms of best practices?

3
  • 1
    Using inheritance to share code (in your case, sharing Preprocessor) is generally not a good idea, and you should favour composition instead, which is possibly why the Sr. Engineer suggested a pure virtual class. That being said, it does not seem like it grants you any benefits for the problem you are describing. Commented Aug 11, 2021 at 20:15
  • @VincentSavard After reading up on composition over inheritance, it seems a bit overkill and like it will reduce code readability, since my project is small-scale and I'm an intern so my project will be given to someone to use/finish in a year or two, and Preprocessor just calls already-defined functions in a specified order, with specified arguments, upon model creation or model prediction.
    – casey ryan
    Commented Aug 11, 2021 at 20:30
  • 1
    You know your use case better, of course, so this is your call. If I were you, I would ask for clarifications with the Sr. Engineer to make sure you are both in agreement, especially since someone else will finish your system in the future. Commented Aug 11, 2021 at 20:36

1 Answer 1

3

You need to read the first 64 bytes which model you need to create.

Then, based on some criteria you chose the model to use:

  • Option 1: just implement this in a reading function, with a simple condition to construct the right model. Super easy, super simple.

  • Option 2: since the model is abstract, there is an aim to allow easy extension of the models. The issue then, is that you’d have to modify the reading function. An alternative is then to implement a factory function, that takes a parameter to decide which model to instantiate.

  • Option 3: like option 2, but you find a clever trick to let each model register to the factory along with a parameter or a condition function.

The factory should not be a static function of the abstract class: this would break the Open/Closed principle, since for every new cild class, you’d need to modify the parent class. Moreover, one could argue that this also breaks the single responsibility principle, since there would be more than one reason to change.

A separate factory allows you to keep things as independent as needed: abstract doesn’t need to know all its children, and the factory only needs to know the children needed for your current project )”(unless you use option 3 and keep the factory very general and reusable across several projects.

2
  • The goal is to make this extensible so the engineers can add model types, but each model has different types/amounts of data in the binary file being read. I made a common header for each model, and in there it specifies what type of model the file is. Would my best course of action be making a ModelFactory class, that reads the common header, and then calls a more specific ABCModelFactory?
    – casey ryan
    Commented Aug 11, 2021 at 20:21
  • 2
    @caseyryan yes, exactly. The need for extendibility eliminates option 1. The others stay valid: one the model created you’d give it the input stream to process it. You also may have a look at serialization techniques if the file contains the model to be processed (this is not suitable if the file contains data to be processed according to a model)
    – Christophe
    Commented Aug 11, 2021 at 20:28

Not the answer you're looking for? Browse other questions tagged or ask your own question.