3
\$\begingroup\$

Here is my take at a lightweight interface implementation, where I focus on discoverability of suitable classes from strings (for simplicity, class name is used as an id). Each interface has it's own "registry" of implementations.

My choice of abc module is due to presence of registration method there, better support from IDEs, and the fact that it's a part of Python standard library. I have not that much explored typing, but there is still one illustration for it as well.

My examples are borrowed from this https://realpython.com/python-interface/#formal-interfaces, though I've enhanced the original.

I am mostly interested critiques towards the interface_meta module. The rest of the code is an illustration.

Targeting Python 3.8+ here.

from __future__ import annotations

##### interface_meta module
import inspect
from abc import ABCMeta
from abc import abstractmethod
from typing import Type


def _registration_hook(cls, subclass):
    subclass_name = subclass.__name__
    if subclass_name in cls._all_classes:
        print(f"Already registered {subclass_name}")
    cls._all_classes[subclass_name] = subclass


def all_classes(cls):
    return cls._all_classes


def for_id(cls, an_id):
    return cls._all_classes.get(an_id)


def comparable(cls, subclass, attr):
    cls_attr = getattr(cls, attr, None)
    if not hasattr(subclass, attr):
        return f"Attribute {attr!r} of {cls.__name__!r} not implemented in {subclass.__name__!r}"
    subclass_attr = getattr(subclass, attr, None)
    if callable(cls_attr) == callable(subclass_attr) == False:
        return
    cls_argspec = inspect.getfullargspec(cls_attr)
    subclass_argspec = inspect.getfullargspec(subclass_attr)
    if cls_argspec != subclass_argspec:
        return (f"\nSignature mismatch '{cls.__name__}.{attr}' <-> '{subclass.__name__}.{attr}'."
                f"\nIn the interface : {cls_argspec}."
                f"\nIn concrete class: {subclass_argspec}")


def subclasshook(cls, subclass):
    cls._registration_hook(cls, subclass)
    errors = [comparable(cls, subclass, am) for am in cls.__abstractmethods__]
    if any(errors):
        raise TypeError("".join(e for e in errors if e))
    return True


class InterfaceMeta(ABCMeta):
    def __new__(mcs, *args, **kwargs):
        i = super().__new__(mcs, *args, **kwargs)
        i._all_classes = {}
        i._registration_hook = _registration_hook
        i.__subclasshook__ = classmethod(subclasshook)
        i.all_classes = classmethod(all_classes)
        i.for_id = classmethod(for_id)
        return i


##### examples of interfaces in different modules

class FormalParserInterface(metaclass=InterfaceMeta):

    @abstractmethod
    def load_data_source(self, path: str, file_name: str) -> str:
        """Load in the data set"""

    @abstractmethod
    def extract_text(self, full_file_path: str) -> dict:
        """Extract text from the data set"""


FormalParserInterfaceType = Type[FormalParserInterface]


class SuccessfulSubFormalParserInterface(FormalParserInterface):
    @abstractmethod
    def extract_html(self, full_file_path: str) -> dict:
        """Extract html from the data set"""
        return {}


#####
class ParserClassificationInterface(metaclass=InterfaceMeta):

    @property
    @abstractmethod
    def category(self):
        """For classification"""

    def get_name(self):
        """Example of concrete method. OK to provide concrete methods when then depend only on other methods
        in the same interface"""
        return self.__class__.__name__


ParserClassificationInterfaceType = Type[ParserClassificationInterface]


##### Implementation modules
@FormalParserInterface.register
@ParserClassificationInterface.register
class EmlParserNew(FormalParserInterface, ParserClassificationInterface):
    """Extract text from an email."""

    category = "EML"

    def load_data_source(self, path: str, file_name: str) -> str:
        """Overrides FormalParserInterface.load_data_source()"""
        print(self.__class__, "load_data_source", path, file_name)
        return ""

    def extract_text(self, full_file_path: str) -> dict:
        """A method defined only in EmlParser.
        Does not override FormalParserInterface.extract_text()
        """
        return {}


#####
@ParserClassificationInterface.register
@FormalParserInterface.register
class PdfParserNew(FormalParserInterface, ParserClassificationInterface):
    """Extract text from a PDF."""

    category = "PDF"

    def load_data_source(self, path: str, file_name: str) -> str:
        """Overrides FormalParserInterface.load_data_source()"""
        print(self.__class__, "load_data_source", path, file_name)
        return "does not matter"

    def extract_text(self, full_file_path: str) -> dict:
        """Overrides FormalParserInterface.extract_text()"""
        print(self.__class__, "extract_text", full_file_path)
        return {"k": "does not matter"}


@ParserClassificationInterface.register
@FormalParserInterface.register
class PdfParserNewest(PdfParserNew):
    """Extract text from a PDF."""

    def load_data_source(self, path: str, file_name: str) -> str:
        """Overrides FormalParserInterface.load_data_source()"""
        print(self.__class__, "load_data_source", path, file_name)
        return "does not matter"


##### Usage examples


def get_classification() -> (ParserClassificationInterface | FormalParserInterface):
    return ParserClassificationInterface.for_id("PdfParserNew")()


if __name__ == "__main__":
    assert issubclass(PdfParserNew, FormalParserInterface) is True
    assert issubclass(PdfParserNew, ParserClassificationInterface) is True
    assert issubclass(SuccessfulSubFormalParserInterface, FormalParserInterface) is True
    try:
        issubclass(SuccessfulSubFormalParserInterface, ParserClassificationInterface)
    except TypeError as e:
        assert str(e) == """Attribute 'category' of 'ParserClassificationInterface' 
        not implemented in 'SuccessfulSubFormalParserInterface'"""
    pdf_parser = PdfParserNew()
    pdf_parser.load_data_source("", "")
    pdf_parser2 = PdfParserNewest()
    pdf_parser2.load_data_source("", "")
    eml_parser = EmlParserNew()
    eml_parser.load_data_source("", "")

    print(FormalParserInterface.all_classes())
    print(ParserClassificationInterface.all_classes())

    some_parser = get_classification()
    print(some_parser.load_data_source("", ""))

    assert pdf_parser.category == "PDF"
    assert pdf_parser2.category == "PDF"
    assert eml_parser.category == "EML"
    assert eml_parser.get_name() == "EmlParserNew"

    assert isinstance(pdf_parser2, ParserClassificationInterface)
    assert isinstance(pdf_parser2, FormalParserInterface)
    assert issubclass(PdfParserNew, ParserClassificationInterface)
    assert issubclass(PdfParserNewest, FormalParserInterface)
    
    # Ability to find implementation by string id.
    print(FormalParserInterface.for_id("PdfParserNew")().load_data_source("", ""))
    assert FormalParserInterface.for_id("PdfParserHtml") is None

I am aware of this question, but in my implementation I wanted to to implement dispatching. And yes I am aware of zope.interface, but this implementation is supposed to be lightweight.

Answering some questions. Demo has two lines of examples: One from the Real Python article, and an example of "classification", purpose of which is just to add an attribute or property. There is no greater purpose.

The for_id is runtime discovery of specific implementation, this is the main reason for the whole library. String id can be easily stored or communicated between systems, making it easy to implement completely declarative code, for example, for rules engine.

My code at the time of this review can be found on github.

\$\endgroup\$
3
  • \$\begingroup\$ I like the simplicity of the InterfaceMeta, but why not build this on typing.Protocol instead? The interface-checking logic itself would be more or less the same, but with the added benefit of users being able to re-use your Protocol for static type hinting. \$\endgroup\$ Commented Sep 30, 2021 at 17:31
  • \$\begingroup\$ On second though, you don't get the __subclasshook__ behavior with typing.Protocol. I'll have to ponder this. \$\endgroup\$ Commented Sep 30, 2021 at 18:10
  • \$\begingroup\$ I need to explore typing.Protocol better before seeing it's benefits (it's quite fresh, and probably not mature enough?). At the moment I am more interested in being able to discover implementations rather than strictly checking anything so this code is next step from purely 'marker interfaces'. I like zope.interface adapters and multi-dispatch features, but they are quite verbose. And actually I can probably even go "meta-level" by having registration on interfaces as well - so those can be discovered "from strings"... hmmm \$\endgroup\$
    – Roman Susi
    Commented Sep 30, 2021 at 18:10

1 Answer 1

2
\$\begingroup\$

Initial impressions

  • Re-using the ABC registration system is a nice idea. Note that this will not raise any errors at class definition time, only at instantiation time.
    • I personally think is a mis-feature of ABCs and not really your fault, but maybe you are happy with that behavior.
    • Another option is to entirely skip runtime type-checking of subclasses of interfaces, leaving all of this work to the type checker.
  • Write docstrings! Some of this code is what I would call "non-obvious". You should document what each of these functions does.
    • At the very least, you should put in a comment on each of the functions that are meant to be attached to Interface classes, because it took a couple re-reads to realize what you were doing.
  • Some names are a bit "opaque".
    • _registration_hook could be _register_interface_subclass
    • comparable could be is_valid_interface_subclass
  • In comparable(), returning a string-ified "error message" is an awkward signaling mechanism. You should structure this data somehow, and only string-ify it for presentation to the user. Otherwise it will makes this library difficult to work with. See below.
  • In subclasshook(), you probably don't want "".join. Perhaps you want "\n".join or similar?
  • You can (and should) type-hint InterfaceMeta and its associated functions.
  • Use warnings.warn() instead of print().
  • I'm not sure I understand how for_id() is supposed to be used, nor do I understand the get_classification() function in the demo. Is it meant to be some kind of dynamic implementation lookup?
  • What is a "classification" supposed to be, anyway? This doesn't look like it's related to the design of your interface system, but it's part of your demo and I don't understand it.

Don't use strings to convey structured information

The stringly-typed output from comparable() is not great. You might want to build a more-structured hierarchy of interface-subclassing failures, which is only string-ified "on-demand" by the user.

For example:

from __future__ import annotations
import inspect
from typing import TypeVar, Sequence


_Interface = TypeVar('_Interface', bound='Interface')


@dataclass
class ImplementationSubclassingError(TypeError):
    """Base class for errors in an implementation of an Interface."""
    parent: _Interface
    subclass: Type[_Interface]


@dataclass
class AttributeMissing(ImplementationSubclassingError):
    attribute_name: str

    def __str__(self) -> str:
        return (
            f"Attribute {self.attribute_name} of {self.parent.__name__} is "
            f"not implemented in {self.subclass.__name__}."
        )


@dataclass
class SignatureMismatch(ImplementationSubclassingError):
    method_name: str
    parent_argspec: inspect.Argspec
    subclass_argspec: inspect.Argspec
    
    def __str__(self) -> str:
        return (
            f"Signature of {self.subclass.__name__}.{self.method_name} "
            f"does not match "
            f"signature of {self.parent.__name__}.{self.method_name}."
        )


class ImplementationInvalid(TypeError):
    # Re-use the "args" attribute from BaseException
    errors: Sequence[ImplementationSubclassingError, ...]

    def __init__(*errors: ImplementationSubclassingError):
        self.errors = errors
        super().__init__(*errors)

    def __str__(self) -> str:
        return "\n".join(map(str, self.errors))

This kind of setup gives you a programmable and introspectable system, which also results in nice error messages for the user.

\$\endgroup\$
7
  • \$\begingroup\$ Very good points, thanks! I will go through it thoroughly, and accept over weekend. Now that you subclassed TypeError, I got an idea if I interfaces should somehow include exception information as well... or be themselves "interface-able" \$\endgroup\$
    – Roman Susi
    Commented Oct 1, 2021 at 6:31
  • \$\begingroup\$ Added answers to the questions in the question. The for_id is central here. The name is probably not good, but here some background is explained: softwareengineering.stackexchange.com/questions/351389/… . The "id" can be obtained from the class being registered in many different ways. Here the simplest way is used - class name. \$\endgroup\$
    – Roman Susi
    Commented Oct 1, 2021 at 17:14
  • \$\begingroup\$ In that case, I'd suggest removing the layer of indirection; just let the user do the lookup in a dict! Unless you need to perform some additional validation/processing before or after the dict lookup. \$\endgroup\$ Commented Oct 1, 2021 at 20:37
  • \$\begingroup\$ Hmmm. I like this idea, but then I do not want users to modify the dict, also I want a cleaner syntax for access. So FormalParserInterface.implementations.get("SpecificImplementationId") gets the get... or then FormalParserInterface.implementations["SpecificImplementationId"] when try-except can be used - something like this? I was also thinking of more advanced cases than the dict allows, but can be good for starters. \$\endgroup\$
    – Roman Susi
    Commented Oct 2, 2021 at 6:48
  • 1
    \$\begingroup\$ For the record, I asked how to make interface metaclass to be more mypy-friendly and received quite simple suggestion for that: stackoverflow.com/questions/69417027/… - that is, for_id and all_classes can be there in the metaclass (at least I do not see any reason not to) \$\endgroup\$
    – Roman Susi
    Commented Oct 2, 2021 at 14:23

Not the answer you're looking for? Browse other questions tagged or ask your own question.