Should I duplicate or inherit a python dataclass which changes attributes based on the version of an API endpoint?

Question

I'm working on a python library for a REST API.I'm using python data classes to represent the structure of the returned JSON

The v2 of this API returns a slightly different object when compared to v1. Here's an example:

v1

{
    "animal": {
        "breed": "retreiver",
        "color": "brown",
        "name": "spot"
    }
}

v2

{
    "animal": {
        "genus": "retreiver",
        "color": "brown",
        "name": "spot"
    }
}

^{^{Biologists, please don't flame me. This is just an example}}

What is the right way to create a data class for this?

Create a data class with just the common attributes between v1 and v2. Then create an inheriting subclass with the different variables for v1 and v2 separately. This subclass is what is used used to represent the API response.
Maintain a separate class for each version. Even though there's some repetition, the responses aren't going to change for a version. So you don't have to modify the data class in the future anyways.

Are there any downsides with these options I haven't considered?

JimmyJames · Accepted Answer · 2022-11-10 18:57:52Z

2

Once I got around to finding out how to do this with Pydantic, I found it is exceedingly easy, at least as long as you are dealing with such a small difference between the two structures. You'll need to install pydantic to try the following working example:

from pydantic import BaseModel, Field


class Animal(BaseModel):
    genus: str = Field(alias="breed")
    color: str
    name: str

    class Config:
        allow_population_by_field_name = True


spot = Animal(genus="retriever", color="brown", name="spot")

json_v1 = spot.json(by_alias=True)
json_v2 = spot.json()

print("v1 out:", json_v1)
print("v2 out:", json_v2)

animal_v1 = Animal.parse_raw(json_v1)
animal_v2 = Animal.parse_raw(json_v2)

print("v1 in:", animal_v1)
print("v2 in:", animal_v2)

Output:

v1 out: {"breed": "retriever", "color": "brown", "name": "spot"}
v2 out: {"genus": "retriever", "color": "brown", "name": "spot"}
v1 in: genus='retriever' color='brown' name='spot'
v2 in: genus='retriever' color='brown' name='spot'

If you aren't familiar with Pydantic, it builds upon dataclasses to add a lot of really useful features. Every time I dig into it, I find more interesting things. I even used it recently to help me generate an antiquated wire format that you can't get libraries for, at least not for free. I've wrestled with serialization for decades and it's a really good tool. Other really good frameworks like FastAPI are built upon it as well.

edited Nov 10, 2022 at 18:57

answered Nov 9, 2022 at 21:45

JimmyJames

27.8k3 gold badges53 silver badges101 bronze badges

It does appear to have tools built with this exact problem in mind. Thanks for the lesson.
– candied_orange
Commented Nov 10, 2022 at 10:14
Thanks! So in every data class, I have a separate method just to parse the JSON into the class attributes. I’m seeing that pydantic does it natively. I’ve also noticed the API I’m using supports openapi as well. Is there a way I can use that to generate pydantic models?
– rsn
Commented Nov 10, 2022 at 18:17
@rsn FastAPI generates openapi models from pydantic. It seems like it might be possible to go in the other direction. I don't know that anyone has done it, though. Not something that I ever considered. It's really the json-schema that you'd need to transform, I think.
– JimmyJames
Commented Nov 10, 2022 at 18:59
@rsn The main problem with going from json-schema to a pydantic models is that pydantic supports a much richer set of types both 'out-of-the-box' and through customization. For example, it supports dates and decimals which map to string and number respectively. But generating a skeleton using generic types seems pretty doable either with a library or even with just a decent regex enabled editor.
– JimmyJames
Commented Nov 10, 2022 at 19:17
@rsn Here's the pydantic documentation about exporting schemas. Nothing jumped out at me when I skimmed that suggested you could generate a pydantic model from a schema.
– JimmyJames
Commented Nov 10, 2022 at 20:34

| Show 2 more comments

candied_orange · Accepted Answer · 2022-11-08 18:21:32Z

1

2 is by far the simplest. However, that just perpetuates the problem.

How is this used? If it's possible to get to a point where you can use this without knowing which version it is then that's work worth doing. Any work you do along the lines of 1 should be aimed at that goal. What you build shouldn't just emerge from looking at the data. Consider what's going to be done with this.

You said the responses aren't going to change. Which makes me think these are immutable. So we don't have to worry about saving updates. In that case I'd lean towards your own data class that can only be populated with what you need. Write methods to populate it from either version. Now you can call things what you want to call them.

edited Nov 8, 2022 at 18:21

answered Nov 8, 2022 at 18:01

candied_orange

111k26 gold badges216 silver badges343 bronze badges

you raise a good point about knowing the version. unfortunately, the endpoint has no concept of versioning. They're all the same. The only way I know if I'm hitting v1 or v2 is I have to query the version of the backend during the authentication phase and based on that, propagate a value to every subsequent method in my library to make it version aware.
– rsn
Commented Nov 8, 2022 at 18:04
@rsn what I'm talking about is, after figuring out what version you're dealing with, doing the work needed so you don't have to know that anymore. Don't let the need to know the version spread father than you have to.
– candied_orange
Commented Nov 8, 2022 at 18:15
I would probably have one data class that matches the new version and then in the endpoint, transform the structure when the old version is requested. If you haven't looked at Pydantic, it adds a lot of nice features to data classes which might help here.
– JimmyJames
Commented Nov 9, 2022 at 15:50
@JimmyJames My point was to ignore either version and create a data class that is focused on how this data will be used. Then transform the others to it as needed. You seem to assume that is what the new version is. If we were told that I missed it.
– candied_orange
Commented Nov 9, 2022 at 16:14
Sorry, I thought I was agreeing with you but I see the distinction you are making. That approach is probably the more 'rigorous' way but one of the nice things about Pydantic is that it does deserialization, validation, and serialization for json with almost no effort. You can add in 'hooks' to do tweaks like this. I should probably write my own answer, not because I disagree with what you are saying but just as an alternate.
– JimmyJames
Commented Nov 9, 2022 at 17:48

| Show 3 more comments

Stack Exchange Network

Should I duplicate or inherit a python dataclass which changes attributes based on the version of an API endpoint?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
class-design
python-3.x
data-modeling
or ask your own question.

Hot Network Questions

Should I duplicate or inherit a python dataclass which changes attributes based on the version of an API endpoint?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged class-designpython-3.xdata-modeling or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
class-design
python-3.x
data-modeling
or ask your own question.