25

A getter is a failure to design an object. It violates encapsulation which is a core principle of object oriented programing.

Now please tell me, how do you design a libraries hash table collection under that philosophy?

I recently opened an answer of mine with the previous two lines. They roused a response from one of our long time contributors:

"Now please tell me, how do you design a libraries hash table collection under that philosophy?" - I feel a negative vibe here. I'm not judging, I do that sometimes too. I more than happy to answer, if I feel there's an honest effort to try to understand.

Robert Bräutigam

To which I am responding with this question. Now sure, it may sound like I'm asking two questions since the title and body differ, but I'm trying to grok much more than hash table design here.

I've come to believe that object oriented programming is good. But that not every line of a program can be purely object oriented. And that is fine. Much like you can't write a program free of static methods (you at least need main) but that doesn't mean it all has to be static.

That idea is what my somewhat clumsy example was trying to illustrate. But if that idea is flawed I would love to know why.

One of the practices of OOP is when a method needs data it's better to move the method than to move the data to put them together. This enables encapsulation. State is not something to share, simply something that changes behavior.

However, some boundaries, like library boundaries, make it impossible to move the method and so you move data. Thus getter ridden Data Transfer "Objects" are born. And so OOP ideals are compromised in the face of reality. Rather than give up entirely in the face of this I consider that a compromise to isolate rather than spread. This is why I think of OOP everywhere as unachievable but still think OOP is good when you can do it.

For bonus points, please do show me how a OOP purist would design a hash table without getters.

41
  • 3
    @Ewan The word getter here is not used to mean a getter technically but any kind of method (getters included) that returns the internal state of an object.
    – Ced
    Commented May 11 at 21:06
  • 4
    @Ewan Getters are methods for returning state, there being special syntax in some languages for nicer usage is immaterial. Commented May 11 at 21:06
  • 40
    "A getter is a failure to design an object." - I think that the crux of the issue is that this is too general a statement, taken at face value. What many people do with getters is they just expose each private field. While this offers some level of indirection, it is minimal, and if your goal was to encapsulate the internals in the first place, then doing this soon makes it difficult to change those internals, or to extract a sufficiently abstract interface, without a bunch of rewriting. That's what people caution against. But if getting out a value is a core behavior, a getter is fine. Commented May 11 at 23:06
  • 14
    It is quite clear you want to give Robert B. a stage with this question. But maybe a more honest title would have been "How far can you push Robert B's idea of Object Oriented Programming", or "How far can you push Tell-Don't-Ask". because that is IMHO the core point of Robert's approach to OO. Note also that there are other schools-of-thought, people who don't conflate OOP with TDA.
    – Doc Brown
    Commented May 12 at 6:17
  • 10
    You -- or someone you are referring to, I can't really sort it out here -- may be suffering from what I call Object Happiness Disorder: the belief that OOP is a goal in itself, rather than one technique for achieving a goal. The question to ask is not "is this choice aligned with OOP dogma?" but rather does this choice make my program more correct, legible, usable, robust, feature complete, extensible, or whatever actual goals you have. Commented May 13 at 17:50

15 Answers 15

10

However, some boundaries, like library boundaries, make it impossible to move the method and so you move data.

I agree. Sometimes you do not (or can not) know the behavior that will be attached to your data, in these cases you'll publish the data.

So, if your question is, whether you can (in a reasonable way) always write code without returning instance variables / internal state, i.e. "getters", my answer is no.

Example: here is a small HTTP/REST library I've written. It has a couple "getters" like ContentResponse.getContent(). As @ced pointed out, the domain is to return "content" from an HTTP call and I don't know how that content is supposed to be used, so a getter is appropriate.

But, and this is my point, these exceptions occur far less then people assume they would. Like orders of magnitude less.

It's like if people would only shower once or twice a month and I'm arguing you should shower every day, while you are saying "he can't mean every single day?".

Like in the library above, there are like 3 getters in total or so, which is unlikely to grow, even with new features of the library.

Even in articles like: Data Boundaries are the root cause of Maintenance Problems, I'm pretty consciously and repeatedly saying this only applies to "inside" the application (or library). It's not to say, that library boundaries can be full of getters. Lots of times you can actually find a suitable behavior instead of giving out data.

My point is, that most designs are riddled with internal, data-oriented boundaries, DTOs, Beans, layers, etc. We can discuss the fine points and the gray areas of OOP, but the truth is, we're not even close to those at the moment for most projects.

So now that we found out it's ok to sometimes have getters in the external interface of a library. A more interesting question would be, is it ok to have a Layered Architecture or Clean Architecture on the inside of an application. And if yes, under what circumstances / requirements.

18
  • 2
    I agree about the DTOs but is Account.display really better than Account.GetTotalAsPrimitiveType() ? It seems to me that NumberView is leaking to Account as much as Account would leak to the UI
    – Ewan
    Commented May 12 at 11:02
  • 1
    @Ewan one difference is that you are now working with 1 class instead of two (the view can be hidden), with the domain object acting kind of as a facade for the view with Account.displayCard, Account.displayDetail. This also adds more discoverability than the more common: AccountCard(account), AccountDetailView(account). And yet another benefice is that your object is now a tree with the domain at the top. So your object tree goes from business relevancy to details.
    – Ced
    Commented May 12 at 13:00
  • 1
    @Ewan but most importantly is that your account object is closed. You know how its data is going to be used. Which you cannot know in advance with Account.GetTotalAsPrimitiveType(). The result could be used for display, a calculation, anything really, the point is you do not know. Whether that's a real issue is another question
    – Ced
    Commented May 12 at 13:01
  • 1
    @Ewan inheriting won’t change the interface. Wrapping doesn’t move it to be with the data. It does let you isolate the non object oriented code. So maybe we’re saying the same thing different ways. I just wanted to be sure. Commented May 12 at 17:03
  • 1
    I find it problematic to encapsulate data + behaviour into a single class whenever dealing with inter-dependent data from multiple external sources; For example, consider a business rule which uses data from an inbound HTTP request to fetch a JWT Bearer token cached in redis, which in-turn leads us to query a database (based on the JWT claims and the inbound request together), which in turn requires sending a request to a downstream API, and so-on. Trying to encapsulate this chain of events and all of its data into a single class is in danger of becoming a "God object". Commented May 13 at 7:25
26

I think the principle that all getters somehow violate oop principles at some fundamental level is wrong. If I have a string class and I want to know how long the string is by calling getLength does that mean it's not pure OOP? No! Of course it doesn't; that's preposterous. A string of characters has a length. It's inherent to the properties of the string. Just like people have names, ages, birthdays, and preferences of color. Exposing those properties doesn't violate the principles of encapsulation in oop.

Sharing information inherent to the model or problem domain doesn't violate OOP. What good is an array, list, or collection if you can't get the contents of it? And this gets at the limitations of what encapsulation, as a principle, are trying to solve. Encapsulation is there to prevent implementation coupling. It's not trying to prevent coupling of data. If I call getLength on a string then I can't work with string that doesn't have a length. Therefore I'm coupled or depends upon that data or detail.

Coupling happens. It has to happen in order to build software. But all OOP is meant to handle is implementation coupling not other forms of coupling like data or interface. It's a much lower bar OOP is trying to solve or really provide mechanisms to let you solve it. Something critics are quick to confuse encapsulation in all it's forms as somehow reasons to discredit OO for not solving all forms when it never intended to.

And this is where the concept of interfaces comes in. Because OO programs couple themselves to interfaces used, that meaning the methods that are called. But if another object implements the same interface then you could substitute one for the other and the code would work all the same. So most OO languages provide some mechanism to declare these interfaces to formally declare these integration points. And most have evolved to focus on interfaces based around methods rather than properties because they provide far more flexibility than data properties do.

This tenancy of OO to focus on methods instead data is why getters enter the picture because OO places more value on methods for their flexibility. For example String.getBytes must violate encapsulation right? It's exposing the String's implementation detail right? Well not exactly. I could implement a String with a char array and dynamically transform that to a byte array when getBytes method to provide the String is various encodings (ie UTF-8, UTF-16, etc) when it's called. If you didn't have a method that returned you the byte array and you needed it. Then you'd have to get the underlying char array in the string in order to write that method outside of the String object. That would most definitely violate encapsulation.

And that's the thing about this that is tricky. If your requirements demand you have knowledge about character encodings of a String the best way to handle that is to delegate that conversion to the owner of that data to perform the operation for you. That's how OO and encapsulation helps you. By calling that getBytes method you are codifying that as a requirement in your program. OO and encapsulation doesn't let you transcend the requirements.

In fact I will even go another step and say that if you have getters/setters on your object you are practicing encapsulation. And YES even if you return the internal guts of your implementation through a getter! Controversial stance I know, but hear me out. Because you've used a method in order to reveal potential internal workings of a class to the client you haven't prevented yourself into changing those internals later. That's the core test of encapsulation. If you allowed direct access to the instance variables you'd have violated encapsulation and thereby locked yourself into not being able to refactor it. But if you use a getter you can always change the object's internal representation and simply change your getter method without affecting clients. And that is the point of encapsulation. Now you are locked into providing that method (until all clients are changed), but you are still insulating your ability to change from clients.

That's why OO focuses on methods as its preferred way of coupling. They offer a flexible place to insert logic to provide and integration point on which to base your dependencies.

14
  • "Exposing those properties doesn't violate the principles of encapsulation in oop.". It does though. This is not a matter of opinion, but just fact. Encapsulation is "encapsulating", erecting a capsule, a barrier around, i.e. not making available, internal state, i.e. instance variables. It doesn't matter whether it is impossible to avoid, it doesn't matter if this is good or bad. The only thing that matters pragmatically is: do you lose control of the data or not. If you expose it, you lose control, if you don't, you don't. Commented May 14 at 7:09
  • 5
    @RobertBräutigam In what universe does encapsulation equate to creating a 100% opaque wall around an object. If I have a person instance, what's so damaging about me being able to know their name (via some property getter or method)?
    – Peregrine
    Commented May 14 at 7:17
  • 1
    @Peregrine It's not 100% opaque. It is a wall of behavior, that is derived from requirements, i.e. the Ubiquitous Language. The "damage" of exposing a name, is that you can not reason about it anymore. What is it for? Do I need one string, do I need firstname and lastname, do I need a salutation? You have no clue. If you include behavior instead of giving it out, you'll know the context, therefore can reason and change in isolation. This is the best I can do in a comment., HTH. Commented May 14 at 7:53
  • 1
    @RobertBräutigam: if you expose the "length" of a string, but do not allow the length to be modified, how is that losing control of the data? Commented May 14 at 12:16
  • 3
    @GregBurghardt String.length is a bad example, it sits at the edge of a library. Still, "losing control" means that as soon as you publish that, you can't modify your string to be an unlimited string. If the length is int, now you can't have a longer string than that. If you calculated the length based on bytes before, and people started using it to store it, you'll have a huge problem when unicode comes along. As soon as its published, it's out there. You can't take it back, you'll have to support it. This is what I mean by "lost control". Commented May 14 at 16:28
18

This is not a getter

Just because a method has get in its name doesn't mean it's a getter; nor does renaming a getter to eg say fetchName stops it from being a getter.

In fact, the getItem method is more of a lookup: search for item x in your inventory and return it. Under the hood, you could have all sorts of elaborate logic, including fetching results from a database, reading files, etc: you are not simply returning an attribute.

In contrast, suppose that you implement the HashDict as a wrapper around an existing implementation of it (eg to be able to swap it later). Then having a getCollection method that would return the internal dictionary would be a getter (and thus avoided); you should instead use the other interface methods to interact with it.

To summarize, this is just naming confusion: a getItem method of a dictionary is not a getter.

10

I'll take a stab at it but I'll just touch on the hash table question.

Firstly, the person that replied to you, is an advocate of Object oriented domain driven development (coined by him) and therefore I'll answer in that context. That is a term that has a specific meaning. That is: object orientation, (with state encapsulation), applied to domain objects, objects that have a meaning in an application. In that sense an HashTable is not a domain object, at least in most application, it's a tool that domain objects can use. You will not talk about HashTable being a business relevant entity / value with your customer.

Now what if your task was to design a storage system that can "get an item given a key" and "store an item given a key" and "return the count of stored items" ?

In that case those are business requirements and are also fine, as per the domain. That's what the domain dictates you to do.

The key difference is that Product.getName() does not answer a question you might ask to your customer "what can be done with the product ?" While in the case of a storage system Storage.getItem('x') it does answer the question "what can be done with the storage ?".

Secondly, you'll note that Product.getName() is a pure getter as it has no parameter. Which means it does not need additional input to compute the returned value. In comparison to Storage.getItem('x'). However, the fact that a method has no parameter, is just an hint, as Storage.count() would be okay, given it is a business requirement (point 1).

Third, the point of this concept, is to try to group functionalities vertically, instead of horizontally. Product.getName() most likely implies a layered architecture where the data is not grouped with the behavior.

Note: I make no judgement here whether layered architectures are bad, or whether this type of OO is a good way of achieving functional cohesion / grouping functionalities vertically, it is outside the scope of this answer. However this is what I got from reading his articles.

9
  • 2
    I feel like I understand what you're saying, but I also feel like you have neither agreed nor disagreed with my central thesis: you can't have OOP everywhere. I think you've hinted agreement but I fear that's just my wishful thinking. Commented May 12 at 3:04
  • 3
    @candied_orange Agreed, I had the same feeling when posting. I'd say yes you can't, ultimately, but a follow up interesting question might be, in under what circumstance can you, and most importantly when should you ?
    – Ced
    Commented May 12 at 13:06
  • @candied_orange: Hypothetically, if you posit that OOP cannot be deterministically attributed down to each individual line of code in isolation, then your question would effectively be unanswerable. Consider the possibility that there might not be a conclusive yes/no to your answer, simply because it's not that deterministically and granularly defined.
    – Flater
    Commented May 13 at 4:07
  • @candied_orange the conclusion “you can't have OOP everywhere” is based on the assumption that there was a set of rules to be strictly followed, to qualify as “object oriented”. That’s just wrong. There is not a single definition of OOP, all people agree on.
    – Holger
    Commented May 13 at 11:11
  • 1
    @pjc50 So Holger can talk past me and double down but I can't? Aw man. Fine. I'll stick to a strictly necessary hostile approach. Commented May 13 at 14:19
5

You can't infer OOP from a line of code

This question is a bit of a semantic puzzle in what it's asking.

But that not every line of a program can be purely object oriented.

Yeah, because object orientation cannot be inferred from a line of code. That's like looking at a brick and asking if it's part of a two-story house. The brick doesn't know. The brick can't prove nor disprove whether it's part of a two-story house.

Object orientation isn't an architecture either, in the sense that e.g. Clean Architecture is (this is not CA-specific, I'm just using it as an example architecture). For architectures, there's an ideal codebase and most of the time your supposedly "Clean Architecture" codebase is a mix of said architecture and some real world compromises you've made along the way. When we argue whether something is or isn't a particular architecture, what we're really disagreeing on what percentage of the codebase adheres to the architecture. Put differently, different people have different opinions on what percentage of impurity (i.e. compromise) they find acceptable.

Object orientation is a modeling philosophy. I want to start from an oversimplified example. You could nitpick this to death, but its intention is only to act as a very simple baseline, not a meticulously precise one.
The measure of OOP-ness is not expressed as a percentage of what parts are(n't) OOP, it's a measure of overall design quality and depth of implementation.

Consider the following brief:

We need an application in which we can save vendors, validate our contracts with them, and purchase items according to these contracts.

An object-oriented programmer is going to see Vendor, Contract, Item. While the focus of this answer isn't on defining functional programming, the contrasting takeaway that a functional programmer would end up with is saveVendor, validateContract, purchaseItem (I believe it would be more correct to omit the noun from these names to be true FP, but I'm more in the OOP camp so I might be biased towards including them).

This isn't a matter of wrong or right, or even of labeling what % of your codebase is or isn't object-oriented.

At the end of the day, that object-oriented programmer is still going to have to write the save, validate and purchase logic, but they will categorize this as part of the objects that they designed. Similarly, a functional programmer is still going to have to create some kind of data container that contains e.g. a vendor's information, but they're going to design it based on how it slots into the functionality they've build.

"Oriented" means "what you used as the first order of structure". It doesn't mean "only does this".

Tangent on statics

Much like you can't write a program free of static methods (you at least need main)

This is a nitpick, but an important one to connect to the rest of this answer. You don't need main. But as long as you don't tell the program what method to run on startup, then you do need a default method that will be run. And statics overlap with defaults in the Highlander sense: there can be only one.

The specific nature of how statics operate as opposed to instances is irrelevant as to how you decide what needs to be executed on startup. There's enough similarity that one can represent the other, but they're not inherently tied together. The compiler could just as easily have defaulted to doing (new Program()).Main() as opposed to Program.Main, without really making any different as to the default nature of the startup method being called.

OOP and sharing state

This enables encapsulation. State is not something to share, simply something that changes behavior.

That is not a core OOP tenet. It's good advice, but it's not what drove the concept of OOP to exist in the first place. Experienced OOP devs and FP devs alike will agree with this.

The key distinction being made here that's pro-encapsulation is that processes are significantly more ephemeral than static definitions are. You can more easily adjust the logical implementation of a process (while maintaining its overall input-output structure) than it is to change a static definition while still keeping it compatible with its previous structure.

To put it differently, the fact that a function/method inherently needs to map an input to a process and then (optional) to an output inherently enforces loose coupling between the caller and the implementation. It's possible to achieve similar loose coupling with static definitions (e.g an interface around a class), but it's easier to forget to do this and something it requires more effort.

For OOP specifically, there's an additional consideration here whereby you want thematically linked processed grouped together, e.g. you're going to find the way to contact a vendor and the way to update a vendor's details both in the Vendor class.
When you expose state externally, you are giving outsiders the ability to write their own logic based on your state fields. But, if this logic hinges on your state fields, the odds are fairly high that this logic should be grouped with you instead of existing in some other consumer of your class.

This leads to a general disapproval of sharing state as a means of preventing others from developing processes that you should be keeping ownership over. By prohibiting state from being shared, you inhibit that from happening.

Should we ban access to state?

Now, I also wrote this in my answer on the question that you answered (which sparked you to post this question), but I very much agree with the spirit (not developing logic external to the class that owns the related fields) while also very much disagreeing with the notion that blanket-banning something is the correct way to enforce that.

So my feedback is going to be more that exposing some state is reasonable, but when consuming said state it should be evaluated whether the consumer is the right place to add this new logic that you're creating.

I preach pragmatism, not blind dogma, which means that I'm generally against any kind of blanket rule unless you can exhaustively prove that there is no counter against it.

However, some boundaries, like library boundaries, make it impossible to move the method and so you move data. [..] And so OOP ideals are compromised in the face of reality.

Just to reiterate, the thing that's suspended in this example scenario of a library isn't an OOP ideal, but it's still something we'd like to be able to do regardless.

A library boundary means meaningfully losing access to the source code in a way that you can adapt it to your extended needs. That's sort of what distinguishes a library from "other files in my codebase". This is also a bit of a semantical puzzle.

Let's explore a world where we could change this - open source libraries have that ability. While it would be nice to be able to extend it, this becomes an issue of ownership. If the library now has a bug, the original authors are not particularly on the hook for it, since they don't understand nor have active support for your extensions of their logic. Similarly, you might not be able to troubleshoot the issue either.
For the example of open source code, you are free to make your own fork of the code, but the library's author has no responsibility over your fork. At best, they might be charitable enough to help you out.

This ropes in another consideration for good development practices: clear ownership. This can only really be ensured if you enforce the separation of logic from different authors.

Note that by "author" I don't mean a specific person, I mean an entity who bears responsibility of any and all people that came before them or worked alongside them and authored code. If Bob leaves MyCompany and Tom now works for them and he inherits Bob's code, then MyCompany is the singular author of this code.

There's other ways to integrate libraries into your codebase that don't require you to break that boundary of ownership. Wrapping is the straightforward solution here. Extension methods can help your logic feel more integrated with the library but they're really just side-loaded helper logic that can only access the public data.

Usually with explicit consent of the author (through their design), it's possible to provide inheritable classes so that you can access protected state and integrate your logic more than just by wrapping it. This works well but it usually requires some design considerations by the library authors so it's up to them whether they really want to take on that additional effort (and if it's work the added users of their library).

Extra credit and examples of why dogma is bad.

For bonus points, please do show me how a OOP purist would design a hash table without getters.

This is more of another semantic puzzle. "Getter" means different things to different people. In a language that has actual property getters (such as C#'s public string MyProp { get; }), does "getter" refer to specifically that, or also any method like public string GetMyProp() { ... }?

If the former, then the easy (but in my opinion cop out) answer is that you can replace every property with a private field with accompanying methods to access it, and then you've technically created something without a getter. For obvious reasons, that's not really an answer to your question though.

Additionally, this question paints a scenario that completely ignores the purpose of the advice that sparked the previous question (which in turn sparked this one).
It's impossible for me to conclusively tell you how to design [..] without already being aware of how an external consumer would make use of whatever I expose, if and how they build logic around it, and if that logic would've been more appropriate to add into my class instead of the consumer's.

There is no pure OOP way of solving this problem without knowing the consumer and what they intend to do with your object. Simply put, if you follow to "expose no state, only behavior" guideline (shortened for brevity), then you're going to run into a lot of issues when you realize that OOP starts from an object's definition and then builds its behavior inside of that definition. This inherently means that the author of that object definition must invariably be the author of the behavior included in that type definition.

If it's outside of the type definition, then it violates your "add behavior to the type" goal of exposing behavior.

And if it's a different author, then it would be bad to have one type be designed by two different authors without them at the very least being aware of each other. As we established, you don't know your consumer yet in this scenario, so active knowledge of your consumer is not possible here.

This is a really good example of why I detest blind dogma and consider it one of the bigger issues that plagues software development. The question you asked here is a blind application of the letter of the guideline, and you did not notice that your question actually does not touch on the spirit of the guideline.

8
  • 1
    "so active knowledge of your consumer is not possible here." It's really confusing how you condemn my thinking while supporting my argument. Commented May 13 at 4:57
  • 1
    @candied_orange The guideline refers to cases where there is a consumer who wishes to do something that is not yet possible and thus tries to build it for themselves based on your object's state. The guideline states that you should not expose state so that this does not happen, and to instead add behavior to the object instead of consuming its state this way. This is inherently only relevant in a scenario where (a) you know what your consumer is trying to add and (b) when you're actually able to add said behavior to the consumed object (i.e. its design is not finalized).
    – Flater
    Commented May 13 at 5:07
  • 1
    @candied_orange By contrast, your "bonus question" is one where you (a) have not explained anything about what a consumer would like to see added and (b) is asking for a finalized design that would account for all future consumers. It's orthogonal to the guideline's entire context, hence why I believe you've not quite understood what the guideline is trying to convey.
    – Flater
    Commented May 13 at 5:08
  • That conundrum is exactly what I was trying to depict. That's the boundary. Now imagine yourself as an OOP purist and tell me how you deal with that. Commented May 14 at 13:04
  • @candied_orange I don't know whether you're trolling or patently unwilling to actually hear why your question is not answerable based on the outlined parameters you've set for it. Just because it makes grammatical sense in English does not mean that it can be meaningfully answered.
    – Flater
    Commented May 14 at 23:10
3

I almost responded with an answer similar to this on the other question and decided not to. I've been tempted twice, I'll bite.

What are we even arguing about here?

Why do we do Object Oriented programming? (or Functional Programming, or Constraint-Based Logic Programming, or...). I mean it's safe to say that any program could be written in a Procedural style and almost every programming language (including Java) offers ways to write Procedural code.

The answer to the question is going to vary by who you ask (and when you asked them, the answer in 1997 was not exactly the same as in 2024). But the answer they give is probably going to include words and phrases like encapsulation, subtype polymorphism, data abstraction, etc.

Although I think a lot of arguments against Procedural are strawmen that at best may have been applicable 20 years ago, I don't think anyone would argue that Procedural code does great with those things in the general case.

So is Procedural bad? Well, it tends to be bad at those specific things. Are those things we want? Again answers differ and people also argue about how well OOP does or doesn't deliver on them, but the answer is generally yes.

But we don't need them all the time. There's plenty of Procedural code, even large codebases of Procedural code, that are just fine without them, or at least without the OOP version of them. The fact that they aren't necessary creates a division:

  1. The first camp says that since you don't need them all the time, it's blindly dogmatic to enforce adherence. It forces programmers to contort perfectly reasonable square pegs in order to fit arbitrary round holes. YAGNI: we're over-engineering stuff to appease the Cargo Cult Priests of OO orthodoxy. I read your question as resonating with this viewpoint. I know I do.

  2. The second camp says that habits rule. Discipline rules. It's useful to get into the habit and discipline of doing it "right" even though it's not always necessary for the times when it is, and you don't always know those in advance which times those are. Bad habits like primitive obsession have been a known anti-pattern for long enough now, we shouldn't have to argue about them anymore. We've all been burned by requirements change that invalidated assumptions that were perfectly reasonable when we started but didn't really need to be made in the first place.

This dichotomy shows up any time a discipline is sometimes useful but not universally necessary. Static typing vs dynamic. Strong static typing (Haskell, Rust, ML) vs mainstream static typing (Java, C#, Typescript). TTD vs not-TDD. The argument over the desirability of getters and data boundaries has the exact same shape. It's not a coincidence.

This split is further driven by the kinds of software we write. The requirements for an embedded system, a relational database engine, a video game, and a desktop GUI look very different, even if they're all written in the same language. It is entirely possible that Bob lives in the first world where the problems almost always stem from over-engineering and people should use a g*dd*mn hashmap of strings instead of trying to model every single tiny entity in the system as a "domain object". It's entirely possible that Alice lives almost exclusively in the second world where even the most minor of requirements change requires touching a dozen or more classes because somebody sprayed their assumptions promiscuously all over the codebase with no thought to the future. Bob and Alice are going to argue a lot on the internet, possibly without even realizing they are coming from two very different places.

We think we're arguing about the same things. I'm willing to bet that doesn't happen as often as it seems. A lot of the debate is driven by what boils down to personal preferences, but those preferences are forged in the crucible of our experiences, and those vary from person to person.

So it's all just preference?

Not entirely. There are real tradeoffs. But they're tradeoffs: different people are going to want to be at different points on the curve. Just remember that it's easy to point to the cases where your preferences work and theirs fail. But so is the reverse. If you find yourself getting into this disagreement with people then I think it's worth addressing the elephant in the room that both the underlying world views are valid in their own way even though they give different answers.

2

Does this count as a "pure" HashTable? Note that any data container, like an array or hash table, can expose a ForEach method that takes a function pointer as argument and call that function with the values inside the array. You don't really need to expose the function pointer, you can encapsulate that as well, but then you'd have to create a new HashTable class for every action you may need to do on one of its elements, and that will quickly spiral out of control (which is why we don't do such a thing in professional software engineering regardless - getters allow reusable code, but I'm sure you know that already).

public class ThePureHashTable<Tkey, TValue>
{
    public void DoSomethingWithValue(Tkey key, Action<TValue> something)
    {
       something(this[key]);
    }

    public void Add(Tkey key, TValue value)
    {

    }
}
2
  • Not a thing to do with general-purpose data structures, but for the sake of the argument, yeah a getter free design would look something like that (though you could argue this is not exactly a hash table anymore). You would not encapsulate the action (by which I presume you mean make it entirely internal to the class) - as that's a very different design. Passing in an action is just dependency injection (it could also be an object) - you'd do this sort of thing if your class needs to call something, but cannot know what that something is. But again, not how hash tables are conceptualized. Commented May 11 at 23:24
  • Yeah, you can push OOP as far as you like you just flip the way you call functions. I think this question is just a continuation of the argument on the previous one
    – Ewan
    Commented May 12 at 11:51
2

The actual question you are asking is: "How far can you push the "Tell, don't ask"-principle."

The answer is, you should push it as long as it improves encapsulation, but you can easily push it so far it will have the opposite effect.

The principle is another way to say an object should control it's own state. An external client should not perform "open brain surgery" on an object and mutate its internal state, like in this example:

 var counter = new Counter();
 // increment counter
 counter.value = counter.value + 1;
 // print counter
 console.log('Count:', counter.value);
 // reset counter
 counter.value = 0;

Lets apply "Tell, don't ask", to let the counter control its own state:

 var counter = new Counter();
 // increment counter
 counter.increment();
 // reset counter
 counter.reset();

But it would clearly be taking it too far to do things like this:

 counter.printTo(console);

Since now the counter suddenly needs to be aware of how we are going to use the counter value, and this breaks separation of concerns. Exposing the counter value through a getter is perfectly fine. How it is used should not be the concern of the counter.

Getters appropriately used are not "compromising" the ideals of OOP.

1

My view is that OOP primarily exists to permit decentralised designership of software.

That is, it's intended to facilitate the ability of separate teams to design and maintain software components, with less coordination amongst themselves than would be necessary otherwise.

In OOP, everything is done via a dynamic procedure call, with no direct access to in-memory storage. This means that the provider of the component retains an interception point at all the outer limits of their component, behind which they ultimately reserve control of the entire internal design.

It's not a panacea in this regard, and I don't want to go through all the ways in which this idea fails, but I want to set the basic scene.

There's another, conflated, purpose for which OOP is used, and that is for a single design team (or individual designer), to attempt to manage complexity within the piece of software under their purview. The idea here is not to separate design control between people, but to divide-and-conquer the complexity of a single application.

There's lots of stuff written about OOP which is essentially about pursuing this secondary purpose, rather than the first.

Again, I don't want to get into all the ways this idea fails. I only want to note that there are in fact multiple purposes at play, which are not really connected, and that ideas about each typically become conflated.

Now, I'm less clear about how this idea of not sharing data between objects comes into the picture. Which of either of these purposes I've mentioned does it serve?

One of the essential purposes of software in general is to process data, and therefore the existence of internal flows of data is intrinsic and irreducible (not to mention how data ultimately flows externally to and from the computer users).

How do you get data from an object, or show anything to the user, if not by a getter (and put data into another object, or take input from the user, if not by some kind of setter)?

I appreciate @RobertBräutigam's point that we don't have to take everything too literally, and ignore the gist of the point, but I think (though he uses different language) he is essentially diagnosing the problem with software as being the existence of internal data flows - and his prescription is to stop data flowing.

Certainly, it is possible to have software with badly disorganised internal data flows, or data flows which (though not disorderly) are unnecessary in their complexity.

He might be familiar in his own experience with software designs where there are too many objects and there is too much data passing from one object to the next, and each object is doing little or nothing with the data it contains. His advice would make sense in that context: get rid of the long bucket brigade of getters and DTOs, and just put a method on the source object which does the relevant thing with the data it already has.

But I think he's gone far too far in suggesting that "no getters" - or any equivalent language - is a useful general guideline for how to design object models. It's a farcical caricature, from his very own hand, of the more nuanced and more narrowly scoped point he probably wanted to make about certain special cases of how designs can go wrong.

1
  • 1
    Component split defines what is an implementation detail. Component have to hide their state and be split among state access seams because state management is usually the hardest problem. Hence intercomponent state acces indicates that either state is not hard to manage or component split is invalid (does not match current requirements). Hence getters are a symptom of either design problem or complexity that is not state related. Good answer, but conclusion is too polite.
    – Basilevs
    Commented May 16 at 0:48
-1

I have less to none experience on philosophical concepts of OOP (that for me refers to object oriented programming) and when facing this kind of context I try to correlate with real life where objects are inert beings that stay where and how they are placed by external forces (forces in physical sense), their internal state is unknown to outsiders that are unable even to deduct it from their appearance, hence it is not possible to do anything with objects without using force (again in physical sense, mechanical sense more specific).

So...

How far can you push Object Oriented Programming?

Until building a beautiful inert system useless without something that animates it, thought at least would be beautiful, delighting every presence around it. What would be a database without read access? Pardon the syrupyness.

1
  • 1
    Database reading is Platon's table eating. While getters are dealing with database's storage and table leg count.
    – Basilevs
    Commented May 16 at 0:35
-1

This isn't exactly a question. Nonetheless I will attempt to answer it as it was not-quite-asked.

First, what is a getter in a system without functions? I normally assume that we have identity functions; that is, we can write something like the following Python:

def id(x): return x

Can this be a method on an object? Of course:

class Id:
    def run(self, x): return x
id = Id().run

But is it a getter? If so, then we have a failure of composition of functions and we can't even write standard function-oriented code without leaving the object-oriented world. To sharpen this, consider the K combinator:

def k(x): return lambda y: x

If we turn it into classes, then we must explicitly close over that first argument:

class Konst1:
    def run(self, x): return Konst2(x)
class Konst2:
    def __init__(self, x): self.x = x
    def run(self, y): return self.x

So we can't even do something as simple as combinatory logic without getters!

I think that a better question is whether getters are actually emblematic of object-oriented programming. The E family of languages does not have getters, because it doesn't have classes or prototypes; instead, objects are written as immediate literals which implicitly close over their surrounding scopes and a compiler is required to compute the implied class structures. Let me write the K combinator at an E REPL:

? def k { to run(x) { return def _ { to run(y) { return x } } } }
# value: <k>

? k(4)(2)
# value: 4

I think that this exposes a fundamental issue with the way you've framed things, although I'm not sure how to improve the framing. If I write a hash table in E, is it inherently less getter-ish than in Python? Is it less object-oriented? I would say yes and no respectively. So "object" is polysemous; it could refer to something with classes (Java), with prototypes (Python), or with object literals (E).

4
  • 2
    if we let "getters" include "functions that return state" surely E has getters?
    – Ewan
    Commented May 13 at 8:43
  • @Ewan: That's what I'm asking, yeah. Is it a getter to return something within one's closure? If so, isn't all of functional and combinatory programming excluded?
    – Corbin
    Commented May 13 at 15:05
  • 1
    I guess thats my second question, do you think functional programming is still OOP? surely they are different and you dont expect them to follow the same rules?
    – Ewan
    Commented May 13 at 15:20
  • @Ewan: To me, object-oriented programming is functional programming with message sending. It can be pure, it can have limited mutability, it can be statically/gradually typed. The biggest difference between e.g. E and OCaml is that OCaml doesn't have easy syntax for asynchronous message delivery.
    – Corbin
    Commented May 13 at 18:17
-1

A hash table is an association between keys and values with requirements on the keys to enable certain operations to be faster.

Specifically, keys need to have a fast map to a "hash value", an integer, that is as close to uniformly spread over the set of integer values as possible and as "random" as possible, yet any two keys who compare equal should have the same hash value.

This allows a set of techniques to be used that make checking for collisions and lookup much faster than with the more traditional balanced tree approach.

So if we are using a Hash table we must care about this distinction; something about the speed of collision detection must be of value. In addition, these tables do not store the keys in a "sensible" order (not alphabetical etc).

The kind of operations that are efficient is this:

  1. A "foreach member" operation, which does something on each and every key-value pair. The order it does so is unspecified. This is an expensive (O(n)) operation.

  2. A "fast match" subsetting operation, which operates on the key and does something only on elements that match. This is a fast (O(1)) operation.

There is often some business logic added to hash tables, like "only one element for each key", that is not inherit in the properties of a hash table.

The Key-Value structure of a hash table can also be discarded; what you really have is elements, with an equality-comparison (forming equivalence classes) and a hash operation on it. The equality and hash operation need not examine the entire state of the elements.

We typically project our elements into (K,V) pairs then operate on the K component with Hash(K) and Equal(K,K).

In some cases, hash tables actually care about all of the elements in a given hash bucket. To go further down the purity route, we can discard the equality operation entirely and leave that up to the next layer.

So now, our Hash Table of E is something that is provided with a Hash(E) operation. It places the elements into buckets, where all elements with the same Hash(E) value go in the same bucket, but no guarantee that two elements with different Hash(E) are in different buckets.

We can then boil this down further and have our Bucket table. The Bucket table takes (K,V) pairs where the K is an integer, and stores them in Buckets. The operations on it are "look at everything" and "look at a given Bucket". It resizes itself as you add elements so that the number of cross-Key-value Bucket collisions are low.

A Bucket table:

Bucket<E> is Iterable<(K,E)> and Iterable<E>

you can iterate it as a list of (K,E) values.

Bucket<E>.Sieve( K, F(Iterable<K,E> or Iterable<E>) )

you can "Sieve" out elements that are in the same Bucket as a given K.

Hash is then adapts a Bucket table. It uses a projection P to map elements to integers, and provides:

Hash<E,P> is Bucket<E>
Hash<E,P>.Sieve( X, F(Iterable<K, E> or Iterable<E>) )

where Sieve(X, ?) means Sieve(P(X), ?).

We can add collision rules (a uniquehashtable?), insert, exists operations on top of this, sort of like this:

hash.exists(x) := hash.sieve( x, any_of( e->(e==x) ) )
hash.insert(x) := hash.sieve( x, l->l.insert(x) ) // or maybe insert({p(x),x})
hash.access(x, f) := hash.sieve( x, filter( e->(e==x), f ) )

now, one interesting thing with this "lack of getters" approach is that if our modest hash table was actually a massively distributed table, the operations still make sense.

Imagine if our buckets where actually distinct servers, and our elements where massive beasts of data that we really can't afford to transmit, but our operations where tiny and easy to serialize.

Now our bucket.sieve operation bundles up our function and sends it to the server, where it is run there and any results computed. Then the result is sent back.

Our hash built on top of bucket can do the same thing.

Similarly, if our table was mostly not in memory and "loading it" was expensive, the operation based model means that we know the domain of validity of what we are providing. We give access to the ability to iterate on our buckets, not the bucket list itself. And the iterable operation similarly gives the ability to visit the elements and do things to them, not "get a copy" or "get a reference".

...

I find that when thinking about OOP, it is a good idea to step back to its origins. OOP was about messaging protocols between opaque actors, not about methods the like.

"You have a bunch of data. Here is some operations I want to run on all of the data, please do so and collect the results using this operation." is a messaging based operation.

Imagining the things you are communication between aren't living in the same memory space or even time zone also helps.

This doesn't mean your protocols should enforce marshaling - but rather you should see "do I really need a shared memory space to do this efficiently?", and if not, decouple it from the assumption. And don't force people into writing proxy objects that fake "the same memory space" to do it.

2
  • Why explain hashtable? People are coming to this site specifically because they do not want to read and answer algorithmic questions on Stackoverflow :)
    – Basilevs
    Commented May 16 at 1:13
  • 1
    This answer boils down to Tell Don't Ask, but the OP is well aware or the principle. The real question is - where is the principle not applicable?
    – Basilevs
    Commented May 16 at 1:19
-2

You can easily create a programming language which is 100% object oriented. For example, the Ruby language only knows objects. There is nothing that is not an object. There are nothing like the "literals" of Java which live outside of Java's object-orientedness. Even classes themselves are objects. Methods are objects. Closures are objects. And if you want to go real deep you can google Ruby's super-classes, modules, mixins, which give you a whole lot of further OOP power (while deftly solving the multiple-inheritance problem).

When programming with Ruby, at no point do you have the feeling that they have pushed it too far. Everything is very easy, normal, straightforward, elegant and there is not a single thing that yells "there is too much OOP here".

Obviously you are still, as a developer, free to (mis-)use this particular language in many ways. You can write "scripts" which just consist of a long long bunch of iterative spaghetti, which operate on whatever objects come with the language and the standard library, but doesn't define a single own class or method. On the other hand, you can fetch your old paper copy of the good old Design Patterns book and pattern the hell out of your program, until it is absolutely unrecognizable what it even does. Or you can write classes that are basically "C structs" with some code associated with them. All of that is not the failure of OOP, that's simply the failure of the programmer overdoing or misunderstanding it.

I assume that the feeling that you can "overdo OOP" is mostly applicable to languages that either have a non-complete OOP implementation (like Java with its separation of primitives like integers, and associated two-class citizenships); or maybe for languages that were purely iterative/modular/package based before and where OOP features were added retroactively (i.e., C++, Perl, Python...), and where actually using the object-based "world" can be more difficult then just not doing it.

Interestingly, Ruby "hides" getters/setters by the convention that a getter does not have the get_ prefix - i.e. if you want to have a getter that fetches the name of a person, you would simply call the method name. The setter does not have a set_ prefix, but is simply a method name= (the "=" is valid in method names, in Ruby). This means that in many cases, you see code like print person.name or person.name = "Hans". The actual attributes of an object live in a separate namespace, so there is no issue. In many cases you also get away with defining catch-all methods (i.e., a single getter and single setter per class) that avoids having to type so much, or makes it very straight-forward to create libraries which encapsulate DAO-style DB access purely at run-time.

By the way, I fail to see what your example with the avoidable getter/setter in the context of a hash class brings to the question:

Specifically, when designing a class implementing a hash, a "getter" is perfectly fine OOP design, and it encapsulates the implementation details perfectly.

A getter in a hash is not a trivial "getName()", but it is the method that knows how traverse the - possibly quite involved - implementation of the hash. The hash could be based on in-memory buckets with a hash over the content of the stored objects (which is in itself an interesting problem - how do you get such a hash code for an object in the first place?). It could also be an implementation hiding a distributed Reddis service, with authorization, security, failover topics etc.. It will probably handle multithreading synchronization in any case. You must have getter/setter-methods in a hash that encapsulate the actual implementation.

1
  • 1
    On this site OOP is usually not a property of a language, but a paradigm or philosophy of data and behavior grouping. So language discussion seems off-topic.
    – Basilevs
    Commented May 16 at 1:04
-2

Here's my attempt to answer this supposedly unanswerable question:

You design a libraries hash table collection exactly the way you see libraries designing their hash table collection.

I'm a big believer in OOP and TDA. But I hold on to my ideals in the face of reality by acknowledging that they only work in certain circumstances. I don't push them into situations they can't handle. What you see across these boundaries isn't OOP. A collection doesn't encapsulate what it contains. It has no idea what methods would provide the behavior you need. This isn't an object. Not in that sense.

Now sure, each collection has some internal state that it encapsulates just fine. And in that sense it's an object. Just not to you when you're trying to move your method to be with this data.

That limits what OOP can do for you. If you accept those limits for what they are you can actually get away with being fairly fanatical about OOP. Rather than getters being something to rationalize about they are something whose use should be isolated once inside a space that is safe for OOP.

What you do is make some actual OOP objects and back them with these helpful library tools. Yes you can use your objects to abstract them away. So no, you didn't move the method all the way to the data but you got it as close as you can. And that's good enough.

objects should contain every business function that are highly related to other functions and data in the object.

Data Boundaries are the root cause of Maintenance Problems

There's the ideal right there. And it's absolutely true. It's just not something your hash table is going to let you do. Build a space for your OOP. Don't insist that your whole program is OOP. None of them are. OOP is something you slip in where you can.

If you're willing to do that you don't need the hash table to be designed any differently.

2
  • 1
    Why should OOP consider state of collection items to be part of Collection itself? Collection does not have visibility inside items, so it makes no sense to move methods into it even when adhering to OOP. Collection manages item storage, not its data.
    – Basilevs
    Commented May 16 at 1:02
  • @Basilevs because that’s the data the method needs. Doesn’t mater what the collection thinks. What maters is you can’t move the method. OOP wants what it wants. It’s not that collections are a special exception to OOP. This is just reality vs ideals. Commented May 16 at 8:56
-2

TL;DR

API should be concerned with and hide details of the main function of a component. Getters may be useful, when state management is not the main function.

This has nothing to do with OOP

Component design goal is to split the problem into subproblems in a way that complexity of each component is manageable while complexity of their interactions is minimal. This has nothing to do with OOP.

Components hide their implementation details by definition. This has nothing to do with OOP.

Components are usually focused on exactly one complex subproblem. This has nothing to do with OOP.

State is a hard problem to manage. Usually state management is isolated to better manage its complexity. This has nothing to do with OOP.

OOP isolates state within components. FP isolates state outside of components (for example with IO monad).

OOP components hide state to avoid exposing its complexity. They expose the state if in doing so, the overall complexity decreases. This may happen when the subproblem complexity does not come from state management.

Getters are exposing state. Therefore their presence indicate that either Component design is not suitable for the problem at hand (leaks implementation detail) or design is focused on a problem that is harder to solve than state management.

In other words, question "are getters bad" is a special case of component design question - is state of this component the hardest problem it is trying to solve? If the component is not concerned with state management, it can expose whatever private fields it wants under the assumption that this would help other components to get the answer to the actual hard subproblem this one solves (obviously, details of the solution should never be exposed, just the result).

The heuristic "getters are bad" comes up due to the hard problem of state management and an attempt to isolate state management by default. The heuristic is only apllicable for components managing state. Not every component has to manage state and some components manage one part of state but expose another.

Split your components to minimize API surface, isolate current, actual hardest problems as implementation detail. This has nothing to do with OOP.

Examples

String.length is a leak of abstraction (for multiple reasons including multi byte encodings) because mutable strings should not expose their state allowing indexed access. For immutable strings with fixed-length encodings this is not a leak, because they can afford indexed access and expose it as API (like arrays).

Array.length is not a leak, because it is not changing and indexed access is main purpose of Arrays.

HashMap.size is not a leak because Collections are explicitly designed stateful and the main problem being solved is storage, state management is left to clients.

Account.balance is a leak in context of bank transaction and is not a leak in context of Internet Bank UI.

Dog.legCount is a leak in context of house pet management and is not a leak in context of pet clothing shop or veterinary clinic.

DTO.property is not a leak, because value delivery is the main purpose of DTO.

This has nothing to do with OOP

Split your components to minimize API surface, isolate current, actual hardest problems as implementation detail. This has nothing to do with OOP. Getters are just a minor part of API design. They are not special, apply other heuristics.

Not the answer you're looking for? Browse other questions tagged or ask your own question.