115

I'm heavily using std::set<int> and often I simply need to check if such a set contains a number or not.

I'd find it natural to write:

if (myset.contains(number))
   ...

But because of the lack of a contains member, I need to write the cumbersome:

if (myset.find(number) != myset.end())
  ..

or the not as obvious:

if (myset.count(element) > 0) 
  ..

Is there a reason for this design decision ?

12
  • 7
    Most of the standard library works with iterators so normally functions returning iterators is what you would expect. Not to hard though to write a function to abstract that away. Most likely the compiler will inline it since it should only be a line or 2 of code and you will get the same performance. Commented Mar 1, 2017 at 13:07
  • 4
    Another (more fundamental) problem with the count() approach is that it does more work than a countains() would have to do. Commented Mar 2, 2017 at 9:44
  • 11
    The fundamental reason behind that design decision is that contains() which returns a bool would lose valuable information about where the element is in the collection. find() preserves and returns that information in the form of an iterator, therefore is a better choice for a generic library like STL. (That's not to say that a bool contains() isn't a very nice-to-have or even necessary, though.) Commented Mar 2, 2017 at 9:56
  • 3
    It's easy to write a contains(set, element) free function using the public interface of the set. Therefore, the set's interface is functionally complete; adding a convenience method just increases the interface without enabling any additional function, which isn't the C++ way. Commented Mar 2, 2017 at 10:55
  • 3
    Are we closing everything these days? How is this question "Primarily opinion based" in any way?
    – Mr. Alien
    Commented Mar 3, 2017 at 4:16

11 Answers 11

160

I think it was probably because they were trying to make std::set and std::multiset as similar as possible. (And obviously count has a perfectly sensible meaning for std::multiset.)

Personally I think this was a mistake.

It doesn't look quite so bad if you pretend that count is just a misspelling of contains and write the test as:

if (myset.count(element)) 
   ...

It's still a shame though.

20
  • 6
    Incidentally, it's exactly the same with maps and multimaps (which is just as ugly, but less ugly than all these comparisons with .end()). Commented Mar 1, 2017 at 13:17
  • 8
    Alternatively, they may have seen no need for an additional member contains(), on the grounds that it would be redundant because for any std::set<T> s and T t, the result of s.contains(t) is exactly identical to the result of static_cast<bool>(s.count(t)). As using the value in a conditional expression would implicitly cast it to bool, they may have felt that count() served the purpose well enough. Commented Mar 1, 2017 at 20:57
  • 2
    That said, there are situations when having a function that returns the result as a bool would be useful, and the name contains is slightly clearer than the name count, which would allow code to be parsed faster by people that aren't familiar with set. Commented Mar 1, 2017 at 21:00
  • 3
    Misspelling? if (myset.ICanHaz(element)) ... :D Commented Mar 2, 2017 at 13:35
  • 3
    @MartinBonner It doesn't really matter if the reasons for leaving it out were dumb. It also doesn't really matter if the conversation wasn't the 100% final rationale. Your answer here is just a reasonable guess on how you think it should be. The conversation, and answer from somebody not only involved in it, but tasked with proposing it (even though they didn't) is unarguably closer to the truth than this guess, no matter how you look at it. At the bare minimum you should at least mention it in this answer, that would be a great improvement and would be the responsible thing to do.
    – Jason C
    Commented Mar 4, 2017 at 19:02
53
+100

To be able to write if (s.contains()), contains() has to return a bool (or a type convertible to bool, which is another story), like binary_search does.

The fundamental reason behind the design decision not to do it this way is that contains() which returns a bool would lose valuable information about where the element is in the collection. find() preserves and returns that information in the form of an iterator, therefore is a better choice for a generic library like STL. This has always been the guiding principle for Alex Stepanov, as he has often explained (for example, here).

As to the count() approach in general, although it's often an okay workaround, the problem with it is that it does more work than a contains() would have to do.

That is not to say that a bool contains() isn't a very nice-to-have or even necessary. A while ago we had a long discussion about this very same issue in the ISO C++ Standard - Future Proposals group.

5
  • 10
    And it is interesting to note that that discussion ended with near consensus on it being desirable and you being asked to write a proposal for it.
    – PJTraill
    Commented Mar 2, 2017 at 20:54
  • @PJTraill True, and the reason I didn't go forward is that contains() would, obviously, interact strongly with existing containers and algorithms, which would be strongly impacted by concepts and ranges — at the time expected to come into C++17 — and I was convinced (as a result of the discussion as well as a couple of private email exchanges) that it was a better idea to wait for them first. Of course, in 2015 it wasn't clear that neither concepts nor ranges would not make it into C++17 (in fact, there were high hopes that they would). I'm not sure it's worth pursuing it now, though. Commented Mar 3, 2017 at 11:56
  • 2
    For std::set (which is what the question asks about), I don't see how count does more work than contains would have to do. The glibc implementation of count is (roughly) return find(value) == end() ? 0 : 1;. Apart from details about the ternary operator vs just returning != end() (which I would expect the optimizer to remove), I can't see how there is any more work. Commented Mar 4, 2017 at 14:11
  • 8
    " ... contains() which returns a bool would lose valuable information about where the element is in the collection" -- If the user calls myset.contains() (if it existed), that would be a pretty strong indication that that information is not valuable (to the user in that context). Commented Mar 17, 2017 at 21:17
  • 1
    Why does count() do more work than contains() would have to do for std::set? It's unique so count() could just be return contains(x) ? 1 : 0; which is exactly the same.
    – Timmmm
    Commented Nov 21, 2017 at 13:02
24

It lacks it because nobody added it. Nobody added it because the containers from the STL that the std library incorporated where designed to be minimal in interface. (Note that std::string did not come from the STL in the same way).

If you don't mind some strange syntax, you can fake it:

template<class K>
struct contains_t {
  K&& k;
  template<class C>
  friend bool operator->*( C&& c, contains_t&& ) {
    auto range = std::forward<C>(c).equal_range(std::forward<K>(k));
    return range.first != range.second;
    // faster than:
    // return std::forward<C>(c).count( std::forward<K>(k) ) != 0;
    // for multi-meows with lots of duplicates
  }
};
template<class K>
containts_t<K> contains( K&& k ) {
  return {std::forward<K>(k)};
}

use:

if (some_set->*contains(some_element)) {
}

Basically, you can write extension methods for most C++ std types using this technique.

It makes a lot more sense to just do this:

if (some_set.count(some_element)) {
}

but I am amused by the extension method method.

The really sad thing is that writing an efficient contains could be faster on a multimap or multiset, as they just have to find one element, while count has to find each of them and count them.

A multiset containing 1 billion copies of 7 (you know, in case you run out) can have a really slow .count(7), but could have a very fast contains(7).

With the above extension method, we could make it faster for this case by using lower_bound, comparing to end, and then comparing to the element. Doing that for an unordered meow as well as an ordered meow would require fancy SFINAE or container-specific overloads however.

5
  • 2
    1 billion copies of 7? And here I thought std::set cannot contain duplicates and therefore std::set::count will always return 0 or 1.
    – nwp
    Commented Mar 2, 2017 at 8:58
  • 5
    @nwp std::multiset::count can Commented Mar 2, 2017 at 9:00
  • 2
    @nwp My lack of backticks around the word "set" is because I'm not referring to std::set specifically. To make you feel better, I'll add multi Commented Mar 2, 2017 at 13:33
  • 3
    I seem to be missing the joke for whatever "meow" was supposed to be a reference to. Commented Mar 2, 2017 at 18:39
  • 2
    @user2357112 meow is a placeholder for "set or map". Ask the STL why. Commented Mar 2, 2017 at 18:41
13

You are looking into particular case and not seeing bigger picture. As stated in documentation std::set meets requirement of AssociativeContainer concept. For that concept it does not make any sense to have contains method, as it is pretty much useless for std::multiset and std::multimap, but count works fine for all of them. Though method contains could be added as an alias for count for std::set, std::map and their hashed versions (like length for size() in std::string ), but looks like library creators did not see real need for it.

8
  • 8
    Note that string is a monster: it existed prior to the STL, where it had length and all those methods which are index-based, and then was "containerified" to fit within the STL model... without removing the existing methods for backward compatibility reasons. See GotW #84: Monoliths Unstrung => string really violates the "minimum amount of member functions" design principle. Commented Mar 1, 2017 at 16:02
  • 6
    But then the question becomes "Why is it worth having an AssociativeContainer concept like that?" - and I'm not sure it hindsight it was. Commented Mar 1, 2017 at 16:50
  • 27
    Asking if a multiset, a multimap or map contains something makes perfect sense to me. In fact, contains is equal in effort on a set/map, but can be made faster than count on a multiset/multimap. Commented Mar 1, 2017 at 18:48
  • 5
    AssociativeContainer doesn't require classes to not have a contains method. Commented Mar 2, 2017 at 0:51
  • 8
    @Slava That's like saying size() and empty() are duplicates, yet many containers have both.
    – Barry
    Commented Mar 2, 2017 at 16:07
10

Although I don't know why std::set has no contains but count which only ever returns 0 or 1, you can write a templated contains helper function like this:

template<class Container, class T>
auto contains(const Container& v, const T& x)
-> decltype(v.find(x) != v.end())
{
    return v.find(x) != v.end();
}

And use it like this:

    if (contains(myset, element)) ...
10
  • 3
    -1, because this is straightly contradicted by the fact that in fact the contains method does exist, it's just named in a stupid way. Commented Mar 1, 2017 at 14:33
  • 4
    "The STL strives to offer a minimal interface" chough std::string cough
    – bolov
    Commented Mar 1, 2017 at 14:56
  • 6
    @bolov: your point? std.::string is NOT part of the STL! It's part of the standard library and was retroactively templated...
    – MFH
    Commented Mar 1, 2017 at 15:22
  • 3
    @MatteoItalia count can be slower since it effectively needs to do two finds to get the beginning and the end of the range if the code is shared with multiset. Commented Mar 1, 2017 at 15:30
  • 2
    The OP already knows it is redundant, but apparently wants the code to explicitly read contains. I see nothing wrong with that. @MarkRansom the little SFINAE is to prevent this template from binding to things it shouldn't.
    – rustyx
    Commented Mar 1, 2017 at 15:35
9

Since c++20,

bool contains( const Key& key ) const

is available.

7

The true reason for set is a mystery for me, but one possible explanation for this same design in map could be to prevent people from writing inefficient code by accident:

if (myMap.contains("Meaning of universe"))
{
    myMap["Meaning of universe"] = 42;
}

Which would result in two map lookups.

Instead, you are forced to get an iterator. This gives you a mental hint that you should reuse the iterator:

auto position = myMap.find("Meaning of universe");
if (position != myMap.cend())
{
    position->second = 42;
}

which consumes only one map lookup.

When we realize that set and map are made from the same flesh, we can apply this principle also to set. That is, if we want to act on an item in the set only if it is present in the set, this design can prevent us from writing code as this:

struct Dog
{
    std::string name;
    void bark();
}

operator <(Dog left, Dog right)
{
    return left.name < right.name;
}

std::set<Dog> dogs;
...
if (dogs.contain("Husky"))
{
    dogs.find("Husky")->bark();
}

Of course all this is a mere speculation.

4
  • 1
    Yes, but for sets of ints this does not apply. Commented Mar 1, 2017 at 13:40
  • 8
    Except people can just write if (myMap.count("Meaning of universe")) just fine, so... ?
    – Barry
    Commented Mar 1, 2017 at 13:59
  • @MichaelWalz Oops, you're right. I modified my answer to include also the set example. However, the reasoning for a set of ints is a mystery to me. Commented Mar 1, 2017 at 14:13
  • 2
    This cannot be right. They can just as easily write your inefficient code with contains as with count. Commented Mar 1, 2017 at 16:52
3

I'd like to point out , as mentioned by Andy, that since C++20 the standard added the contains Member function for maps or set:

bool contains( const Key& key ) const;  (since C++20)

Now I'd like to focus my answer regarding performance vs readability. In term of performance if you compare the two versions:

#include <unordered_map>
#include <string>
using hash_map = std::unordered_map<std::string,std::string>;
hash_map a;

std::string get_cpp20(hash_map& x,std::string str)
{
    if(x.contains(str))
        return x.at(str);
    else
        return "";
};

std::string get_cpp17(hash_map& x,std::string str)
{
    if(const auto it = x.find(str); it !=x.end())
        return it->second;
    else
        return "";
};

You will find that the cpp20 version takes two calls to std::_Hash_find_last_result while the cpp17 takes only one call.

Now I find myself with many data structure with nested unordered_map. So you end up with something like this:

using my_nested_map = std::unordered_map<std::string,std::unordered_map<std::string,std::unordered_map<int,std::string>>>;

std::string get_cpp20_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
    if(x.contains(level1) &&
        x.at(level1).contains(level2) &&
        x.at(level1).at(level2).contains(level3))

        return x.at(level1).at(level2).at(level3);
    else
        return "";
};

std::string get_cpp17_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
    if(const auto it_level1=x.find(level1); it_level1!=x.end())
        if(const auto it_level2=it_level1->second.find(level2);it_level2!=it_level1->second.end())
            if(const auto it_level3=it_level2->second.find(level3);it_level3!=it_level2->second.end())
                return it_level3->second;

    return "";
};

Now if you have plenty of condition in-between these ifs, using the iterator really is painful, very error prone and unclear, I often find myself looking back at the definition of the map to understand what kind of object was at level 1 or level2, while with the cpp20 version , you see at(level1).at(level2).... and understand immediately what you are dealing with. So in term of code maintenance/review, contains is a very nice addition.

1
  • 4
    The usage of contains in get_cpp20 is abusive. contains should of course be used only if you want to know if an element is in the map or not without actually retrieving the element. So using contains makes a lot of sense with sets, but much less with maps. Commented May 25, 2022 at 15:40
0

What about binary_search ?

 set <int> set1;
 set1.insert(10);
 set1.insert(40);
 set1.insert(30);
 if(std::binary_search(set1.begin(),set1.end(),30))
     bool found=true;
2
  • It would't work for std::unordered_set, but for std::set it would. Commented Feb 8, 2019 at 16:21
  • It's normal, binary_search works just for binary trees. Commented Feb 8, 2019 at 16:31
0

contains() has to return a bool. Using C++ 20 compiler I get the following output for the code:

#include<iostream>
#include<map>
using namespace std;

int main()
{
    multimap<char,int>mulmap;
    mulmap.insert(make_pair('a', 1)); //multiple similar key
    mulmap.insert(make_pair('a', 2)); //multiple similar key
    mulmap.insert(make_pair('a', 3)); //multiple similar key
    mulmap.insert(make_pair('b', 3));
    mulmap.insert({'a',4});
    mulmap.insert(pair<char,int>('a', 4));
    
    cout<<mulmap.contains('c')<<endl;  //Output:0 as it doesn't exist
    cout<<mulmap.contains('b')<<endl;  //Output:1 as it exist
}
-1

Another reason is that it would give a programmer the false impression that std::set is a set in the math set theory sense. If they implement that, then many other questions would follow: if an std::set has contains() for a value, why doesn't it have it for another set? Where are union(), intersection() and other set operations and predicates?

The answer is, of course, that some of the set operations are already implemented as functions in (std::set_union() etc.) and other are as trivially implemented as contains(). Functions and function objects work better with math abstractions than object members, and they are not limited to the particular container type.

If one need to implement a full math-set functionality, he has not only a choice of underlying container, but also he has a choice of implementation details, e.g., would his theory_union() function work with immutable objects, better suited for functional programming, or would it modify its operands and save memory? Would it be implemented as function object from the start or it'd be better to implement is a C-function, and use std::function<> if needed?

As it is now, std::set is just a container, well-suited for the implementation of set in math sense, but it is nearly as far from being a theoretical set as std::vector from being a theoretical vector.

Not the answer you're looking for? Browse other questions tagged or ask your own question.